In this article, I want to discuss how we can recover deleted records from an SQLite database, or rather how we can recover all records and distinguish between those that are live in the DB and those that are found in unused areas and do not match a live record. I will also show how the first few bytes of records are regularly overwritten by SQLite structures and how these partial records can be recovered.

Before I can discuss how we do this, it’s quite straight forward with SQLite Forensic Recovery, I need to take you briefly through a slightly simplified structure of a database explaining how the database fits together and how records are stored within the ‘pages’ of the database. At that point, I can give a basic overview of the algorithm used to recover the non-live records which will give you, as the investigator, a handle on how much confidence you can ascribe to one of these records.

I then show you how the deleted records and partial deleted records look when you open the database in The Forensic Browser for SQLite.

Database structure

SQLite databases are made up of a number of pages of a fixed number of bytes, either 1024, 2048, 4096, 8192, 16384, 32768 or 65536. The first 100 bytes of the first page contain a header the remainder of this page follows the same structure as every other page (although the ‘data’ area is reduced by 100 bytes).

A database is made up of a number of tables and indexes (indexes are just a special sort of table) and each table/index uses one or more pages. A page can only be used for one table/index i.e. you will not find live records from different tables in the same page. Each table or index is stored as a B-Tree (Balanced tree) with each B-Tree occupying one or more pages.

There is a master table (sqlite_master) that resides (has its root) in the first page of the database and may overflow into additional pages. The master table tells us where the root of every user created table and index can be found.

The schema for the sqlite_master table is

CREATE TABLE sqlite_master(
type text,
name text,
tbl_name text,
rootpage integer,
sql text

A typical entry for a row in this table follows (in csv form), this is the DbMeta table from a Skype database, we can see the SQL CREATE statement for the table in the SQL column and we can see that the rootpage for this table is at page 2.

“table”,”DbMeta”,”DbMeta”,”2″,”CREATE TABLE DbMeta (key TEXT NOT NULL PRIMARY KEY, value TEXT)”

So to find out where the root page of any table or index is located we just need to read the sqlite_master table at page 1 and locate the entry for the table/index and get the location for the root (rootpage). The first few rows of the sqlite_master table for a Skype database are shown below:

SQLite Pages within a B-tree are classified as either internal or leaf pages, internal pages contain pointers to other internal pages or to leaf pages. Leaf pages contain the data for the table. Typically a table will comprise one root page (it must always have exactly one root page) which points to a number of leaf pages. Larger tables/indexes contain a root page that points to internal pages which in turn point either to more internal pages or to leaf pages. In very small tables the root page can double as a leaf page, essentially a root page is either an internal page or a leaf page without a parent.
As tables are appended to, records are modified or records are deleted the number of pages used by a table can grow or shrink and records can be moved between pages. A list of pages that are not currently in use is held in the freelist, the freelist itself is stored as one or more pages.

Now on to how a page is constructed. Although a page has a structure it is a little more free-form and slightly more complex to describe. There are five main types of pages – leaf and internal pages for both tables and indexes and overflow pages for entries that cannot fit into a single page (Binary Large Objects (BLOBs) such as pictures).

A page starts with an 8 or 12-byte header (varies depending on the type of page – 12 bytes for interior pages and 8 bytes for leaf pages). This is followed immediately by the variable length cell pointer array. Each entry in the cell pointer array is an offset to (points to) the start of a record (a row from the table). The size of the cell pointer array is recorded in the cell header.

The following graphic shows the structure, a header followed by a cell pointer array with each element in the array pointing to a record. The records are variable length and can be interspersed with areas of unused space.

We can see from the above that there is unused space within a database page which we need to consider when carving records along with those pages that are marked as completely unused.

However, before we carve it would be useful to have a look of the structure or a record/row within a table to see how each column is stored and the limitations this imposes on us when carving deleted or partial records.

It is worth noting at this point that SQLite Forensic Explorer allows a user to step through each and every byte in a database page, decoding the various structures. Some of the screenshots that follow have been created using this program.

So how is a record stored on a disk, A record is usually stored as a contiguous series of bytes, a header which holds a few different lengths (and almost invariably holds the rowid) followed by the body of payload.

The payload is split into two portions – a serial type, one for each column in the row, followed immediately by the data for each column appended back to back, The following diagram may help:

In the diagram above the record structure is for a table of the following form (it’s not quite correct but it makes the explanation easier):

CREATE TABLE test (id int, name text, age int, initials text, gender text, state text, country text)

The first thing to note is that there are 7 columns in the table definition but only 6 in our diagram. When SQLite stores a row of data it only stores columns up until the last column that contains data, any other columns (containing NUL) are ignored to save space. So by the lack of a record for the last column (country), we can conclude that in this record the country field is NUL.

The data for the first column is an int and this integer value is stored immediately following the last serial type (txt7).

The second column is an 8 character text column and the data follows the integer in the previous column, column 1,

The third column, an int, is stored following column 2,

Column 4, a three character text column, follows column 3,

Column 5 is NUL, there is a serial type for this column but NO DATA is stored,

Column 6 is a 7 character text column and the data for this column now immediately follows column 4.

We can see from the above that, in simple terms, the record structure can be sparse at two levels, once there is no more data to store at the end of a row then all column serial types (and naturally no column data) is stored in the table. If column is nul but is followed by further populated columns then the serial type is recorded as null but, logically, no data is associated with it.

From the above you can see that there are two types of unallocated records:

  1. Those in unused pages where there is a pointer from the cell pointer array to the page
  2. Those in unused space that we can match to one of the tables

Freeblock Records

There is a final consideration when dealing with deleted records, the freeblock structure.

SQLite maintains a linked list of freeblocks in a page. A free block is a block of free space that follows the first allocated record.

In the graphic below the record header points to the offset of the first freeblock (this entry will be 0 if there are no freeblocks), i.e. offset 753. The first two bytes at this offset are a 16-bit integer that point to the next freeblock 0x0379 = offset 889, the second two bytes are the size of this free block 0x0026 = 98 bytes.

The second freeblock has the first two bytes = 0x0000 this signifies that this is the last freeblock in the database page. This freeblock is 0x0024 bytes in length i.e. 36 bytes.

When a record is deleted (or moved to another page) and the space allocated is freed up then a freeblock is written to the page and the freeblock list updated accordingly. If we look at the record at offset 889 before it was deleted and the record afterwards then we can see the effect this has on the deleted record

In the combined screenshots from The Forensic Explorer for SQLite below the upper image shows the record before it was deleted along with a breakdown of the record structure (to the right). The lower image shows the freeblock and how the first four bytes of the record have been overwritten with 0x00000024:

There is, of course, a need to be able to forensically carve all three types of deleted record. The Forensic Toolkit for SQLite can do this

Recovering deleted data from databases

Now we need to look, in simplified form, at the process used when carving data from a deleted page, i.e. a page where the cell pointer array is intact and each entry in the cell pointer array ‘points’ to a row of data from a table.
The initial process is simple, identify a leaf page and for each entry in the cell pointer array read the associated record/row.
In the screenshot below the centre panel has a complete record highlighted in yellow, cell pointer array at offset 8 (0x00D4 = 212) points to the start of this record. The record has 6 bytes (offset 216) that comprise the record serial types and these serial types are:

  • Txt5
  • Txt11
  • Txt11
  • Int
  • Txt99

All that is required now is that we find a matching table definition that has the form txt, txt, txt, int, txt. If you have been paying attention you will recall that the sqlite_master table has the form:

CREATE TABLE sqlite_master(
type text,
name text,
tbl_name text,
rootpage integer,
sql text

This matches nicely with the record structure we have recovered so we can insert the data associated with each of these serial types into a new (recovered) sqlite_master table.

If we now look at a more complex example from a Kik database, in the screenshot below, we can see that this record has 6 bytes of record serial types (record keys). We can also see that the serial types are:

  • NULL
  • Txt36
  • NULL
  • Int
  • Txt41
  • NULL
So how do we match up this carved record with the table to which it belongs? Simply put we have to match the NULL columns with any column. An examination of the schema for the Kik database shows that the following table matches NULL, Txt, NULL, Int, Txt, NULL.

content_id VARCHAR,
platform VARCHAR,
priority INT,
content_uri VARCHAR,

Using the forensic Browser to carve deleted records from an Existing SQLite database

From version 2.0.0 the Forensic Browser can automatically identify and carve all three types of deleted record from an SQLite database.

Simply create a case file and select a source database. The Browser will then ask whether you want to identify deleted records and if so will create a new database with some changes to allow for possible duplicate records and older, different, versions of a record that might have the same primary key as an existing record.

You can also select whether to identify records (including partial records) in any associated journal or WAL files.

The Forensic Browser will now automatically attempt to identify deleted records and populate a new table with all the records, existing (live), recovered and partial.

Recovered records are those where the header is intact and all columns from the table should be present (unless records overflow into another database page)

Partial records are those where the row header has been overwritten, in this case, there will be some missing data and the Forensic Browser will attempt to reconstruct this data.

The Browser also creates an MD5 hash of the content of each column in a row to help identify duplicate records, the Browser will not insert a recovered record into the table if a live record with the same hash exists.

The following animated gif shows the full procedure for recovering deleted records from an IOS mmssms.db file and creating a simple report showing converted date and time along with the message details and the status (live/recovered/partial) and filtering on just the recovered records: