Speech Bubbles

This short video shows how to use the speech bubbles report functionality for the Forensic Browser for SQLite. This new report option will be available in version 3.3.0 which will be released shortly.

Investigating a database using foreign keys

SQL is an extremely powerful programming language, and understanding SQL database schemas can often help immensely when creating queries on the database in question. 

The particular feature that I want to discuss in this blog is foreign keys, and I intend to show their value by example, as if I have just been asked to create a report on a database (BlackBerry Messenger master.db from an IOS device) of which I have little knowledge. To help with this let’s assume I have been asked to investigate file transfers between the user of this DB and his/her contacts and create a simple report to detail the transfers. 

This example is, of course, a little contrived but it should, I hope, show the principles well.

First, what are Foreign Keys?

Simply put, foreign keys provide a way of ensuring that relationships between tables are valid. For example, in the case of a simple messaging database, they can ensure that for every message with a userID there is an entry for the userID in the user’s table.

Code:

CREATE TABLE messages(
  time INTEGER PRIMARY KEY, 
  message TEXT,
  userID integer,
  FOREIGN KEY (userID) REFERENCES users(userID)
)

CREATE TABLE users(
  userID INTEGER PRIMARY KEY,
  username Text
)

In the example above trying to add a new message with a userID that does not exist in the users table will fail, likewise attempting to delete a user when there is a message that refers to that userID will also fail.

When they exist, they, therefore, provide a useful clue to help us understand the relationships between tables.

Note there is much more to foreign keys than in my explanation above, but this is sufficient to understand the main points of this blog. There is a link to the SQLite website at the end of this blog where foreign keys are discussed at length.

The rest of this blog follows the basic thought processes I might go through when initially determining which tables relate to each other and creating the query for my final report.

The DB I am looking at has 58 tables and working out the relations between those tables is obviously going to be long-winded, so where do we start?

To decide what we need to do it is, of course, useful to understand something about the investigation. In this case, as discussed above, we are interested in file transfers between users.

There is a table named file transfers and so this would be a logical place to start looking. The content of the table looks as follows:

and the schema for the table:

Code:

CREATE TABLE FileTransfers
(
  FileTransferId   integer not null primary key autoincrement,
  ConversationId   integer not null,
  RemotePin        text not null,
  UserId           integer not null,
  ParticipantId    integer null,
  ContactId        integer null,
  Incoming         integer not null,
  State            integer not null,
  AbortReason      integer null,
  Path             text null,
  ContentType      text not null,
  Description      text not null,
  SuggestedFilename   text null,
  TotalSize        integer null,
  BytesTransferred integer not null, GroupId integer null,
  foreign key (UserId)   references Users (UserId),
  foreign key (ParticipantId)  references Participants (ParticipantId)  on delete set null,
  foreign key (ContactId)  references Contacts (ContactId)  on delete set null
)

You should see the foreign key references straight away and it shows us that the users, participants and contacts tables are all related to this table and we can see what columns are used to formulate the relationship. Useful information.

The users table looks as follows:

We can now create a simple join to start building some sort of report, what I would like to do is have a report that shows

The trasferID, file path and name, whether the file was sent or received, when and by whom. 

Before I do this I like to take a look at the base table, in this case, the FileTransfers table to get an idea of what I expect to see in my query. The FileTransfers table contains 22 rows and so it follows that my report, after I have created my JOINs, should also have 22 rows. This might seem a little obvious but taking stock and making sanity checks as the report is built will pick up any errors early in the process.

The fields we want from the FileTransfers table are:

FileTransferId, 
Path,
SuggestedFilename,
Incoming,
UserId

The field from Users is:

DisplayName

We can see from the foreign key description above that two tables are joined on the userId column, so we will use the same tables and columns in the foreign key for our JOIN.

As we want to have one row in the final report for each row in the FileTransfers table we want a LEFT JOIN, if you need a refresher on join types have a look at the article at this link.

The query to create this report is shown below and we can see that there are 22 rows in it as expected.:

The only thing missing is when the files were transferred. An examination of the FileTransfers table shows that this information is not stored in there and it is unlikely to be in Users, Participants or Contacts (I have checked – trust me). So how do we find out when the file was transferred? one method is to query the sqlite_master table to see if any tables reference the FileTransfers table (rather than the other way around, which is what we have been looking at so far).

The query to do this is shown below. Note that I tend to use “like” in my queries rather than “=” i.e.:

Code:

SELECT *
FROM sqlite_master
WHERE sqlite_master.sql LIKE '%references filetransfers%'

rather than

Code:

SELECT *
FROM sqlite_master
WHERE sqlite_master.sql = '%references filetransfers%'

This is simply because “=” is case sensitive in SQLite and “LIKE” is case insensitive, this just avoids missing something, particularly, as in this case, where the schema has capitalized portions of the table names.

There are three tables with foreign keys that reference the FileTransfers table, the most promising of these looks to be the TextMessages table, which has the following foreign key:

Code:

CREATE TABLE TextMessages
{
  ...
  foreign key (FileTransferId)  references FileTransfers (FileTransferId)  on delete set null,
  ...
)

So we take a quick look at the TextMessages table and see what is in the FileTransferId column, as this is the table and column referenced above:

Not what we were hoping for, there are no values at all in the FileTransferId column.

So we look at one of the other two tables that reference the FileTransfer table, the PictureTransfers table is referenced on both the SmallFileTransferId column and the LargeFileTransferId column. I have no idea what these mean – so let’s take a look at this table:

Now, this is more interesting, there are also 22 rows in this table and the SmallFileTransferId column has the same 22 unique IDs that we saw in the FileTransfers table. But there is still no date information.

So we do the same process as before and look to see what references the PictureTransfers table, and we see that the TextMessage table references PictureTransfers:

The line of interest here is reproduced below:

CREATE TABLE TextMessages
(

foreign key (PictureTransferId)
references PictureTransfers (PictureTransferId)
on delete set null,

)

This leads us back to the TextMessage table, but this time the referenced field is PictureTransferId, so we check out that column:

An examination of the TextMessage table shows that there is a timestamp column.

So before we create our final query if we summarise what we have found.

The FileTransfers table is referenced by the PictureTransfers table which is in turn referenced by the TextMessages table. We want a report showing each File Transfer along with various other data but particularly user name and date, so we want a report with 22 rows.

The final query is shown below along with the visual designer window. The incoming column has a yes/no boolean conversion applied and the timestamp column was a unix epoch date and is converted by the Browser as such. While the SQL looks complex with three JOINs, the visual query designer shows this in a much more intuitive way.

Of course, all of the above was created using the Forensic Browser for SQLite using drag and drop query building.

If you are a Forensic Browser user and have a similar problem I hope this helps and gives you an idea of how your could approach it, but if you do need further assistance then please do not hesitate to get in touch, it’s all part of the service.

The SQLite website has some very detailed information on Foreign Keys here:

https://www.sqlite.org/foreignkeys.html

How NOT to examine SQLite WAL files

At a recent conference while talking about SQLite forensics I found out that some people still use non-forensic tools to investigate databases with WAL files and were quite happy that they would not miss anything of importance. This is something I disagree with very, very, strongly and I hope in this article to explain the very basics of how WAL files work and show why you should not use non-forensic tools to examine SQLite databases.

Most of the databases we examine are appended to as user interactions take place. Be it a web browser which contains tables related to browsing events whereby new URLs are appended to associated tables as and when the user navigates to them. Or a messaging application where each message to and from the user is appended to the end of a list of such messages and new contacts are likewise added to the end of a list of contacts. In the example below, we will consider a simplified database which is appended to in the same manner and we will show what happens when the user tries to hide an event by deleting it. 

While the example is simply the same sequence events as shown here will happen in the same way if a user chooses to delete an incriminating message or delete a URL of a side to hide the history of their visit from a real-world application.

For those of you who are not aware WAL files are Write Ahead Logs and simply put when a page is to be updated in the database the complete new page is appended to the WAL file and the old page is left in the DB unchanged. At some later point the latest copy of each page in the WAL file is copied back to the appropriate place in the DB and any new pages are written to the WAL starting from the beginning again.

When an application uses the SQLite library to read a record from the DB, the library determines whether the latest copy is in the DB or the WAL and reads it from the appropriate location.

Typically the pages are copied from the WAL to the DB either when it exceeds 1000 pages or when the application is shut down gracefully. So when investigating a DB which has an associated WAL then all of the most recent pages and changes to the DB are held in the WAL NOT in the DB.

The process that some people use, believing it will get all the latest pages (and also show the older version of these pages) is to rename the WAL and open the DB with a non-forensic tool believing this will show the old state of the database – this is correct but also very wrong. They then rename the WAL back and open the DB again believing that this will show the latest state of the database, again this is correct but very, very limiting.

The rest of this article will show why you SHOULD NOT use this technique and with a very simple example will show why.

The example is a simple database that has events appended from time to time. This could be some logging application or a messaging application. Our example has just one table (of course it could be and normally would be more complex) and the sequence of events for initializing the database, creating the table and populating the records is documented below. In the example a number of records are added 3 records are added, one then deleted and then another record is added. The SQLite commands are as follows (the order is important):

Open an SQLite command prompt creating a test database

Code:

SQLite3 atest.db

Put the DB into WAL mode

Code:

PRAGMA journal_mode = WAL;

Create a test table with one text column

Code:

CREATE TABLE testtable (textcol text);

Add 3 records (you could add more but this keeps it simple)

Code:

INSERT INTO testtable VALUES (‘our first text string’);
INSERT INTO testtable VALUES (‘our second text string’);
INSERT INTO testtable VALUES (‘our third text string’);

Delete the second record

Code:

DELETE FROM testtable where text col = ‘our second text string’;

Add one more record

Code:

INSERT INTO testtable VALUES (‘our fourth text string’);

Now kill the command shell – do not exit SQLite first, this will preserve the WAL file

The folder containing your test database should look as follows:

The database you created should be present along with the associated WAL file (you can ignore the .db-shm file)

Now make a backup/archive of the database and WAL file (so we can use them both again later) using whatever application you choose.

And now rename the WAL file so that SQLite does not associate it with the parent database

Code:

Ren atest.db-wal   atest.db-walx

We now want to open the database in a tool of our choosing – we could use the SQLite command prompt but for variation I will use the Firefox SQLite manager plugin.

SQLite Manager shows that there are no tables in the database at all:

And the database settings tab shows that the database is in WAL mode:

So why is the database empty? This is simply a function of the WAL mode as discussed above. All changes to the database are first written to the WAL until either the application is shut down gracefully (remember we killed the command shell – so our SQlite exe was NOT shut down gracefully) or more than 1000 pages (by default) are written to the WAL (ours is just eight 1K pages). But we renamed the WAL file so SQLite can no longer associate it with its parent database – hence it can’t find the records we added.

Now re-instate the database and the WAL file from your backup and, without renaming the WAL file, open the database in the Firefox SQLite manager plugin again (or any non-forensic SQLite tool). You can see from the screenshot below that all the records that we added are there and the record we deleted is not present – this is the live database and this is what we would expect to see.

If we look at the results using the Forensic Browser for SQLite we see the deleted record is still within the WAL and a complete picture of what has happened in this example. 

In more detail:

But why are there multiple copies of some of the records?

When a working copy database is created with the Browser and a WAL file is processed the additional columns supplied by the Forensic Browser for SQLite provide information about where the record was found.

The first additional column sfIsLive shows whether a record is a live record or ‘not live’. Live records are those that the user would see if they opened the database using a normal non-forensic SQLite database viewer. Where not live = false these are records that are deleted OR copies of live records.

sfSource shows where the record has been recovered from, in this case, all of the records are from the WAL file.

sfPage is the database page number that the WAL file is holding, in this case, all of the records are from DB page number 2.

sfCommit is the commit number (more on this shortly)

sfpk is just a unique number for each record – we will can this to refer to each record in our description of events below.

Some background

SQLite is an ACID compliant database manager and without going into too much detail 

SQLite maintains its integrity by using journals and transactions. Each transaction can be one or more writes (pages) to a database. The last write/page of a transaction is known as the commit frame. When using the SQLite command line as we have just done every write is a single transaction and every write is, therefore, the commit frame. So the sfCommit number shows all of the pages (and rows) that were written to a database in each transaction.

So looking at the example above we can see the sfCommit 1 (sfpk 4) comprised a single row (our first text string) written to frame2 of the WAL and representing page 2 of the database.

The second transaction sfCommit 2 is two rows (sfpk 5 & 6) this is a new page written to frame 3 of the WAL but again this is a new version of page 2 of the database. This transaction is the addition of the second row of the database but entails a complete rewrite of page 2 hence this new copy of page 2 has both rows.

The third transaction (sfpk 7-9) involves page 2 being rewritten again with the addition of the third entry to the table.

The fourth transaction (sfpk 10 & 11) again rewrites page 2 but in this case, the change is the deletion of the second row.

The final transaction (sfpk 1-3) is the addition of the fourth row of text.

So using the Forensic Browser for SQLite we can see the deleted record that we would not otherwise have seen using the ‘trick’ that some offer as a method of dealing with WAL files.

We can also see when this particular record was deleted, i.e. in transaction 4. In a normal application, such as a messaging app, there may be date and time information in transaction 4 that could show us approximately when this transaction occurred – or there might be date and time information in transactions 3 and 5 that would allow us to ‘bracket’ the deletion between two other events.

Imagine that this was a real-world database such as internet history or a messaging database on a mobile phone (we never normally shut down applications on phones) where a user had repeatably deleted browsing history or messages. There could be many WAL pages containing data that precede the most current page so potentially lots of pages containing data that can not be recovered using the non-forensic technique that is being proposed. 

There is another article on my web site that deals with this in more detail:
https://sandersonforensics.com/forum…eted-in-SQLite

Summary

Using non forensic tools in the manner describe above will show you either:

  • the live files from the current database (when the WAL is present), or
  • the state of the database before the first transaction in the WAL – i.e. just the records in the DB without any of the changes in any of the transactions in the WAL

Using an appropriate Forensic SQLite Browser such as the Forensic Browser for SQLite you can see all of the records from the DB along with all of the changes that have occurred (often many copies of the same page comprising multiple transactions).

Why can’t I see who sent that deleted IOS SMS message

I have seen a number of posts on bulletin boards recently that refer to some of the mainstream software failing to be able to attribute a contact to a deleted message on IOS SMS.db recoveries. My previous post “SMS recovered records and contacts – three ways” shows another method of establishing a relationship between two sets of deleted records. However in this post, I wanted to show how you can use the database schema and in particular foreign key constraints and triggers to understand why deleting one item can have a knock-on effect and how the schema can be used explain why these records are no longer available.

My previous article is available here: https://sandersonforensics.com/forum…cts-three-ways

In summary – understanding triggers and foreign key definitions in a database schema can give a much better understanding of how tables relate to each other and give an overview of how the database “works”.

The following covers triggers and foreign key constraints in a little detail but full coverage is well beyond the scope of this article. Links to further detail for those interested is provided within the text, of course, if you are a Forensic Toolkit for SQLite user and need any help with this, or any other aspect of SQLite forensics, then please get in touch – it’s all part of the service.

Triggers

Triggers are database operations that are automatically performed when a specified database event occurs. It is important to understand that a trigger is a function of SQLite and the database it is defined on and once defined whether it runs or not is not within the control of the programmer, but is controlled by the settings of the database – i.e. the actions are carried out automatically by the SQLite library.

An example is the sms.db from an IOS device that maintains a table of deleted messages. By examining the code for the trigger we can see that after a record is deleted from the message table the guide of the deleted record from the message table is automatically added to the deleted_messages table:

A trigger can be created that will run before, after or instead of a delete, insert, or update event.

Code:

CREATE TRIGGER add_to_deleted_messages AFTER DELETE ON message
BEGIN
   INSERT INTO deleted_messages (guid) VALUES (OLD.guid);
END

The SQLite railroad diagram for a trigger is below:

Simplified we:

  • Create a trigger and give it a name
  • Decide when it executes (before/after/instead of) an event
  • Decide what the event is (delete/insert/update)
  • Define what the trigger does

Rather than create some triggers it might be useful to show some real world triggers and explain what they do.

The Viber database has the following trigger:

Code:

CREATE TRIGGER Delete_Contact AFTER DELETE ON Contact
BEGIN
   DELETE FROM ContactRelation WHERE ContactID = old.ContactID;
END

This trigger: 

  • creates a trigger called insert_mms_pending_on_update
  • that executes after
  • a record in the pdu table is updated, when the new value for msg_box is changed to 4* and the new value for m_type = 128
  • a new record is inserted into pending_msgs with values (1, new._id, new.m_type,0,0,0,0)

This trigger is of also interesting because it gives a bit of a clue as to what some of the values in pdu table mean. It would seem (without further investigation) that a record in the pdu table with m_type = 128 and msg_box = 4 could be a pending message of some sort.

Triggers are stored in the sqlite_master table and can be examined along with the tables on which they operate by running queries such as:

Code:

SELECT *
FROM sqlite_master WHERE type = ‘trigger’

To show all triggers in the schema.

SELECT *
FROM sqlite_master WHERE type = ‘trigger’ AND tbl_name = ‘<insert table name>’

To show the triggers for a specified table.

There is a wealth of information regarding triggers on the SQLite web site here https://sqlite.org/lang_createtrigger.html

Foreign keys

SQL foreign key constraints are used to enforce “exists” relationships between tables. The SQLite database engine will not allow a change to records that will break these constraints.

An example might be a messaging database which holds a table of contacts and a second table with the messages. It does not seem to make sense for there to be a record in the messages table with a contact_id field where there is no matching record in the contacts table with the same id. This can be achieved by a foreign key constraint applied to the table definition. The references clause ensures that for every message there must be a record in the contacts table linked to by the contacts_id column:

Code:

CREATE TABLE contacts (id integer primary key, name text)
CREATE TABLE messages (id integer primary key, contact_id integer REFERENCES contacts(id), message text)

With the above table definitions if you try to insert a record into the messages table with a contact_id that does not exist in the contacts table then SQLite will not allow you to break referential integrity and will raise an exception.

But what happens if you try and edit a record that is already in the database, or you delete a record? SQLite provides for this by allowing you to specify ON DELETE and ON UPDATE actions. The actions are:

NO ACTION – Do nothing
RESTRICT – prevents the update or delete action taking place
SET NULL – sets the child key of all rows mapping to the parent to NULL
SET DEFAULT – as above but sets to the default value
CASCADE – propagates the action on the parent key to all child keys

For instance the master.db file from a blackberry app contains an AppIcons table with a foreign key on the Apps table AppID.

Code:

CREATE TABLE "AppIcons"
(
   AppId          integer primary key not null,
   Data           blob not null,
   foreign key (AppId) REFERENCES Apps (AppId) ON DELETE CASCADE
)

Perhaps not surprisingly when an App is deleted from the parent Apps table all entries in the AppIcons table that match the AppId of the deleted application are also deleted automatically.

SQLite foreign keys are discussed in depth at this page:

https://www.sqlite.org/foreignkeys.html

You can probably see how this helps with our initial query. To summarise it – we have recovered a number of deleted messages from an SMS message table but we can’t find the related contact with who our users were communicating. We want to know why?

It may help further if we look at the structure of the SMS database and how the tables interrelate. The diagram below shows the different tables in the sms.db and shows the joins between the tables (all joins are left joins).

There are four main tables:

ChatContains an entry (row) for each chat (i.e. thread or conversation) between the ‘owner’ of the DB being examined and any of their contacts. Each chat is uniquely identified by the primary key of the table, the ROWID column.
Messagecontains the details of each of the messages
HandleContains the third party details – i.e. this is the contacts or participants table
AttachmentContains a list of file attachment details.

So, in summary, the database contains a series of conversations (chat table) each chat will have one or more messages (message table) associated with it, each chat can have one or more participants (handle table) and each message can have zero or more attachments (attachment table).

Each major table is joined to other tables via an intermediate join table (e.g. chat -> chat_handle_join -> handle) this allows one chat to have multiple participants without having to duplicate entries in the chat table or the handle table.

For the sake of brevity for the rest of this article, I will ignore the attachment and message_attachment_join tables.

To explain this further, let’s take a look at a real chat from my phone – chat number 109.
In the display below I have provided the SQL behind three queries that show the relevant data for chat 109.

It can be seen that chat 109 is between me and one friend (handle_id 109) so there is one entry in the chat_handle_join table for chat 109 that joins to the relevant contact (handle 108)

There are 42 messages in this conversation, so there are 42 entries in the chat_message_join table for chat 109 (only 7 shown for brevity) each of which points to one of the 42 messages in the message table.

Note that there is also a handle_id column in the message table that links directly to the handle table (the dotted line in the picture above). More on this later.

So what happens if we delete an entire conversation by deleting entry 109 in the chat table. To find out we need to look at the schema for the database.

It would seem sensible to start with the schema for the chat table and any triggers created on this table. The following shows the schema for the table. There are no triggers and we can see straight away that there are no foreign key constraints defined on this table.

First the table schema and the foreign keys:

CREATE TABLE chat_message_join (
   chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE,
   message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE,
   PRIMARY KEY (chat_id, message_id)
)

We can see from the above table schema that there are two primary key constraints.
chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE,

When a record is deleted from the chat table this instructs the SQLite engine to automatically delete any entry in the chat_message_join table where the chat_id matches the ROWID of the deleted entry from the chat table.

message_id INTEGER REFERENCES message (ROWID) ON DELETE CASCADE,

The second foreign key constraint ensures that when a message is deleted from the message table that any entry in the chat_message_join table that has a message_id that matches the ROWID of the deleted message will also be deleted.

So far the constraints ensure that matching entries from the chat_message_join table are deleted when a row is deleted from either the chat or the message tables.
What we can’t see yet is how a message is deleted when its parent ‘chat’ record is deleted. For this we need to look at the bottom index in the screenshot above:

Code:

CREATE TRIGGER clean_orphaned_messages AFTER DELETE ON chat_message_join
BEGIN    
   DELETE FROM message WHERE
   (SELECT 1 FROM chat_message_join WHERE message_id = message.rowid   LIMIT 1) IS NULL;
END
  1. CREATE TRIGGER clean_orphaned_messages AFTER DELETE ON chat_message_join

We can see from the first line that this trigger fires when a row is deleted from the chat_message_join table.

  1. DELETE FROM message WHERE

Line three instructs the trigger to delete rows from the message table when the query in the following lines evaluates to true

  1. SELECT 1 FROM chat_message_join WHERE message_id = message.rowid LIMIT 1) IS NULL;

Line 4 deletes any row in the message table that does not have a matching entry in the chat_message_join table.

So the foreign key constraint ensures that a delete on the chat table causes any matching records in chat_message_join to be deleted. The trigger fires when any row from the chat_message_join table is deleted and ensures that any rows in the message table that do not have parent record in the chat_message_join table are also deleted

A matching set of constraints exists in the chat_handle_join table ensure that when a chat or a handle is deleted then any associated rows in the chat_handle_join table are also deleted.

Code:

CREATE TABLE chat_handle_join (
chat_id INTEGER REFERENCES chat (ROWID) ON DELETE CASCADE,
handle_id INTEGER REFERENCES handle (ROWID) ON DELETE CASCADE,
UNIQUE(chat_id, handle_id))

The trigger is slightly different

Code:

CREATE TRIGGER clean_orphaned_handles AFTER DELETE ON chat_handle_join
BEGIN    
   DELETE FROM handle WHERE
   handle.ROWID = old.handle_id    
   AND        
   (SELECT 1 from chat_handle_join WHERE handle_id = old.handle_id LIMIT 1) IS NULL    
   AND        
   (SELECT 1 from message WHERE handle_id = old.handle_id LIMIT 1) IS NULL;
END

After a delete in the chat_handle_join table, any handles with the same ROWID as the deleted chat_handle_join handle_id will be deleted, provided that there are no other rows in the chat_handle_join table that have the same handle_id and there are no rows left in the message table that also have a handle_id that matches the deleted handle_ID.

There is one further trigger, on the message table, that is relevant to our question:

Code:

CREATE TRIGGER after_delete_on_message AFTER DELETE ON message
BEGIN    
   DELETE FROM handle         
   WHERE
      handle.ROWID = OLD.handle_id    
      AND        
      (SELECT 1 FROM chat_handle_join WHERE handle_id = OLD.handle_id LIMIT 1) IS NULL    
      AND        
      (SELECT 1 FROM message WHERE handle_id = OLD.handle_id LIMIT 1) IS NULL    
      AND
      (SELECT 1 FROM message WHERE other_handle = OLD.handle_id LIMIT 1) IS NULL;
END

The trigger executes after a record is deleted from the message table and will delete any row in the handle table where the ROWID matches the handle_id of the deleted record, provided that the row in the handle table does not match a record in the chat_handle_join table and there is no additional record in the message table where either the handle_id or other_handle columns use the handle_id of the deleted record.

In answer to our question “So what happens if we delete an entire conversation by deleting entry 109 in the chat table.” (diagram reproduced below)

  • The ROWID in the chat table is the primary key for this table and as such there can only be one entry with value 109
  • Deleting this row causes a constraint violation with both the chat_message_join table and the chat_handle_join table due to the foreign key definitions
  • The foreign key definitions have an ON DELETE CASCADE clause so all rows in the chat_message_join table and the chat_handle_join tables with a chat_id of 109 will also be deleted
  • The clean_orphaned_messages trigger that fires after a delete on the chat_message_join table will ensure that any of the messages from the message table that do not have a matching entry in the chat_message_join table will be deleted, i.e. all 42 messages associated with chat 109
  • The clean_orphaned_handles trigger that fires after a delete on the chat_handle_join will ensure that as long as the handle_id associated with chat 109 is not in use by any other chats then this handle will be deleted
  • The after_delete_on_message trigger that fires after a delete on the message table has a similar action to the clean_orphaned_handles but also checks two other fields in the message table before deleting the associated handle*

*Note that this last trigger will also insure that if messages are deleted individually (rather than en-masse by deleting as above starting with the chat table) then when all messages that relate to a handle are deleted the corresponding handle will also be deleted.

So there you have it – deleting all of the messages relating to a conversation one by one, or deleting the entire conversation will ensure that the contact information is also deleted, provided the same contact is not part of any other conversation.

SMS recovered records and contacts – 3 ways

In a recent forensic case involving recovered deleted SMS messages from an sms.db file on an IOS mobile device, none of the mainstream mobile phone forensic software made the link between sender and recipient for the recovered records of interest.

I have been asked a few times recently about obtaining the third party of a deleted IOS SMS message that has been recovered by the Forensic Browser. The procedure is simply to create a JOIN between two (or three – there are two ways of establishing the third party) tables and if the data is in the relevant tables then a link is made. Unfortunately, when messages are deleted, particularly when entire conversations are deleted, the primary keys on the all-important tables are often overwritten making these joins impossible.

When the case mentioned was looked at with the Browser neither of the JOINS described above allowed the investigators to ascertain who the third party in a deleted conversation was. So I was asked to take a look at the DB further and I managed to find a rather interesting, to me, third rather obscure route to make the link. This was made possible by the new Structured Storage Manager in the Forensic Browser for SQLite.

In this article, I’ll describe the structure of the sms.db database and how the different tables relate to each other. I’ll then explain the two ‘normal’ joins and why they usually don’t work when trying to ID the third party for a deleted message and I’ll show the third route mentioned above which can often work where the first two don’t.

First I’ll provide an overview of the relationships between the tables. I am not going to discuss all elements and constraints of the different tables as this goes beyond the scope of this article, but I’ll cover in overview an example of a user deleting a conversation to explain what happens and how data can be recovered.

Firstly the relationships between the different tables:

There are four main tables:

  • Chat – contains an entry (row) for each chat (or conversation) between the ‘owner’ of the DB being examined and any of their contacts. Each chat is uniquely identified by the primary key of the table, the ROWID column. Both the chat identifier and the guid column contain the phone number or occasionally name of the third party to the conversation.
  • Message – contains the details of each of the messages.
  • Handle – contains the third party details in the id and uncanonicalized id fields.
  • Attachment – contains a list of file attachment details.

So in summary the database contains a series of conversations (chat table) each chat will have one or more messages (message table) associated with it, each chat can have one or more participants (handle table) and each message can have zero or more attachments (attachment table).

Each major table is joined to other tables via an intermediate join table (e.g. chat handle join) this allows one chat to have multiple participants without having to duplicate entries in the chat table or the handle table.

Let’s take a look at a real, but anonymized, chat from my phone – chat number 109.
In the display below I have provided the SQL behind the queries that show the relevant data for chat 109, it can be seen that chat 109 is between me and one friend 

You can also see (the dotted line) that there is a further connection between the message and handle table (message.handle_id -> handle.ROWID) that is not explicitly defined by the table schemas.

What happens when a conversation is deleted?

Intuitively if you deleted a conversation from a phone you would expect all of the associated messages to be deleted and if you deleted all of the associated messages in a conversation then you would expect the associated entry in the conversation (chat) table to be deleted.

The same can be said for the handle table. The default set up for the sms.db is for the handle entry to be deleted when all message entries that point to it are deleted (more on this in another article).

So in summary when a conversation is deleted in its entirety the relevant entries in the message table are deleted and if there are no further messages which involve this contact the relevant row in the handle table is deleted.

So what should be self-evident from the diagram above is that if we recover a deleted text in the message table then the associated entry in the handle table must either exist or be recovered for us to establish who the third party in the conversation is, or we need to follow the links back to the chat table (each of which must also exist or be recovered).

Identifying the third party the ‘standard way’

If there is a single additional participant in the conversation then the handle can be obtained from the handle_id in the messages table and a simple join used to show the other party:

If the entry in the handle table is not present or cannot be recovered then providing that the relevant records in BOTH the chat_message_join and chat table are recovered then a slightly more complex join can be made to retrieve the same information.

If there are multiple participants, then the chain of tables must be followed back around, i.e.: message -> chat_message_join -> chat -> chat_handle_join -> handle – in order to recover all participants. I’ll leave this query to you to work on – it’s straight forward and just the same as the previous query – just involving more tables.

There is however a problem when dealing with deleted records and this relates to an SQLite artifact called freeblocks.

Freeblocks are structures within an SQLite database page that track all blocks of unused space greater than 3 bytes in size. They do this by writing a 4-byte structure to the start of each block of free space that includes the size of the current block and a pointer to the next block. When a record is deleted from an SQLite table it usually results in a new block of free space comprising the deleted record of which the first four bytes are overwritten by this 4-byte structure. The screenshot below shows a page with two freeblocks (the unused space before the first record is not part of the linked list of freeblocks) the freeblock structure is underlined (a freeblock pointer of 0x0000 signifies the last freeblock):

A major complication has now arisen that will affect our success rate when trying to recover records.

The first two entries in an SQLite record (row) are the Payload length followed by the ROWID.

In most cases, these two variable length integers take up less than 4 bytes of storage so when a record is deleted these two records are usually overwritten by the freeblock structure. As the ROWID usually records the primary key the loss of this value usually means that recovered records in different tables cannot be related to each other.

Luckily when two adjacent records are deleted at the same time a freeblock will be created that encompasses both the records, but only the start of the first record will be overwritten. Unfortunately, this does not help in our scenario because there would be only one chat record per conversation and normally only one other participant in the handle table in a conversation; and even if there were more than one other participant it is not likely that they will have consecutive handle_ids.

So what is the solution?

Happily conversations usually involve more than one message, normally lots, and they are very immediate in that a message often gets an immediate reply (message and reply are usually adjacent in the database). So we can hopefully recover lots of messages and hopefully some of them with intact (not overwritten) ROWID’s.

All we need to do is to link one of the recovered messages back to either the chat table or the handle table to establish who the third party participant was.

But before I show you how to do this I need to digress and talk about the Structured Storage Manager mentioned in the opening lines of this article. The Structured Storage Manager (SSM) can be used to break down structured storage (think binary plists, Facebook orca blobs etc.) into a table structure so that the Forensic Browser can query them and happily there is one such blob in the database that can help us.

The chat table contains a column called “properties” that often holds a binary plist.

Which when decoding using the Forensic Browser built in Binary Plist viewer it looks like this:

The interesting item here is the CKChatWatermarkMessageID integer which my testing shows the ID of one of the messages associated with this chat. In fact what seems to happen is that this blob is updated regularly and the messageID (and following timestamp) represents the last message in a conversation.

We can use the SSM to break down the binary plist into a table. To do this we run the SSM from the Tools menu:

In the following dialog, we identify the table and fieldname of the field holding the Binary Plist blob (properties) and we also identify the primary key for the table so we can relate the decoded data back to the original row. We also tell the SSM what table we want the new data stored in and what type of data it is (Binary Plist in this case):

Once we hit OK the new table is created and we can easily create a query that associates a chat identifier (phone number) with each message (CKChatWatermarkMessageID) from the decoded BPList:

The query above is a little complicated particularly when we need to use it in conjunction with another query so I want to simplify the above SQL by using a VIEW. This is best explained by example – I choose “Create view on current visual query” from the Queries menu:

I give this the name watermarks (it could, of course, be any meaningful name):

The newly created VIEW is a sort of virtual table so I can now run a new very much simpler query that exactly replaces the above:

The list of CKChatWatermarkMessageID and telephone numbers/contacts details above are from all recovered conversations, as well as from some conversations that are still live.

Finally, I want to find any one (or more) of the rows from the above query that has a CKChatWatermarkMessageID that is *IN* a list of any of the ROWIDs (message IDs) for any message with handle_id = 108 i.e. our deleted conversation.

A query showing these ROWIDs and a few extra fields for context is below:

Note that that list of messages that we have recovered from chat 108 is not complete and also some of the recovered messages may have had their ROWID overwritten as previously described. But, all we need is to be lucky enough to find one matching recovered message to determine who the conversation was with.

The highlighted *IN* above gives a clue as to the SQL syntax. We can use the SQLite in expression to filter on a list of values that are in a specified list, the format is:

Code:

SELECT * from xxx where expression IN (value1, value2, .... value_n)

value1, value2… can be replaced by a SQL query that returns just one column (known as a correlated subquery):

There you have it – we have managed to associate a message ID from the deleted conversation with the decoded binary plist from a recovered chat record with an overwritten primary key and thus establish the third party for all of the messages in chat 108. We have also managed to do what the mainstream mobile forensics tools failed to do and we therefore reiterate that although they quite rightly have their place you should always back up anything they find, or fail to find, with another method

Forensic Browser for SQLite – Structured Storage Manager

Often data held within tables in databases is stored within a BLOB (Binary Large OBject) this data is often structured data that is encoded in a particular format. XML and Binary Plists are examples of these structured storage objects. Often the data in each blob in a table is in the same format and it would be useful to query these objects and include selected data in a report.

The Structured Storage Manager does this by using a template to break down the items in each BLOB object and converts the data to a table held within the case file.

The following screenshot shows the msg_blob records from the messages table in a Facebook orca2.db file. The blobs are shown in their raw (hex) form and are clearly a binary (non-text format) and thus it is not possible to query these objects using normal SQL commands:

We can decode the data by :

Create a case file and then open the Facebook orca2.db (the decoded data from the orca blobs will be written to a new table in the case file).

Then invoke the structured storage manager from the Tools menu:

In the following dialog we need to provide some data:

Source table (main.messages) is the database.tablename that contains the blob column

ID field (msg_id) is the primary key of this table – we need something unique so that a query can be made tying the extracted data back to its source

Structured Storage field (msg_blob) is the field/column that contains the blob data

Destination table name (StructuredStorage_messages) i steh name of a new table that will be created in the case file that will hold the extracted data

Strcutured storage type (Facebook orca blob) is the encoding type used to store the structured data selected from the list of currently supported types

Once all the above has been selected we are ready to decide which items from the decoded blob we want to select to copy to the extracted data table. The simplest solution here is to select “Add all elements” from the pop-up menu:

The Browser will then parse a structured storage blob and decode each of the data types and create tree structure that represents the underlying datat and create an associated table with a new column for each element.

The following screenshot shows the decode orca blob structure:

You can select a subset of the above but as all of the data is added to individual columns in a new table it is easier to use the SQL features of the Browser to select your chosen columns. 

The screenshot below shows a JOIN created on the two tables and just those I require (containing the msg_id, date, userID, message text and senderID) are selected for my custom report:

Using the Forensic Browser for SQLite to display maps based on data from exiftool Edit

I recently saw a Twitter conversation where a user wanted to see the EXIF data from some image files displayed as maps and showing a clickable URL for Google Maps. The latter part of this problem can easily be solved with the Browser – the steps are as follows:

  1. Run exiftool and export the relevant results as a csv
  2. Import the csv into an SQLite database
  3. Use the Browser to create a query displaying the lat and long as two fields
  4. Create a VIEW to represent this query
  5. Use the Browser to display a map for each row showing the location defined by the lat and longs
  6. Use the Browser to combine the lat and longs into a clickable URL

This example assumes that you want to display the locations of all the files in the path “E:\\My Pictures”

1. Run exiftool and export the relevant results as a csv

Run the following command line in exiftool

Code:

exiftool -n -gpsposition -csv "e:\\my Pictures" > "e:\\geo.csv"

The commands instruct exiftool to parse all of the data in the specified folder and pipe the output in csv format to the specified file.

-gpsposition specifies that just the GPS tags from the EXIF data will be exported
-n tells exiftool to save GPS data in numerical (floating point) form

A few of the lines from the exported geo.csv file are shown below:

Code:

e:/my pictures/hugh.jpg,
e:/my pictures/image.png,
e:/my pictures/image1.JPG,50.0867083333333 -5.31498611111111
e:/my pictures/IMG_1568.JPG,50.0888333333333 -5.10166666666667
e:/my pictures/IMG_1697.MOV,50.1567 -5.0683

We can see that for those files that have GPS information it is displayed as a lat and long. The keen-eyed among you will have noted that the lat and long is actually a single column, i.e. there is no comma separating the two – this can be resolved later with the Forensic Browser.

2. Import the csv into an SQLite database

Using the sqlite command line tool (or another tool of your choice) create a new database:

Code:

sqlite3 geo.db

Now within the command line tool create a table with two columns for the new data

Code:

CREATE TABLE files (filename TEXT, latlon TEXT);

Set SQLite to work in csv mode

Code:

.MODE CSV

import the csv file created with exif tool.

Code:

.IMPORT geo.csv

3. Use the Browser to create a query and then a VIEW displaying the lat and long as two fields

A query showing the data from the files table looks as follows:

What we need is a query that splits the lat and long from the latlon column into two separate entities, i.e. two new columns. SQLite provides an inbuilt function to extract a portion of a field SubStr and a second function InStr to find the offset of a particular element of a string.

Notice that in the latlon field above the two elements are separated by a space, the following query extracts the characters from the latlon field starting at character 1 and stopping at the character 1 before the space.

SubStr(files.latlon, 1, instr(files.latlon, ‘ ‘) – 1)

This can be combined with a similar query that extracts the part of the latlon string after the space. The combined query looks as follows:

You can see the original latlon column plus two new columns (which we have called lat and lon using the AS qualifier in the query above.

4. Create a VIEW to represent this query

A VIEW is a sort of virtual table and the VIEW can then be used in place of the query itself. The SQLite command we would use is:

Code:

CREATE VIEW geo AS (SELECT files.filename,
files.latlon,
SubStr(files.latlon, 1, instr(files.latlon, ' ') - 1) AS lat,
SubStr(files.latlon, instr(files.latlon, ' ') + 1) AS lon
FROM files)
However the browser has a menu option that allows you to simply create a VIEW on the current visual query.
Once a view has been created then the VIEW name can be used instead of the previous query. e.g.:

5. Use the Browser to display a map for each row showing the location defined by the lat and longs

The Browser has a built-in function that creates geolocated maps based on lat and long fields:

You are just prompted for the table, an ID field and the lat and long columns:

A new table is created and populated with maps for each row in the “source” table. Once the maps have been created for you a simple visual query is automatically built joining the two tables allowing you to customize your query:

6. Use the Browser to combine the lat and longs into a clickable URL

The final step is to create a URL column. This simply uses some hard-coded string values concatenated together with data from the lat and lon columns we created above.

The format for a google maps URL at zoom level 9 is as follows:

http://maps.google.com/ll=,,z9

All we need to do is replace the and elements with values from our table

The SQL for this row is below:

‘http://maps.google.com/?ll=’ || CaseDB.Geodata1.lat || ‘,’ || CaseDB.Geodata1.lon || ‘,z9 ‘ AS url

Hardcoded strings are enclosed in single quotes and the SQLite concatenation operator || is used to join successive strings and field values together, we call the column URL.

The final report is shown below

Q. When is secure delete not secure?

A. When a journal is in use (potentially).

The raison d’etre for a journal, be it a traditional rollback journal or the newer SQLite Write Ahead Log (WAL) file is to maintain database integrity. Simply put if an operation fails for whatever reason then the changes to the database are unwound to put the DB back to its last known good state. It might seem obvious then to state that a copy of securely deleted data would need to be kept in order to facilitate this functionality. This securely deleted data can and sometimes does exist for quite some time.

We also need to understand that SQLite groups actions together in transactions, transactions can be one database update (write, modify, delete) or it can be many thousands of such actions as determined by the user. Think of a simple messaging application whereby a message is received asking to be “friends” – our hypothetical app needs to write the friend request to the messages table and add a user to the users table. It would make no sense to update the messages table with a message that referred to user x when user x’s details had not yet been added to the users table, so both these actions could be wrapped in a transaction so either both tables are updated or neither is updated.

So how does this impact secure delete? 

First we need to understand what secure delete does, according to the SQLite website the command (pragma) that initiates secure delete says “When secure-delete is (sic) on, SQLite overwrites deleted content with zeros.”

Well that gives me a warm fuzzy feeling, when I delete it, it’s gone! But hang on, if SQLite immediately overwrites something I delete and something goes wrong before the transaction completes how can SQLite rewind the DB to its last good state? 

The simple answer is it can’t. in order to maintain database integrity SQLite MUST maintain a copy of the data that has been deleted somewhere until it ‘knows’ the last transaction has completed correctly, that somewhere is the journal.

Rollback Journals

Lets look at an example with a rollback journal. For the purposes of this demo I do the following:

  • create a couple of tables (messages and users)
  • add some users
  • add 9 numbered records to the messages table
  • delete records 1-3
  • add a further 9 records
  • delete records 11 and 12

Also unless you specifically tell SQLite differently it treats each individual database update as a transaction, so I group the updates into sets of 4 so each of the 4 updates forms a transaction. The simplified instructions I use are as follows:

Code:

begin transaction;
insert into users (1, 'System');
insert into users (2, 'Helen');
insert into messages (1, strfTime('%s','2016-03-01 09:12:45'), 0, 1, 1, 'Hello welcome to Sanderson Forensics test messaging database');
insert into messages (2, strfTime('%s','2016-03-02 15:08:14'), 1, 1, 2, 'Hi Honey just got this new messaging app - it''s awful :)');
end transaction;
begin transaction;
insert into messages (3, strfTime('%s','2016-03-02 15:10:12'), 0, 1, 2, 'Sounds great - another app, I''m soo excited');
insert into messages (4, strfTime('%s','2016-03-02 15:12:45'), 1, 1, 2, 'haha, thought you''d like it');
insert into messages (5, strfTime('%s','2016-03-02 15:14:42'), 1, 1, 2, 'anyway i''ll be home in a couple of hours hour xx');
insert into messages (6, strfTime('%s','2016-03-03 17:12:45'), 1, 1, 2, 'just left the office....');
end transaction;
begin transaction;
insert into messages (7, strfTime('%s','2016-03-04 09:02:01'), 1, 1, 2, 'just got to work');
insert into messages (8, strfTime('%s','2016-03-04 09:10:12'), 0, 1, 2, 'cool have a nice day x');
insert into messages (9, strfTime('%s','2016-03-04 09:22:02'), 0, 2, 1, 'Paul, be my friend, Darcy');
insert into users (3, 'Darcy');
end transaction;
begin transaction;
delete from messages where id < 4;
insert into messages (10, strfTime('%s','2016-03-04 09:25:43'), 1, 1, 3, 'Hiya mate - didn''t know you were on this app');
insert into messages (11, strfTime('%s','2016-03-04 09:27:43'), 0, 1, 3, 'No time for pleasantries, Ive transferred the money from the company account');
insert into messages (12, strfTime('%s','2016-03-04 09:29:22'), 0, 1, 3, 'This is really scary - I dont do illegal - no more after this');
insert into messages (13, strfTime('%s','2016-03-04 09:29:22'), 1, 1, 3, 'OK good - thats it we are quits now');
end transaction;
begin transaction;
insert into messages (14, strfTime('%s','2016-03-04 10:03:21'), 0, 1, 2, 'Hi honey - are you working late? what time will you be home?');
insert into messages (15, strfTime('%s','2016-03-04 10:05:10'), 1, 1, 2, 'about 8pm hopefully');
insert into messages (16, strfTime('%s','2016-03-04 13:13:40'), 0, 1, 3, 'Oh - I forgot to say, delete any trace of this conversation');
insert into messages (17, strfTime('%s','2016-03-04 14:08:21'), 0, 1, 2, 'Hi honey - did you get my message form earlier?');
end transaction;
insert into messages  (18, strfTime('%s','2016-03-04 17:05:08'), 1, 1, 2, 'been in a meeting, leaving shortly, quick pint with the boys and then ill be on the train');
delete from messages where id = 11 or id = 12;
end transaction;

Before we look at the recovered data with the Forensic Browser for SQLite a quick summary of how a rollback journal works.

Essentially before a change to a page in the database is made the page is copied to the rollback journal. The database is then updated and only when this is successful the rollback journal is invalidated by wiping the journal header (in fact the default mode is to delete the rollback journal file in its entirety - more on this later). If something goes wrong e.g. the application/devices crashes. When it is restarted the valid rollback journal is still there so SQLite will know there has been a problem and will transfer the cached/saved records back into the database to restore it to its last known good state.

When opening a database with the Forensic Browser for SQLite if there is an associated journal or WAL then the user is prompted as to whether they want to process this also. All records from both the database and the journal/WAL are then added to the parsed data and marked as to their source:

The following screenshot shows an SQLite database and its associated journal loaded into the Forensic Browser for SQLite. You can see that the records that are in the DB are marked as live and that records 1-3 and 11 & 12 are missing as we expect.

The journal however still has a copy of the deleted records 11 & 12, from the last transaction. Records 1-3 have been overwritten in the journal by the subsequent transactions since they were deleted.

So while the journal exists the secure delete is not as effective as you may think. To be clear this is an intended consequence of the operation of SQLite and should not be considered a bug.

It is also useful to consider when we may find helpful data in a journal. Rollback Journals held on disk operate in one of three modes, delete, truncate and persist. These three modes work as follows.

When a journal is in delete mode (the default) and a transaction is completed successfully the journal file is deleted. In this instance, the journal could potentially be recovered using normal forensic techniques and any securely deleted data from the last transaction recovered.

When a journal is in truncate mode and a transaction completes successfully, the journal file size is reset to 0 bytes, the file is truncated. As above the content of the truncated journal could potentially be recovered forensically and securely deleted records from the last transaction recovered.

When a journal is in persist mode (this is the mode I used for my demo) and a transaction completes successfully, the header for the journal file is overwritten but otherwise the journal stays on disk and of course, all securely deleted data from the last transaction will be easily accessible.

WAL Journals

What happens when we do the same operation with WAL files? This is much more fun 

First - how do WALs operate? The explanation for the SQLite web site is that they turn traditional journaling on its head and that rather than copying the page that will be changed to the journal they leave the original page in the database and write the new page to the WAL.

The database reader when it needs to read a page looks for the latest copy in the WAL and only reads the main database if it can't find a copy in the WAL.

The database is always left in a good state as any errors (power cuts etc.) will only affect writes to the WAL.

There is another difference/complication. Rather than copy the changed pages from the WAL to the database after every transaction, multiple transactions are appended in the WAL one after another until a WAL 'checkpoint' occurs. By default, this is when a transaction takes the WAL above 1000 pages of data, if the application specifically triggers one or when/if the application shuts down gracefully.

At a checkpoint, all changes pages are written back to the main database file. However, after a successful checkpoint, the WAL is not deleted or truncated, any new pages start to be written from the start of the WAL file leaving all older pages in a sort of WAL slack.

The WAL file is only deleted when the application is shut down gracefully and all of the changed pages in the WAL have been successfully written to the database.

After opening the database and associated WAL in the Forensic Browser for SQLite the summary of the data in the messages table is as follows. As you can see from the messages marked as true in the sfIsLive column the live records are missing records 1, 2, 3, 11 & 12 as we expect:

We can also see multiple copies of the securely deleted records 1-3 and a copy of the securely deleted records 11 & 12.

There are multiple copies of records 1-3 which were in database page 3 because each time page 3 was updated a new copy of it was written to the WAL. We can see that records 1-9 where all in DB page 3 at some time and also see that page 3 was updated in commits 1, 2 & 3.

Of interest, we can get an idea of what SQLite is doing in the background as we can see that message ID 4 has been in both DB page 3 and DB page 4 and so at some point, SQLite has moved these records within the DB.

We can also see that there are no pages of data actually in the database, this is because in our example a checkpoint has not yet occurred and all pages are waiting to be written to the main database file.

As stated writes to WAL files start at the beginning of a file after a checkpoint takes place. WAL files are deleted when the application shuts down (after any associated successful checkpoint). So if a WAL file exists or can be recovered then it may be possible to recover records from multiple previous transactions.

There is a further feature in the Forensic Browser for SQLite that allows you to "look" at a database as it used to be. When you open a DB and choose to process the associated WAL the Browser will ask "which is the last commit frame that you want to use?". Typically this will be the latest but by choosing an older commit frame you can get the Browser to show you the database as it was when that frame was the last to be written. Essentially you can wind back the clock on the database state.

The table below shows the database when commit frame 5 was the last live frame. In the screenshot, you can see that records 1-3 have yet to be deleted and that records 7 onwards have not yet been written (although they are still shown as this is a forensic tool after all).

So in summary:

If a journal file exists or can be recovered then you can potentially find evidence of any securely deleted records from the last transaction.

If a WAL file still exists or can be recovered you can potentially find evidence of any securely deleted records from often many previous transactions.

Determining when a record was deleted in SQLite

In a recent article, I discussed how I identified deleted records in a database that was using WAL journalling. In this article, I want to take this a little further and show how we can see what the live records were at a specific point in time and how we can timeline the frames in a WAL and use this to determine approximately when the records were deleted .gs.

To help with this I created a test messaging application with a few artificial messages and users to keep the demonstration simple.

The “application” I have developed has just two tables “messages” and “users” and I created a simple conversation series and added and deleted users and messages.

The Messages table, when viewed with the Firefox SQLite manager plugin, i.e. the live database as the application, would display it looks as follows:

The users table is even simpler:
The final step in setting the ‘picture’ is to show the database as it would be seen in the Forensic Browser with a simple join between the tables and the columns and the columns displayed with simple conversions so the report makes more sense:
That is how most SQLite viewers would see the database and all looks OK. So what happens when the database is opened with the Browser and the WAL journal is processed alongside the master database? The following screen shot shows just a few of the 101 messages that were recovered by the Forenisc Browser for SQLite, while there is a lot of duplication in this report (see later) there are also some records that don’t appear in the live database (some of which are highlighted) it is these records we will examine in the rest of this article.

So why 101 records? If you have read my previous article you will be aware that when a database is using a WAL journal any new or changed pages are updated and appended to the end of the WAL file, the associated database file is not touched at all until a WAL checkpoint occurs (typically when the WAL grows beyond 1000 pages). The other reason of course for the extra records is because some of them are deleted records that are not in the live database.

So how can we identify just the deleted records when there are exact copies of live records that are also held in the WAL?

The simplest method is to using something called a correlated subquery that first identifies the MD5 of each of the live records and then returns a query which shows everything other than a record that has the same MD5as the live records.

The query is:

Code:

SELECT *
FROM messages
WHERE sfcMD5 NOT IN (SELECT sfcMD5 FROM messages WHERE sfIsLive = 'True')
And results look like this, still some duplicated records (which is always to be expected when WALs are active) but all messages that we haven’t seen before:
This query, however, returns every row from the table and will include duplicate copies of any rows where there are multiple copies. We can filter this query further by simply adding a group by clause to the sfcmd5 column to just display one unique copy of each record based on the MD5 of all of the columns in the original database (i.e. not including those columns that start with sf…).

This query is further explained here:

So what else can the Forensic Browser do for us as part of an investigation?

Firstly we can use one of the parsing features of the Browser to use a commit or transaction number to limit the Browser when considering records to include in the list of live records.

The commit number for the last WAL frame containing the deleted records is highlighted above is 20 and WAL frame is frame 25. When the Browser is run and the WAL is identified the investigator gets the option to process all frames, none of the frames, or select the maximum frame. Note that all records are still processed, but the value selected will determine which records are marked as live:
The following screenshot shows the result of parsing with the last commit number set to 20. The results have been filtered to just show the live records and we can see at the top left that this is 16 of 101 records. So, in summary, this is the database/table as it was before the messages were deleted.
When were the records deleted?

As well as knowing what records were deleted it would of course be useful to determine when a record was deleted. The success of this technique depends on the content of the database and works best if there is a timestamp that is set automatically, in this example we are assuming that the “date” is the date a message was sent/received and is set automatically by the application, i.e. not manually by the user.

The first step is to try and work out what the last possible date that the last deleted record was live in the table. We can then determine:

  1. The date and time of the last message in the table before the record was deleted
  2. the date and time of the next message after the record was deleted.

The highlighted record below is the last occurrence of any of the deleted records, we can see that this is in commit number 20 (as discussed above).

To summarize what this shows:

  • The record of interest is on DB page 5
  • At commit 20 page 5 is updated and the record is live
  • Commit 21 just affects page 6 and adds a record to the database
  • At commit 22 page 5 is updated and the record is no longer in the table
  • The record was therefore deleted in commit 22.
As we know the record was not present in page 5 at commit 22 the numbered bullet points above become:

  1. The date and time of the last transaction in the table before commit 22 (LastLiveDate)
  2. the date and time of the next (earliest) transaction after and including commit 22 and after (LatestDeleteDate)

If we can determine the date of the last record that was sent or received before commit number 22 then we know the last time that the record was in the live DB, the following query does this:

Getting the other bracketing date (the following date) is slightly more complex because WAL frames/pages after commit 22 can also include records that are live both before and after the deletion occured. The following SQL uses the query above as a sub query that returns the last live date, and finds the earliest date that follows the last live date from the following WAL pages.

Getting the other bracketing date (the following date) is slightly more complex because WAL frames/pages after commit 22 can also include records that are live, both before and after the deletion occurred. The following SQL uses the query above as a subquery that returns the last live date and finds the earliest date that follows the last live date from the following WAL pages.

Code:

insert into messages (id, date, sent, type, user_id, message) values (9, strfTime('%s','2016-03-04 09:22:02'), 0, 2, 1, 'Paul, be my friend, Darcy');
insert into users (id, username) values (3, 'Darcy');
insert into messages (id, date, sent, type, user_id, message) values (10, strfTime('%s','2016-03-04 09:25:43'), 1, 1, 3, 'Hiya mate - didn''t know you were on this app');
insert into messages (id, date, sent, type, user_id, message) values (11, strfTime('%s','2016-03-04 09:27:43'), 0, 1, 3, 'No time for pleasantries, Ive transferred the money from the company account');
insert into messages (id, date, sent, type, user_id, message) values (12, strfTime('%s','2016-03-04 09:29:22'), 0, 1, 3, 'This is really scary - I dont do illegal - no more after this');
insert into messages (id, date, sent, type, user_id, message) values (13, strfTime('%s','2016-03-04 09:29:22'), 1, 1, 3, 'OK good - thats it we are quits now');
insert into messages (id, date, sent, type, user_id, message) values (14, strfTime('%s','2016-03-04 10:03:21'), 0, 1, 2, 'Hi honey - are you working late? what time will you be home?');
insert into messages (id, date, sent, type, user_id, message) values (15, strfTime('%s','2016-03-04 10:05:10'), 1, 1, 2, 'about 8pm hopefully');
insert into messages (id, date, sent, type, user_id, message) values (16, strfTime('%s','2016-03-04 13:13:40'), 0, 1, 3, 'Oh - I forgot to say, delete any trace of this conversation');
insert into messages (id, date, sent, type, user_id, message) values (17, strfTime('%s','2016-03-04 14:08:21'), 0, 1, 2, 'Hi honey - did you get my message form earlier?');
delete from messages where (id >= 9 and id <= 13) or (id = 16);
insert into messages (id, date, sent, type, user_id, message) values (18, strfTime('%s','2016-03-04 17:05:08'), 1, 1, 2, 'been in a meeting, leaving shortly, quick pint with the boys and then ill be on the train');
This article has shown why it is essential to look at a database and WAL as a whole and not just look at the latest live incarnantion of the database, and also why we as investigators need to actively examine SQLite databases and not just rely on a canned report from a generic investigations application to produce our reports for us.

Identifying deleted records in DB and WAL

I have been working with a user who is processing an Android mmssms.db with its associated WAL journal and it became clear that another forensic tool that he was using was not identifying deleted records. My colleague wanted to create a report showing just the deleted records from the SMS table for a specific phone number.

First a little back ground regarding Write Ahead Log (WAL) journals.

When a new record is to be written to an SQLite database and WAL journalling is in operation the SQLite engine identifies which page of the database needs to be updated with the new record and writes a new copy of that page to the WAL file (with the new record) but leaves the old page in the DB untouched. If subsequently another record is to be written to the same page then another copy of the page (with the new record and the previous new record) will be appended to the WAL file.

The same happens if a record is deleted – the page that needs to be updated (with the deleted record) is appended to the WAL and the DB left unchanged.

If you have been following all that you will realise that you can get multiple copies of a page in the WAL file and therefore multiple copies of the records in a page.

The Forensic Browser can process a DB along with the WAL file and will identify which of all of these records is the current “live” record but will still display all of the other records, because, as forensic investigators we need to see everything. If the record is not the actual live record that would be displayed by a normal SQLite browser, then it is marked as False in the sfIsLive column.

If you have really been following this you will realise that if a record is written to the WAL and then subsequently deleted you will get both copies of the page (one with the live record and one without it – i.e with it deleted) in the WAL file. The recovered deleted record of course will not be marked as live, because it is not.

The task therefore is to identify all the records for the user (last four digits of phone number = 9937) that have been deleted. the process I used was as follows.

Open the Forensic Browser for SQLite and create a case file – we will be creating some views later and these will be saved to the case file.

Open the DB, the associated WAL file and choose to recover records.

Choose to process all of the frames in the WAL

When the tables have all loaded we run our first query to select all of the records from the SMS table that belong to the user we are interested in. the query is

Code:

SELECT *
FROM sms
WHERE sms.ADDRESS = '+xxxxxxxx9937'
To make our final query simpler we now need to create a view based on this query. This is done from the queries menu and when prompted we give the view a name – I called it “all_9937”.

The query above can now be replaced by :
Code:

SELECT *
FROM all_9937
The next step is to create a similar view but only on the live records for this particular phone number. The Browser records which records are live and which are not by using ‘True’ or ‘False’ as appropriate in the sfisLive column. So the query to identify live records is simply

Code:

SELECT *
FROM sms
WHERE sms.ADDRESS = '+xxxxxxxx9937' and sms.sfisLive = 'True'
We create a view on this query and call it “live_9937”

The final task is to create a correlated subquery to return the difference between the two tables, this sounds more complex than it is and the query is simply:

Code:

SELECT *
FROM ALL_9937
WHERE ALL_9937._ID NOT IN (SELECT LIVE_9937._ID FROM LIVE_9937)
This in English simply says select all records from 9337 that are not present in the live records from 9937.

The final results looked like this

The views can be replaced by the original queries but this makes the SQL a little harder to follow:

Code:

SELECT *
FROM sms
WHERE sms.ADDRESS = '+xxxxxxx9937' AND sms._ID NOT IN
(SELECT sms._ID FROM sms WHERE sms.ADDRESS = '+xxxxxxx9937' AND sms.sfIsLive = 'True')