Converting the new Firefox cache2 files to an SQLite DB for investigating

A few weeks ago I was contacted by a user who asked me to have a look at the new Firefox cache2 format files and see if I could convert the metadata (along with the cached file) into an SQLite table in a similar manner to the Chrome Cache extension I did for the Forensic Browser a few months ago. The result of this work is the free tool FirefoxCache2ToSQlite.

Before I discuss the tool and the structure of the data it presents to the user of the Browser, I’ll look a little at the format of the data. For this exercise I’ll ignore the index file for the cache as this just provides a quick way of linking the chrome cache database with individually cached files, this is not important when working with both sets of data in SQLite as you’ll see.

But first a quick taster with a screenshot of a query on the resulting tables created by FirefoxCache2ToSQLite and viewed with the Forensic Browser for SQLite and – this will give you an idea of the sort of data you can see with a formatted report before I go on with the details.

Cache2 file format

Individual cached files are simply written to disk as a normal file with a name which is a sha1 hash of the URL of the file, simple. The file is followed immediately by the metadata followed by a 4 byte big endian integer (all integers are big endian) which is the length of the original file. These four bytes are the last four bytes of the file and equate to the offset of the start of the metadata.

So what is the metadata that follows the file content? The first few bytes are hashes of some of the data and are not really relevant for this article, we do however need to skip them and as the amount of data is variable we need to understand a little of what is there. Essentially there is a 4-byte hash followed by a further 2 bytes for every 262,144 bytes of the original file size. So we need to divide the file size by 262,144 and round up the result, multiply by 2 and add 4 and then add this to the offset value (the last four bytes of the cache file) to get to the start of the metadata itself.

The first section of metadata is 28 bytes split into 7 big-endian numbers, this is followed immediately by the URL of the cached page.

The screenshot below shows that the last four bytes of the cache file point to the end of the file proper, i.e. the beginning of the metadata at offset 0x0258. As this is a small file there follows the 4-byte hash 45C6AA5D followed in turn by the 2 byte hash 0339 on the first 262,144 bytes. This is then followed by 7 big-endian numbers and then the URL of the cached file.

These 7 big-endian numbers, in sequence, are:

Version 00000001 1
Fetch Count 00000001 1
Last Fetched date 54DDEFE3 13/02/2015 12:36:51
54DDEFE3 13/02/2015 12:36:51
Frequency 3AD345A3
Expiry Date 54E13bA3 16/02/2015 00:36:51
Key Length 00000031 49

The remaining bytes comprise a varying number of tuples made up of name value pairs, i.e. pairs of strings as seen in the screenshot below. These strings are written to the tuples table in the name and value columns as appropriate.

Examining the cache

FirefoxCache2ToSQlite creates 2 different SQLite tables. The first contains the fixes tables and has the following schema:

CREATE TABLE cache2 (
filename text primary key,
filesize int,
file blob,
Version int,
FetchCount int,
LastFetchedDate int,
LastModifiedDate int,
Frequency int,
ExpiryDate int,
KeyLength int,
uri text)

As determined by the primary key filed, there can be only one entry for each filename. The file size is the size of the cached file excluding any metadata, the file BLOB field contains the data for the cached file – The Forensic Browser for SQLite will allow you to view this as hex, the text of for images as a picture. The remaining fields are as above and the definitions are outside the scope of this article.

The tuples table has the following schema

CREATE TABLE tuples (
filename text,
name text,
value text)

The filename field in this table is a not a primary key and there will be more than one entry for each corresponding entry on the cache2 table. Each entry will have a filename field so a join can be created in the main cache2 table. The following name and value columns contain the associated data for each tuple.

The following screenshot shows three of the tuples for the file with name ending in 7d62, these tuples are security-data, request-method and response-head along with the associated data for each of them.

A join can be used to create custom queries/reports such as the one below, which includes the file data displayed as hex, the dates and just the response-head data from the tuples table:

Using group_concat to amalgamate the results of queries

Recently one of our users contacted me and asked for help creating a custom report for a Skype database after other forensic software had failed to include some very relevant data in their reports’.

In a nutshell, the messages report he had produced using the other software only included the author of a given message and did not list the other person(s) who were part of that conversation. This information is maintained in different places in a Skype main.db SQLite database. This article discusses how to include this information in a user-friendly format.

In order to include the required information, the two tables that we need are the messages table and the participant’s table. Normally in a Skype installation, this table includes most of the information we want in the dialog_partner column, however in this database (from an HTC One phone) this column was blank.

In order to understand the problem more clearly have a look at this extract of selected columns from the participant’s table:

The convo_id column is a unique number that refers to each conversation; the identity column is the identity of the participants. Note that conversation 426 has 4 participants. Also note that r3scue193 (that’s me) appears in every conversation, as you would expect.

If we now look at the messages table we can see that there is a corresponding convo_id column and we can use this column to perform a join between the two tables so that we can include a list of all the participants.

So how do we get a list of all of the participants in a conversation?

If we simply do a LEFT join between the two tables on convo_id then for each row in the messages table we will get a join for each matching row in the participants table, i.e. for a conversation with two participants we would get a duplicated row for each participant (the identity column below), for three participants we would get three rows etc. – this would be very confusing:

The answer is the SQLite aggregate function group_concat, from the SQLite web site (https://www.sqlite.org/lang_aggfunc.html)

group_concat(X)
group_concat(X,Y)
The group_concat() function returns a string which is the concatenation of all non-NULL values of X. If parameter Y is present then it is used as the separator between instances of X. A comma (“,”) is used as the separator if Y is omitted. The order of the concatenated elements is arbitrary.

The query we use is below and utilizes the group_concat function along with the GROUP BY expression:

Code:

SELECT Participants.convo_id,
Group_Concat(Participants.identity, ', ') AS ConversationParticipants
FROM Participants
GROUP BY Participants.convo_id
In English, this selects all of the convo_id and associated participants from the participants table and groups them by the convo_id, and then the identity of each participant in each group is concatenated into a single string with each identity separated by a comma.

This might be clearer if we review the original table:

And then look at the results of the query:

Remember you can use the Case Manager to save the query you have just created for re-use on another case.

We now want to use the query created above with the messages table so that we can list the participants alongside each message. But before we do that we will make a VIEW based on the above query. A VIEW can be thought of as a sort of temporary table and the Forensic Browser for SQLite allows you to create a VIEW by selecting the appropriate option from the Queries menu:

You then need to enter a name for the view

This view can now be used in the same way as any table by dragging it to the query designer. The screenshot below shows that:

Code:

SELECT Participants.convo_id,
Group_Concat(Participants.identity, ', ') AS ConversationParticipants
FROM Participants
GROUP BY Participants.convo_id

Can now be replaced with:

Code:

SELECT *
FROM Convo_participants

All that now remains is to create a JOIN on the messages table using our new view and select the columns we want to display:

Dealing with records found in SQLite Rollback Journals

I was contacted recently by a police client who had identified an incriminating file name in an SQLite rollback journal who wanted some assistance getting the data into a user-friendly format. Intelligence provided was that the suspect was using Kik messenger on an Android device and had deleted a number of messages. The file itself, a jpeg, had been found but they needed to link it back to the message and put it in context with the other messages around it. 

The process I suggested using the Forensic Toolkit for SQLite is rather straight forward and is detailed at the end of this article. But, I thought I would write up some of my observations on whether to manually ‘rollback’ the journal myself and why I chose not to do so.

SQLite maintains two sorts of journals, the older rollback journal file, that I will discuss in most of this article and the new Write Ahead Logging (WAL) file that I won’t cover here (however the same technique offered at the end of this article can be used to deal with WAL files). Only one journal type can be used at any one time. Journals are maintained so that if an error occurs while updating a database, SQLite can revert back to the last good position i.e. before the write transaction took place.

When a record is updated, created or deleted in a database that utilises a rollback journal the entire database page that contains that record is first copied to the rollback journal file. The database is then updated. If SQLite crashes when it restarts it checks the journal for valid entries (more on ‘valid’ later) if any are present it “rolls back” the journal, i.e. copying the pages from the journal back into the database.

Database updates normally take place as part of a transaction and usually a lot of records are updated at once, for instance if multiple records are updated or deleted. When all the transactions are complete the changes are ‘committed’ to the database. The records for different tables are stored in the database in pages, so if multiple pages are updated as part of a transaction then multiple pages will exist in the journal.

A valid journal starts with a header section that is padded out to the size of 1 sector (so usually 512 bytes) the sector size is recorded in the header. Only the first 28 bytes of the header are actually used. The header also records the number of pages in the current journal, a random seed (known as a nonce) used to calculate a checksum for each page and the database page size.

This is followed by the stored pages from the database with each page stored consecutively as follows:

The file format is detailed here:

http://www.sqlite.org/fileformat.html

There are a few points to note from the description at the file format link above:

  • The journal is only valid if it contains a valid header
  • The same page must not exist more than once in the journal
  • The checksum for each page is calculated on certain bytes with the nonce from the journal header
  • A new nonce is used for each transaction
  • When a transaction is committed, a journal is marked as invalid by either – deleting it, truncating it to 0 bytes or clearing the journal header sector.

In practice when a transaction is committed it seems that the journal header is normally zeroed and the data in the journal remains. This is not a problem when it comes to reading each page from the journal as we can obtain the page size from the database itself and as discussed above, the header is almost invariably 512 bytes.

However, there is a complication. If a large amount of data, say 1000 pages, is written to a journal and then committed, these 1000 pages remain after the transaction is committed and just the journal header is wiped. If the next transaction contains just 3 pages then these 3 pages overwrite the first 3 pages of the previous transaction. The next transaction might be 30 pages, etc. This of course could, and does, happen multiple times with different page counts.

What this means in practice is that the journal often contains data from multiple previous transactions. If the header is present then the page count field in the header can be used to determine what pages belong to the current transaction, or the checksum (using the nonce) could be used to determine the valid pages. As mentioned above it is unusual to see a non zeroed header.

In the journal sent to me the first 21 pages were as follows:

If the header had been intact we could use the nonce (sometimes referred to as a SALT) from the header and the data from the stored page to verify the checksum of each page in turn. Once we have a mismatching checksum we know that all of the pages prior to the first mismatch and all of the pages further on (including the first mismatch) are from one or more previous journals

We can see that DB page 1 is at journal pages 3 and 4 and DB page 90 is at pages 14 and 21. Referring back to our bullet points above we see that a page can only exist once in a valid journal, so from this, we can conclude that pages from at least three transactions are present in this journal.

As page one of a database is always updated when a database is using journal mode (not WAL mode) we can make an assumption that the last write was pages 86, 4 and 1, because the specification says that a page from the database can only occur once in any valid journal, therefore database page 1 at journal page 4 MUST be from a previous backup. Likewise, we can also say that the journal changes again somewhere between journal page 15 and journal page 21 (because database page 90 can’t appear twice in the same journal).

 

Back to the problem of viewing the data, what are our options?

We could roll back the complete journal anyway (we would need to write a program to write every single page in the journal back to its correct location in the database. This I decided was too fraught with potential problems. Page 1 is just the DB header and this is updated on every database commit, but DB page 90 is also present in the journal in 2 locations, so which one do we use.

I don’t like this idea!

We could insert every record in every page in the journals into the KikDatabase.db file (the main SQLite database for Kik). This also has its problems:

We know that there are multiple transactions in the journal, therefore, there may be multiple copies of the same record. The ID for each row in messagesTable is an integer primary key (see later) and the table constraints for a primary key (must be unique) would be broken if we tried to insert more than one record with the same id. With more complex database schemas this issue becomes even more relevant. Or what about inserting an internal leaf page that broke the parent-child relationships in the binary tree in which each table is stored?
Even if this was successful the data from the journal would now be interspersed with the data that currently exists in the live database and there would be no way of differentiating between the two sets of records.

I don’t like this idea!

We could try and determine where the last valid journal ends and roll back just those entries up until there. But this point is not always obvious and it still doesn’t get the pages beyond the last ‘good’ page that we choose.

So I don’t like this idea either!

The solution I chose turns out to also be the simplest one and required no new code, Use Forensic Recovery for SQLite to carve directly from the journal and create a new database with just the records from the journal. All we need to do is ensure that a template is created for the KikDatabase, this can be done by just pointing Forensic Recovery for SQLite at the existing database when following the wizard (these steps are very straight forward and are detailed later). Forensic Recovery for SQLite will display tables as per the original schema (with constraints removed so duplicate records are allowed) showing all of the records, it also creates a new SQLite database that we can examine with the Forensic Browser for SQLite.

Although a report can now be created on the database we have populated with the content of the journal, it makes sense to take one further step. As our brief is to look specifically for deleted records we need to filter the recovered data to remove those records in the journal that still exist in the live database. Remember that if a single record is deleted from a database then the complete page, including all other records in that page – whether they have been deleted or not, will be copied to the journal, so the journal will normally contain deleted and copies of live records.

To understand how we can do this let’s digress a little and look at the SQL for the messagesTable – as follows:

 

Code:

CREATE TABLE messagesTable (_id INTEGER PRIMARY KEY AUTOINCREMENT, body VARCHAR, partner_jid VARCHAR, was_me INT, read_state INT, uid VARCHAR, length INTEGER, timestamp LONG, bin_id VARCHAR, sys_msg VARCHAR, stat_msg VARCHAR, stat_user_jid VARCHAR, req_read_reciept BOOLEAN, content_id VARCHAR, app_id VARCHAR, message_retry_count INT, encryption_failure BOOLEAN default 0, encryption_key BLOB, render_instructions VARCHAR)

Of particular relevance is the first column “_id INTEGER PRIMARY KEY AUTOINCREMENT”
This column states that the _id is the primary key – this means that this field must be unique. It also states that this field ‘auto-increments’, this means that even if a row is deleted the next inserted row will not re-use the deleted _id rather it will be auto-incremented from the last (largest) value used.

Incidentally, the next value to use for auto-increment is stored in the sqlite_sequence table. In our database this table, as shown below, tells us that the last _id used for messagesTable was 7603, i.e. the next to be used will be 7604.

So within the Forensic Browser for SQLite we create a query that filters the recovered data based on the _id field and restricts the returned data to those rows that do not exist in the original. First, open the live database – this database will have a fully qualified prefix “main”. Now attach the recovered database and attach it as “recovered” and paste the query below into the SQL query editor and execute the query:

Code:

SELECT *FROM recovered.messagesTable
WHERE recovered.messagesTable._ID NOT IN
(SELECT main.messagesTable._id FROM main.messagesTable)

In English this does the following, row by row:

a) Selects all of the messages from the recovered table
b) but, only if the ID of the record in a) is not in:
c) the list of the complete set of _id fields returned from the live (main) database.

So basically the query will just return those rows carved from the journal that are not in the current live database. The main browser display will look as follows:

As mentioned above the process of setting up the template and carving from the journal file is actually quite straight forward:

  • Run Forensic Recovery for SQLite and work through the wizard
  • When you get to the templates page, “right click” the “Apps table” and choose “import tables from SQLite databases”
  • Navigate to the KikDatabase.db file and select it
  • Call your template something like “Kik – Android”
  • Make sure just the new template that you create (Kik – Android in my example) is check marked (you don’t want tabs to be created for all the other applications)
  • Uncheck the “Search for database definitions” box – there is no need as we are carving from a known journal
  • Finally select the KikDatabase.db-journal file, under the files tab
The wizard should now run to completion and show you all of the records that are in the journal. The carved messages table looks as follows in SQLite Recovery:
To see exactly where a specific record was found you can double click the row (in the picture above) and SQLite Forensic Recovery will display the source file, database and offset of the carved record as below:
You can now also open the carved database from the folder you selected at the first page of the wizard using The Forensic Browser for SQLite and create custom reports and/or run the SQL query detailed above to select just the deleted records.

Using Forensic Browser for SQLite to examine ANY SQLite database

We all know that SQLite has become pervasive and is common on pretty much every investigation we do and we often rely on your Swiss army knife type tools to produce reports on the supported databases found in an image. We quite often usually leave the investigation there and look no further. This might be OK, but we are potentially missing a whole host of evidence.

• What happens if the database schema has changed? (this happens regularly)
• What do we do if our tool doesn’t support that DB? (there are lots)
• Is our tool extracting all the relevant information for our case? (very often not)

These are all valid questions. New tables and fields are added to databases all the time and although a tool might produce what looks like a comprehensive report, without looking further we don’t know what we are missing!

Some tables are huge, and by that I mean contain lots of data well beyond the ability of a generic tool to display in a nicely formatted report. For instance the Skype contacts table contains (when I last counted) 98 different columns. A tool that produces a nice simple report cannot possibly extract all of the relevant data from these columns – if we don’t look at them all how do we know that we are not missing crucial evidence?

There are literally millions of apps that use SQLite as storage on both phones and desktop and as mentioned above these apps usually have a changing DB structure and contain masses of data. We really need a tool that will allow us to create a nicely formatted custom report on just those tables and columns that we want, restricting the report, if required, to just certain users/rows and on databases that we may never have seen before.

The Forensic Browser for SQLite was written to address all these issues.

In simple terms The Forensic Browser for SQLite is a visual, Drag and Drop, SQL query generator that allows a user to examine every column and row in every table in any database and produce custom compound reports across multiple tables. Once a query has been created it can be saved for future use or shared with other users. In this short article I go through a whirlwind tour of some of the features of The Browser showing the results on different databases, it briefly covers:

• Querying a table
• Displaying an integer column as a formatted date string with an appropriate timezone offset
• Displaying blobs as pictures/hex/decoded binary plist, Boolean integers as a more meaningful yes/no/true/false/on/off
• Choosing specific columns for a report
• Filtering reports on particular users or for a custom period
• Creating SQL joins using just drag and drop
• Exporting all of the above (including pictures and maps) in a report to HTML/XLS/PDF

Creating a report on selected columns from a table is simplicity itself, as this example from a Skype database shows:

1. Open the database in the normal manner from the “File” menu
2. Drag the table you want to report on from the tables list on the right-hand side into the central visual query designer and select the columns you want in your report
3. The SQL for your query is automatically generated
4. Hit “Execute SQL” to run the query
5. Examine the results and when happy
6. Hit “Create Report” to create and save your report to disk as a PDF

In the examples above the date is just displayed as an integer, so what about changing the way a column is displayed – again this is straight forward. The screenshot below shows the results of a query on just four fields from 96 in the contacts table. The data shown is the “raw” data, i.e. as is present in the database with no conversion, the blob field is shown as hex (the default for the Forensic Browser):

Right clicking on any column allows the user to change the display of the data in that column.

Any numeric date fields (Unix 10 digit in the example above) can be displayed in any text format and timezone and DST conversions can be applied. Boolean 1/0 fields can be displayed as yes/no, on/off or true/false. Blobs can be displayed as pictures, etc.

Now we have our report we may want to filter so that just certain rows are returned. We can filter on any value in a column by clicking on the ‘filter icon’ for that particular column header and just ‘check mark’ those entries that we want to see:

Or if we are interested in a particular date range all we need to do is choose the custom option from the filter menu:

And enter the range of dates we would like to filter on:

So what about more complex databases, a brief examination of the Kik messenger DB shows that the messages table records the message details, date and time etc. but the user is stored as an integer and we need to do a look up (JOIN) on the primary key of the users table.

Creating this join in The Forensic Browser is straight forward, we first drag the ZKIKMESSAGE table to the query designer window, then drag the ZKIKUSER table to the query designer window and then left click the ZUSER row in the ZKIKMESSAGE table and drag the mouse to the Z_PK row in the ZKIKUSER table.

The designer and corresponding (automatically created) SQL query looks as below:

We can now select the columns we want (by check marking them as discussed above) in our report and apply a conversion to the ZTIMESTAMP column (this time it is a mac absolute date) and our report is ready to print.

More complex reports can be made across multiple tables in the same manner.

To create a report just select the appropriate output option from the menu. Reports are an exact mirror of what you see on screen. So if you click on a column to add a sort or adjust the width of a column then this is reflected in the output report. Of course if you output your report to Exel then further adjustments can be made in that application.

There are many more features in The Forensic Browser for SQLite including:

• Decoding geolocation coordinates and displaying a map alongside a post
• Importing a picture held externally to a file to display alongside a message
• Maintaining a comprehensive query library so that when custom queries/reports have been built they can be saved (and/or shared) for future use
• Searching all tables for multiple keywords and creating complex queries against the search results
• Importing tables from multiple databases and creating queries across all tables
• Creating Forensic Browser extensions to decode, display and query complex structures held as blobs

Using SQL as a date/time conversion tool

In a slight aside from my recent articles re using the Forensic Toolkit for SQLite I have put together a short tutorial on using SQL to convert dates. This article came about as I was using third party utilities (and in some case SQLite) to validate the date and time conversion procedures in The Forensic Browser for SQLite.

First what am I trying to achieve? Basically I want a table where I can insert a date and have all of the possible valid dates in different possible formats (Unix, NSDate, Chrome, FileTime etc.) calculated and displayed alongside. Sound simple?

This is what it should look like after a valid NSDate has been added to the base column:

The first thing to do is to create a table with a column for our “base” date and then additional columns for each of the date formats that we support, the CREATE statement is below:

Code:

CREATE TABLE dates (base, unix10 text, unix13 text, NSDate text, chrome text, filetime text)

The base column has no defined type – we could be looking at integers, floating point numbers or text. This actually is not really important with SQLite as it is pretty lax with column affinity, i.e. you can write a text value to an integer or float field, or a float to an integer…

The next thing to do is create a series of triggers. Triggers are actions that take place when certain things change. So, for instance, you could create a log table that had a date and time entry every time another table changed or you could recalculate the total of all of the values in a column when one of them is updated.

Our first trigger is a simple one basically before a new row is inserted in our table I want to delete every row in the table. This ensures that our table only ever has one row, this keeps things simple.

More information about SQLite triggers can be found here:
https://www.sqlite.org/lang_createtrigger.html

The trigger is as follows

Code:

CREATE TRIGGER trig_del before insert on dates begin delete from dates; end

The first line defines when the trigger activates, in this case before a new INSERT takes place on our dates table. The line(s) bracketed by begin and end define what our trigger does, in this case deletes all rows from the dates table.

Before we move on to the rest of our triggers we’ll digress and look briefly at the SQLite DateTime function.

In simple terms this converts a number into a string depicting the date, all we need to do is provide the number and tell the function what the number represents (Unix date or julian date). You can do more such as controlling the format of the resulting text string, but this is beyond the scope of this article.

The DateTime function is detailed here – https://www.sqlite.org/lang_datefunc.html

So to convert 1234567890 into a text string we use

Code:

DateTime(1234567890, 'unixepoch') and we get 2009-02-13 23:31:30

if we wanted to update a particular column (unix10) with this value the SQL is

Code:

UPDATE dates SET unix10 = DateTime(1234567890, 'unixepoch')

But, we need to take our Unix date value form the newly entered base column. The trigger functionality of SQLite adds a qualifier to allow us to access the old value of a column (before it was updated) or the new value (after it was updated), so to get the new value of the base column to replace our hard-coded Unix date we use new.base

So now to our first trigger that does some of the real work

Code:

CREATE TRIGGER trig_ins AFTER INSERT ON dates BEGIN UPDATE dates SET unix10 = (DateTime(new.base, 'unixepoch')); END

This trigger activates after a new row has been inserted and simply calls the built IN SQLite function DateTime and passes it the new value from the base column and converts it to a text string assuming the value in base is a 10 digit Unix value.

If you have read the page on triggers you will know that you can have multiple statements within the begin and end block so we can further update our insert trigger with additional statements to deal with additional date and time formats.

For Unix dates expressed as 13 digit integers recording the date to millisecond accuracy we can divide the date by 1000 to give us seconds rather than milliseconds and use:

Code:

DateTime(new.base/1000, 'unixepoch')

For Google Chrome dates (Microseconds since 1/1/1601) we need to also convert this to a UnixDate (seconds since 1/1/1970). The first thing we can do is to convert from Microseconds to seconds, we simply divide the date by 1,000,000. We can then adjust for the difference in seconds, but what is this?

SQLite helpfully provides a way to do this too, we use strftime. The strftime function calculates the difference between two times and SQLite also provides us with a built-in constant (%s) for the Unix epoch (1/1/1970), so

Code:

strftime('%s', '1601-01-01 00:00:00')

Gives us the difference between 1/1/1970 and 1/1/1601 – neat eh? Our query now becomes

Code:

datetime((old.base/1000000)+strftime('%s', '1601-01-01 00:00:00'), 'unixepoch')

(We add the difference as stftime will return a negative number)

Now Microsoft filetimes can be dealt with in exactly the same way. Filetime is the number of 100 nanoseconds since 1/1/1601 so we convert to seconds by dividing by 10,000,000 and then using the same equation as for Chrome time.

Code:

datetime((old.base/100000000000)+strftime('%s', '1601-01-01 00:00:00'), 'unixepoch')

If you have read the triggers web page you will know that we can have multiple statements within a trigger, so we can combine all of the above, as below:

Code:

CREATE TRIGGER trig_ins after insert on dates begin update dates set unix10 = datetime(new.base, 'unixepoch'); update dates set unix13 = datetime((new.base/1000), 'unixepoch'); update dates set chrome = datetime((new.base/1000000)+strftime('%s', '1601-01-01 00:00:00'), 'unixepoch'); update dates set filetime = datetime((new.base/10000000)+strftime('%s', '1601-01-01 00:00:00'), 'unixepoch'); update dates set nsdate = datetime((new.base)+strftime('%s', '2001-01-01 00:00:00'), 'unixepoch'); end

There is one last convenience trigger we can add. I have been using the excellent SQLite manager plugin for Firefox to create my tables and indexes as well as insert new rows to test my statements – The Forensic Browser for SQLite is not meant to have this functionality. I found though that rather than add a new row, I was simply editing the current (and only) row. So we need to add a new trigger that fires after just the base column in the table changes.

The trigger is the same as above, with just the change to the first line. I hope these need no further explanation.

Code:

CREATE TRIGGER trig_upd after update of base on dates begin update dates set unix10 = datetime(new.base, 'unixepoch'); update dates set unix13 = datetime((new.base/1000), 'unixepoch'); update dates set chrome = datetime((new.base/1000000) + strftime('%s', '1601-01-01 00:00:00'), 'unixepoch'); update dates set filetime = datetime((new.base/10000000) +strftime('%s', '1601-01-01 00:00:00'), 'unixepoch'); update dates set nsdate = datetime((new.base) + strftime('%s', '2001-01-01 00:00:00'), 'unixepoch'); end
If you don’t want to cut and paste the SQL above to create your own table (although I would encourage it) you can download the table I created using the above commands at the link below. It should be pretty straight forward to add new date formats but if you do have trouble with a particular format then please feel free to get in touch. datesandtimes.sqlite To use the database above with the Firefox SQLite manager plugin. Open the file and select the dates table. Add a new row (using the add button) or edit the existing row if present and change the value in the base column to any date value you have. The database will update the remaining columns and populate them with valid dates where possible. A screenshot of the resulting table after a valid Chrome date has been posted is below:

Chrome history with Recursive Common Table Expressions

Like many applications, Google Chrome uses an SQLite database (or rather a number of SQLite databases) to store information relating to pages visited. One of these databases is the history DB which uses a set of normalised tables which, amongst other things, holds a table showing the date and time of every page visited, where appropriate the previous page (the page on which the user clicked a link/referring page) and an internal ID number of an entry in the URL table that contains the text of the page itself.

The purpose of this article is to show how we can generate a list of the pages that a user has followed while browsing the internet, or more correctly showing the chain of webpages that led to a specific page. To achieve this we will use a feature of SQL known as Recursive Common Table Expressions (RCTE), don’t worry it sounds worse than it is!

But first let’s look at a simple RCTE (from the SQLite web site) and see how it all works:

Code:

WITH RECURSIVE cte(x) AS (
SELECT 1
UNION ALL
SELECT cte.x + 1
FROM cte
LIMIT 10)
SELECT *
FROM cte
So what does the above query do? Very simply it creates a simple table with one column (x) containing the value 1 and writes out the value, then adds 1 to this column and then writes out the value until it has written all the results from 1 to 10. i.e.:

The essential feature of any RCTE is that there is a UNION between two tables, the first part of the UNION (SELECT 1) sets up the starting conditions of the query and the second half of the UNION ( SELECT cte.x + 1 FROM cte) refers back to the first half (this is the recursive element).

If that’s not clear, then hopefully things will become clearer as we work through this explanation.

But first, a bit more background on the tables we are interested in.

The first table we want to look at from the Chrome history is the visits table, this table contains a number of columns, we are just interested in the first four (the purpose of the remainder is outside of the scope of this article).

These columns are:

  • Id – a unique identification number of this record (this is the tables primary key)
  • URL – the id of the entry in the URLs table that contains the text of the URL visited
  • visit_time – the time of the visit encoded as a Chrome date (we will decode this later)
  • from_visit – the id in this table of the previous page.

The keen-eyed amongst you will have noted that number in the last entry in the from_visit column is sometimes the same as the id of a previous row, this is our link of pages visited and we can follow this list back until the from_visit entry is 0 which indicates the start of a viewing chain.

  • The row with id = 27585 has a from_visit = 27584
  • The row with id = 27584 has a from_visit = 27583
  • The row with id = 27583 has a from_visit = 27582
  • The row with id = 27582 has a from_visit = 0

Clearly, from an investigation standpoint, it would be useful to have a report of each of the pages in a chain displayed in order, showing the route through the internet history to a given page.

The second table is the URLs table. The URLs table contains a number of columns that would interest a forensic investigator, for this article we will concern ourselves with just the first two columns.

  • id – again the primary key and a unique identifier for this row in this table
  • url – the URL of the page visited.
These two tables can be joined via a LEFT JOIN (see our article on JOINs) on the visits.url field to the urls.id table. When we do this and use a Chrome conversion on the visit_time column we get a results table that looks like this:

The query that we use to create the results table above is:

Code:

SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id

For the sake of this article, we will follow the chain of web sites back from entry ID 27585 and as mentioned above we will use a Recursive Common Table Expression (RCTE). But before we do that we will digress a little and look a little at what a Common Table Expression (CTE) actually is.

A CTE is analogous, in simple terms, to a temporary table (or to a view), but unlike a temporary table or view, it exists just for the life of the query. As discussed above the query to select the rows we are interested in, is:

Code:

SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id

Expressed as a Common Table Expression this would be:

Code:

WITH cte AS (SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id
)
SELECT *
FROM cte

The first five lines simply set up the Common Table and the final two lines are the query, basically put, cte becomes an alias for the select expression. Another way of thinking of the above expression is the Common Table Expression “cte” is an alias for the SELECT query and this alias can then be referred to in place of the query.

Recursion

As beautifully shown in the tongue in cheek dictionary definition of Recursion “Recursion – def: see Recursion”

a Recursive Common Table Expression is an expression that references itself. The basic format of a Recursive CTE (RCTE) is
Code:

WITH RECURSIVE (alias_name) AS (
Initial-select
UNION ALL
Recursive-select)
SELECT * from alias_name

Or from the SQLite web site:

Where the initial-select initializes the query and the recursive-select part of the UNION provides the recursive element.
We can now work with our RCTE skeleton above and replace a section at a time to build up our final query.
Start by giving the RCTE an alias-name, we’ll call this (Imaginatively) rcte:

Code:

WITH RECURSIVE rcte AS (
Initial-select
UNION ALL
Recursive-select)
SELECT * from rcte

Now we need to replace the initial-select portion of the query (the first half of the UNION), as discussed before, the basic query we will be using to create the table is reproduced here:

Code:

SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id

We need however to take this one step further and decide which record we want to start with, as mentioned above we want ID = 27585. So the initial-select becomes:

Code:

SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id
WHERE visits.id = 27585

We can copy and paste this directly into our ‘in-progress’ query which now becomes as follows:

Code:

WITH RECURSIVE rcte AS (
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id
WHERE visits.id = 27585
UNION ALL
Recursive-select)
SELECT * from rcte

Now the second, recursive part of the UNION; again we use the starting query but with a slight modification, the query needs to be recursive, i.e. it needs to refer to itself (I have highlighted the changed element of the query):
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM rcte
LEFT JOIN urls ON visits.url = urls.id

This is not quite enough though, we also need to tell the query how to follow our chain of pointers, we can do this using an INNER JOIN (see my previous article on JOINS for more information) and we JOIN the two tables on the from_visit and the id fields.

Code:

SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM rcte
LEFT JOIN urls ON visits.url = urls.id
INNER JOIN visits ON CTE.from_visit = visits.id

Now we can copy and paste this subquery to our in-progress query:

Code:

WITH RECURSIVE rcte AS (
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id
WHERE visits.id = 27585
UNION ALL
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM rcte
LEFT JOIN urls ON visits.url = urls.id
INNER JOIN visits ON rcte.from_visit = visits.id
)
SELECT * from rcte

Although this query works (you can open an SQLite command window and try it – but make sure you use a valid visits.id) I want to make two very small changes.

The first is easy and simply orders the results by ID. The second limits the number of iterations through the recursive loop, while not essential if you don’t do this and create a query incorrectly, or the table you are querying loops back on itself then a recursive query can just hang. The final query is below with the two changes highlighted:

Code:

WITH RECURSIVE rcte AS (
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM visits
LEFT JOIN urls ON visits.url = urls.id
WHERE visits.id = 27585
UNION ALL
SELECT visits.id,
visits.url,
visits.visit_time,
visits.from_visit,
urls.url AS URL1
FROM rcte
LEFT JOIN urls ON visits.url = urls.id
INNER JOIN visits ON rcte.from_visit = visits.id
LIMIT 100
)
SELECT * from rcte
ORDER BY rcte.id

By way of explanation, the recursive-select part of the UNION references back to the initial-select part and the query will continue through each row until rcte.from_visit does not equal visits.id. INNER JOIN will only match rows from the visits table that are also in the cte table – see my previous article on joins for more information.

In English, it works like this:

  1. The initializing select runs and selects the one row from the visits table that has ID 27585. This row is inserted into our results table
  2. The recursing select uses this value and selects the rows (in our case a single row) which has a from_visit equal to the ID in step 1
  3. The row from step 2 becomes our new step 1 and the process repeats until step 2 fails to return a row, or we have been through this loop 100 times

In The Forensic Browser for SQLite this looks as follows:

We can, of course, display the Chrome date in a more meaningful format using the built-in options in the Forensic Browser for SQLite and the report produced can be exported to PDF, HTML, XLSX or csv.

Using The Forensic Browser for SQLite to display maps alongside Facebook messages

I suspect that many of you have looked at Facebook SQLite databases and seen data in the messages table coordinates column of the form:

Code:

{"longitude":-0.19690104212,"latitude":51.085777776363003,"accuracy":5}

Since release 1.1.0 the Forensic Browser for SQLite has been able to display data stored in two columns, one for the latitude and one for the longitude as a map of the location.

In this article, I want to show how we can create a query using SQL that extracts the latitude and longitude (lat and long) from the single field above and create a temporary VIEW (a sort of temporary virtual table) that contains the lat and long. This VIEW can then be used to create a new permanent table (in the Forensic Browser case database) that holds three maps at different zoom levels that can be linked to the messages table.

First, we create a short query just to show the format of the text coordinates column from our database:

Code:

SELECT messages.coordinates
FROM messages
WHERE messages.coordinates IS NOT NULL

This query returns just the coordinates column for those rows where there are values in this column, as below:

Now we need to identify where the lat and long are in the text above. This is easy, they immediately follow the text ‘label’ before them i.e. the lon number starts 11 characters after the start of the word ‘longitude’ and the lat starts 10 characters after the word ‘latitude’.

So now we need to determine how long each number is, again this can be done by looking at each number again and seeing what follows it so the lon ends 2 characters before the start of the word ‘latitude’ and similar the lat ends two characters before the word ‘accuracy’. We need to extract these numbers. This diagram for the latitude calculation may make this clearer:

The first digit of the longitude value starts 11 characters after the start of the word longitude which is 25 characters from the start of the text, i.e. 25 + 11 = 36. The end of the longitude starts 2 characters before the start of the word accuracy, i.e. 50 -2 = 48. Therefore the length of the longitude value is 48 – 36 = 12 characters

We can now utilize two of the SQLite core functions, first to identify the start offsets (characters) of the lat and long and second to extract the value.

The first function is Instr a function that finds the start of one string within another string (or field). So InStr(coordinates, ‘longitude’) will find the start of the word ‘longitude’. Note the single inverted comments around ‘longitude’ this instructs SQLite to consider longitude as a text string and not a field name.

The start offset of the longitude value is obtained with:

Code:

InStr(messages.coordinates, 'longitude') + 11

The end offset of the longitude value is obtained with:

Code:

InStr(messages.coordinates, 'accuracy') - 2

But we don’t want the end offset, we want the length of the longitude, we can get this by subtracting the start offset of the longitude value from the end offset, i.e.:

Code:

(InStr(messages.coordinates, 'accuracy') - 2) – (InStr(messages.coordinates, 'longitude') + 11)

We can then use the SubStr Function which as the name implies, extracts a substring from some text (or field), the second is InStr which finds the position of some supplied text from within some other text (or a field). So SubStr(coordinates, 36, 12) will extract the 12 characters of text starting at character 36, which we have just seen is the longitude field. We can substitute the expressions above into SubStr to get

Code:

SubStr(messages.coordinates, InStr(messages.coordinates, 'longitude') + 11, (InStr(messages.coordinates, 'latitude') - 2) - (InStr(messages.coordinates, 'longitude') + 11))

Now we need to devise an expression in the same way for the start of the latitude number and also the length of it. i.e.:

Code:

SubStr(messages.coordinates, INSTR(messages.coordinates, 'latitude') + 10, (INSTR(messages.coordinates, 'longitude') - 2) - (INSTR(messages.coordinates, 'latitude') + 11))

To add these to the SQL query we go to the fields editor and in the bottom blank expression box, we can type, or cut and paste the expression above. 

When the expression has been entered we can change the alias column to read ‘Lon’.

In the screenshot below the Forensic Browser has fully qualified the field names by inserting the name of the table in front of the field name separated by a period (i.e. messages.).

The expression for latitude is derived in exactly the same way and should be entered in the next blank box:

Code:

SubStr(coordinates,
InStr(coordinates, ‘latitude') + 10,
(InStr(coordinates, 'accuracy') - 2) - (InStr(coordinates, 'latitude') + 11))

There is just one thing to add to the query before we can create our VIEW, that is to add a final field to the query that we can use to relate the lat and long back to the correct row in the messages table. This is almost invariably the primary key from the table from which we get our data, in the case of the messages table this is the msg_id field. Our final query becomes:

Code:

SELECT messages.msg_id,
messages.coordinates,
SubStr(messages.coordinates, InStr(messages.coordinates, 'longitude') + 11, (InStr(messages.coordinates, 'latitude') - 2) - (InStr(messages.coordinates,
'longitude') + 11)) AS lon,
SubStr(messages.coordinates, InStr(messages.coordinates, 'latitude') + 10, (InStr(messages.coordinates, 'accuracy') - 2) - (InStr(messages.coordinates,
'latitude') + 11)) AS lat
FROM messages

To create a VIEW on this table, execute the query and hit the F5 function key (or select “Create temp view on current query” from the Queries menu). When prompted enter a name for the VIEW, I have chosen “LatAndLongs”.

Once the VIEW has been created it is then available to use as a virtual table and is displayed in the tables tree to the top right of the Forensic Reporter display, so queries can be run on it exactly as if it is a real table.

So the whole of the above complex query can now be replaced by:

Code:

SELECT * from LatAndLongs

Both of these SQL queries show exactly the same results, i.e.

The next step is to turn these lat and long fields into maps. Provided the Forensic Browser for SQLite is connected to the internet you can do this by selecting “Create geolocated images” from the Tools menu.

In the “Select Source Table” box type the name that you gave to the VIEW you have just created, in my case “LatAndlongs“, the Browser should auto-detect the two fields named lat and long (if you have used different names select them from the options provided). You also need to select (or enter) the ID field, i.e. the field that will link these, in this case it is msg_id as displayed below:

Finally, you should either accept or type in a new table name for the destination data (the maps) – the default is Geodata1.

We can now run a query on Geodata1 to see the content of the table:

Code:

SELECT * from geodata1

All that remains to do now is create a query joining the messages table to the geodata1 table and select the columns (including the maps) that we want in our final report.

And finally create the report:

Basic SQL Joins

I have written a few articles lately about designing queries using The Forensic Browser for SQLite and shown how easy it is to do this using drag and drop. But I thought it might be useful to go back to basics a little, go over joins in a little more detail and try and put the data in terms that we understand in the computer forensics world.

Before we start though we need some tables to work with, so to make things relevant we will use a file list and a list of hashes of contraband material. Our files table lists a file id (say the MFT no.) a file name and the hash of the file (we will just use a 4 digit integer representation of a hash to make things readable):

Code:

CREATE TABLE files (id int primary key, name text, hash int)
We also have a hashes table which contains a 4 digit hash and a level which could signify a number of things but in our case is just a number between 1 and 5: Code:
CREATE TABLE hashes (hash int, level int)
And to make our example workable we need to populate the tables with data. The tables become:
FILES
ID Name Hash
1 Horse 1233
2 house sale.doc 3434
3 ingredients.txt 8573
4 cmg0003.img 7375
5 house.jpg 8573
6 desktop.ini 9426
7 dinkey.doc 4534
HASHES
Hash Level
5456 2
7452 3
7538 2
8573 5
1234 1
3434 3
1233 1
5555 2
For anyone reading this who doesn’t know the significance of hashes in investigations, here is a brief summary. Each file in the file list has a digital signature (the hash) that uniquely identifies it. Any two files with the same hash are statistically exactly the same. The hashes table contains a list of digital signatures that correspond to known contraband files. These could be illegal images, instructions for making drugs or other contraband etc. etc. Note that for this example the hash column has not been designated a primary key, this is intentional. So on to the different joins:

Left Join

Suppose we want a list of all files and their associated level whether or not the file has a matching hash in the hashes table. This is known as a LEFT JOIN (or LEFT OUTER JOIN) and we want our results table to show all of the rows in the files table with the matching level from the hashes table if applicable. Visually this can be represented by a Venn diagram with the file table represented by the left-hand circle and hashes table the right-hand circle.
A LEFT JOIN joins all of the rows in the left table with those rows in the right table that match on the join field. The resulting table shows us the four fields we have selected with those files that match our list of contraband files from our hash table also listing their “level”, or if they don’t have a matching hash the level column is blank. The results we expect can be visualised in the following table:
This shows that every row in the files table will be shown in our query results joined to those rows in the hashes table that has a matching hash. In the forensic Browser we can also visually see the relationship between the tables in the query along with the results:

Using The Forensic Browser we can create the join the old fashioned way by just typing in the text of the query, once we tab out (or click outside) of the text box the visual elements will be created automatically (your query will be nicely formatted). Or, we can create the query visually as follows:

Drag each of the two tables from the “Tables” panel at the top right into the central visual query designer (or just double-click on each of them), the first table you add is your “left table”. Click on the checkbox next to each column that you want in the final report. Select the column in the left table with the mouse and drag the cursor to the column in the right table that you wish to “join” with. Execute your query.

The LEFT JOIN is the default join in The Forensic Browser when two tables/rows are joined in this way.

In English this can be explained as follows:

Code:

SELECT files.id, files.name, files.hash, hashes.level FROM files LEFT OUTER JOIN hashes ON files.hash = hashes.hash

The first four lines simply select the columns that we wish to see in our final report, three columns from the files table and one from the hashes table.

Line 5 says we want to gather our data from the files table

And line 6 says we want a LEFT JOIN, i.e. all the rows from the left table (files, the first named table) irrespective of whether they have a matching row in the hashes table, with those rows in the hashes table that match based on the join field specified in this row. The joining field in this example is the hash field from both tables.

So our final result should show all of the rows and all three columns from the files table and just a single column and those rows from the hashes table that have a matching hash with the files table.

Note that if there were two or more rows in the hashes table that matched a hash (i.e. duplicate hash values) then there would be a row in the query results for each matching file AND each matching hash.

If the hashes table is modified such that there are duplicate hashes stored within, it looks as follows:

HASHES
Hash Level
5456 2
7452 3
7538 2
8573 5
1234 1
3434 3
1233 1
5555 2
8573 4
Then the result table is quite different. In the table below each of the two rows in the files table with hash 8753, has been joined with both of the rows in the hashes table that has this hash.
It is important to understand the schema of the tables that you perform a join upon and in particular the definition of the columns. A LEFT JOIN on two columns each of which is guaranteed to have unique values, i.e. a primary key, will result in a one to one relationship, i.e. one row in the results table for each row in the left table. A LEFT JOIN where one or both of the joined columns has duplicate entries can result in many more output rows than input rows.

Right Join

The RIGHT JOIN is the reverse of the LEFT JOIN, i.e. ALL of the rows from the right table with just those rows from the left table that match the joining condition. RIGHT JOINS are not supported by SQLite, however, it is simple enough to replicate utilising a LEFT JOIN using the tables in the opposite order from that above.

Inner Join

Our examination so far is going well but if we have many thousands of files we might not initially want to see those files that aren’t contraband, so we want a query that lists all the files in our file list that have a matching entry in the hashes list, this brings us on to the INNER JOIN, represented by the Venn diagram below:
This join selects the rows from both tables but only where the hashes match.
It can be seen that the results of the INNER JOIN (above) are just those rows from the LEFT JOIN that have a matching hash (the results of LEFT JOIN are reproduced below).

Left excluding join

Our investigation has now moved on, we have examined all of the files with matching hashes but we need to look at those unknown files to see if there is any new evidence. So we want to isolate those files that don’t have a matching hash in the hashes table. For this, we need what is sometimes known as a LEFT EXCLUDING JOIN, which is essentially a LEFT JOIN with a qualifying clause.

Right Excluding Joins

The RIGHT EXCLUDING JOIN is the reverse of this. RIGHT JOINS are not supported by SQLite, however, it is simple enough to create a LEFT EXCLUDING JOIN using the tables in the reverse order, as above.

Full Outer Join

While undertaking our investigation we come across a clone of the hard disk we are examining and we would like to compare this with the original and see what the differences are. So we want to design a query that returns just those rows that don’t appear in both databases. The query we want is a FULL OUTER JOIN and the Venn diagram for it is:

Unfortunately, FULL OUTER JOINS are not supported by SQLite so we need to devise a workaround. Our list of files from both devices is reproduced below:

FILES
ID Name Hash
1 Horse 1233
2 house sale.doc 3434
3 ingredients.txt 8573
4 cmg0003.img 7375
5 house.jpg 8573
6 desktop.ini 9426
7 dinkey.doc 4534
Clone
ID Name Hash
1 Horse 1233
2 house sale.doc 3434
3 ingredients.txt 9432
4 cmg0003.img 7375
5 house.jpg 8573
6 Drugs.xls 3476
7 dinkey.doc 4534
8 Contacts.db 4723

It’s quite easy to see the differences in the examples above, but with many thousands of files reviewing the data manually would be daunting. For our examination, we are interested in the files on the original disk that are not on the clone and vice versa. To do this we will again use the hash value.

So how do we simulate a FULL OUTER JOIN?

I hope that you can all see that a FULL OUTER JOIN is essentially the results of a LEFT EXCLUDING JOIN merged with the result of a RIGHT EXCLUDING JOIN, i.e.

Our first query is a LEFT EXCLUDING JOIN and as expected it returns just the one file in the files table that is not in the clone table:

And then as previously discussed we simulate the RIGHT EXCLUDING JOIN by swapping the table names and doing a LEFT EXCLUDING JOIN. 

We could leave things there as the results are easy enough to understand at this level and indeed in some ways more useful from an investigators standpoint than just one list of files that differ. But we started this section with the intention of simulating a FULL OUTER JOIN so we will continue and complete our task.

To get a results table with the combined contents of the two tables above we simply need a UNION of the two queries. A UNION can be applied to any two (or more) queries provided that the queries produce the same output columns, in the same order. Our queries clearly do this.

Creating a UNION in The Forensic Browser for SQLite is straight forward.

First create each query in turn (as we did above), then with the first query displayed in The Forensic Browser click on the “add union” button (the plus symbol below)

In the subsequent new visual query designer add the second part of the query, The Forensic Browser will join the two queries with a UNION and display the SQL.

*Note that the union above is not a technically accurate simulation of a full outer join, the reasoning behind this is complex and beyond the scope of this article, and there is no distinction between the source of the files. i.e. we can’t see from the query which device a file was on. To partially get around this restriction we can add a label to each part of the UNION (remember each part must have the same columns in the same order). Our final SQL query and the results are shown below with an additional column showing the device from which each file is missing.

Creating a Forensic Browser for SQLite extension to load Kik images from a binary plist

This short article will show how a user (you) can write code to extend the functionality of The Forensic Browser for SQLite by either decoding a binary structure within an SQLite database or decoding such a structure in a file external to the database.

For this example, we will consider attachments to messages in the Kik messaging application.

In order to create a meaningful report including an attachment on a Kik database, we need to take selected fields from three tables ZKIKMESSAGE, ZKIKUSER and ZKIKATTACHMENT. Investigation has shown that the ZKIKATTACHMENT table has a field ZMESSAGE which relates to the ZKIKMESSAGE table primary key (Z_PK) and a further field ZCONTENT which relates to the actual message which is stored externally (more shortly).

So our first task is to create a query that has every row in the ZKIKMESSAGE table with the ZKIKATTACHMENT TABLE on the fields discussed. The screenshot below shows our join with a few meaningful additional columns included. I have also included the ZKIKUSER table so our report can contain the user name:

If we execute this query we can see that the ZCONTENT column from the attachments table contains what looks like a GUID and if we look in the folder structure for Kik the attachments folder contains files with a similar naming convention:

Further investigation shows that these files do indeed have a name matching the database entry and that the files are in fact binary Plists. When these Plists are decoded we can see that there are binary structures stored within this Plist and anyone familiar with looking at hex will recognise that the icon and preview data are a PNG and a JPG respectively.

So we now know there are pictures in the Kik Binary Plists and we could easily carve them out using an appropriate tool. But what we really want to do is access them through The Forensic Browser for SQLite and create a report including any pictures as and where appropriate. The remainder of this article shows you how to do this.
The process is simply to create a script that gets the data you want and write the data to an SQLite database that can then be “attached” to the SQLite database you are examining. The tables from the new database can then be queried by using a fully qualified table name i.e. <database>.<table>.<field> (if there is any conflict).
The pseudo code to create this database is:

  • Create databases and tables
  • Open each file in turn
    • Get handle to Plist images dictionary
    • Extract images and write to the new database
  • Flush changes to the database

When the script runs you are prompted for the location of the Kik attachments (usually a folder structure ending in com.kik.chat\documents\attachments. On completion, there will be a database in the same folder as Kik.py with the file name kik_attachments.db.

The Python code for the above is as follows :

import sqlite3
import SFBPList
import os

sqlite3.paramstyle = 'qmark'

outDB = sqlite3.connect('kik_attachment.db')

outDB.execute("create table if not exists decoded_preview (msg_id text, preview blob)")
outDB.execute("create table if not exists decoded_icon (msg_id text, icon blob)")

#indvidual tables can make any subsequent logic for joins simpler
outDB.execute("delete from decoded_preview")
outDB.execute("delete from decoded_icon")

filepath = input("Enter the file path to the folder containing the plists\n: ")

filenames = next(os.walk(filepath))[2]

# for each file
for filename in filenames:

try:
plist = SFBPList.ProcessPlist(filepath + "\\" + filename)
finally:
#parse the image dictionary from the plist
image = plist["image"]
#loop through every item
for i in image:
#determine whether picture is a preview or icon
if i['name'] == 'preview':
outDB.execute("insert into decoded_preview(msg_id, preview) values(?, ?)", (filename, sqlite3.Binary(i['value'])))
else:
outDB.execute("insert into decoded_icon(msg_id, icon) values(?, ?)", (filename, sqlite3.Binary(i['value'])))
outDB.commit();

Now we have our table the rest is quite straightforward. As of version 1.0.5 of the Forensic Browser for SQLite, you can attach a second (or subsequent) database to an open database.

Select “Attach DB” from the file menu and choose the Kik_attachment.db created with our python script. You will then be prompted for a name for this database (a default name is supplied).

The database is now attached and you can use the tables exactly as you would any other within The Browser.

All that remains now is to complete our query and report. Add the decoded_preview table to the report created earlier and join the new table with a left join on decoded_preview and select view blob as pictures from the results grid context menu.

The final query design and the table look like this:

Creating a timeline with the Forensic Browser for SQLite

Timelines are a hot topic in the forensic field and rightly so; viewing data from any source alone is only part of the picture, data when viewed in context alongside other columns often paints a very different picture and the evidence can sometimes just jump out at you.

There have been many articles written about timelines and there are various programs such as log2timeline from the SANS Institute that will help an investigator create timelines. Some of these programs have plugins that allow you to work with specific SQLite databases, but not all databases have plugins written for them and sometimes the table schema changes.

To complement these programs you can use The Forensic Browser for SQLite to create a timeline on multiple tables and output the data in a comma separated format, such that the timeline can be examined with another tool of your choice.

The following tutorial shows you how to create an arbitrary timeline on two tables from a Skype database. This could easily be extended to multiple tables on almost any database. The tutorial is also a useful source of information regarding some slightly more complex SQL constructs and some of the inbuilt “core” SQLite functions.

The main feature of a timeline is that it has one main datetime field and additional fields that describe an event that happened at that time. Depending on which format you use there can be anything from about 5 to 17 fields. For our example we will use the following:

  • Date
  • Time
  • DatabaseName
  • Table
  • User
  • Description

Let’s start building our timeline field by field (this is a sensible way to proceed as it lets us deal with problems in isolation before we move on) we will start with the Skype messages table.

Open a Skype database and add the message table to the query designer.

Add the timestamp field with an alias Date.

Execute the query and right click on the results to choose a conversion.

Select Unix 10 digit timestamp and “Date Only” using the format yyyy/MM/dd (note that you can set different formats for each and every date/time field within The Forensic Browser).

Add another timestamp by clicking on a down arrow in the “fields designer” in the empty row at the bottom.
Execute the query again.

For this time field set the format to “Time Only” and HH:mm:ss.

The display should now look like the screenshot below:

The next two columns are constants and have values that we will enter ourselves. We do this by entering a single quoted string in the expression column and an unquoted alias (no spaces allowed here) in the Alias column. This will now look as follows:
When we execute the query now it should look like this:
Our next task is to add the user name. We could add this field as a constant as it will not change for this query, but we want to make our query portable across Skype databases so this can be taken from the Skype Accounts table using the skypename field.

Double click on the accounts table in the table list to add it to the query designer and check the skypename field to add it to the field designer, again add the alias, in this case, User. The complete display should look something like this:

Finally, we need to add the description. What I want to add here is text in the form of :

Message from – -> :

For this expression, we need to get clever with the skype database and some of the functions built into SQLite. There are a number of fields in the messages table that can help us with what we want but the best is the chatname field, an example of which is:

#johndoe/$r3scue193;5cf150f3ff4e274

This shows a message from johndow to r3scue193.

The SQLite core functions we will use are as follows

Code:

substr(X,Y,Z) returns a string consisting, from string X, consisting of Z characters starting at Y
instr(X,Y) finds the first occurrence of string Y in string (or field) X
replace(X,Y,Z) replaces string Y with string Z in string (or field) X
The first thing to do is extract all characters from the chatname field starting at character 2 (the first character in the string “#” is position 1) up until the “;” character. So our code will look like

Code:

Substr(messages.chatname, 2, x-2)
where x is the position of the “;” character

We get the value of x by use of the instr function as follows

Code:

Instr(messages.chatname, ‘;’)

Now substituting the expression above for x, this gives us the following compound expression

Code:

Substr(messages.chatname, 2, Instr(messages.chatname, ‘;’) -2)

Finally, we want to replace the two characters that separate the names ”/$” with “ – -> “
So we need to use the replace function with our compound function above as the source string – and we give this expression the Alias Description.

Code:

replace(substr(messages.chatname, 2, instr(messages.chatname, ';')-2), '/$', ' --> ') AS Description

While developing your equation it makes sense to build it step by step, you can execute each function above in turn and see whether you get any errors, for instance in the first case you could use an arbitrary value, 21 and see what happens when you just use the Substr part of the expression, e.g. Substr(messages.chatname, 2,21) 

I am sure you have noted that there is one thing missing from our query, the message itself. 

To get this we just need to specify the messages.body_xml field at the end of the above expression and use the SQLite concatenate symbol || (placing this between two strings joins them together), we also want to add a couple of space characters before the message field to make it readable.

This field expression now becomes:

Code:

Replace(SubStr(Messages.chatname, 2, instr(Messages.chatname, ';') - 2), '/$', ' --> ') || ' ' || Messages.body_xml AS Description
And our final SQL becomes:
SELECT Messages."timestamp" AS Date,
Messages."timestamp" AS Time,
'Main.db' AS DataBaseName,
'Messages' AS TableName,
Accounts.skypename AS User,
Replace(SubStr(Messages.chatname, 2, instr(Messages.chatname, ';') - 2), '/$', ' --> ') || ' ' || Messages.body_xml AS Description
FROM Messages,
Accounts
When we started this tutorial I said that we would make our timeline on multiple different tables, we do this by means of an SQLite union. A union is simply a joining of multiple tables that have the same names (there was another reason for adding an alias to each column in our query above).

For our example I will use the Transfers table, this lists files transferred between users.
So how do we create a union? Click on the plus symbol at the top right of the query designer window, this will create a union subquery. A second “Q” will appear next to the first and clicking on each will cycle between the two subqueries.

Within the new blank subquery, we can start creating our new query exactly as we did before, this time using the Transfers table.

There is no timestamp in this table, but we can use the starttime column, add two instances of it above and give one an alias Date and the other Time as we did above.

Add the two literal text fields as above, make sure that the second uses the TableName ‘Transfers’ (it’s always good to fully qualify your field name when performing joins, tablename.fieldname is much clearer when debugging).

Add the account name as above.
And finally, just to keep this tutorial simple, we will just add the transfers.filename concatenated with the filesize as the final field with an alias description. The final SQL looks like this

Code:

SELECT Messages."timestamp" AS Date,
Messages."timestamp" AS Time,
'Main.db' AS DataBaseName,
'Messages' AS TableName,
Accounts.skypename,
Replace(SubStr(Messages.chatname, 2, instr(Messages.chatname, ';') - 2), '/$', ' --> ') || ' ' || Messages.body_xml AS Description
FROM Messages,
Accounts
UNION
SELECT Transfers.starttime AS Date,
Transfers.starttime AS Time,
'Main.db' AS DataBaseName,
'Transfers' AS TableName,
Accounts.skypename AS User,
Transfers.filename || ' ' || Transfers.filesize AS Description
FROM Transfers,
Accounts

A few things to keep in mind:

  • All of the fields/columns in each part of the union MUST have the same name or alias (this is one reason we used aliases)
  • All of the fields/columns in each part of the union must be in the same order, i.e. date first, followed by time, followed by DatabaseName…
  • You cannot execute a subquery until both of the above criteria are met

The final output of the Forensic Browser for SQLite look as below:

There is one final thing that we may wish to do, we need to output this table into a form suitable for importing into another forensic tool. Provided we do not have any graphics or hex displays in our output we can use the inbuilt Export current query to CSV option available from the “File|Report menu”.

You can, of course, continue and create additional unions on different tables and then save your final SQL query for re-use next time you see the same database.

https://www.sqlite.org/lang.html

https://www.sqlite.org/lang_corefunc.html