At a recent conference while talking about SQLite forensics I found out that some people still use non-forensic tools to investigate databases with WAL files and were quite happy that they would not miss anything of importance. This is something I disagree with very, very, strongly and I hope in this article to explain the very basics of how WAL files work and show why you should not use non-forensic tools to examine SQLite databases.
Most of the databases we examine are appended to as user interactions take place. Be it a web browser which contains tables related to browsing events whereby new URLs are appended to associated tables as and when the user navigates to them. Or a messaging application where each message to and from the user is appended to the end of a list of such messages and new contacts are likewise added to the end of a list of contacts. In the example below, we will consider a simplified database which is appended to in the same manner and we will show what happens when the user tries to hide an event by deleting it.
While the example is simply the same sequence events as shown here will happen in the same way if a user chooses to delete an incriminating message or delete a URL of a side to hide the history of their visit from a real-world application.
For those of you who are not aware WAL files are Write Ahead Logs and simply put when a page is to be updated in the database the complete new page is appended to the WAL file and the old page is left in the DB unchanged. At some later point the latest copy of each page in the WAL file is copied back to the appropriate place in the DB and any new pages are written to the WAL starting from the beginning again.
When an application uses the SQLite library to read a record from the DB, the library determines whether the latest copy is in the DB or the WAL and reads it from the appropriate location.
Typically the pages are copied from the WAL to the DB either when it exceeds 1000 pages or when the application is shut down gracefully. So when investigating a DB which has an associated WAL then all of the most recent pages and changes to the DB are held in the WAL NOT in the DB.
The process that some people use, believing it will get all the latest pages (and also show the older version of these pages) is to rename the WAL and open the DB with a non-forensic tool believing this will show the old state of the database – this is correct but also very wrong. They then rename the WAL back and open the DB again believing that this will show the latest state of the database, again this is correct but very, very limiting.
The rest of this article will show why you SHOULD NOT use this technique and with a very simple example will show why.
The example is a simple database that has events appended from time to time. This could be some logging application or a messaging application. Our example has just one table (of course it could be and normally would be more complex) and the sequence of events for initializing the database, creating the table and populating the records is documented below. In the example a number of records are added 3 records are added, one then deleted and then another record is added. The SQLite commands are as follows (the order is important):
Open an SQLite command prompt creating a test database
Put the DB into WAL mode
PRAGMA journal_mode = WAL;
Create a test table with one text column
CREATE TABLE testtable (textcol text);
Add 3 records (you could add more but this keeps it simple)
INSERT INTO testtable VALUES (‘our first text string’); INSERT INTO testtable VALUES (‘our second text string’); INSERT INTO testtable VALUES (‘our third text string’);
Delete the second record
DELETE FROM testtable where text col = ‘our second text string’;
Add one more record
INSERT INTO testtable VALUES (‘our fourth text string’);
Now kill the command shell – do not exit SQLite first, this will preserve the WAL file
The folder containing your test database should look as follows:
The database you created should be present along with the associated WAL file (you can ignore the .db-shm file)
Now make a backup/archive of the database and WAL file (so we can use them both again later) using whatever application you choose.
And now rename the WAL file so that SQLite does not associate it with the parent database
Ren atest.db-wal atest.db-walx
We now want to open the database in a tool of our choosing – we could use the SQLite command prompt but for variation I will use the Firefox SQLite manager plugin.
SQLite Manager shows that there are no tables in the database at all:
And the database settings tab shows that the database is in WAL mode:
So why is the database empty? This is simply a function of the WAL mode as discussed above. All changes to the database are first written to the WAL until either the application is shut down gracefully (remember we killed the command shell – so our SQlite exe was NOT shut down gracefully) or more than 1000 pages (by default) are written to the WAL (ours is just eight 1K pages). But we renamed the WAL file so SQLite can no longer associate it with its parent database – hence it can’t find the records we added.
Now re-instate the database and the WAL file from your backup and, without renaming the WAL file, open the database in the Firefox SQLite manager plugin again (or any non-forensic SQLite tool). You can see from the screenshot below that all the records that we added are there and the record we deleted is not present – this is the live database and this is what we would expect to see.
If we look at the results using the Forensic Browser for SQLite we see the deleted record is still within the WAL and a complete picture of what has happened in this example.
In more detail:
But why are there multiple copies of some of the records?
When a working copy database is created with the Browser and a WAL file is processed the additional columns supplied by the Forensic Browser for SQLite provide information about where the record was found.
The first additional column sfIsLive shows whether a record is a live record or ‘not live’. Live records are those that the user would see if they opened the database using a normal non-forensic SQLite database viewer. Where not live = false these are records that are deleted OR copies of live records.
sfSource shows where the record has been recovered from, in this case, all of the records are from the WAL file.
sfPage is the database page number that the WAL file is holding, in this case, all of the records are from DB page number 2.
sfCommit is the commit number (more on this shortly)
sfpk is just a unique number for each record – we will can this to refer to each record in our description of events below.
SQLite is an ACID compliant database manager and without going into too much detail
SQLite maintains its integrity by using journals and transactions. Each transaction can be one or more writes (pages) to a database. The last write/page of a transaction is known as the commit frame. When using the SQLite command line as we have just done every write is a single transaction and every write is, therefore, the commit frame. So the sfCommit number shows all of the pages (and rows) that were written to a database in each transaction.
So looking at the example above we can see the sfCommit 1 (sfpk 4) comprised a single row (our first text string) written to frame2 of the WAL and representing page 2 of the database.
The second transaction sfCommit 2 is two rows (sfpk 5 & 6) this is a new page written to frame 3 of the WAL but again this is a new version of page 2 of the database. This transaction is the addition of the second row of the database but entails a complete rewrite of page 2 hence this new copy of page 2 has both rows.
The third transaction (sfpk 7-9) involves page 2 being rewritten again with the addition of the third entry to the table.
The fourth transaction (sfpk 10 & 11) again rewrites page 2 but in this case, the change is the deletion of the second row.
The final transaction (sfpk 1-3) is the addition of the fourth row of text.
So using the Forensic Browser for SQLite we can see the deleted record that we would not otherwise have seen using the ‘trick’ that some offer as a method of dealing with WAL files.
We can also see when this particular record was deleted, i.e. in transaction 4. In a normal application, such as a messaging app, there may be date and time information in transaction 4 that could show us approximately when this transaction occurred – or there might be date and time information in transactions 3 and 5 that would allow us to ‘bracket’ the deletion between two other events.
Imagine that this was a real-world database such as internet history or a messaging database on a mobile phone (we never normally shut down applications on phones) where a user had repeatably deleted browsing history or messages. There could be many WAL pages containing data that precede the most current page so potentially lots of pages containing data that can not be recovered using the non-forensic technique that is being proposed.
There is another article on my web site that deals with this in more detail:
Using non forensic tools in the manner describe above will show you either:
- the live files from the current database (when the WAL is present), or
- the state of the database before the first transaction in the WAL – i.e. just the records in the DB without any of the changes in any of the transactions in the WAL
Using an appropriate Forensic SQLite Browser such as the Forensic Browser for SQLite you can see all of the records from the DB along with all of the changes that have occurred (often many copies of the same page comprising multiple transactions).