I have written a few articles lately about designing queries using The Forensic Browser for SQLite and shown how easy it is to do this using drag and drop. But I thought it might be useful to go back to basics a little, go over joins in a little more detail and try and put the data in terms that we understand in the computer forensics world.
Before we start though we need some tables to work with, so to make things relevant we will use a file list and a list of hashes of contraband material. Our files table lists a file id (say the MFT no.) a file name and the hash of the file (we will just use a 4 digit integer representation of a hash to make things readable):
Code:
CREATE TABLE files (id int primary key, name text, hash int)
CREATE TABLE hashes (hash int, level int)
|
|
Left Join
Suppose we want a list of all files and their associated level whether or not the file has a matching hash in the hashes table. This is known as a LEFT JOIN (or LEFT OUTER JOIN) and we want our results table to show all of the rows in the files table with the matching level from the hashes table if applicable. Visually this can be represented by a Venn diagram with the file table represented by the left-hand circle and hashes table the right-hand circle.Using The Forensic Browser we can create the join the old fashioned way by just typing in the text of the query, once we tab out (or click outside) of the text box the visual elements will be created automatically (your query will be nicely formatted). Or, we can create the query visually as follows:
Drag each of the two tables from the “Tables” panel at the top right into the central visual query designer (or just double-click on each of them), the first table you add is your “left table”. Click on the checkbox next to each column that you want in the final report. Select the column in the left table with the mouse and drag the cursor to the column in the right table that you wish to “join” with. Execute your query.
The LEFT JOIN is the default join in The Forensic Browser when two tables/rows are joined in this way.
In English this can be explained as follows:
Code:
SELECT files.id,
files.name,
files.hash,
hashes.level
FROM files
LEFT OUTER JOIN hashes ON files.hash = hashes.hash
The first four lines simply select the columns that we wish to see in our final report, three columns from the files table and one from the hashes table.
Line 5 says we want to gather our data from the files table
And line 6 says we want a LEFT JOIN, i.e. all the rows from the left table (files, the first named table) irrespective of whether they have a matching row in the hashes table, with those rows in the hashes table that match based on the join field specified in this row. The joining field in this example is the hash field from both tables.
So our final result should show all of the rows and all three columns from the files table and just a single column and those rows from the hashes table that have a matching hash with the files table.
Note that if there were two or more rows in the hashes table that matched a hash (i.e. duplicate hash values) then there would be a row in the query results for each matching file AND each matching hash.
If the hashes table is modified such that there are duplicate hashes stored within, it looks as follows:
HASHES | |
Hash | Level |
5456 | 2 |
7452 | 3 |
7538 | 2 |
8573 | 5 |
1234 | 1 |
3434 | 3 |
1233 | 1 |
5555 | 2 |
8573 | 4 |
Right Join
The RIGHT JOIN is the reverse of the LEFT JOIN, i.e. ALL of the rows from the right table with just those rows from the left table that match the joining condition. RIGHT JOINS are not supported by SQLite, however, it is simple enough to replicate utilising a LEFT JOIN using the tables in the opposite order from that above.Inner Join
Our examination so far is going well but if we have many thousands of files we might not initially want to see those files that aren’t contraband, so we want a query that lists all the files in our file list that have a matching entry in the hashes list, this brings us on to the INNER JOIN, represented by the Venn diagram below:Left excluding join
Our investigation has now moved on, we have examined all of the files with matching hashes but we need to look at those unknown files to see if there is any new evidence. So we want to isolate those files that don’t have a matching hash in the hashes table. For this, we need what is sometimes known as a LEFT EXCLUDING JOIN, which is essentially a LEFT JOIN with a qualifying clause.Right Excluding Joins
The RIGHT EXCLUDING JOIN is the reverse of this. RIGHT JOINS are not supported by SQLite, however, it is simple enough to create a LEFT EXCLUDING JOIN using the tables in the reverse order, as above.Full Outer Join
While undertaking our investigation we come across a clone of the hard disk we are examining and we would like to compare this with the original and see what the differences are. So we want to design a query that returns just those rows that don’t appear in both databases. The query we want is a FULL OUTER JOIN and the Venn diagram for it is:Unfortunately, FULL OUTER JOINS are not supported by SQLite so we need to devise a workaround. Our list of files from both devices is reproduced below:
|
|
It’s quite easy to see the differences in the examples above, but with many thousands of files reviewing the data manually would be daunting. For our examination, we are interested in the files on the original disk that are not on the clone and vice versa. To do this we will again use the hash value.
So how do we simulate a FULL OUTER JOIN?
I hope that you can all see that a FULL OUTER JOIN is essentially the results of a LEFT EXCLUDING JOIN merged with the result of a RIGHT EXCLUDING JOIN, i.e.
Our first query is a LEFT EXCLUDING JOIN and as expected it returns just the one file in the files table that is not in the clone table:
And then as previously discussed we simulate the RIGHT EXCLUDING JOIN by swapping the table names and doing a LEFT EXCLUDING JOIN.
We could leave things there as the results are easy enough to understand at this level and indeed in some ways more useful from an investigators standpoint than just one list of files that differ. But we started this section with the intention of simulating a FULL OUTER JOIN so we will continue and complete our task.
To get a results table with the combined contents of the two tables above we simply need a UNION of the two queries. A UNION can be applied to any two (or more) queries provided that the queries produce the same output columns, in the same order. Our queries clearly do this.
Creating a UNION in The Forensic Browser for SQLite is straight forward.
First create each query in turn (as we did above), then with the first query displayed in The Forensic Browser click on the “add union” button (the plus symbol below)
In the subsequent new visual query designer add the second part of the query, The Forensic Browser will join the two queries with a UNION and display the SQL.