Database v7.2 Schema
This page outlines version 7.2 of the TSK database schema. It is used in Autopsy 4.6.0 and beyond. The database is made by using the tsk_loaddb command line tool or the equivalent library-level methods.
The major addition in this release was the addition of communications and account related tables.
Some general notes on this schema:
- Every type of data is assigned a unique ID, called the Object ID
- Data sources are grouped by devices (to allow a computer or phone to have multiple drives in it)
- Data in a disk image has a hierarchy. Images are the root, with volume or file systems below it, followed by volumes and files.
- The tsk_objects table is used to keep track of what object IDs have been used and to map the parent and child relationship.
- This schema has been designed to store more than what TSK initially imports. It has been designed to support carved files and a folder full of local files, etc.
- This schema supports the blackboard so that modules in Autopsy can communicate and save results.
- Virtual files are made of unallocated space with the naming format of:
- Unalloc_[PARENT-OBJECT-ID]_[BYTE-START]_[BYTE-END]
NOTE: This maybe a bit out of date. The code is the best reference. See the initialize() method in db_sqlite.cpp.
Contents
General Information Tables
tsk_db_info
Metadata about the database.
- schema_ver - Version of the database schema used to create database
- tsk_ver - Version of TSK used to create database
- "schema_minor_Ver" - Minor release version of the schema to allow for backward compatible changes.
Object Tables
tsk_objects
Every object (image, volume system, file, etc.) has an entry in this table. This table allows you to find the parent of a given object.
- obj_id - Unique id
- par_obj_id - The object id of the parent object (null for root objects). The parent of a volume system is an image, the parent of a directory is a directory or filesystem, the parent of a filesystem is a volume or an image, etc.
- type - Object type (as TSK_DB_OBJECT_TYPE_ENUM enum).
Data Source/Device Tables
data_source_info
Contains information about a data source, which could be an image. This is where we group data sources into devices (based on device ID)
- obj_id - Id of image/data source in tsk_objects
- device_id - Unique ID (GUID) for the device that contains the data source.
- time_zone - Timezone that the data source was originally located in.
tsk_image_info
Contains information about each set of images that is stored in the database.
- obj_id - Id of image in tsk_objects
- type - Type of disk image format (as TSK_IMG_TYPE_ENUM)
- ssize - Sector size of device in bytes
- tzone - Timezone where image is from (the same format that TSK tools want as input)
- size - Size of the original image (in bytes)
- md5 - Hash of the image. Currently, this is populated only if the input image is E01.
- display_name - display name of the image.
tsk_image_names
Stores path(s) to file(s) on disk that make up an image set.
- obj_id - Id of image in tsk_objects
- name - Path to location of image file on disk
- sequence - Position in sequence of image parts
Volume System Tables
tsk_vs_info
Contains one row for every volume system found in the images.
- obj_id - Id of volume system in tsk_objects
- vs_type - Type of volume system / media management (as TSK_VS_TYPE_ENUM)
- img_offset - Byte offset where VS starts in disk image
- block_size - Size of blocks in bytes
tsk_vs_parts
Contains one row for every volume / partition in the images.
- obj_id - Id of volume in tsk_objects
- addr - Address of this partition
- start - Sector offset of start of partition
- length - Number of sectors in partition
- desc - Description of partition (volume system type-specific)
- flags - Flags for partition (as TSK_VS_PART_FLAG_ENUM)
File System Tables
tsk_fs_info
Contains one for for every file system in the images.
- obj_id - Id of filesystem in tsk_objects
- img_offset - Byte offset that filesystem starts at
- fs_type - Type of file system (as TSK_FS_TYPE_ENUM)
- block_size - Size of each block (in bytes)
- block_count - Number of blocks in filesystem
- root_inum - Metadata address of root directory
- first_inum - First valid metadata address
- last_inum - Last valid metadata address
- display_name - Display name of file system (could be volume label) (New in V3)
tsk_files
Contains one for for every file found in the images. Has the basic metadata for the file.
- obj_id - Id of file in tsk_objects
- fs_obj_id - Id of filesystem in tsk_objects (NULL if file is not located in a file system -- carved in unpartitioned space, etc.)
- type - Type of file: filesystem, carved, etc. (as TSK_DB_FILES_TYPE_ENUM enum)
- attr_type - Type of attribute (as TSK_FS_ATTR_TYPE_ENUM)
- attr_id - Id of attribute
- name - Name of attribute. Will be NULL if attribute doesn't have a name. Must not have any slashes in it.
- meta_addr - Address of the metadata structure that the name points to.
- meta_seq - Sequence of the metadata address - New in V3
- has_layout - True if file has an entry in tsk_file_layout
- has_path - True if file has an entry in tsk_files_path
- dir_type - File type information: directory, file, etc. (as TSK_FS_NAME_TYPE_ENUM)
- meta_type - File type (as TSK_FS_META_TYPE_ENUM)
- dir_flags - Flags that describe allocation status etc. (as TSK_FS_NAME_FLAG_ENUM)
- meta_flags - Flags for this file for its allocation status etc. (as TSK_FS_META_FLAG_ENUM)
- size - File size in bytes
- ctime - Last file / metadata status change time (stored in number of seconds since Jan 1, 1970 UTC)
- crtime - Created time
- atime - Last file content accessed time
- mtime - Last file content modification time
- mode - Unix-style permissions (as TSK_FS_META_MODE_ENUM)
- uid - Owner id
- gid - Group id
- md5 - MD5 hash of file contents
- known - Known status of file (as TSK_DB_FILES_KNOWN_ENUM)
- parent_path - full path of parent folder. Must begin and end with a '/' (Note that a single '/' is valid).
- mime_type - MIME type of the file content, if it has been detected.
tsk_file_layout
Stores the layout of a file within the image. A file will have one or more rows in this table depending on how fragmented it was. All file types use this table (file system, carved, unallocated blocks, etc.).
- obj_id - Id of file in tsk_objects
- sequence - Position of this run in the file (0-based and the obj_id and sequence pair will be unique in the table)
- byte_start - Byte offset of fragment relative to the start of the image file
- byte_len - Length of fragment in bytes
tsk_files_path
If a "locally-stored" file has been imported into the database for analysis, then this table stores its path. Used for derived files and other files that are not directly in the image file.
- obj_id - Id of file in tsk_objects
- path - Path to where the file is locally stored in a file system.
- encoding_type - Method used to store the file on the disk.
file_encoding_types
Methods that can be used to store files on local disks to prevent them from being quarantined by antivirus
- encoding_type - ID of method used to store data. See EncodingType enum.
- name - Display name of technique.
tsk_files_derived_method
Derived files are those that result from analyzing another file. For example, files that are extracted from a ZIP file will be considered derived. This table keeps track of the derivation techniques that were used to make the derived files.
- derived_id - Unique id for this derivation method.
- tool_name - Name of derivation method/tool
- tool_version - Version of tool used in derivation method
- other - Other details
tsk_files_derived
Each derived file has a row that captures the information needed to re-derive it
- obj_id - Id of file in tsk_objects
- derived_id - Id of derivation method in tsk_files_derived_method
- rederive - Details needed to re-derive file (will be specific to the derivation method)
Blackboard Tables
The blackboard is used to store results from analysis modules.
blackboard_artifacts
Stores artifacts associated with objects.
- artifact_id - Id of the artifact (assigned by the database)
- obj_id - Id of the associated object
- artifact_type_id - Id for the type of artifact (can be looked up in the blackboard_artifact_types table)
blackboard_attributes
Stores name value pairs associated with an artifact. Only one of the value columns should be populated
- artifact_id - Id of the associated artifact.
- source - Source string, should be module name that created the entry.
- context - Additional context string
- attribute_type_id - Id for the type of attribute (can be looked up in the blackboard_attribute_types)
- value_type - The type of value (0 for string, 1 for int, 2 for long, 3 for double, 4 for byte array)
- value_byte - A blob of binary data (should be empty unless the value type is byte)
- value_text - A string of text (should be empty unless the value type is string)
- value_int32 - An integer (should be 0 unless the value type is int)
- value_int64 - A long integer (should be 0 unless the value type is long)
- value_double - A double (should be 0.0 unless the value type is double)
blackboard_artifact_types
Types of artifacts
- artifact_type_id - Id for the type (this is used by the blackboard_artifacts table)
- type_name - A string identifier for the type (unique)
- display_name - A display name for the type (not unique, should be human readable)
Types of attribute
blackboard_attribute_types
- attribute_type_id - Id for the type (this is used by the blackboard_attributes table)
- type_name - A string identifier for the type (unique)
- display_name - A display name for the type (not unique, should be human readable)
Tags
tag_names table
Defines what tag names the user has created and can therefore be applied.
- tag_name_id - Unique ID for each tag name
- display_name - Display name of tag
- description - Description (can be empty string)
- color - Color choice for tag (can be empty string)
content_tags table
One row for each file tagged.
- tag_id - unique ID
- obj_id - object id of Content that has been tagged
- tag_name_id - Tag name that was used
- comment - optional comment
- begin_byte_offset - optional byte offset into file that was tagged
- end_byte_offset - optional byte ending offset into file that was tagged
blackboard_artifact_tags table
One row for each artifact that is tagged.
- tag_id - unique ID
- artifact_id - Artifact ID of artifact that was tagged
- tag_name_id - Tag name that was used
- comment - optional comment
Communications / Accounts
These tables keep track of which accounts were found and who communicated with who.
Currently in DRAFT form (Nov 20, 2017)
accounts
A row is created in this table for each unique account in the _case_. Each account has a type and identifier (as assigned by the account provider). For example, EMAIL is an account type and the unique identifier is jdoe@gmail.com.
- account_id - Assigned by the database
- account_type_id - Type of account
- account_unique_identifier - Unique identifier assigned by the provider.
- UNIQUE: account_typeId, account_unique_identifier
account_types
A row is created in here for each type of Account types. Module writers in the future will be able to make their own types. Some types are predefined: http://sleuthkit.org/sleuthkit/docs/jni-docs/4.4.1/enumorg_1_1sleuthkit_1_1datamodel_1_1_account_1_1_type.html.
- account_type_id - Assigned by the database
- type_name - Short, enum name for the type
- display_name - Display name (spaces, etc.)
relationships
- relationship_id -
- account1_id -
- account2_id -
- relationship_source_obj_id -
- date_time -
- ....
- UNIQUE: accont1_id, account2_id, relationship_source_obj_id
account_to_instances_map
- account_id
- account_instance_id
- UNIQUE: account_id, account_instance_id
Ingest Module Status
These tables keep track in Autopsy which modules were run on the data sources.
ingest_module_types table
Defines the types of ingest modules supported.
- type_id INTEGER PRIMARY KEY
- type_name TEXT NOT NULL)",
ingest_modules
Defines which modules were installed. One row for each module.
- ingest_module_id INTEGER PRIMARY KEY
- display_name TEXT NOT NULL
- unique_name TEXT UNIQUE NOT NULL
- type_id INTEGER NOT NULL
- version TEXT NOT NULL,
ingest_job_status_types table
- type_id INTEGER PRIMARY KEY
- type_name TEXT NOT NULL
ingest_jobs
One row is created each time ingest is started, which is a set of modules in a pipeline.
- ingest_job_id INTEGER PRIMARY KEY
- obj_id INTEGER NOT NULL
- host_name TEXT NOT NULL
- start_date_time INTEGER NOT NULL
- end_date_time INTEGER NOT NULL
- status_id INTEGER NOT NULL
- settings_dir TEXT
ingest_job_modules
Defines the order of the modules in a given pipeline (i.e. ingest_job)
- ingest_job_id INTEGER
- ingest_module_id INTEGER
- pipeline_position INTEGER,
Indexes
parObjId
Index to speed up the process of finding parent objects.
- artifactID ON blackboard_artifacts(artifact_id)
- artifact_objID ON blackboard_artifacts(obj_id)
- attrsArtifactID ON blackboard_attributes(artifact_id)
- layout_objID ON tsk_file_layout(obj_id)