Difference between revisions of "ISO9660 Implementation Notes"

From SleuthKitWiki
Jump to: navigation, search
(Added link)
(Reverting to remove vandalism)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Note: This was copied from the skins_iso9660.txt file from [[TSK]]. It will need some help to integrate it into the Wiki.
 
 
 
 
=Introduction=
 
=Introduction=
The [[ISO9660]] file system is used on many platforms and has many
+
The [[ISO9660]] file system is used on many platforms and has many variations and extensions.  At the most basic level of ISO9660 there are several differences than traditional file systems due to the type of media available.
variations and extensions.  At the most basic level of ISO9660 there
+
are several differences than traditional filesystems due to the type
+
of media available.
+
  
This document gives a quick overview of ISO9660 and how it was  
+
This document describes how it was implemented.  It assumes that you know the basics of the [[ISO9660]] format.  
implemented.
+
  
[[The Sleuth Kit]] allows one to investigate an ISO9660 image in the same
+
= General Notes =  
ways as any UNIX image, including:
+
 
+
* Creation of ASCII timeline of file activity
+
* File and directory level analysis
+
 
+
= ISO9660 Overview =
+
This provides a quick introduction to the ISO9660 file system.  The
+
terms used are different then with other file systems.  For a full
+
overview of the file system, refer to the document "Volume and File
+
Structure of CDROM for Information Interchange"
+
 
+
http://www.ecma-international.org/publications/standards/Ecma-119.htm
+
 
+
 
+
== Volume Descriptors ==
+
ISO9660 uses structures called Volume Descriptors to store information
+
about the directory hierarchy of an ISO9660 volume.  At 32768 bytes
+
into the image there is a contiguous list of volume descriptors.
+
A primary volume descriptor contains an address of a Path Table which
+
is a list of every directory on the volume.  In this path table each
+
directory record has a single run of contiguous bytes known as an
+
Extent.  Each directory's single data extent contains a group of
+
contiguous directory descriptors which represent files, directories
+
or other standard file types.
+
 
+
Primary volume descriptors only allow uppercase filenames in the
+
8.3 format (8 chars dot 3 chars).
+
 
+
Supplementary volume descriptors are very similar to primary volume
+
descriptors.  The main difference is that supplementary volume
+
descriptors store filenames as UCS-2 characters and are used
+
in Microsoft Joliet extensions to allow mixed case filenames up to
+
103 characters.
+
 
+
All volume descriptors are stored at least once, with there being a
+
requirement to have only a single primary volume descriptor for an
+
image to be valid.  Supplementary volume descriptors usually contain
+
the same data as primary volume descriptors.
+
 
+
== Files ==
+
ISO9660 file are stored in an extent whose size is measured in bytes.
+
 
+
A file is considered unique if its extent address is unique.
+
 
+
== Directories==
+
Directory names are only stored in the path table of the volume
+
descriptor.  As a directory is encountered as a directory descriptor
+
inside another directory's extent, the address of its data extent
+
is examined by the ISO9660 implementation to see if we've seen this
+
directory before and figure out what its name is.
+
 
+
Directories are unusual in the way they are identified as a unique
+
inode.  If we examine the root directory using a primary volume
+
descriptor then its extent address is where on the volume the extent
+
containing the list of directory descriptors with 8.3 encoded names
+
exists.  If we examine the root directory of that same volume using
+
a supplementary volume descriptor we will find that the extent
+
address is different because these directory descriptors are UCS-2
+
encoded, even though each directory descriptor will point at the same
+
data extent for each file.
+
 
+
This last paragraph is quite complicated.  Lets simplify:
+
 
+
Imagine a CD with 3 files on it: file-1.txt, file-2.txt, file3.txt.
+
 
+
The path table in a primary volume descriptor has one directory in it
+
and its extent contains 3 directory descriptor structures with 8.3
+
uppercase encoding.  The path table in a supplementary volume
+
descriptor describing this same volume has one directory but its extent
+
is different because those 3 directory descriptor structures are
+
different than the previous 3.  The files are not considered unique
+
because their extent addresses (where their data lies) is not unique.
+
 
+
== Of Note: ==  
+
 
Due to many reports of mastering software errata, there are some
 
Due to many reports of mastering software errata, there are some
 
issues that The Sleuth Kit handles that the specifications for ISO9660
 
issues that The Sleuth Kit handles that the specifications for ISO9660
Line 92: Line 11:
 
possibility of finding more and alerts the user to this.
 
possibility of finding more and alerts the user to this.
  
Inodes don't really exist in ISO9660 so the implementation is
+
When TSK loads the file system, it chooses a secondary volume descriptor, runs through its path table, and processes each directory listed in itEach file found during this process is saved in memory and assigned a metadata address.
improvised based on anything thats extent is unique is a different
+
fileThe pseudo inode strucutre is stored in a linked list to make
+
viewing an entire image faster.
+
  
ISO9660 stores many fields as both byte orderA 32 bit number
+
After the secondary volume descriptor is loaded, the other volume descriptors are processed in the same manorNow though, we check for an entry that duplicates an entry that was added by the previous process.  We skip entries for files that duplicate the starting block and size of an existing entry. If the file is unique, it is added to the internal list and given an address. However, these files will be treated specially because they cannot be reached from the directory tree of the initially processed volume descriptor. They will be marked as being "unallocated" (because they can't be reached from the root directory) and will end up as orphan files in the $OrphanFiles directory. 
will take 8 bytes, the first 4 are little endian, the last 4 are
+
big endian.
+
  
= Using The Sleuth Kit with ISO9660 =
+
The file name code in TSK processes each directory, but needs to figure out the metadata address. Therefore, it searches the previously loaded data. It needs to match the loaded files with the current file. It does this based on the byte offset of the directory entry.  
The Sleuth Kit allows one to view all aspects of the ISO9660 structure.
+
 
+
All Sleuth Kit commands should work the same as their counterparts.
+
 
+
Note that Autopsy can automate this process for you and allows you
+
to view all attributes.
+
 
+
  http://www.sleuthkit.org/autopsy
+
  
 
= What TSK Cannot Currently Do =
 
= What TSK Cannot Currently Do =
Line 118: Line 24:
 
* High Sierra is not handled.
 
* High Sierra is not handled.
 
* Files that are stored with an interleave gap
 
* Files that are stored with an interleave gap
 
Original Version By: Wyatt Banks, Crucial Security
 

Latest revision as of 06:53, 4 February 2013

Introduction

The ISO9660 file system is used on many platforms and has many variations and extensions. At the most basic level of ISO9660 there are several differences than traditional file systems due to the type of media available.

This document describes how it was implemented. It assumes that you know the basics of the ISO9660 format.

General Notes

Due to many reports of mastering software errata, there are some issues that The Sleuth Kit handles that the specifications for ISO9660 say will never happen. The specs say that there is only one unique primary volume descriptor per volume. The Sleuth Kit handles the possibility of finding more and alerts the user to this.

When TSK loads the file system, it chooses a secondary volume descriptor, runs through its path table, and processes each directory listed in it. Each file found during this process is saved in memory and assigned a metadata address.

After the secondary volume descriptor is loaded, the other volume descriptors are processed in the same manor. Now though, we check for an entry that duplicates an entry that was added by the previous process. We skip entries for files that duplicate the starting block and size of an existing entry. If the file is unique, it is added to the internal list and given an address. However, these files will be treated specially because they cannot be reached from the directory tree of the initially processed volume descriptor. They will be marked as being "unallocated" (because they can't be reached from the root directory) and will end up as orphan files in the $OrphanFiles directory.

The file name code in TSK processes each directory, but needs to figure out the metadata address. Therefore, it searches the previously loaded data. It needs to match the loaded files with the current file. It does this based on the byte offset of the directory entry.

What TSK Cannot Currently Do

There are a few things that The Sleuth Kit is not yet able to do with ISO9660:

  • Multisessions CDs are not handled.
  • High Sierra is not handled.
  • Files that are stored with an interleave gap