Code Monkey home page Code Monkey logo

dfimagetools's People

Contributors

joachimmetz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dfimagetools's Issues

enhance bodyfile output

  • add signature - #57
  • extend character escaping for characters special Unicode and non-Unicode characters - #77
  • add time zone information for FAT and equivalent # time zone: Europe/Amsterdam

enhance artifact filters

  • add Windows Registry support
  • determine what the desired behavior of path artifact type should be
    • all file entries that glob resolves to or only directories?
  • determine what the desired behavior of file artifact type should be
    • all file entries that glob resolves to or only files?

Add script to create extent/block map

Create a script to build a map of image extent/block offsets to file path + data stream names. Write map to a SQLite database.

  • support file system/volume offset of data streams / file content
  • support image offset
  • support inline/resident data? - extents API not designed for this purpose
  • support file system metadata? - extents API not designed for this purpose

Depends on:

Create script to create bodyfile of allocated file entries

Per google/turbinia#959 there is a need for a script that can create a bodyfile of allocated file entries. There is https://github.com/open-source-dfir/dfvfs-snippets/blob/main/scripts/list_file_entries.py should be relatively straightforward to create a similar script and extended it with the necessary features.

  • add list entries script #3
  • add unit tests #4
  • add functionality to output to bodyfile format - #7
  • add support for fraction of a second - #8
  • add documentation about bodyfile format used - #12
    • requires #9
  • add support for data streams - #13
  • add support for NTFS $FILE_NAME attributes - #15
  • find alternative solution to correct rounding errors due to float - #16
  • add support for symbolic link - #17

list_file_entries script not accepting 'all' partitions

Package: dfimagetools-tools
Version: 20211228-1ppa1~focal
# /usr/bin/list_file_entries.py /dev/sdb
The following partitions were found:

Identifier      Offset (in bytes)       Size (in bytes)
p1              116391936 (0x06f00000)  119.9GiB / 128.7GB (128732610048 B)
p14             1048576 (0x00100000)    4.0MiB / 4.2MB (4194304 B)
p15             5242880 (0x00500000)    106.0MiB / 111.1MB (111149056 B)

Please specify the identifier of the partition that should be
processed. All partitions can be defined as: "all". Note that you can
abort with Ctrl^C.

Partition identifier(s): all

Unsupported partition identifier(s), please try again or abort with
Ctrl^C.


Please specify the identifier of the partition that should be
processed. All partitions can be defined as: "all". Note that you can
abort with Ctrl^C.

Partition identifier(s): 

recursive_hasher: enhancements

Originally from: open-source-dfir/dfvfs-snippets#42

  • add CLI options for selection partitions
  • print volume (VSS)
  • skip /␀␀␀␀HFS+ Private Data
/␀␀␀␀HFS+ Private Data contains the file content for hard links which by default is already hashed, consider adding an option to have the recursive hasher skip this directory. Maybe an option to allow to hash file system metadata file entries?

Extend list_file_entries script

for bodyfile generation

  • to consider: add support for APFS, GPT and LVM volume identifier in path - #21
  • improve handling of ($FILE_NAME) ensure parent file entry matches - #23
  • add support for NTFS mode_as_string value - #24
  • add d, l file type indicators in the mode_as_string field for NTFS - #35
  • add CLI options to handle encrypted volumes - #94
  • to consider: add resource fork for HFS
  • improve handling of NTFS DOS file name, either print DOS name or skip? Maybe add an option to control this behavior?
    • extend dfvfs.FileNameNTFSAttribute with a name space attribute
  • to consider: add support for NTFS UID and GID value
  • to consider: add support for NTFS index name
  • to consider: add support to calculate MD5
  • print VSS volume names (from open-source-dfir/dfvfs-snippets#41)

Move Plaso image_export to imagetools

As part of the effort to make parts of Plaso more reusable, move Plaso image_export to the imagetools project

  • determine if artifact filter and pre-processing logic should be moved to a separate project
  • filter based on artifact definitions
  • filter based on single path - #98
  • filter based on find specs (use YAML filter file?)
    • date and time range (image export --date-filter)
    • filename (image export --names)
    • filename extension (image export --extentensions)
  • filter based on (content) signatures (image export --signatures)
  • generate hashes.json output

bodyfile: extend character escaping for characters special Unicode and non-Unicode characters

Certain file systems allow for characters that either have a special meaning in Unicode such as U+d800 and/or non-Unicode characters

The extended bodyfile 3 format currently does not specify how to handle these characters. Proposal is to escape such characters as "\u####" and "\U########", preferring the short form over the long form where possible.

  • Control characters U+1-U+8, U+B-U+C, U+E-U+1F, U+7F-U+84, U+86-U+9F (already covered)
  • Unicode surrogate characters U+d800-U+dfff - #78
  • Undefined Unicode characters - #95
    • U+FDD0-U+FDDF
    • U+fffe-U+ffff
    • U+1FFFE-U+1FFFF
    • U+2FFFE-U+2FFFF
    • U+3FFFE-U+3FFFF
    • U+4FFFE-U+4FFFF
    • U+5FFFE-U+5FFFF
    • U+6FFFE-U+6FFFF
    • U+7FFFE-U+7FFFF
    • U+8FFFE-U+8FFFF
    • U+9FFFE-U+9FFFF
    • U+AFFFE-U+AFFFF
    • U+BFFFE-U+BFFFF
    • U+CFFFE-U+CFFFF
    • U+DFFFE-U+DFFFF
    • U+EFFFE-U+EFFFF
    • U+FFFFE-U+FFFFF
    • U+10FFFE-U+10FFFF
  • Other values observed to be not printable - #95
    • U+2028, U+2029, U+E000, U+F8FF, U+F0000, U+FFFFD, U+100000, U+10FFFD

Open questions

  • What about "Unicode compatibility characters" ?
  • What about U+110000-U+ffffffff
  • What about original path uses a specific codepage (encoding), which is converted to Unicode, however that can be encoded into multiple variations of the original encoding e.g. encoding U+2252 to cp932. What if there are 2 paths that decode to the same string? How should the original path be best preserved?
  • filename contains a path segment separator (e.g. \ or /), if not escaped this leads to ambiguity e.g. if / is a path segment separator is 'test/1234' a single file name or a path ?

A related discussion dfxml-working-group/dfxml_schema#34

Also consider if the format should be extended with a header to specify its encoding?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.