Code Monkey home page Code Monkey logo

dfxml_python's Introduction

DFXML

Build Status

Welcome to the Digital Forensics XML (DFXML) git repository housing the Python codebase.

Overview

DFXML is a file format designed to capture metadata and provenance information about the operation of software tools in a systematic fashion. The original motivation was to represent the output of digital forensics tools, and specifically the SleuthKit tools. DFXML was expanded to operate with the bulk_extractor digital forensics tool. DFXML was then expanded to cover the output of the tcpflow tool. With the lessons we learned form handling all of those programs, we were able to separate out use of DFXML for documenting runtime provenance of any program, and the use of DFXML to represent specific digital forensics artifacts like files and hash sets.

Content of the repository

This repository contains original DFXML implementations in Python for writing DFXML files, as well as an assortment of tools for reading, generating, and processing DFXML files. The folder layout is as follows:

dfxml/		- The Python DFXML module
dfxml/bin/	- Standalone tools usable when the DFXML package is installed
dfxml/tests/ 	- Unit tests for the DFXML modules.
tests/		- Unit tests for the DFXML package.
samples/	- Exemplary .dfxml-files
schema/		- The DFXML schema.  Not directly tracked; run `make schema-init` to retrieve.

Usage

Installation

In order to install the dfxml-module for using it in your scripts, you can rely on Python's package manager pip to call install within the directory, where setup.py lives:

cd dfxml_python
pip3 install .

Installed utilities

Some tools are provided in this repository, under dfxml/bin/. That directory provides an overview of the tools, as well as links to documentation for command-line programs added to $PATH when the dfxml module is installed.

Using this as a git submodule

This DFXML module can be used as a submodule inside another git module.

We've noticed that people will typically start development in these modules, and then want to push the changes back to the master. This causes a problem with git, because when you've done the development, you weren't at the head. If this happens to you, you will need to create a new branch for your current location, then checkout the master branch, and then merge your branch into the master. You can do that this this sequence of git commands:

Sometimes when working with DFXML as a submodule, you may get off the master and end up with a disconnected head. If so, use this to get back on the master:

$ git checkout -b newbranch
$ git checkout master
$ git merge newbranch
$ git branch -d newbranch

or, more succinctly:

$ git checkout -b tmp  ; git checkout master ; git merge tmp ; git branch -d tmp

Usage with the DFXML Schema

The DFXML schema is tracked here similarly to a Git submodule, but without using the Git submodule mechanism to avoid some operational deployment issues. If you would like to check out the tracked schema version, run make schema-init. It is only necessary to check this out if you are testing validation of DFXML content against the schema.

Release Notes

  • 2018-07-22 @simsong Significant redesign of the Python library.
    • Configure Python module with a module directory and moved most of dfxml.py to __init__.py.
    • Renamed Objects.py to be objects.py since Python3 naming conventions use only lower case filenames.
    • Moved tests to a test/ subdirectory and redesigned most of them to work with py.test. The tests that require arguments on the python command line were not updated.
    • Removed calls to logging withing files and modules that are not tests, so that using DFXML doesn't inherently start emitting logging messages.
    • Removed calls to logging in Objects tests where the only thing that the test program was logging was the fact that it had run. py.test will provide similar logging now.

--- Simson Garfinkel, May 6, 2021

Disclaimer

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

dfxml_python's People

Contributors

ajnelson avatar ajnelson-nist avatar brucemty avatar dkogan avatar jgru avatar kamwoods avatar kieranjol avatar pzhur avatar simsong avatar uckelman-sf avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.