Code Monkey home page Code Monkey logo

msoffcrypto-tool's Introduction

msoffcrypto-tool

PyPI PyPI downloads build Coverage Status Documentation Status

msoffcrypto-tool (formerly ms-offcrypto-tool) is Python tool and library for decrypting encrypted MS Office files with password, intermediate key, or private key which generated its escrow key.

Contents

Installation

pip install msoffcrypto-tool

Examples

As CLI tool (with password)

msoffcrypto-tool encrypted.docx decrypted.docx -p Passw0rd

Password is prompted if you omit the password argument value:

$ msoffcrypto-tool encrypted.docx decrypted.docx -p
Password:

Test if the file is encrypted or not (exit code 0 or 1 is returned):

msoffcrypto-tool document.doc --test -v

As library

Password and more key types are supported with library functions.

Basic usage:

import msoffcrypto

encrypted = open("encrypted.docx", "rb")
file = msoffcrypto.OfficeFile(encrypted)

file.load_key(password="Passw0rd")  # Use password

with open("decrypted.docx", "wb") as f:
    file.decrypt(f)

encrypted.close()

Basic usage (in-memory):

import msoffcrypto
import io
import pandas as pd

decrypted = io.BytesIO()

with open("encrypted.xlsx", "rb") as f:
    file = msoffcrypto.OfficeFile(f)
    file.load_key(password="Passw0rd")  # Use password
    file.decrypt(decrypted)

df = pd.read_excel(decrypted)
print(df)

Advanced usage:

# Verify password before decryption (default: False)
# The ECMA-376 Agile/Standard crypto system allows one to know whether the supplied password is correct before actually decrypting the file
# Currently, the verify_password option is only meaningful for ECMA-376 Agile/Standard Encryption
file.load_key(password="Passw0rd", verify_password=True)

# Use private key
file.load_key(private_key=open("priv.pem", "rb"))

# Use intermediate key (secretKey)
file.load_key(secret_key=binascii.unhexlify("AE8C36E68B4BB9EA46E5544A5FDB6693875B2FDE1507CBC65C8BCF99E25C2562"))

# Check the HMAC of the data payload before decryption (default: False)
# Currently, the verify_integrity option is only meaningful for ECMA-376 Agile Encryption
file.decrypt(open("decrypted.docx", "wb"), verify_integrity=True)

Supported encryption methods

MS-OFFCRYPTO specs

  • ECMA-376 (Agile Encryption/Standard Encryption)
    • MS-DOCX (OOXML) (Word 2007-2016)
    • MS-XLSX (OOXML) (Excel 2007-2016)
    • MS-PPTX (OOXML) (PowerPoint 2007-2016)
  • Office Binary Document RC4 CryptoAPI
    • MS-DOC (Word 2002, 2003, 2004)
    • MS-XLS (Excel 2002, 2003, 2004) (experimental)
    • MS-PPT (PowerPoint 2002, 2003, 2004) (partial, experimental)
  • Office Binary Document RC4
    • MS-DOC (Word 97, 98, 2000)
    • MS-XLS (Excel 97, 98, 2000) (experimental)
  • ECMA-376 (Extensible Encryption)
  • XOR Obfuscation

Other

  • Word 95 Encryption (Word 95 and prior)
  • Excel 95 Encryption (Excel 95 and prior)
  • PowerPoint 95 Encryption (PowerPoint 95 and prior)

PRs are welcome!

Tests

With coverage and pytest:

poetry install
poetry run coverage run -m pytest -v

Todo

  • Add tests
  • Support decryption with passwords
  • Support older encryption schemes
  • Add function-level tests
  • Add API documents
  • Publish to PyPI
  • Add decryption tests for various file formats
  • Integrate with more comprehensive projects handling MS Office files (such as oletools?) if possible
  • Add the password prompt mode for CLI
  • Improve error types (v4.12.0)
  • Redesign APIs (v6.0.0)
  • Introduce something like ctypes.Structure
  • Support encryption
  • Isolate parser

Resources

Alternatives

Use cases and mentions

General

Malware/maldoc analysis

CTF

In other languages

In publications

Contributors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.