Code Monkey home page Code Monkey logo

iocextractor's Introduction

IOCextractor

IOC (Indicator of Compromise) Extractor: a program to help extract IOCs from text files. The general goal is to speed up the process of parsing structured data (IOCs) from unstructured or semi-structured data (like case reports or security bulletins).

Compatibility and Requirements

The program is written in Python 2.7, and a binary version for Windows is provided (IOCextractor.zip).

Usage

This program helps extract indicators of compromise from a plain text file. It currently identifies MD5 hashes, IPv4 addresses, domains, URLs, and email addresses. First, when a user opens a file, the program identifies potential IOCs using regular expressions (ignoring a few obvious false positives, like IP addresses that start with 10). It tags and highlights the potential IOCs for a user to review.

A user can remove a tag by selecting its range of text and then either clicking the "Clear" button or right-clicking the selected text (command-click instead in Mac OS). It's also possible to remove all the tags from a large range of text, like a list of victim IP addresses, by selecting the whole range and clicking "Clear" or right-clicking. A user can add a tag by selecting a range of text and then clicking the corresponding button, for example "MD5." For any range of text that is either rejected or added, the program will search through the rest of the text to apply the same change everywhere. So if a user removes a tag from a victim IP address, the program will un-tag that IP address everywhere; it works the same for tagging a new IP address.

After a user has reviewed the tagging for accuracy, the program will export a list of unique tagged IOCs. It currently either exports to the console or saves a file in one of the following formats: CSV, CybOX Observables XML, OpenIOC 1.1. It is also set up so anyone could easily add another output format for a specific application.

A simple demonstration case report (DemonstrationCaseReport.txt) and a testing file (TestDocument.txt) are also provided.

Credits

iocextractor's People

Contributors

stephenbrannon avatar williamgibb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

iocextractor's Issues

Line Breaks

Can't tag indicators not previously recognized that are split across multiple lines.

Handle [dot] obfuscation

When URLs and IP addresses are "neutralized" using "[dot]" instead of "." or the common "[.]", IOC Extractor should still recognize them.

License

Does anyone know what license, if any, this code is licensed under? It is not specified under the readme or in any of the source code files.

Get only valid IPv4

Things like version numbers (ie: 5.24.12.1335) are extracted as IP.

Suggestion, use an alternative way to validate them:

import socket

def valid_ip(address):
    try: 
        socket.inet_aton(address)
        return True
    except:
        return False

display refresh issue

I'm not sure whether this is actually a bug, or is a problem with my python install or usage. I'm running IOCextractor under python 3.2 on Windows 7 x64. I observe that whenever I highlight all or part of one of the identified IOCs, and then click elsewhere in the document, the portion that I had previously highlighted loses its highlight color completely. Additionally, all other occurrences of the string I highlighted in the document that are currently part of some identified IOC also lose their highlighting. Even when you select a complete IOC, and then click the appropriate assignment box at the top to indicate it's an IPV4 or other IOC, as soon as you click elsewhere in the document, the highlighting goes away. I also just tested, and it appears that exporting IOCs to a csv file subsequently only exports those that are still highligted. Any idea what's going on here?

Case Sensitivity

When extracting IOCs, the output is case sensitive. As a result, you can end up with duplicate hashes/domains/etc in your output.

STIX support?

Good afternoon,

Do you have any plans to add STIX output to IOCextractor?

Thank you.

-David

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.