Code Monkey home page Code Monkey logo

pytaxonomies's Introduction

PyTaxonomies

Build Status Coverage Status codecov.io

Pythonic way to work with the taxonomies defined there: https://github.com/MISP/misp-taxonomies

Usage

Taxonomies and predicates are represented as immutable Python dictionaries.

Installation

pip3 install git+https://github.com/MISP/PyTaxonomies

or

git clone https://github.com/MISP/PyTaxonomies
cd PyTaxonomies
git submodule init && git submodule update
python3 setup.py install

Basics

In [1]: from pytaxonomies import Taxonomies

In [2]: taxonomies = Taxonomies()

In [3]: taxonomies.version
Out[3]: '20160725'

In [4]: taxonomies.license
Out[4]: 'CC-BY'

In [5]: taxonomies.description
Out[5]: 'Manifest file of MISP taxonomies available.'

# How many taxonomies have been imported
In [6]: len(taxonomies)
Out[6]: 27

# Names of the taxonomies
In [7]: list(taxonomies.keys())
Out[7]:
['tlp',
 'eu-critical-sectors',
 'dni-ism',
 'de-vs',
 'osint',
 'ms-caro-malware',
 'open-threat',
 'circl',
 'iep',
 'euci',
 'kill-chain',
 'europol-events',
 'veris',
 'information-security-indicators',
 'estimative-language',
 'adversary',
 'europol-incident',
 'malware_classification',
 'ecsirt',
 'dhs-ciip-sectors',
 'csirt_case_classification',
 'nato',
 'fr-classif',
 'enisa',
 'misp',
 'admiralty-scale',
 'ms-caro-malware-full']

In [8]: taxonomies.get('enisa').description
Out[8]: 'The present threat taxonomy is an initial version that has been developed on the basis of available ENISA material. This material has been used as an ENISA-internal structuring aid for information collection and threat consolidation purposes. It emerged in the time period 2012-2015.'

In [9]: taxonomies.get('enisa').version
Out[9]: 201601

In [10]: taxonomies.get('enisa').name
Out[10]: 'enisa'

In [11]: list(taxonomies.get('enisa').keys())
Out[11]:
['legal',
 'outages',
 'eavesdropping-interception-hijacking',
 'nefarious-activity-abuse',
 'physical-attack',
 'failures-malfunction',
 'disaster',
 'unintentional-damage']

In [12]: list(taxonomies.get('enisa').get('physical-attack'))
Out[12]:
['fraud-by-employees',
 'theft',
 'unauthorised-physical-access-or-unauthorised-entry-to-premises',
 'theft-of-documents',
 'information-leak-or-unauthorised-sharing',
 'vandalism',
 'damage-from-the-wafare',
 'sabotage',
 'coercion-or-extortion-or-corruption',
 'theft-of-mobile-devices',
 'theft-of-fixed-hardware',
 'terrorist-attack',
 'theft-of-backups',
 'fraud']

In [13]: taxonomies.get('enisa').get('physical-attack').get('vandalism').value
Out[13]: 'vandalism'

In [14]: taxonomies.get('enisa').get('physical-attack').get('vandalism').expanded
Out[14]: 'Vandalism'

In [15]: taxonomies.get('enisa').get('physical-attack').get('vandalism').description
Out[15]: 'Act of physically damaging IT assets.'

Get machine tags

In [1]: print(taxonomies)  # or taxonomies.all_machinetags()

<display the machine tags for all the taxonomies>

In [2]: print(taxonomies.get('circl'))  # or taxonomies.get('circl').machinetags()
circl:incident-classification="vulnerability"
circl:incident-classification="malware"
circl:incident-classification="fastflux"
circl:incident-classification="system-compromise"
circl:incident-classification="sql-injection"
circl:incident-classification="scan"
circl:incident-classification="XSS"
circl:incident-classification="information-leak"
circl:incident-classification="scam"
circl:incident-classification="copyright-issue"
circl:incident-classification="denial-of-service"
circl:incident-classification="phishing"
circl:incident-classification="spam"
circl:topic="undefined"
circl:topic="industry"
circl:topic="ict"
circl:topic="finance"
circl:topic="services"
circl:topic="individual"
circl:topic="medical"

# All entries
In [3]: taxonomies.get('circl').amount_entries()
Out[3]: 28

# Amount predicates
In [3]: len(taxonomies.get('circl'))
Out[3]: 2

Expanded machine tag

In [10]: print(taxonomies.get('circl').machinetags_expanded())
circl:topic="Individual"
circl:topic="Services"
circl:topic="Finance"
circl:topic="Medical"
circl:topic="Industry"
circl:topic="Undefined"
circl:topic="ICT"
circl:incident-classification="Phishing"
circl:incident-classification="Malware"
circl:incident-classification="XSS"
circl:incident-classification="Copyright issue"
circl:incident-classification="Spam"
circl:incident-classification="SQL Injection"
circl:incident-classification="Scan"
circl:incident-classification="Scam"
circl:incident-classification="Vulnerability"
circl:incident-classification="Denial of Service"
circl:incident-classification="Information leak"
circl:incident-classification="Fastflux"
circl:incident-classification="System compromise"

pytaxonomies's People

Contributors

adulau avatar dependabot[bot] avatar fukusuket avatar rafiot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytaxonomies's Issues

Publish on PyPI

Publishing on PyPI would make process of installing and keeping up to date easier for the user of this package.

UnicodeDecodeError on Windows(Locales where UTF-8 is not the default encoding)

Hello, Thank you for maintaining the tool :) There was an error about encoding, so I report it.

Describe the issue
An error occurs on Windows in locales where UTF-8 is not the default encoding.

Step to Reproduce

pip3 install git+https://github.com/MISP/PyTaxonomies
python
>>> from pytaxonomies import Taxonomies
>>> taxonomies = Taxonomies()

Actual behavior
UnicodeDecodeError occur as follows.

C:\Users\fukusuke>python
Python 3.11.4 (tags/v3.11.4:d2340ef, Jun  7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytaxonomies import Taxonomies
>>> taxonomies = Taxonomies()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\fukusuke\AppData\Local\Programs\Python\Python311\Lib\site-packages\pytaxonomies\api.py", line 257, in __init__
    self.manifest = self.loader(manifest_path)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\fukusuke\AppData\Local\Programs\Python\Python311\Lib\site-packages\pytaxonomies\api.py", line 282, in __load_path
    return json.load(f)
           ^^^^^^^^^^^^
  File "C:\Users\fukusuke\AppData\Local\Programs\Python\Python311\Lib\json\__init__.py", line 293, in load
    return loads(fp.read(),
                 ^^^^^^^^^
UnicodeDecodeError: 'cp932' codec can't decode byte 0x93 in position 15191: illegal multibyte sequence

Expected behavior
UnicodeDecodeError does not occur.

Environment

  • OS: Windows 11
  • Python: 3.11.4
PS C:\Users\fukusuke> [System.Text.Encoding]::Default
BodyName          : iso-2022-jp
EncodingName      : 日本語 (シフト JIS)
HeaderName        : iso-2022-jp
WebName           : shift_jis
WindowsCodePage   : 932
IsBrowserDisplay  : True
IsBrowserSave     : True
IsMailNewsDisplay : True
IsMailNewsSave    : True
IsSingleByte      : False
EncoderFallback   : System.Text.InternalEncoderBestFitFallback
DecoderFallback   : System.Text.InternalDecoderBestFitFallback
IsReadOnly        : True
CodePage          : 932

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128)

That should work according to the documentation, but it does not.

from pytaxonomies import Taxonomies
taxonomies = Taxonomies()
taxonomies.all_machinetags()
Traceback (most recent call last):
  File "/Users/bla/Documents/scripts/interactiveshell.py", line 2878, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-10-55b0011c9157>", line 1, in <module>
    taxonomies.all_machinetags()
  File "/Users/bla/Documents/scripts/bla/venv3/lib/python2.7/site-packages/pytaxonomies/api.py", line 328, in all_machinetags
    return [taxonomy.machinetags() for taxonomy in self.values()]
  File "/Users/bla/Documents/scripts/bla/venv3/lib/python2.7/site-packages/pytaxonomies/api.py", line 202, in machinetags
    to_return.append('{}:{}="{}"'.format(self.name, p, k))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 4: ordinal not in range(128)

taxonomies.get("c").machinetags_expanded() AttributeError: 'NoneType'

If you try the following:

from pytaxonomies import Taxonomies
taxonomies = Taxonomies()
taxonomies.get("c").machinetags_expanded()

you get:

ile "<ipython-input-13-21c1a4715166>", line 1, in <module>
    taxonomies.get("c").machinetags_expanded()
AttributeError: 'NoneType' object has no attribute 'machinetags_expanded'

Imho I would expect all circl results (as it starts with c) or a proper error message or introduce a new method:

from pytaxonomies import Taxonomies
taxonomies = Taxonomies()
taxonomies.search("c").machinetags_expanded()

And then return the closest match (but then I would love it also searching not only in the keys but also values)

Hope that make sense.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.