Code Monkey home page Code Monkey logo

whois-oracle's Introduction

The WHOIS Oracle, forked from pythonwhois

Because it is all knowing! A WHOIS retrieval and parsing library for Python, forked from pythonwhois and updated by me.

Dependencies

None! All you need is the Python standard library.

Instructions

The manual (including install instructions) can be found in the doc/ directory. A HTML version is also viewable here.

Cache configuration

Using pythonwhois.set_persistent_cache a cache can be set. If a cache is set, whois-oracle will look there for WHOIS servers for TLD's. For domains with thin WHOIS servers, only the 'head' WHOIS server is cached, not the referral servers. Otherwise it would be impossible to get the correct information because the information for the domain might not be on that WHOIS server at all.

Cool down configuration

This feature is not useful for single lookups, but for bulk this comes in really handy. Every WHOIS server gets a certain time before it will be asked again, to prevent spamming and possibly refused connections. This can be configured by passing a configuration file to pythonwhois.set_cool_down_config. This file can contain the following to elements, but doesn't have to. [general]
default_cool_down_length : 1
This is the general part. Currently, only one variable can be defined. It is optional to do so.

[whois.eu]
cool_down_length : 10
max_requests_minute : 5
max_requests_hour : 20
max_requests_day : 50
This is how sections for specific WHOIS servers are defined. The section name is the name of the server and the section can contain the listed properties. None of them are required. Multiple WHOIS servers can be added to the configuration file.

Goals

  • 100% coverage of WHOIS formats.
  • Accurate and complete data.
  • Consistently functional parsing; constant tests to ensure the parser isn't accidentally broken.

Features

  • WHOIS data retrieval
    • Able to follow WHOIS server redirects
    • Won't get stuck on multiple-result responses from verisign-grs
  • WHOIS data parsing
    • Base information (registrar, etc.)
    • Dates/times (registration, expiry, ...)
    • Full registrant information (!)
    • Nameservers
  • Optional WHOIS data normalization
    • Attempts to intelligently reformat WHOIS data for better accuracy and (human) readability
    • Converts various abbreviation types to full locality names
      • Airport codes
      • Country names (2- and 3-letter ISO codes)
      • US states and territories
      • Canadian states and territories
      • Australian states
    • Identifies both organization and person names, and moves or reformats them where necessary
    • Identifies names where the first and last name are swapped around, and fixes them
    • Deduplicates names, even across fields, and even when they're not 100% identical
    • Recognizes common (legal) abbreviations, and ensures that they are in the correct case
  • pwhois, a simple WHOIS tool using pythonwhois
    • Easily readable output format
    • Can also output raw WHOIS data
    • ... and JSON.
  • Automated testing suite
    • Will detect and warn about any changes in parsed data compared to previous runs
    • Guarantees that previously working WHOIS parsing doesn't unintentionally break when changing code

IP range WHOIS

pythonwhois does not yet support WHOIS lookups on IP ranges (including single IPs), although this will be added at some point in the future. In the meantime, consider using ipwhois - it offers functionality and an API similar to pythonwhois, but for IPs. It also supports delegated RWhois.

Do note that ipwhois does not offer a normalization feature, and does not (yet) come with a command-line tool. Additionally, ipwhois is maintained by Philip Hane and not by me; please make sure to file bugs relating to it in the ipwhois repository, not in that of pythonwhois.

It doesn't work!

  • It doesn't work at all?
  • It doesn't parse the data for a particular domain?
  • There's an inaccuracy in parsing the data for a domain, even just a small one?

If any of those apply, don't hesitate to file an issue! The goal is 100% coverage, and we need your feedback to reach that goal.

License

This library may be used under the WTFPL - or, if you take issue with that, consider it to be under the CC0.

Data sources

This library uses a number of third-party datasets for normalization:

Be aware that the OpenFlights database in particular has potential licensing consequences; if you do not wish to be bound by these potential consequences, you may simply delete the airports.dat file from your distribution. pythonwhois will assume there is no database available, and will not perform airport code conversion (but still function correctly otherwise). This also applies to other included datasets.

Contributing

Feel free to fork and submit pull requests (to the develop branch)! If you change any parsing or normalization logic, ensure to run the full test suite before opening a pull request. Instructions for that are below.

Please note that this project uses tabs for indentation.

All commands are relative to the root directory of the repository.

Pull requests that do not include output from test.py will be rejected!

Adding new WHOIS data to the testing set

pwhois --raw thedomain.com > test/data/thedomain.com

Checking the currently parsed data (while editing the parser)

./pwhois -f test/data/thedomain.com/ .

(don't forget the dot at the end!)

Marking the current parsed data as correct for a domain

Make sure to verify (using pwhois or otherwise) that the WHOIS data for the domain is being parsed correctly, before marking it as correct!

./test.py update thedomain.com

Running all tests

./test.py run all

Testing a specific domain

./test.py run thedomain.com

Running the full test suite including support for multiple python versions

tox

Generating documentation

You need ZippyDoc (which can be installed through pip install zippydoc).

zpy2html doc/*.zpy

whois-oracle's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.