Code Monkey home page Code Monkey logo

Comments (8)

UmanShahzad avatar UmanShahzad commented on August 25, 2024 1
# mmdb -> json -> mmdb
$ ipinfo mmdb export --format json GeoLite2-City.mmdb city.json
$ ipinfo mmdb import --json --no-network --size 28 --in city.json --out city.mmdb

$ ipinfo mmdb metadata city.mmdb
- Binary Format 2.0
- Database Type ipinfo city.mmdb
- IP Version    6
- Record Size   28
- Node Count    4755530
- Description
    en ipinfo city.mmdb
- Languages     en
- Build Epoch   1701062232

$ ipinfo mmdb metadata GeoLite2-City.mmdb
- Binary Format 2.0
- Database Type GeoLite2-City
- IP Version    6
- Record Size   28
- Node Count    4755629
- Description
    en GeoLite2City database
- Languages     de, en, es, fr, ja, pt-BR, ru, zh-CN
- Build Epoch   1700588285
  • Use the --size 28 flag - we use 32 by default, and the input MMDB used 28.
  • Ensure --no-network is there.
  • The final MMDB is still exactly 824kb bigger than and has 99 nodes less than the input MMDB.

Another attempt with --ignore-empty-values --disallow-reserved:

$ ipinfo mmdb import --json --no-network --size 28 --ignore-empty-values --disallow-reserved --in city.json --out city2.mmdb

$ ipinfo mmdb metadata city2.mmdb
- Binary Format 2.0
- Database Type ipinfo city2.mmdb
- IP Version    6
- Record Size   28
- Node Count    4755615
- Description
    en ipinfo city2.mmdb
- Languages     en
- Build Epoch   1701062800

$ ipinfo mmdb metadata GeoLite2-City.mmdb
- Binary Format 2.0
- Database Type GeoLite2-City
- IP Version    6
- Record Size   28
- Node Count    4755629
- Description
    en GeoLite2City database
- Languages     de, en, es, fr, ja, pt-BR, ru, zh-CN
- Build Epoch   1700588285
  • We're still at the same size, but node count difference is now 14 less instead of 99 less.

Now adding in --alias-6to4:

$ ipinfo mmdb import --json --no-network --size 28 --ignore-empty-values --disallow-reserved --alias-6to4 --in city.json --out city3.mmdb

$ ipinfo mmdb metadata city3.mmdb
- Binary Format 2.0
- Database Type ipinfo city3.mmdb
- IP Version    6
- Record Size   28
- Node Count    4755630
- Description
    en ipinfo city3.mmdb
- Languages     en
- Build Epoch   1701063705

$ ipinfo mmdb metadata GeoLite2-City.mmdb
- Binary Format 2.0
- Database Type GeoLite2-City
- IP Version    6
- Record Size   28
- Node Count    4755629
- Description
    en GeoLite2City database
- Languages     de, en, es, fr, ja, pt-BR, ru, zh-CN
- Build Epoch   1700588285
  • Node count is now only 1 more than the input MMDB, instead being in a deficit.
  • Size increased by 4kb.

It seems no matter what flag combos I use, the size doesn't really budge. The Golang MMDB writer which is used by mmdbctl may just not be as efficient in deduplicating the data properly as compared to the writer being used to produce the input MMDB.

from mmdbctl.

UmanShahzad avatar UmanShahzad commented on August 25, 2024

@zuozhehao what commands were run and can you provide the 64.2MB file if possible for debugging?

But it's likely that some extra fields get added in - is the output of a read from both MMDBs exactly the same for most IPs? I'm thinking you've likely got a new network field added into the 287MB one, which can be fixed with --no-network.

from mmdbctl.

zuozhehao avatar zuozhehao commented on August 25, 2024

@zuozhehao what commands were run and can you provide the 64.2MB file if possible for debugging?

But it's likely that some extra fields get added in - is the output of a read from both MMDBs exactly the same for most IPs? I'm thinking you've likely got a new network field added into the 287MB one, which can be fixed with --no-network.

@UmanShahzad I did not modify the data, tested export to JSON, and then import it from JSONto mmdb.

1、GeoLite2-City.mmdb(64.2 MB) download from https://www.maxmind.com/
2、commands:
./mmdbctl export --format json GeoLite2-City.mmdb city.json
./mmdbctl import --json --in city.json --out city.mmdb

from mmdbctl.

zuozhehao avatar zuozhehao commented on August 25, 2024

with --no-network ,it will out file size city.mmdb(69.5 MB)
What is the extra 5MB..

from mmdbctl.

UmanShahzad avatar UmanShahzad commented on August 25, 2024

@zuozhehao can you show an example output with mmdbctl read <ip> --json-pretty <mmdb> for each? And also mmdbctl metadata?

from mmdbctl.

zuozhehao avatar zuozhehao commented on August 25, 2024

@zuozhehao can you show an example output with mmdbctl read <ip> --json-pretty <mmdb> for each? And also mmdbctl metadata?

@UmanShahzad This is using the exported new database
./mmdbctl read 8.8.8.8 --format json-pretty city.mmdb

{
  "continent": {
    "code": "NA",
    "geoname_id": 6255149,
    "names": {
      "de": "Nordamerika",
      "en": "North America",
      "es": "Norteamérica",
      "fr": "Amérique du Nord",
      "ja": "北アメリカ",
      "pt-BR": "América do Norte",
      "ru": "Северная Америка",
      "zh-CN": "北美洲"
    }
  },
  "country": {
    "geoname_id": 6252001,
    "iso_code": "US",
    "names": {
      "de": "Vereinigte Staaten",
      "en": "United States",
      "es": "Estados Unidos",
      "fr": "États Unis",
      "ja": "アメリカ",
      "pt-BR": "EUA",
      "ru": "США",
      "zh-CN": "美国"
    }
  },
  "location": {
    "accuracy_radius": 1000,
    "latitude": 37.751,
    "longitude": -97.822,
    "time_zone": "America/Chicago"
  },
  "registered_country": {
    "geoname_id": 6252001,
    "iso_code": "US",
    "names": {
      "de": "Vereinigte Staaten",
      "en": "United States",
      "es": "Estados Unidos",
      "fr": "États Unis",
      "ja": "アメリカ",
      "pt-BR": "EUA",
      "ru": "США",
      "zh-CN": "美国"
    }
  }
}

./mmdbctl read 8.8.8.8 --format json-pretty GeoLite2-City.mmdb

{
  "continent": {
    "code": "NA",
    "geoname_id": 6255149,
    "names": {
      "de": "Nordamerika",
      "en": "North America",
      "es": "Norteamérica",
      "fr": "Amérique du Nord",
      "ja": "北アメリカ",
      "pt-BR": "América do Norte",
      "ru": "Северная Америка",
      "zh-CN": "北美洲"
    }
  },
  "country": {
    "geoname_id": 6252001,
    "iso_code": "US",
    "names": {
      "de": "Vereinigte Staaten",
      "en": "United States",
      "es": "Estados Unidos",
      "fr": "États Unis",
      "ja": "アメリカ",
      "pt-BR": "EUA",
      "ru": "США",
      "zh-CN": "美国"
    }
  },
  "location": {
    "accuracy_radius": 1000,
    "latitude": 37.751,
    "longitude": -97.822,
    "time_zone": "America/Chicago"
  },
  "registered_country": {
    "geoname_id": 6252001,
    "iso_code": "US",
    "names": {
      "de": "Vereinigte Staaten",
      "en": "United States",
      "es": "Estados Unidos",
      "fr": "États Unis",
      "ja": "アメリカ",
      "pt-BR": "EUA",
      "ru": "США",
      "zh-CN": "美国"
    }
  }
}

from mmdbctl.

UmanShahzad avatar UmanShahzad commented on August 25, 2024

@zuozhehao could you pls provide either the source MMDB file or show the metadata result on both MMDBs? It seems I have to signup and go through a procedure to procure it myself.

The extra 5MB can be occurring for many reasons, but likely it's due to differences in how the MMDB writers (the golang one we use, and the one used by maxmind to produce their MMDB that you downloaded originally) optimize the data section.

from mmdbctl.

zuozhehao avatar zuozhehao commented on August 25, 2024

@UmanShahzad The file volume exceeds the limit, I uploaded it to the online drive
send.cm GEOLITE2-CITY.MMDB
drive.google.com GEOLITE2-CITY.MMDB

from mmdbctl.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.