Code Monkey home page Code Monkey logo

mmdb-from-go-blogpost's Introduction

Enriching MMDB files with your own data using Go

MaxMind DB (or MMDB) files facilitate the storage and retrieval of data in connection with IP addresses and IP address ranges, making queries for such data very fast and easy to perform. While MMDB files are usable on a variety of platforms and in a number of different programming languages, this article will focus on building MMDB files using the Go programming language.

MaxMind offers several prebuilt MMDB files, like the free GeoLite2 Country MMDB file. For many situations these MMDB files are useful enough as is. If, however, you have your own data associated with IP address ranges, you can create hybrid MMDB files, enriching existing MMDB contents with your own data. In this article, we're going to add details about a fictional company's IP address ranges to the GeoLite2 Country MMDB file. We'll be building a new MMDB file, one that contains both MaxMind's and our fictional company's data.

If you don't need any of the MaxMind data, but you still want to create a fast, easy-to-query database keyed on IP addresses and IP address ranges, you can consult this example code showing how to create an MMDB file from scratch.

Prerequisites

  • you must have git installed in order to clone the code and install the dependencies, and it must be in your $PATH
  • Go 1.14 or later must be installed, and go must be in your $PATH
  • the mmdbinspect tool must be installed and be in your $PATH
  • a copy of the GeoLite2 Country database must be in your working directory
  • your working directory (which can be located under any parent directory) must be named mmdb-from-go-blogpost (if you clone the code using the instructions below, this directory will be created for you)
  • a basic understanding of Go and of IP addresses and CIDR notation will be helpful, but allowances have been made for the intrepid explorer for whom these concepts are novel!

Using Docker or Vagrant

The code repository comes with a Dockerfile and a Vagrantfile included. If you'd like to begin work in an environment which has all of the necessary software dependencies pre-installed, see our documentation for getting started with Docker and Vagrant.

AcmeCorp's data

For the purposes of this tutorial, I have mocked up some data for a fictional company, AcmeCorp. This method can be adapted for your own real data, as long as that data maps to IP addresses or IP address ranges.

AcmeCorp has three departments:

  • SRE, whose IP addresses come from the 56.0.0.0/16 range,
  • Development, whose IP addresses come from the 56.1.0.0/16 range, and
  • Management, whose IP addresses come from the 56.2.0.0/16 range.

Members of the SRE department have access to all three of AcmeCorp's environments, development, staging, and production. Members of the Development and Management departments have access to the development and staging environments (but not to production).

Each valid record in GeoLite2 Country is a map, containing data about the country associated with the IP address.

For each of the AcmeCorp ranges, we're going to add to the existing data the AcmeCorp.Environments and AcmeCorp.DeptName keys. More on this later.

The steps we're going to take

We're going to write some Go code that makes use of the MaxMind mmdbwriter Go module to:

  1. Load the GeoLite2 Country MaxMind DB.
  2. Add our own internal department data to the appropriate IP address ranges.
  3. Write the enriched database to a new MMDB file.
  4. Look up the new data in the enriched database to confirm our additions.
    • We will use the mmdbinspect tool to see our new data in the MMDB file we've built and compare a few ranges in it to those in the old GeoLite2 Country MMDB file.

The full code is presented in the next section. Let's dive in!

The code, explained

The repo for this tutorial is available on GitHub. You can clone it locally and cd into the repo dir by running the following in a terminal window:

me@myhost:~/dev $ git clone https://github.com/maxmind/mmdb-from-go-blogpost.git
me@myhost:~/dev $ cd mmdb-from-go-blogpost

Now I’m going to break down the contents of main.go from the repo, the code that will perform steps 1-3 of the tutorial. If you prefer to read the code directly, you can skip to the next section.

All Go programs begin with a package main, indicating that this file will contain a main function, the start of our program's execution. This program is no exception.

package main

Most programs have a list of imported packages next. In our case, the list of packages imported include some from the standard library: log, which we use for outputting in error scenarios; net, for the net.ParseCIDR function and the net.IPNet type, which we use when inserting new data into the MMDB tree; and os, which we use when creating a new file into which we will write the MMDB tree. We also import some packages from MaxMind's mmdbwriter repo, which are designed specifically for building MMDB files and for working with MMDB trees -- you'll see how we use those below.

import (
	"log"
	"net"
	"os"

	"github.com/maxmind/mmdbwriter"
	"github.com/maxmind/mmdbwriter/inserter"
	"github.com/maxmind/mmdbwriter/mmdbtype"
)

Now we're at the start of the program execution. We begin by loading the existing database, GeoLite2-Country.mmdb, that we're going to enrich.

func main() {
	// Load the database we wish to enrich.
	writer, err := mmdbwriter.Load("GeoLite2-Country.mmdb", mmdbwriter.Options{})
	if err != nil {
		log.Fatal(err)
	}

Having loaded the existing GeoLite2 Country database, we begin defining the data we wish to enrich it with. The second return value of the net.ParseCIDR() function is of type *net.IPNet, which is what we need for the first parameter for our upcoming writer.InsertFunc() call, so we use net.ParseCIDR() to go from the string-literal CIDR form "56.0.0.0/16" to the desired *net.IPnet.

	// Define and insert the new data.
	_, sreNet, err := net.ParseCIDR("56.0.0.0/16")
	if err != nil {
		log.Fatal(err)
	}

sreData is the data we will be merging into the existing records for the SRE range. We must define this data in terms of the mmdbtype.DataType interface. mmdbwriter uses this interface to determine the data type to associate with the data when inserting it into the database.

As the existing GeoLite2 Country records are maps, we use a mmdbtype.Map as the top level data structure. This map contains our two new keys, AcmeCorp.DeptName and AcmeCorp.Environments.

AcmeCorp.DeptName is an mmdbtype.String containing the name of the department for the IP address range.

AcmeCorp.Environments is an mmdbtype.Slice. A slice contains an ordered list of values. In this case, it is a list of the environments that the IP address range is allowed to access. These environments are represented as mmdbtype.String values.

[An aside: If you look at the output of running the mmdbinspect -db GeoLite2-Country.mmdb 56.0.0.1 command in your terminal, examining the $.[0].Records[0].Record JSONPath (i.e. the sole record, stripped of its wrappers), then you'll see that it is a JSON Object, which as expected corresponds to the mmdbtype.Map type.]

	sreData := mmdbtype.Map{
		"AcmeCorp.DeptName": mmdbtype.String("SRE"),
		"AcmeCorp.Environments": mmdbtype.Slice{
			mmdbtype.String("development"),
			mmdbtype.String("staging"),
			mmdbtype.String("production"),
		},
	}

Now that we've got our data, we insert it into the MMDB using InsertFunc. We use InsertFunc instead of Insert as it allows us to pass in an inserter function that will merge our new data with any existing data.

In this case, we are using the inserter.TopLevelMergeWith function. This updates the existing map with the keys from our new map.

After inserting, our MMDB tree will have the AcmeCorp SRE IP addresses in the 56.0.0.0/16 range, whose maps contain the new environment and department name keys in addition to whatever GeoLite2 Country data they returned previously. (Note that we carefully picked non-clashing, top-level keys; no key in the GeoLite2 Country data starts with AcmeCorp.)

What happens if there is an IP address for which no record exists? With the inserter.TopLevelMergeWith strategy, this IP address will also happily take our new top-level keys as well.

	if err := writer.InsertFunc(sreNet, inserter.TopLevelMergeWith(sreData)); err != nil {
		log.Fatal(err)
	}

We repeat the process for the Development and Management departments, taking care to update the range itself, the list of environments, and the department name as we go.

	_, devNet, err := net.ParseCIDR("56.1.0.0/16")
	if err != nil {
		log.Fatal(err)
	}
	devData := mmdbtype.Map{
		"AcmeCorp.DeptName": mmdbtype.String("Development"),
		"AcmeCorp.Environments": mmdbtype.Slice{
			mmdbtype.String("development"),
			mmdbtype.String("staging"),
		},
	}
	if err := writer.InsertFunc(devNet, inserter.TopLevelMergeWith(devData)); err != nil {
		log.Fatal(err)
	}

	_, mgmtNet, err := net.ParseCIDR("56.2.0.0/16")
	if err != nil {
		log.Fatal(err)
	}
	mgmtData := mmdbtype.Map{
		"AcmeCorp.DeptName": mmdbtype.String("Management"),
		"AcmeCorp.Environments": mmdbtype.Slice{
			mmdbtype.String("development"),
			mmdbtype.String("staging"),
		},
	}
	if err := writer.InsertFunc(mgmtNet, inserter.TopLevelMergeWith(mgmtData)); err != nil {
		log.Fatal(err)
	}

Finally we write the new database to disk.

	// Write the newly enriched DB to the filesystem.
	fh, err := os.Create("GeoLite2-Country-with-Department-Data.mmdb")
	if err != nil {
		log.Fatal(err)
	}
	_, err = writer.WriteTo(fh)
	if err != nil {
		log.Fatal(err)
	}
}

Building the code and running it

So that's our code! Now we build the program and run it. On my 2015-model laptop it takes under 10 seconds to run.

me@myhost:~/dev/mmdb-from-go-blogpost $ go build
me@myhost:~/dev/mmdb-from-go-blogpost $ ./mmdb-from-go-blogpost 

This will have built the enriched database. Finally, we will compare some IP address and range queries on the original and enriched database using the mmdbinspect tool.

me@myhost:~/dev/mmdb-from-go-blogpost $ mmdbinspect -db GeoLite2-Country.mmdb -db GeoLite2-Country-with-Department-Data.mmdb 56.0.0.1 56.1.0.0/24 56.2.0.54 56.3.0.1 | less

The output from this command, elided here for brevity, shows us that the AcmeCorp.Environments and AcmeCorp.DeptName keys are not present in the original MMDB file at all and that they are present in the enriched MMDB file when expected. The 56.3.0.1 IP address remains identical across both databases (without any AcmeCorp fields) as a control.

And that's it! You've now built yourself a GeoLite2 Country MMDB file enriched with custom data.

Contacting Us

Feel free to open an issue in the repo if you have any questions or just want to tell us what you've created.

Copyright and License

This software is Copyright (c) 2020-2023 by MaxMind, Inc.

This is free software, licensed under the Apache License, Version 2.0 or the MIT License, at your option.

mmdb-from-go-blogpost's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar faktas2 avatar nchelluri avatar oalders avatar oschwald avatar rafl avatar ugexe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mmdb-from-go-blogpost's Issues

show custom geoip fields

` geoip{
source => "client_ip"
database => "D:\elk\conf\geoIP\go\mmdb-from-go-blogpost\GeoLite2-Country-with-Department-Data.mmdb"
fields => ["city_name", "continent_code", "country_code2", "country_code3", "country_name", "dma_code", "ip", "latitude", "longitude", "postal_code", "region_name", "timezone","AcmeCorp.DeptName"]

}
`
How show [AcmeCorp.DeptName] in geoip with logstash plugins??
Could you give some help??

Incorrect result after import from CSV file

Hi,

I have created a custom MMDB file from the 'GeoIP2-ISP-CSV_20230210.zip' file using the following code...
When I query the network address, the response does not match.

For Example, Query: 1.0.4.0/24, Response: 1.0.4.0/22. (former is from CSV, and the latter is from custom MMDB)
But querying GeoIP2-ISP.mmdb gives the correct response.

package main

import (
"encoding/csv"
"fmt"
"io"
"log"
"net"
"os"

"github.com/maxmind/mmdbwriter"
"github.com/maxmind/mmdbwriter/mmdbtype"

)

func main() {
writer, err := mmdbwriter.New(
mmdbwriter.Options{
DatabaseType: "IP",
Description: map[string]string{"en": "IP Database"},
IncludeReservedNetworks: true,
Languages: []string{"en"},
RecordSize: 24,
},
)

if err != nil {
	log.Fatal(err)
}

for _, file := range []string{"GeoIP2-ISP-Blocks-IPv4.csv""} {
	fh, err := os.Open(file)

	if err != nil {
		log.Fatal(err)
	}

	r := csv.NewReader(fh)

	r.Read() // skip header

	for {
		row, err := r.Read()

		if err == io.EOF {
			break
		}
		if err != nil {
			log.Fatal(err)
		}

		if len(row) != 7 {
			log.Fatalf("unexpected CSV rows: %v", row)
		}

		_, network, err := net.ParseCIDR(row[0])

		if err != nil {
			log.Fatal(err)
		}

		if row[2] != "" {
			record := mmdbtype.Map{}
			record["org"] = mmdbtype.String(row[2])

			err = writer.Insert(network, record)

			if err != nil {
				log.Fatal(err)
			}
		}
	}
}

fh, err := os.Create("IP.db")

if err != nil {
	log.Fatal(err)
}

_, err = writer.WriteTo(fh)

if err != nil {
	log.Fatal(err)
}

fmt.Println("Success!")

}

How to insert custom region data with ipaddr

I want to insert some ips into a new mmdb file.
The code for inserting custom data:

writer, err := mmdbwriter.New(
	mmdbwriter.Options{
		DatabaseType: "GeoLite2-City",
		RecordSize:   24,
		Languages: []string{"en"},
	},
)
if err != nil {
	return err
}

nameId := 0
for i:=0; i<len(customData); i++ {
		record := mmdbtype.Map{}
		record["country"] = mmdbtype.Map{"geoname_id": mmdbtype.Uint64(nameId),"names":mmdbtype.Map{"en": mmdbtype.String(customData[i].Country)}}
		nameId++

		record["province"] = mmdbtype.Map{"geoname_id": mmdbtype.Uint64(nameId),"names":mmdbtype.Map{"en": mmdbtype.String(customData[i].Province)}}
		nameId++

		record["city"] = mmdbtype.Map{"geoname_id": mmdbtype.Uint64(nameId),"names":mmdbtype.Map{"en": mmdbtype.String(customData[i].City)}}
		nameId++

		record["county"] = mmdbtype.Map{"geoname_id": mmdbtype.Uint64(nameId),"names":mmdbtype.Map{"en": mmdbtype.String(customData[i].County)}}
		nameId++

		tmpIp := net.ParseIP(customData[i].ipVal)
		writer.Insert(&net.IPNet{IP: tmpIp, Mask: tmpIp.DefaultMask()}, record)
}
fh, err := os.Create("out/out.mmdb")
if err != nil {
	log.Fatal(err)
}
_, err = writer.WriteTo(fh)
if err != nil {
	log.Fatal(err)
}

Then I search ip for testing:

city, err := geoip2Reader.City(net.ParseIP(ipStr))
	if err == nil {
		marshal, _ := json.Marshal(city)
		zap.S().Info(string(marshal))
	}

The result of searching is not correct.
Am I doing something wrong?

Splunk can't load the updated mmdb file

In general, the situation is this: I use mmdb to determine the ip location in traffic in Splunk using the module | iplocation.

When you load a regular mmdb without changes, it loads well.
well

When you load an mmdb file processed by the library, it’s a completely different bad outcome.
bad

Those. In general, I simply took and rewrote mmdb without making changes, and still Splunk did not want to accept it.

Question: what does the library do that makes Splunk stop accepting the file??? (If you look at the metadata, only build_epoch changes. The file size does not change.)

Code for just rewrite mmdb:
package main

import (
"log"
"os"

"github.com/maxmind/mmdbwriter"

)

func main() {

// Load the database we wish to enrich.
var path_to_db string
path_to_db = "test/GeoLite2-City.mmdb"

writer, err := mmdbwriter.Load(path_to_db, mmdbwriter.Options{})
if err != nil {
	log.Fatal(err)
}


// Write the newly enriched DB to the filesystem.
fh, err := os.Create("test/GeoLite2-City1.mmdb")
if err != nil {
	log.Fatal(err)
}
_, err = writer.WriteTo(fh)
if err != nil {
	log.Fatal(err)
}

}

Custom DB with Logstash GeoIP filter

I wrote a custom mmdb using the instructions here https://blog.maxmind.com/2020/09/01/enriching-mmdb-files-with-your-own-data-using-go/ and it is my understanding that if we build a custom mmdb it must follow the tree structure of the geolite2_city.mmdb or geolite2_asn.mmdb in order for the GeoIP filter in Logstash to work.
I went ahead and used the tree structure (i think) of city and have no errors from logstash but my fields will not populate in my Elasticsearch index. Can anyone help me understand what i might be doing wrong? I can share my go code if it helps to understand.

add new localipaddress

Hi,

Please help to provide one use case, how to add new ipaddress range in to GeoLite2-City.mmdb.

ex: private ipaddress like 10.200.11.0-10.200.11.20 -- want to add these ipaddress range under singapore region.

One Node dissapeared after Inserting data

After executing this code, for some reason the new file becomes smaller by 250KB and becomes smaller by one less node too (it was 4960213, it became 4960212 - read from the metadata). Is this normal or is something done wrong? Just the code exactly from your article with an example.
But the code does its job of enrichment, everything works

metadata before updating:
maxminddb.reader.Metadata(node_count=4960213, record_size=28, ip_version=6, database_type='GeoLite2-City', languages=['de', 'en', 'es', 'fr', 'ja', 'pt-BR', 'ru', 'zh-CN'], binary_format_major_version=2, binary_format_minor_version=0, build_epoch=1664466441, description={'en': 'GeoLite2 City database'})

metadata after updating:
maxminddb.reader.Metadata(node_count=4960212, record_size=28, ip_version=6, database_type='GeoLite2-City', languages=['de', 'en', 'es', 'fr', 'ja', 'pt-BR', 'ru', 'zh-CN'], binary_format_major_version=2, binary_format_minor_version=0, build_epoch=1698236150, description={'en': 'GeoLite2 City database'})

import (
"log"
"net"
"os"

"github.com/maxmind/mmdbwriter"
"github.com/maxmind/mmdbwriter/inserter"
"github.com/maxmind/mmdbwriter/mmdbtype"

)

func main() {

// Load the database we wish to enrich.
var path_to_db string
path_to_db = "GeoLite2-City.mmdb"

writer, err := mmdbwriter.Load(path_to_db, mmdbwriter.Options{})
if err != nil {
	log.Fatal(err)
}

// Define and insert the new data.
_, sreNet, err := net.ParseCIDR("56.1.0.0/16")
if err != nil {
	log.Fatal(err)
}

sreData := mmdbtype.Map{
	"AcmeCorp.DeptName": mmdbtype.String("SRE"),
	"AcmeCorp.Environments": mmdbtype.Slice{
		mmdbtype.String("development"),
		mmdbtype.String("staging"),
		mmdbtype.String("production"),
	},
}

if err := writer.InsertFunc(sreNet, inserter.TopLevelMergeWith(sreData)); err != nil {
	log.Fatal(err)
}

// Write the newly enriched DB to the filesystem.
fh, err := os.Create("GeoLite2-City-test.mmdb")
if err != nil {
	log.Fatal(err)
}
_, err = writer.WriteTo(fh)
if err != nil {
	log.Fatal(err)
}

}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.