Code Monkey home page Code Monkey logo

Comments (14)

cjheath avatar cjheath commented on July 30, 2024

I don't use logstash or Jira. If you do, please test the change and make a pull request. Thanks!

from geoip.

OkkeKlein avatar OkkeKlein commented on July 30, 2024

Unfortunately I am not a programmer. And know nothing about Ruby. But if someone makes the changes, I will gladly test it with Logstash.

from geoip.

cjheath avatar cjheath commented on July 30, 2024

I'm sorry, but you'll need to find someone who knows how to replicate this problem before anyone can attempt a fix.

from geoip.

batbast avatar batbast commented on July 30, 2024

Hi,

I can reproduce this bug ;
Exception in filterworker {"exception"=>#<NoMethodError: undefined method to_hash' for "MYORG":String>, "backtrace"=>["file:/opt/elasticsearch/logstash-1.2.2/logstash-1.2.2-flatjar.jar!/logstash/filters/geoip.rb:104:in filter'", "(eval):64:ininitialize'", "org/jruby/RubyProc.java:271:in call'", "file:/opt/elasticsearch/logstash-1.2.2/logstash-1.2.2-flatjar.jar!/logstash/pipeline.rb:250:infilter'", "file:/opt/elasticsearch/logstash-1.2.2/logstash-1.2.2-flatjar.jar!/logstash/pipeline.rb:191:in filterworker'", "file:/opt/elasticsearch/logstash-1.2.2/logstash-1.2.2-flatjar.jar!/logstash/pipeline.rb:134:in`start_filters'"], :level=>:error}

I made a correction for geoip.rb file (1.3.2 version, packaged with logstash 1.2.2), which is similar to the previous ASN bug (#38)

diff geoip.rb ../new/geoip.rb 155a156,163

class ISP < Struct.new(:isp)

def to_hash
  Hash[each_pair.to_a]
end

end
350c358

< record

ISP.new(record)

You can test it with an ORG file construct with a script, which is build from https://github.com/mteodoro/mmutils.
I can send you this new script, or an sample ORG dat file.

Thanks for your code.

from geoip.

cjheath avatar cjheath commented on July 30, 2024

Thanks batbast.

I think I have applied your change correctly, but I don't have a license to access the ISP data. If you can send me a sample ISP data file, I'll test and release it....

... or you could just send me a pull request :)

from geoip.

batbast avatar batbast commented on July 30, 2024

Hi Clifford,

Thank you for your response.

I send you 3 files :
- a dat sample
- a csv sample
- a python script used to create the dat file from the csv file :
it is inspired from
https://github.com/mteodoro/mmutils/raw/master/csv2dat.py

Utilisation (install python-ipaddr package)

$ ./csvORG2dat.py -w organizations.dat mmorg_net organizations.csv
wrote 42-node trie with 5 networks (3 distinct labels) in 0 seconds

Test

$ geoiplookup -f organizations.dat 10.1.1.1
GeoIP Organization Edition: ORG1

On 2013-12-11 21:25, Clifford Heath wrote:

Thanks batbast.

I think I have applied your change correctly, but I don't have a
license to access the ISP data. If you can send me a sample ISP data
file, I'll test and release it....

... or you could just send me a pull request :)

Reply to this email directly or view it on GitHub [1].

Links:

[1] #41 (comment)
10.1.1.0/24,ORG1
10.10.0.0/16,ORG1
10.20.0.0/16,ORG2
10.1.3.0/24,ORG3
13.0.0.0/16,ORG3

#!/usr/bin/env python

Source : https://github.com/mteodoro/mmutils

import sys
import logging
import logging.handlers
import optparse

import csv
import fileinput
import itertools
import struct
import time

from functools import partial

import ipaddr

def init_logger(opts):
level = logging.INFO
handler = logging.StreamHandler()
#handler = logging.handlers.SysLogHandler(address='/dev/log')
if opts.debug:
level = logging.DEBUG
handler = logging.StreamHandler()
root = logging.getLogger()
root.setLevel(level)
root.addHandler(handler)

def parse_args(argv):
if argv is None:
argv = sys.argv[1:]
p = optparse.OptionParser()

cmdlist = []
for cmd, (f, usage) in sorted(cmds.iteritems()):
    cmdlist.append('%-8s\t%%prog %s' % (cmd, usage))
cmdlist = '\n  '.join(cmdlist)

p.usage = '%%prog [options] <cmd> <arg>+\n\nExamples:\n  %s' % cmdlist

p.add_option('-d', '--debug', action='store_true',
        default=False, help="debug mode")
p.add_option('-g', '--geoip', action='store_true',
        default=False, help='test with C GeoIP module')
p.add_option('-w', '--write-dat', help='write filename.dat')
opts, args = p.parse_args(argv)

#sanity check
if not args or args[0] not in cmds:
    p.error('missing command. choose from: %s' % ' '.join(sorted(cmds)))

return opts, args

def gen_csv(f):
"""peek at rows from a csv and start yielding when we get past the comments
to a row that starts with an int (split at : to check IPv6)"""
def startswith_int(row):
try:
int(row[0].split(':', 1)[0])
return True
except ValueError:
return False

cr = csv.reader(f)
#return itertools.dropwhile(lambda x: not startswith_int(x), cr)
return cr

class RadixTreeNode(object):
slots = ['segment', 'lhs', 'rhs']
def init(self, segment):
self.segment = segment
self.lhs = None
self.rhs = None

class RadixTree(object):
def init(self, debug=False):
self.debug = False

    self.netcount = 0
    self.segments = [RadixTreeNode(0)]
    self.data_offsets = {}
    self.data_segments = []
    self.cur_offset = 1

def __setitem__(self, net, data):
    self.netcount += 1
    inet = int(net)
    node = self.segments[0]
    for depth in range(self.seek_depth, self.seek_depth - (net.prefixlen-1), -1):
        if inet & (1 << depth):
            if not node.rhs:
                node.rhs = RadixTreeNode(len(self.segments))
                self.segments.append(node.rhs)
            node = node.rhs
        else:
            if not node.lhs:
                node.lhs = RadixTreeNode(len(self.segments))
                self.segments.append(node.lhs)
            node = node.lhs

    if not data in self.data_offsets:
        self.data_offsets[data] = self.cur_offset
        enc_data = self.encode(*data)
        self.data_segments.append(enc_data)
        self.cur_offset += (len(enc_data))

    if self.debug:
        #store net after data for easier debugging
        data = data, net

    if inet & (1 << self.seek_depth - (net.prefixlen-1)):
        node.rhs = data
    else:
        node.lhs = data

def gen_nets(self, opts, args):
    raise NotImplementedError

def load(self, opts, args):
    for nets, data in self.gen_nets(opts, args):
        for net in nets:
            self[net] = data

def dump_node(self, node):
    if not node:
        #empty leaf
        return '--'
    elif isinstance(node, RadixTreeNode):
        #internal node
        return node.segment
    else:
        #data leaf
        data = node[0] if self.debug else node
        return '%d %s' % (len(self.segments) + self.data_offsets[data], node)

def dump(self):
    for node in self.segments:
        print node.segment, [self.dump_node(node.lhs), self.dump_node(node.rhs)]

def encode(self, *args):
    raise NotImplementedError

def encode_rec(self, rec, reclen):
    """encode rec as 4-byte little-endian int, then truncate it to reclen"""
    assert(reclen <= 4)
    return struct.pack('<I', rec)[:reclen]

def serialize_node(self, node):
    if not node:
        #empty leaf
        rec = len(self.segments)
    elif isinstance(node, RadixTreeNode):
        #internal node
        rec = node.segment
    else:
        #data leaf
        data = node[0] if self.debug else node
        rec = len(self.segments) + self.data_offsets[data]
    return self.encode_rec(rec, self.reclen)

def serialize(self, f):
    if len(self.segments) >= 2 ** (8 * self.segreclen):
        logging.warning('too many segments for final segment record size!')

    for node in self.segments:
        f.write(self.serialize_node(node.lhs))
        f.write(self.serialize_node(node.rhs))

    f.write(chr(42)) #So long, and thanks for all the fish!
    f.write(''.join(self.data_segments))

    f.write('bat.bast') #.dat file comment - can be anything
    f.write(chr(0xFF) * 3)
    f.write(chr(self.edition))
    f.write(self.encode_rec(len(self.segments), self.segreclen))

class ORGIPRadixTree(RadixTree):
usage = '-w mmorg.dat mmorg_ip GeoIPORG.csv'
cmd = 'mmorg_ip'
seek_depth = 31
edition = 5
reclen = 4
segreclen = 4

def gen_nets(self, opts, args):
    for lo, hi, org in gen_csv(fileinput.input(args)):
        lo, hi = ipaddr.IPAddress(lo), ipaddr.IPAddress(hi)
        nets = ipaddr.summarize_address_range(lo, hi)
        #print 'lo %s - li %s - nets %s - org %s' % (lo, hi, nets, org)
        yield nets, (org,)

def encode(self, data):
    return data + '\0'

class ORGNetworkRadixTree(RadixTree):
usage = '-w mmorg.dat mmorg_net GeoIPORG.csv'
cmd = 'mmorg_net'
seek_depth = 31
edition = 5
reclen = 4
segreclen = 4

def gen_nets(self, opts, args):
    for net, org in gen_csv(fileinput.input(args)):
        net = [ipaddr.IPNetwork(net)]
        yield net, (org,)

def encode(self, data):
    return data + '\0'

def build_dat(RTree, opts, args):
tstart = time.time()
r = RTree(debug=opts.debug)

r.load(opts, args)

if opts.debug:
    r.dump()

with open(opts.write_dat, 'wb') as f:
    r.serialize(f)

tstop = time.time()
print 'wrote %d-node trie with %d networks (%d distinct labels) in %d seconds' % (
        len(r.segments), r.netcount, len(r.data_offsets), tstop - tstart)

rtrees = [ORGIPRadixTree, ORGNetworkRadixTree]
cmds = dict((rtree.cmd, (partial(build_dat, rtree), rtree.usage)) for rtree in rtrees)

def main(argv=None):
global opts
opts, args = parse_args(argv)
init_logger(opts)
logging.debug(opts)
logging.debug(args)

cmd = args.pop(0)
cmd, usage = cmds[cmd]
return cmd(opts, args)

if name == 'main':
rval = main()
logging.shutdown()
sys.exit(rval)

from geoip.

cjheath avatar cjheath commented on July 30, 2024

I don't see any data files. I had already cloned the mmutils repository and got the conversion program running - I just didn't have any data to convert.

Also, I don't think that github's issue mail system is useful for sending files around. Put them in a gist if you must send files this way. Or attach them in normal email to [email protected]

from geoip.

cjheath avatar cjheath commented on July 30, 2024

Github had mangled your code, above. I figured it out, built the DAT file and tested it. The new version of GeoIP has been released, please test it.

from geoip.

batbast avatar batbast commented on July 30, 2024

Hi Clifford,

There is a little mistake at the line 157 : replace lsp by isp

With this correction I have made 2 tests :

  • with a known IP --> OK
  • with an unknown IP --> OK

Le 2013-12-18 00:01, Clifford Heath a écrit :

Github had mangled your code, above. I figured it out, built the DAT
file and tested it. The new version of GeoIP has been released, please
test it.

Reply to this email directly or view it on GitHub [1].

Links:

[1] #41 (comment)

from geoip.

cjheath avatar cjheath commented on July 30, 2024

Ouch, thanks. Update pushed.

from geoip.

OkkeKlein avatar OkkeKlein commented on July 30, 2024

Thanx to everyobdy for contributing.

I will test and see if this works for logstash.

First weird thing is that it can't find gem with version 1.3.5. But when I change version to 1.3.4 it finds the 1.3.5 version and fetches it.

from geoip.

OkkeKlein avatar OkkeKlein commented on July 30, 2024

Ok. Got it working in Logstash. ISP is now showing, but I am still missing Organization. Not sure if this is an extra field or if this value is stored in ISP.

from geoip.

batbast avatar batbast commented on July 30, 2024

Organization file is similar to the ISP file, so I think the return
generated by the current code of geoip.rb is correct : it's only my opinion.

Le 21/12/2013 14:21, OkkeKlein a écrit :

Ok. Got it working in Logstash. ISP is now showing, but I am still
missing Organization.


Reply to this email directly or view it on GitHub
#41 (comment).

from geoip.

OkkeKlein avatar OkkeKlein commented on July 30, 2024

Assuming this is true, the issue is closed. I appreciate the help.

from geoip.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.