Code Monkey home page Code Monkey logo

sexmachine's Introduction

Sex Machine

PyPI status:

https://pypip.in/v/SexMachine/badge.png https://pypip.in/d/SexMachine/badge.png

This package uses the underlying data from the program "gender" by Jorg Michael (described here). Its use is pretty straightforward:

>>> import sexmachine.detector as gender
>>> d = gender.Detector()
>>> d.get_gender(u"Bob")
u'male'
>>> d.get_gender(u"Sally")
u'female'
>>> d.get_gender(u"Pauley") # should be androgynous
u'andy'

The result will be one of andy (androgynous), male, female, mostly_male, or mostly_female. Any unknown names are considered andies. Moreover, you can set unknown value to whatever you want:

>>> d = gender.Detector(unknown_value=u"ferhat")
>>> d.get_gender(u"Pauley")
u'ferhat'

I18N is fully supported:

>>> d.get_gender(u"Álfrún")
u'female'

Additionally, you can give preference to specific countries:

>>> d.get_gender(u"Jamie")
u'mostly_female'
>>> d.get_gender(u"Jamie", u'great_britain')
u'mostly_male'

Additionally, you can create a detector that is not case sensitive (default is to be case sensitive):

>>> d = sexmachine.detector.Detector(case_sensitive=False)
>>> d.get_gender(u"sally")
u'female'
>>> d.get_gender(u"Sally")
u'female'

Try to avoid creating many Detectors, as each creation means reading the data file.

Licenses

The generator code is distributed under the GPLv3. The data file nam_dict.txt is released under the GNU Free Documentation License.

sexmachine's People

Contributors

ferhatelmas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sexmachine's Issues

Parallelize Data Loading

Loading the data seems to take 5-10s on my machine. It'd be great if we could speed this up by doing the data loading in parallel.

Some suggestions for code improvement

I'm a big fan the code you've done to make this great dataset usable. I've had to request a major international company install "sexmachine", so that's how useful it has been *. In using the code, I've come up with some suggestions for how to improve. I'm happy to implement some if you think they are useful.

  • Country naming support: Although defined in detector.py,

    COUNTRIES = u"""great_britain ireland usa italy ... other_countries""".split()

Names can be additionally mapped by ISO 3166-1-alpha-2 codes. For example,

d.get_gender(u"Andrea", 'IT')

would return the same as

d.get_gender(u"Andrea", 'italy')
  • Distinguish "not found" and "androgynous": Names that are not found should be distinguished from names that are androgynous. For example,

    print d.get_gender(u"Pauley")
    print d.get_gender(u"dssad12jkasdl")

Should print "andy", "unknown" instead of twice printing "andy".

  • Include some data: Return country probability for a name. For example:

    print d.get_gender(u"Andrea")

would return something like:

italy, male, 99%
usa, mostly_female, 85%
global, female, 95%

* The Ruby package was renamed gender_detector based on the blog post here.

Didn't see a contributing.md, would like to contribute data

hi! I would like to to contribute data to nam_dict.txt. Specifically, Indian male, female names (as in Asia/India). I am native to South of India. Do I just send in a PR to the file or is there something I am missing that I should bear in mind before I contribute data? Thanks

Troubles with detector module

I've the next problem:

davidam@debian:~/git/python-examples$ pip install sexmachine
Requirement already satisfied: sexmachine in /usr/local/lib/python2.7/dist-packages

davidam@debian:~/git/python-examples$ python sexmachine.py
Traceback (most recent call last):
File "sexmachine.py", line 1, in
import sexmachine.detector as gender
File "/home/davidam/git/python-examples/sexmachine.py", line 1, in
import sexmachine.detector as gender
ImportError: No module named detector

The source code is:

$ cat sexmachine.py
import sexmachine.detector as gender
d = gender.Detector()
if (d.get_gender(u"Bob") == 'male'):
print("Python")
elif (d.get_gender(u"Sally") == 'female'):
print("Shell")
else:
print("PHP")

Regards.

How to port sexmachine to Python3

Had to change a few lines to make it work in Python3:

mapping.py
Comment out lines 75 and 76:

#u = u.replace(pattern.decode("utf-8"), unichr(code))
#u = u.replace(pattern, unichr(code))

And add a line 76:
pass

detector.py
Change line 80:
best = list(self.names[name].keys())[0]

Change line 98:
return (len(list(country_values)),

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.