Code Monkey home page Code Monkey logo

tz-trout's Introduction

Timezone Trout

This library tries to solve the common problem of figuring out what time zone a specific address or a phone number is in. It does so by using several mappings that are generated with the help of pytz, Geonames.org, and TimezoneFinder

Current version is fairly accurate for the United States, Canada, Australia, and countries which fit within a single time zone.

Vocabulary used in this library:

  • PST - time zone name
  • America/Los_Angeles - time zone identifier
  • UTC-07:00 or -420 - UTC offset (the latter given in minutes)
  • DST - Daylight Saving Time

The US Zipcode data is provided by www.geonames.org under the Creative Commons Attribution 3.0 License.

Starting in v1.0.0, this library requires Python version 3.6 or above.

Examples

>>> tztrout.tz_ids_for_phone('+16503334444')
[u'America/Los_Angeles']
>>> tztrout.tz_ids_for_phone('+49 (0)711 400 40990')
[u'Europe/Berlin', u'Europe/Busingen']
>>> tztrout.tz_ids_for_address('US', state='CA')
[u'America/Los_Angeles']
>>> tztrout.tz_ids_for_address('PL')
[u'Europe/Warsaw']
>>> tztrout.tz_ids_for_address('CN')
[
    u'Asia/Shanghai',
    u'Asia/Harbin',
    u'Asia/Chongqing',
    u'Asia/Urumqi',
    u'Asia/Kashgar'
]
>>> import tztrout
>>> tztrout.tz_ids_for_tz_name('PDT')  # ran during DST
[
    u'America/Dawson',
    u'America/Los_Angeles',
    u'America/Santa_Isabel',
    u'America/Tijuana',
    u'America/Vancouver',
    u'America/Whitehorse',
    u'Canada/Pacific',
    u'US/Pacific'
]
>>> tztrout.tz_ids_for_tz_name('PDT')  # ran outside of the DST period
[]
>>> tztrout.local_time_for_phone('+1 (650) 333-4444')
datetime.datetime(2013, 9, 16, 17, 45, 43, 0000000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>)

>>> tztrout.local_time_for_phone('+48 601 941 311)
datetime.datetime(2013, 9, 17, 2, 45, 43, 0000000, tzinfo=<DstTzInfo 'Europe/Warsaw' CEST+2:00:00 DST>)
>>> tztrout.local_time_for_address('US', state='CA')
datetime.datetime(2013, 9, 16, 17, 45, 43, 0000000, tzinfo=<DstTzInfo 'America/Los_Angeles' PDT-1 day, 17:00:00 DST>)
>>> tztrout.local_time_for_address('PL')
datetime.datetime(2013, 9, 17, 2, 45, 43, 0000000, tzinfo=<DstTzInfo 'Europe/Warsaw' CEST+2:00:00 DST>)
>>> tztrout.tz_ids_for_offset(-7 * 60)  # during DST
[
    u'America/Creston',
    u'America/Dawson',
    u'America/Dawson_Creek',
    u'America/Hermosillo',
    u'America/Los_Angeles',
    u'America/Phoenix',
    u'America/Santa_Isabel',
    u'America/Tijuana',
    u'America/Vancouver',
    u'America/Whitehorse',
    u'Canada/Pacific',
    u'US/Arizona',
    u'US/Pacific'
]
>>> tztrout.tz_ids_for_offset(+2 * 60)  # during DST
[
    "Africa/Blantyre",
    "Africa/Bujumbura",
    "Africa/Cairo",
    "Africa/Ceuta",
    "Africa/Gaborone",
    "Africa/Harare",
    "Africa/Johannesburg",
    "Africa/Kigali",
    "Africa/Lubumbashi",
    "Africa/Lusaka",
    "Africa/Maputo",
    "Africa/Maseru",
    "Africa/Mbabane",
    "Africa/Tripoli",
    "Africa/Windhoek",
    "Arctic/Longyearbyen",
    "Europe/Amsterdam",
    "Europe/Andorra",
    "Europe/Belgrade",
    "Europe/Berlin",
    "Europe/Bratislava",
    "Europe/Brussels",
    "Europe/Budapest",
    "Europe/Busingen",
    "Europe/Copenhagen",
    "Europe/Gibraltar",
    "Europe/Ljubljana",
    "Europe/Luxembourg",
    "Europe/Madrid",
    "Europe/Malta",
    "Europe/Monaco",
    "Europe/Oslo",
    "Europe/Paris",
    "Europe/Podgorica",
    "Europe/Prague",
    "Europe/Rome",
    "Europe/San_Marino",
    "Europe/Sarajevo",
    "Europe/Skopje",
    "Europe/Stockholm",
    "Europe/Tirane",
    "Europe/Vaduz",
    "Europe/Vatican",
    "Europe/Vienna",
    "Europe/Warsaw",
    "Europe/Zagreb",
    "Europe/Zurich"
]

Testing

Just run pytest

Regenerating the data

Time zones, addresses, and phone numbers are in a constant flux and hence the data used by this library needs to be regenerated periodically. To do so, upgrade the pytz and timezonefinder dependencies, and run python regenerate_data.py. If this doesn't fix the problem, consider opening an issue or adding an exception in data_exceptions.py.

Known Issues

  • Australian Central Western Standard Time (CWST) is treated as Australian Central Standard Time (ACST). See Australian anomalies for more details.
  • Lord Howe Standard Time (LHST) is treated as Australian Eastern Standard Time (AEST). In reality, they're 30 minutes apart.
  • The whole state of British Columbia (Canada) is recognized as Pacific Time, although a small portion of its south-east territory should be recognized as Mountain Time.
  • The whole state of Ontario (Canada) is recognized as Eastern Time, although a small portion of its west territory should be recognized as Central Time.
  • All +1 867 phone numbers are recognized as Mountain Time, although this prefix is shared by three Canadian territories in the Arctic far north, spanning across Pacific, Mountain, and Central Time.

Releasing a New Version

  1. Make sure the code has been thoroughly reviewed and tested in a realistic production environment.
  2. Update setup.py and CHANGELOG.md. Make sure you include any breaking changes.
  3. Run python setup.py sdist and twine upload dist/<PACKAGE_TO_UPLOAD>.
  4. Push a new tag pointing to the released commit, format: v0.13 for example.

tz-trout's People

Contributors

anemitz avatar dependabot[bot] avatar eengoron avatar mpessas avatar neob91-close avatar nsaje avatar philfreo avatar tsx avatar wojcikstefan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tz-trout's Issues

`area` guessing based on `phonenumbers.geocoder.description_for_number` is wrong for some numbers

When a single area is extracted from the output of description_for_number, that's interpreted as if it were a state.

elif len(area) == 1 and area[0]:
state = area[0].lower().strip()
state = td.normalized_states['US'].get(state, None)

Here's an example where a phone from Philadelphia, PA thinks state is philadelphia instead of pennsylvania and fails to find it in normalized_states. Because of this, all the US timezone IDs will be returned for this number.

This behaviour is still present in phonenumbers latest version (8.13.1)

Example:

import phonenumbers
from phonenumbers.geocoder import description_for_number
from tztrout import tz_ids_for_phone

phone_number = "+14455000001"

tz_ids = tz_ids_for_phone(phone_number, "US")
print(tz_ids) # contains all of the US timezones

phone = phonenumbers.parse(phone_number, "US")
area = description_for_number(phone, "en").split(',')
print(area) # ["Philadelphia"]

Shouldn't return duplicate offsets

In [3]: tztrout.non_dst_offsets_for_address('US', state='NY')
Out[3]: [-300, -300, -300, -300, -300, -300, -300, -300, -300, -300]

In [5]: tztrout.non_dst_offsets_for_phone('12125556666')
Out[5]: [-300, -300, -300, -300, -300, -300, -300, -300, -300, -300]

US Zipcode list is out of date

  • regenerate_data.py script should download http://download.geonames.org/export/zip/US.zip and convert it to a new json file in tztrout/data.
    • Keep JSON output consistent and diff-friendly (use this)
    • Only keep zip code, city name, 2-letter state code and coordinates to keep the size down.
    • This should be the first step since other files depend on this data.
    • Commit the data file into repo.
    • Provide attribution in the readme: ZIP code data is provided by www.geonames.org under under Creative Commons Attribution 3.0 License
  • Load the data in memory in place of existing pyzipcode data. Make sure to deduplicate city name and state code strings.
    • Get rid of pyzipcode dependency (setup.py + requirements.txt)
  • Since old database had timezones and we don't have it from geonames, use https://github.com/MrMinimal64/timezonefinder to look up timezones by zipcode's coordinates.
  • Measure memory usage and tz lookup time before and after changes.
  • Turn these into test cases:
    • print tztrout.tz_ids_for_address('US', state='NY', zipcode="10065") should return America/New_York but it returns None. This zipcode was introduced in 2006/7.
    • print tztrout.tz_ids_for_address('US', state='IL', zipcode="60642") should return America/Chicago but it returns None. This zipcode was introduced in 2008.
    • print tztrout.tz_ids_for_address('US', state='DC', zipcode="20022") should return America/New_York but it returns None. There's no data on when this zip was introduced at the end of 2004.

We're not going to use pyzipcode3 since @tsx has concerns about maintenance of this library. It only had one database upgrade 2 years ago and no transparent process for future upgrades.

442 area code isn't properly assigned

Should be America/Los_Angeles, but instead it defaults to the whole US:

> tztrout.tz_ids_for_phone('+14422425150')
[u'America/New_York',
 u'America/Detroit',
 u'America/Kentucky/Louisville',
 u'America/Kentucky/Monticello',
 u'America/Indiana/Indianapolis',
 u'America/Indiana/Vincennes',
 u'America/Indiana/Winamac',
 u'America/Indiana/Marengo',
 u'America/Indiana/Petersburg',
 u'America/Indiana/Vevay',
 u'America/Chicago',
 u'America/Indiana/Tell_City',
 u'America/Indiana/Knox',
 u'America/Menominee',
 u'America/North_Dakota/Center',
 u'America/North_Dakota/New_Salem',
 u'America/North_Dakota/Beulah',
 u'America/Denver',
 u'America/Boise',
 u'America/Phoenix',
 u'America/Los_Angeles',
 u'America/Metlakatla',
 u'America/Anchorage',
 u'America/Juneau',
 u'America/Sitka',
 u'America/Yakutat',
 u'America/Nome',
 u'America/Adak',
 u'Pacific/Honolulu']

Internal reference: https://help.close.io/agent/case/74267

Support for Australia

I think time-zones are state driven (the eight three letter acronyms below), so that is not too complex: just 8. All postcodes are mapped to a state, so that is easy too, you can easily get a csv of all postcodes in Australia mapped to States from Australia Post (we just bought some recent data that maps suburbs as well)
Australian phone codes are simple too:
Mobiles start with 04 or +614 all over the country, so that is of no use
Landline codes are +612 for NSW & ACT, +613 VIC&TAS, +617 QLD, and +618 for WA, SA and NT which may not be necessarily on same time zone (Source: http://australia.gov.au/about-australia/our-country/telephone-country-and-area-codes)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.