opengovernment / legistar-scrape Goto Github PK
View Code? Open in Web Editor NEWa python library for scraping Legistar sites and storing in to a Mongo database
a python library for scraping Legistar sites and storing in to a Mongo database
Similar to opengovernment/opengovernment-local#9
Error is generated when scraping:
saving Councilmember O'Brien 522f2ce812a96023f0e70e95
saving Councilmember Oh 522f2ce812a96023f0e70e96
saving Councilmember O'Neill 522f2ce812a96023f0e70e97
saving Traceback (most recent call last):
File "./import.py", line 89, in <module>
print 'saving', member['Person Name'], council_member_id
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 17: ordinal not in range(128)
Downloading/unpacking pdfminer (from -r /u/apps/legistar-scrape-staging/requirements.txt (line 12))
Could not find a version that satisfies the requirement pdfminer (from -r /u/apps/legistar-scrape-staging/requirements.txt (line 12)) (from versions: 20100829, 20101226, 20091219, 20100213, 20100327, 20100424, 20100322, 20101017, 20100104, 20110227, 20091129, 20091024, 20100131, 20100619p1, 20110515, dist-20080727, dist-20090201, dist-20080629, dist-20080429, dist-20080906, dist-20090110, dist-20090517, dist-20090117)
Cleaning up...
No distributions matching the version for pdfminer (from -r /u/apps/legistar-scrape-staging/requirements.txt (line 12))
The current scraper hardcodes pa-philadelphia; this scraper should be parameterized so we can scrape different local governments by providing alternate command line arguments.
We need a settings file for specifying 4 connection parameters for MongoDB:
Billy scrapers assigns those values within a mongodb URL this way:
MONGO_HOST = 'mongodb://username:password@host:port/database'
Is there some standard settings file already implied that I just need to populate?
Also reference http://api.mongodb.org/python/current/api/pymongo/mongo_client.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.