Comments (12)
Gonna close this for now. Feel free to re-open with a specific refactoring to tackle.
from congress.
I apparently don't have the proper permissions to reopen this, but I think now would be a good time to tackle this: Moving the bill processing functions to a separate file (or perhaps to a separate project, combining with similar functions from congress-legislators) so that they can be reused instead of recreated.
With the addition of American Memory, we have at least 3 different places where we get bill information from (THOMAS/Congress.gov, Statutes at Large), and they all effectively output the same format once they're parsed. I think it's time to untie the output functions from a single parser.
from congress.
Sounds like the perfect time to do it. Thanks for tackling this.
from congress.
An incomplete list of functions that would be useful for American Memory processing:
- congress.utils.format_datetime()
- congress.bill_info.latest_status()
- congress.utils.write()
- congress.utils.data_dir()
from congress.
I'd be happy to do it, but I think we should do a little planning and coordination first. In particular, where should it go? A new file or a new project?
from congress.
Oh, silly me: The reason I don't have access to those functions is because I'm working outside of unitedstates/congress. So a new project/repository would help that.
But the real problem is probably with functions like congress.bill_versions.fetch_version(), which I couldn't use for the statutes. That could benefit from just being split up into pieces.
from congress.
The tiny utils functions, we're duplicating those across a bunch of places - fortunately, they're small enough that it's not a big deal. It'd add complexity to have a generic utils repo that we have to dynamically link into the others.
If the American Memory code is outputting bill information in a standard form, is it appropriate to actually bring into unitedstates/congress...? I know the plan is to bring it into unitedstates in some way, but if the code is really that similar, maybe even putting it in this repo is a good idea?
from congress.
There's no point in moving files to a new repository. That doesn't make it any easier to access the functions. Just use PYTHONPATH=path/to/congress or some other method to make the congress project modules available to your American Memory project.
The only refactoring that I think is necessary is to isolate the part that converts the JSON to GovTrack-style XML. Everything else should be fine.
from congress.
The more I work on the American Memory parser, the more I tend to agree that it seems to belong in unitedstates/congress. (Though perhaps that's because I'm making an effort to make it be similar.) I'm not clear where the parser falls into @tauberer's big plans for American Memory. :)
from congress.
Yeah, that sounds right to me, Josh - the simpler the integration the better, and that works just fine.
from congress.
Sorry, I got my terminology wrong: What I meant to suggest was that we create a separate Python package that could be installed/imported on its own to do all the basic work.
from congress.
This was mooted by #95 and will be further handled by unitedstates/utils.
from congress.
Related Issues (20)
- Python 3 support HOT 6
- Some bills (maybe 1/7 of them) give module 'lxml.html' has no attribute 'entities' HOT 1
- Vote format has changed for House 2020? HOT 6
- [Bug] Error handling in govinfo.py line 73 HOT 5
- [Bug] Votes scraper not pulling in most recent vote, until I cleared cache HOT 2
- [Bug] Bad zip file HOT 1
- Newbie Q: Pulling bills for only one topic HOT 2
- Is there any interest in using govinfo's bulkdata zip files HOT 1
- Error: ImportError: No module named html.entities after the Feb 28th update HOT 4
- Unable to scrape Committee meetings HOT 1
- Downloading House votes in 2001 and 1991 raises exception HOT 5
- Error in parsing sponsor & byRequest HOT 4
- Discrepancies on amendment roll call votes
- Update PyPI Package HOT 8
- (votes, committee_meetings): senate.gov and clerk.house.gov not redirecting to https
- Correct Virtual Env Suggestion
- Request - Include Mastodon ID for members of congress HOT 2
- Error from lxml when parsing amendments "purpose" field HOT 1
- Bills & data.json HOT 1
- Errors when parsing amendments for 118th Congress
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from congress.