Comments (4)
read_table uses the pandas tool for this. In fact, it is the only thing we
use out of pandas. It is pretty sophisticated and we replaced the more
straightforward csv reader that was in my original tables implementation.
We might need to offer that as an alternative when read_table can't figure
it out. csv is a really awful world in many ways.
David E. Culler
Friesen Professor of Computer Science
Electrical Engineering and Computer Sciences
University of California, Berkeley
On Mon, Sep 14, 2015 at 12:04 PM, davidwagner [email protected]
wrote:
Try this:
Table.read_table('https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv?accessType=DOWNLOAD')
Table.read_table() fails to recognize the columns; it stuff everything
into one column.Compare to
Table.read_table('https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv')
which does recognize that there are three columns.
Perhaps it is looking at the URL and trying to parse out the filename
extension, and then using that to decide how to decode the data. If so,
maybe it should be smarter about how to parse URLs (to remove fragments and
parameters), or maybe it should ignore the URL/filename and have smarter
format detection (e.g., auto-detect it as CSV based on the contents of the
data rather than the filename).—
Reply to this email directly or view it on GitHub
#66.
from datascience.
Cool, thank you! I wonder if this line in datascience/tables.py is causing the problem:
if filepath_or_buffer.endswith('.csv') and 'sep' not in vargs:
vargs['sep'] = ','
Note to self: investigate when I get a chance.
Anyway, this is absolutely not a big deal, just a super-minor annoyance I thought I'd document.
from datascience.
The table reader doesn't inspect the file, just the path. I think that behavior is here to stay. Instead, you'll have to specify the separator manually.
address = 'https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv?accessType=DOWNLOAD'
Table.read_table(address, sep=',')
from datascience.
Slightly improved in new release (handles the http query string case)
from datascience.
Related Issues (20)
- Upload a wheel to PyPI HOT 2
- Remove sphinx and nbsphinx from requirements.txt file HOT 1
- “python_requires” should be set with “>=3.0”, as datascience is not compatible with all Python versions. HOT 3
- make test doesn't work HOT 1
- Make Plots Compatible with %matplotlib notebook HOT 2
- Deprecation warning for np.int in _vertical_x()
- Actions for running tests are erroring out HOT 3
- Docs are not getting automatically built HOT 2
- v0.17.5 not pip installable HOT 2
- `Table.scatter` method doesn't recognize `sizes` argument HOT 2
- Getting warning from numpy when creating pivot table HOT 4
- Getting warning from numpy when creating histogram HOT 5
- getting version in setup.py HOT 5
- We use Python f-strings HOT 3
- Possible bug with Marker copy HOT 2
- Remove `Table.empty` from `tables.py` HOT 1
- bokeh required for tests to pass HOT 2
- 0.17.6 release not on pypi or conda HOT 5
- Do we need to check for `collections.abc.Sequence` in `utils. is_non_string_iterable`? HOT 2
- test_date_formatter_format_value() should call tzset() HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from datascience.