Code Monkey home page Code Monkey logo

Comments (4)

deculler avatar deculler commented on July 25, 2024

read_table uses the pandas tool for this. In fact, it is the only thing we
use out of pandas. It is pretty sophisticated and we replaced the more
straightforward csv reader that was in my original tables implementation.
We might need to offer that as an alternative when read_table can't figure
it out. csv is a really awful world in many ways.

David E. Culler
Friesen Professor of Computer Science
Electrical Engineering and Computer Sciences
University of California, Berkeley

On Mon, Sep 14, 2015 at 12:04 PM, davidwagner [email protected]
wrote:

Try this:

Table.read_table('https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv?accessType=DOWNLOAD')

Table.read_table() fails to recognize the columns; it stuff everything
into one column.

Compare to

Table.read_table('https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv')

which does recognize that there are three columns.

Perhaps it is looking at the URL and trying to parse out the filename
extension, and then using that to decide how to decode the data. If so,
maybe it should be smarter about how to parse URLs (to remove fragments and
parameters), or maybe it should ignore the URL/filename and have smarter
format detection (e.g., auto-detect it as CSV based on the contents of the
data rather than the filename).


Reply to this email directly or view it on GitHub
#66.

from datascience.

davidwagner avatar davidwagner commented on July 25, 2024

Cool, thank you! I wonder if this line in datascience/tables.py is causing the problem:

    if filepath_or_buffer.endswith('.csv') and 'sep' not in vargs:
        vargs['sep'] = ','

Note to self: investigate when I get a chance.

Anyway, this is absolutely not a big deal, just a super-minor annoyance I thought I'd document.

from datascience.

papajohn avatar papajohn commented on July 25, 2024

The table reader doesn't inspect the file, just the path. I think that behavior is here to stay. Instead, you'll have to specify the separator manually.

address = 'https://data.oaklandnet.com/api/views/7axi-hi5i/rows.csv?accessType=DOWNLOAD'
Table.read_table(address, sep=',')

from datascience.

papajohn avatar papajohn commented on July 25, 2024

Slightly improved in new release (handles the http query string case)

from datascience.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.