Code Monkey home page Code Monkey logo

Comments (25)

rufuspollock avatar rufuspollock commented on August 19, 2024

/cc @mk270 @domoritz

from messytables.

mk270 avatar mk270 commented on August 19, 2024

You just need a mapping, like the one in ktbh:

classes = {
StringType: "string",
IntegerType: "integer",
FloatType: "number",
DecimalType: "number",
DateType: "date",
DateUtilType: "date"
}

There's no need for everyone depending on messytables to change their code

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

A mapping where though - messytables flows through into dataconverters and thence dataproxy etc etc

from messytables.

domoritz avatar domoritz commented on August 19, 2024

I really like the json table schema but I don't think that messytables is the right point to set the type names. Especially since we use only classes and not strings to identify types. I'll close this for now.

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

So where would it go? DataConverters?

from messytables.

domoritz avatar domoritz commented on August 19, 2024

Yes, it should go into DataConverters. Imo, messytables is a basic library that should not care about the actual output to the user.

from messytables.

mk270 avatar mk270 commented on August 19, 2024

Please do NOT stick it in DataConvertors! It's already in one downstream package; it should be horizontally integrated.

Either stick it in a shim, or stick it in messytables.

I shall write a shim; we can then use that, and decide later whether the shim should be in messytables

from messytables.

mk270 avatar mk270 commented on August 19, 2024

I've stuck a shim up at: https://github.com/mk270/messytables-jts

from messytables.

domoritz avatar domoritz commented on August 19, 2024

@mk270 Great. Now the question is whether the mapping should become part of messytables. I think it makes sense as long as it's not too confusing for people who use messytables. Would you like to add it and then we can discuss things based on the proposal?

from messytables.

mk270 avatar mk270 commented on August 19, 2024

Well, people who use messytables aren't going to get confused unless they call any JTS-related functionality.

Let's wait for the JTS spec to be formally frozen, and for messytables-jts to have a few weeks in production, before worrying about whether to incorporate the mapping in messytables itself. There's no hurry, right, and we'd benefit from comparing what other people are doing with it, too?

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

I'm +1 for getting this in. I think it would be really good if messytables were JTS conformant ...

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

@domoritz hmmm - i should have reviewed this more carefully :-)

I think there was a bit of a confusion here. I don't think we need JTS output from messytables - we just wanted the celltypes to conform to JTS types list.

Questions:

  • What is the feeling on actually doing the latter (i.e. using JTS compatible types ...)
  • Could we remove the current jts.py module as it adds additional dependencies and isn't really needed for anything useful ....

from messytables.

domoritz avatar domoritz commented on August 19, 2024

Hmm. I'm afraid I still don't understand what you want.

What is the feeling on actually doing the latter (i.e. using JTS compatible types ...)

What would that actually mean? The types can already be mapped to the JTS types easily.

Could we remove the current jts.py module as it adds additional dependencies and isn't really needed for anything useful ....

I'll revert 6e98d06 later.

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

I mean not mapping to JTS types but using JTS types as default :-)

I want messytables to just support JTS types out of the box - not have to use some special version of messytables to get the JTS types ...

from messytables.

domoritz avatar domoritz commented on August 19, 2024

Well, that is a huge change because it would break every application that uses messytables. Also, we could not use things like the format for dates any more. How about we add a __str__ method to every messytables type that returns the JTS type?

from messytables.

mk270 avatar mk270 commented on August 19, 2024

This is going to couple messytables to JTS such that there'll be pressure to break JTS to accommodate new types in messytables, and vice versa.

That means people will be more hesitant to rely on either of them.

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

@mk270 i don't see how depending on a set of types is going to be that tough and if there are key types not supported in JTS it would warrant adding them.

@domoritz can you explain re formats for dates?

Generally we should make not change until we have very clear agreement here :-)

from messytables.

domoritz avatar domoritz commented on August 19, 2024

messytables.types.DateType also cntains the format of the dates when doing type guessing. JTS does not specify a format but recommetnds ISO. I think we should not support JTS at all because as far as I understand it they have different intensions. Messytables if for parsing tabular data from files. JTS is to define a schema. We should not mix them, IMHO.

@rgrp @mk270 What would be the advantage of supporting only JTS in messytables? Doesn't it mkae more sense to have this as a return type from dataconvereters?

from messytables.

mk270 avatar mk270 commented on August 19, 2024

No, it doesn't make more sense to return it from dataconvertors, as stated further up at #40 (comment)

The only questions are whether JTS should be output by messytables (I don't mind, but it seems like a waste of effort given that messytables-jts exists), and whether messytables should break a bunch of things which depend on it and change its interface the better to reflect JTS.

If we break downstream, then people won't use messytables. So it's like taking messytables in house.

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

@domoritz JTS does contain an optional format field (but not sure i understand you).

I note that most people using messytables downstream should be using dataconverters (shouldn't they?)

@mk270 Would dataconverters support JTS types (or not)?

from messytables.

domoritz avatar domoritz commented on August 19, 2024

@rgrp Well, it's not in the list of attributes. I made a pull request to fix that frictionlessdata/datapackage#45.

Most people (~90%) should probably use dataconverters but not everyone.

I don't see any problems when we tie messytables and jts because jts should be a standard and be used in implementations. @rgrp I'm not 100% sure, I understand what you want. Do you only want us to return strings instead of classes for types or a whole jts with the data? I think we need an example to know what we are talking about.

Is it okay, if I revert the jts commit from messytables for now until we come up with a proper solution?

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

Yes, ok to revert the JTS stuff - we certainly should not change any interface with folks until we have thoroughly discussed etc.

What i'm motivated by is what you get when we convert the "metadata" we get from messytables to JSON e.g. look at the metadata here (which is messytables type info converted through dataconverters to json):

http://jsonpdataproxy.appspot.com/?url=https://raw.github.com/datasets/gold-prices/master/data/data.csv&max-results=2&guess-types=1

from messytables.

mk270 avatar mk270 commented on August 19, 2024

Can't you just use messytables-jts in dataconverters?

from messytables.

domoritz avatar domoritz commented on August 19, 2024

👍 for using the jts in dataconverters instead of messytables. I understand the motivation but I don't think that messtables as a low-level library is the right place for it. What I would do is adding a field to the messytables type classes which is called jts_type.

Alternatively, we can keep headers_and_typed_as_jts and rowset_as_jts but I's prefer the jts_type field as it is much easier to use and people can switch to the jts types very easily.

from messytables.

rufuspollock avatar rufuspollock commented on August 19, 2024

@domoritz this sounds good - can you explain the proposed change to messytables a bit more and how this would be used in e.g. dataconverters ...

from messytables.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.