Code Monkey home page Code Monkey logo

Comments (3)

afrendeiro avatar afrendeiro commented on May 28, 2024

I'm aware of this.
I have not found good solutions besides:

  • assuming the number of columns rows is the same as sample_attributes (not a safe assumption and would not work if sample_attributes is not set)
  • reading through the file without header and set as columns levels everything which is non-numeric (not good because there may be a factor which is entirely numeric for all samples - case above)
  • encode information of the number of header rows in e.g. file name (ugly and implicit).

This just shows that CSV is not a good alternative for storing metadata.

What I recommend for now:

  • do not use Analysis.annotate_samples with save=True. This is safe, and entirely backward compatible. In addition functions which need a multindex columns can produced it on the fly.

What the future will likely be:

  • using formats for serialization that accommodate large numeric matrices and metadata associated with both samples and features such as hdf5 of h5ad/anndata.

from toolkit.

afrendeiro avatar afrendeiro commented on May 28, 2024

For more context see pandas-dev/pandas#17086

from toolkit.

afrendeiro avatar afrendeiro commented on May 28, 2024

60b71f6 now assumes the "matrix_norm" key in Analysis.load_data to be a non-MultiIndex dataframe.
However, it will check that all columns are of type float (as a matrix_norm attribute should) and throw a detailed message on what is the likely cause (MultiIndex CSV) and how to read it in properly manually: analysis.matrix_norm = pandas.read_csv('{file}', index_col=0, header=list(range(x))), x being the number of header rows.

Since Analysis.annotate_samples now defaults to save=False I think this should solve the issue.

from toolkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.