Code Monkey home page Code Monkey logo

Comments (12)

tstoeger avatar tstoeger commented on August 19, 2024

Overlooked the need for a very specific Python 2.7 environment (outlined in https://clue.io/cmapPy/build.html#install - and exceeding the information provided in readme - and being inconsistent with tutorial by leading to the setup of a cmappy version that would require parse.parse() instead of parse()).

To add to confusion the file names had changed between the tutorial and the public version of GSE70138 (which could have opened the possibility for a change of the file format ..).

from cmappy.

oena avatar oena commented on August 19, 2024

Hi @tstoeger, sorry you had difficulties in using the tutorial. If you have suggestions as to how to make installation instructions more clear, feel free to let us know; the README currently links out to ReadTheDocs in order to help us keep documentation in a centralized place and (hopefully) up to date.

Regarding the tutorial, I'll update the inconsistencies regarding use of parse methods. With regard to scope, we definitely hope to add more tutorials in the future, but for the time being only have one with GEO data because we guessed that would be the most common use case for the package. Just for the record--should you want to investigate error messages/bugs without dealing with external datasets in the future--we do already have a variety of files used for testing to disambiguate code vs. file issues; these are located in cmapPy/cmapPy/pandasGEXpress/tests/functional_tests.

from cmappy.

tstoeger avatar tstoeger commented on August 19, 2024

Hi @oena ; Let me thank you at first - both for your inquiry, and the already existing documentation of cmapPY, which already has been very useful. Indeed the tutorial is a very nice extra.

My troubles had arisen from running into slightly different problems, and noticing that at least three distinct aspects seemed to have changed (version of used dataset, something related to external Python code, something related to cmapPY); As I'd take the tutorial as reference, this would hint at me overlooking something - but also not knowing for sure, which aspect I should trust or follow.

Possibly, the tutorial could:

  • prominently state that cmapPY - somewhat unexpectedly - needs a Python 2.7 environment; (possibly adding a check) (I misread the statement on the virtual environment as: "Follow good practice, and set up a dedicated virtual environment for individual tasks - and then import the packages listed in requirements.txt")
  • have a check on the cmapPY version number, and print this number
  • be complemented by a second tutorial (similar to the supplied unit tests) dedicated to testing the most basic workings (and using one of the supplied data sets rather than an external one)

from cmappy.

oena avatar oena commented on August 19, 2024

Those points all seem very reasonable to me, thanks! I'll see what we can do to address them better than we do currently.

from cmappy.

benanbardak avatar benanbardak commented on August 19, 2024

Hi,
Although I'm using Python version 2.7, I get the error "Exception: parse_gctx check_id_validity" that you received above, but not the metadata for the file being parsed - mismatch_ids: ... The file I'm trying to run is GSE92742. I would appreciate it if you could tell me how you solved the above problem.

from cmappy.

tstoeger avatar tstoeger commented on August 19, 2024

I made a Python 3 compatible version of cmapPy; Credits for identifying critical section go to @heltena

In my usage scenario a single line addition was sufficient.

curr_dset.read_direct(temp_array)
temp_array = np.core.defchararray.decode(temp_array, 'utf8')  # <- introduced for Python3 compatibility
header_values[str(k)] = temp_array

My usage scenario was restricted to gctx files, which simplifies the problem of Python 3 compatibility. I didn’t check definition of gctx regarding future compatibility of encoding.I have only constructed tests with GSE92742 level 5, and I additionally bypassed GCToo instances as output I have always been only using the data frame contained within them (hence, I did not check their creation for compatibility with Python3). The above covers my usage of cmapPy.

from cmappy.

saksham219 avatar saksham219 commented on August 19, 2024

Hi @benanbardak
It would be helpful if you can mention which file you are using from GEO to read in the metadata. There are five files given here

from cmappy.

benanbardak avatar benanbardak commented on August 19, 2024

Firstly thank you for response,
I am using "GSE92742_Broad_LINCS_Level3_INF_mlr12k_n1319138x12328.gctx.gz". But I get an error "Exception: parse_gctx check_id_validity some of the ids being used to subset the data are not present in the metadata for the file being parsed - mismatch_ids:.."

from cmappy.

saksham219 avatar saksham219 commented on August 19, 2024

That is a 48 GB file so I will take some time to try to download it. I tried it with another file from the same series "GSE92742_Broad_LINCS_Level2_GEX_delta_n49216x978.gctx.gz" and metadata parsing is working in python2.
If you can try it with this file, and it fails then the issue might be with your version of cmapPy. If it does not fail with this smaller file, it might be the case that the 48gb file has something different going on that the package is not able to handle

from cmappy.

benanbardak avatar benanbardak commented on August 19, 2024

And please check your email.. @tstoeger

from cmappy.

benanbardak avatar benanbardak commented on August 19, 2024

@saksham219 I tried to run tutorial with this data "GSE92742_Broad_LINCS_Level2_GEX_delta_n49216x978.gctx.gz". But again I get an same error. How can I solve this problem? What does mean "the issue might be with your version of cmapPy. " How can I fixed version of cmapPy?
Thank you so much.

from cmappy.

saksham219 avatar saksham219 commented on August 19, 2024

@benanbardak What I mean is that you might not be using the latest version on the master branch of this repo.
you can try running this from the terminal

$ git clone https://github.com/cmap/cmapPy
$ pip install cmapPy/

and then trying to read the file again in a new python environment.

If the problem still persists, it would be helpful if you could list down the versions of the packages in your python by
$ pip freeze

from cmappy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.