Hi, I am trying to run 20/20+ on linux machine with following comman

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Dependencies problem about 2020plus HOT 3 CLOSED

karchinlab commented on June 3, 2024

Dependencies problem

from 2020plus.

Comments (3)

ctokheim commented on June 3, 2024

Hi @vmelichar.

I've attached an updated environment file to install the 2020plus environment (2020plus_environment.yml.gz), which had also worked for another user that had issues recently and I just tested worked for myself to successfully run the unit tests.

To install the environment, I would first either delete your current environment or change the name of the environment in the yaml file.

Then run conda (or mamba) to install.

conda env create -f 2020plus_environment.yml

Although for me conda was being quite slow to create the environment, so my installation was actually via mamba.

conda install mamba
mamba env create -f 2020plus_environment.yml

This should install everything that is necessary, including r dependencies. Just then need to activate the env.

A quick way to test if the 2020plus code is likely installed correctly is then to run the unit tests (this would be quicker then trying to run 20/20+ on large data and only finding out later in the pipeline there was a problem):

pip install nose
nosetests tests/test_features.py
nosetests tests/test_train.py
nosetests tests/test_classify.py

from 2020plus.

vmelichar commented on June 3, 2024

Thank you @ctokheim for the response. The installation of env worked perfectly. All the tests were also passed. But I am still getting an error when running on actual data. It still might be a problem with pandas....

This is the first error I get in the pipeline:

Traceback (most recent call last):
  File "/home/melichv/miniconda3/envs/2020plus/bin/probabilistic2020", line 8, in <module>
    sys.exit(cli_main())
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/probabilistic2020.py", line 284, in cli_main
    main(opts)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/probabilistic2020.py", line 229, in main
    result_df = rt.main(opts, mutation_df)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/randomization_test.py", line 392, in main
    opts['use_unmapped'])
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/count_frameshifts.py", line 37, in count_frameshift_total
    gene_df = fs_df[fs_df['Gene']==bed.gene_name]
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/frame.py", line 2982, in __getitem__
    return self._getitem_frame(key)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/frame.py", line 3082, in _getitem_frame
    return self.where(key)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 9276, in where
    cond, other, inplace, axis, level, errors=errors, try_cast=try_cast
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 9123, in _where
    axis=block_axis,
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 557, in where
    return self.apply("where", **kwargs)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 436, in apply
    kwargs[k] = obj.reindex(b_items, axis=axis, copy=align_copy)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/util/_decorators.py", line 221, in wrapper
    return func(*args, **kwargs)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/frame.py", line 3976, in reindex
    return super().reindex(**kwargs)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 4514, in reindex
    axes, level, limit, tolerance, method, fill_value, copy
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/frame.py", line 3858, in _reindex_axes
    columns, method, copy, level, fill_value, limit, tolerance
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/frame.py", line 3906, in _reindex_columns
Dropped 139 mutations after only keeping valid SNVs
    allow_dups=False,
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 4577, in _reindex_with_indexers
    copy=copy,
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1251, in reindex_indexer
    self.axes[axis]._can_reindex(indexer)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3362, in _can_reindex
    raise ValueError("cannot reindex from a duplicate axis")
ValueError: cannot reindex from a duplicate axis

Then there is also this one:

Traceback (most recent call last):
  File "/home/melichv/miniconda3/envs/2020plus/bin/mut_annotate", line 8, in <module>
    sys.exit(cli_main())
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 432, in cli_main
    main(opts)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 417, in main
    multiprocess_permutation(bed_dict, mut_df, opts, indel_df)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 86, in multiprocess_permutation
    indel_cts_dict = indel_df['Gene'].value_counts().to_dict()
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 5179, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'value_counts'

And this one:

Traceback (most recent call last):
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/utils.py", line 131, in wrapper
    result = f(*args, **kwds)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/randomization_test.py", line 46, in singleprocess_permutation
    genes_with_mut = set(mut_df['Gene'].unique())
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 5179, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'unique'
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/utils.py", line 131, in wrapper
    result = f(*args, **kwds)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/randomization_test.py", line 46, in singleprocess_permutation
    genes_with_mut = set(mut_df['Gene'].unique())
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/pandas/core/generic.py", line 5179, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'unique'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/melichv/miniconda3/envs/2020plus/bin/probabilistic2020", line 8, in <module>
    sys.exit(cli_main())
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/probabilistic2020.py", line 284, in cli_main
    main(opts)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/probabilistic2020.py", line 229, in main
    result_df = rt.main(opts, mutation_df)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/randomization_test.py", line 414, in main
    permutation_result = multiprocess_permutation(bed_dict, mut_df, opts)
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/randomization_test.py", line 178, in multiprocess_permutation
    for chrom_result in process_results:
  File "/home/melichv/miniconda3/envs/2020plus/lib/python3.6/multiprocessing/pool.py", line 761, in next
    raise value
AttributeError: 'DataFrame' object has no attribute 'unique'

Do you think there is a problem with my data and not with 2020plus?

Thank you for your help!

from 2020plus.

vmelichar commented on June 3, 2024

There was indeed an error with my input file. I produced the MAF by merging multiple MAFs, but 20/20+ takes only 8 columns as input, so many rows were duplicates. I corrected the MAF so it only contains specific columns and no duplicate rows and now everything works.

I appreciate your help earlier.

from 2020plus.

Dependencies problem about 2020plus HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent