Code Monkey home page Code Monkey logo

nsalomonis / altanalyze Goto Github PK

View Code? Open in Web Editor NEW
93.0 16.0 29.0 270.21 MB

AltAnalyze is a multi-functional and easy-to-use software package for automated single-cell and bulk gene and splicing analyses. Easy-to-use precompiled graphical user-interface versions available from our website.

Home Page: http://www.altanalyze.org

License: Apache License 2.0

Python 98.82% Batchfile 0.01% Rich Text Format 1.18%
rna-seq-data alternative-splicing scrna-seq-data bioinformatics-pipeline clustering pathway-analysis network-analysis splicing-quantification splicing-visualization python

altanalyze's Introduction

AltAnalyze

An automated cross-platform workflow for RNA-Seq gene, splicing and pathway analysis

AltAnalyze is an extremely user-friendly and open-source analysis tool that can be used for a broad range of genomics analyses. These analyses include the direct processing of raw single-cell and bulk RNASeq or microarray data files, advanced methods for single-cell population discovery and dataset comparison, differential expression analyses, analysis of alternative splicing/promoter/polyadenylation and advanced isoform function prediction analysis (protein, domain and microRNA targeting). Multiple advanced visualization tools and à la carte analysis methods are supported in AltAnalyze (e.g., network, pathway, splicing graph). AltAnalyze is compatible with various data inputs for RNASeq data (FASTQ, BAM, BED), microarray platforms (Gene 1.0, Exon 1.0, junction and 3' arrays) for automated gene expression and splicing analysis. This software requires no advanced knowledge of bioinformatics programs or scripting or advanced computer hardware. User friendly videos, online tutorials and blog posts are also available.

Dependencies

If installed from PyPI (pip install AltAnalyze), the below dependencies should be included in the installed package. When running from source code you will need to install the following libraries.

  • Required: Python 2.7, numpy, scipy, matplotlib, sklearn (scikit-learn)
  • Recommended: umap-learn, nimfa, numba, python-louvain, annoy, networkx, R 3+, fastcluster, pillow, pysam, requests, python-igraph, cairo

AltAnalyze documentation, stand-alone archives are provided at sourceforge as well as at github. For questions not addressed here, please contact us.

News Update 5/17/20

http://altanalyze.readthedocs.io/en/latest/images/AltAnalyzeOverview.gif

altanalyze's People

Contributors

kpeaton avatar nsalomonis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

altanalyze's Issues

Error message in processing CEL files

What steps will reproduce the problem?
1.Begin Analysis
2.EnsMart65 / Affymetrix / Canis Familiaris / Affymetrix expression array
3.Process CEL files

What is the expected output? What do you see instead?
Receive Error Message: "The CEL files indicate that the proper species is Cr, 
however, you indicated Cf. The species indicated by the CEL files will be used 
instead."

What version of the product are you using? On what operating system?
Version 2.0 / Windows 7 Enterprise

Please provide any additional information below.
The  CEL files are Canis familiaris

Original issue reported on code.google.com by [email protected] on 27 Feb 2015 at 2:42

Won't run in OS 10.9.1 to "Add new species"

What steps will reproduce the problem?
1.  Open AltAnalyze and begin
2.  Click on Add New Species
3.  receive error and program crashes

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?
Mac OS 10.9.1

Please provide any additional information below.
Console says:
com.apple.launchd.peruser.501: 
([0x0-0x6b96b9].org.pythonmac.unspecified.AltAnalyze[46436]) Exited with code 
255

Original issue reported on code.google.com by [email protected] on 10 Jan 2014 at 6:18

Cannot open altanalyze on mac

What steps will reproduce the problem?
1. When I click on the altAnalyze icon, after downloading to my applications 
folder, I get the message "Altanalyze has encountered a fatal error, and will 
no termiate."
2.
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system? all 
versions I try. mac os X version 10.5.8


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 3 Oct 2011 at 6:12

Changes to ASPIRE p-value calculation

Background
In version 1.16 beta of AltAnalyze, we introduced the ASPIRE p-value as the 
default metric for assessing reproducibility of ASPIRE scores for reciprocal 
junction analysis of HJAY, MJAY and AltMouse arrays. Previously, the 
appropriate recommended metric was the permutation p-value (still optional). An 
issue with the permutation p-value is that it's power is dependent on the 
number of possible permutation that result from the number of replicates.

The ASPIRE p-value was designed as an optional replacement for the permutation 
p-value. This algorithm calculates and stores an ASPIRE scores for each 
experimental sample for the two reciprocal probesets (normalized baseline 
expression and normalized experimental expression), versus the mean normalized 
baseline expression for all baseline samples (normalized baseline sample values 
treated as experimental in the ASPIRE equation). These p-values are stored in 
two different lists (baseline replicates versus baseline average and 
experimental replicates versus baseline average). A one-way ANOVA p-value is 
calculated between these two lists to derive the ASPIRE p-value. 

Updates
While initially, this p-value seemed appropriate replacement, the baseline 
replicate comparisons appear too variable when a small number of replicates is 
present. Thus, we changed this metric as follows:
Normalized baseline replicate values are now compared to the mean normalized 
experimental replicate values. Here, the baseline replicates are treated as 
experimental and the experimental mean is treated as baseline in the ASPIRE 
equation. 

As before, experimental replicate values are compared to the baseline mean and 
the two resulting replicate ASPIRE score lists are compared by one-way ANOVA. 
Preliminary analysis of RNASeq data (version 2.0) where only two replicates are 
present seems to indicate this is a reasonable measure of variability. 
Nonetheless, we recommend using the FDR adjustment of this p-value, present in 
the results file, as an additional filter.


Original issue reported on code.google.com by [email protected] on 5 Feb 2011 at 6:52

Batch effect error

Hello!

I kindly wanted to ask if you could help me to solve the batch effect removal 
error:

What steps will reproduce the problem?
1. Batch effect removal

What is the expected output? What do you see instead?

Traceback (most recent call last):
  File "ExpressionBuilder.pyc", line 1492, in remoteExpressionBuilder
  File "combat.pyc", line 7, in <module>
  File "patsy\__init__.pyc", line 77, in <module>
  File "patsy\highlevel.pyc", line 20, in <module>
  File "patsy\desc.pyc", line 84, in <module>
  File "<string>", line 1, in <module>
ImportError: No module named builtins

What version of the product are you using? On what operating system?

2.0.8.1, Windows 7

Thank you!!

Original issue reported on code.google.com by [email protected] on 19 May 2015 at 12:26

Coordinate mismatch for extremely small exon regions

Description: A novel splice junction was observed for the human EnsMart91 database from an RNA-Seq analysis, where the assigned junction IDs are incorrect (relative to the reported coordinates, which are correct).

AGAP5:ENSG00000172650:I2.1_73696925-E14.1|ENSG00000172650:E2.1-E3.1 with the coordinates
chr10:73696925-73694804|chr10:73697097-73694806

The exon coordinates in this region are:
ENSG00000172650 E3.1 chr10 - 73694804 73694806
ENSG00000172650 E3.2 chr10 - 73694736 73694804

The problem is complicated by the fact that E14.1 doesn't exist, and true 3' exon is E3.1 not E14.1 and that E3.2 starts at 73694804 and not 73694806. Hence this is possibly an issue with extremely small exon regions in the database build process (EnsemblImport.py) and the RNA-Seq annotation alignment function (RNASeq.py - annotateNovelJunctions() ) .

Evaluation: There are two problems here:

  1. Database coordinate reference issue (database build)
  2. Splice-junction annotation (bed file import)

For issue 1: Rebuild the database for just this chromosome region and determine where the bug is trigged and under what circumstances (does not appear to be a wide-spread issue).
For issue 2: Once the above is complete, verify the issue still persists with a sample that contains reads aligning to the novel junction for selected samples (AML). If so, determine why E14.1 is being assigned (is it from the prior gene model, left over - does it break the model).

Multi-group analysis and FIRMA Stalls

What steps will reproduce the problem?
1. Run FIRMA with more than two groups with 3 or more samples per group
2. Aggravated when analyzing extended or full (rather than core) probesets 

What is the expected output? What do you see instead?
 * AltAnalyze should complete the analysis normally but instead will stall or crash with a memory error while running. 
 * This is the result of a bad key value assigned to a global dictionary value (probeset instead of gene for original_avg_const_exp_db in PerformExpressionAnalysis) when updating the constitutive expression value to report in the summary file.

== Solution ==
 * Instead of a probeset variable this should be a gene. Fixed in AltAnalyze version 1.16.


Original issue reported on code.google.com by [email protected] on 23 Dec 2010 at 5:53

Felis Catus Only AltAnalyze Issue

When AltAnalyze is installed only with the Fc database and the Human database is not installed, the program crashes after installing the database with the warning:

self.arraycomp.selectitem(self.default_option[0])

Upon restarting AltAnalyze, it crashes again before the application reaches the main menu. This error does not occur with dog, cow or human with EnsMart72. It looks like no default options are being loaded for cat. Occurs with a permission write error for when updating array.txt

Downloading and installing a species-specific database via command-line requires version

What steps will reproduce the problem?
1.Executing...
<python AltAnalyze.py --species Hs --update Official>
2.Gives error...
<Finished downloading the latest configuration files.
Species name to update: Homo sapiens
Ensembl version current
Traceback (most recent call last):
  File "AltAnalyze.py", line 7198, in runCommandLineVersion
    commandLineRun()
  File "AltAnalyze.py", line 6825, in commandLineRun
    for ad in db_versions_vendors[db_version]:
KeyError: 'current'>
3.Executing...
<python AltAnalyze.py --species Hs --update Official --version EnsMart65>
4.Completes successfully

What is the expected output? What do you see instead?
It appears that the default "current" value substituted when a version isn't 
provided isn't being handled correctly.

What version of the product are you using? On what operating system?
Version 2.0.8 on Ubuntu x64 12.10

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 6 Jan 2014 at 12:27

The viewer in not opening

Hello,

Thank you for the great applications.
I updated the latest softwares but I have a problem opening the viewer. The system indicates systematically that the file is damaged.
I am running it on a MacBook Pro under the latest system'Big Sur". I tried to use a earlier version of the viewer but got the same message.

Thanks in advance for your help,

Chris

Error running alatanalyze after pip install -- FileNotFoundError: [Errno 2] No such file or directory: 'altanalyze'

Hi,
I installed altanalyze using the simple command

pip install altanalyze

However, invoking "altanalyze" gives an error that seems to be a pythoncode error.

Traceback (most recent call last):
  File "/home/user/.local/bin/altanalyze", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.9/site-packages/altanalyze/__init__.py", line 29, in main
    os.chdir("altanalyze")
FileNotFoundError: [Errno 2] No such file or directory: 'altanalyze'

Aparently the script does not properly call and chage the directory.

Any help is appreciated!

Thanks

Segmentation Default on OS X 10.6 when running AltAnalyze from Source

The error occurs when downloading the AltAnalyze python source code and running 
from the command-line. I've replicated this now on 3 machines with different 
hardware configurations (32 bit and 64 bit) running OS X 10.6 (Snow Leopard). 
To replicate
1. Don't install python - instead use the OS X default 2.6.1 version
2. Download AltAnalyze program source code
3. Open AltAnalyze (python AltAnalyze.py - from Terminal)
4. Leave AltAnalyze and switch to another application or sometime select the 
browse button to search for input files in AltAnalyze.
5. An AltAnalyze critical error is reported with the program crashing in a new 
window and a "Segmentation Fault" is reported in the Terminal 

The following should fix this.
1. Install Python version 2.7.1 (not version 2.7). Python 2.7 for OS X 10.5 
won't properly install and configure Tkinter (this is the version linked to on 
the Python download page). Instead install from here: 
http://www.python.org/download/releases/2.7.1/
2. Repeat steps 3 through 4 and the program should run as expected.

The above should resolve this issue (Python developers are aware of it - 
http://bugs.python.org/issue9227). If you happened to have installed Python 
2.7, just install 2.7.1 and make sure that it says for 10.6 rather than 10.5.


Original issue reported on code.google.com by [email protected] on 18 Jan 2011 at 11:29

A memory error occurs when running FIRMA

What steps will reproduce the problem?
1. Run FIRMA on a 32 bit installation of Python with multiple experimental 
groups.

What is the expected output? What do you see instead?
Normally AltAnalyze should finish it's run but instead, a window appears 
indicating an unexpected error has occurred.


{{{
Calculating FIRMA scores (please be patient)...
Importing probe-to-probeset annotations (please be patient)...
Traceback (most recent call last):
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 3994, in AltAnalyzeSetup
    StatusWindow(root,expr_var, alt_var, goelite_var, additional_var, exp_file_location_db)
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 3641, in __init__
    sys.stdout = status; root.after(100, AltAnalyzeMain(expr_var, alt_var, goelite_var, additional_var, exp_file_location_db, root))
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 4368, in AltAnalyzeMain
    summary_results_db, aspire_output_gene_list, number_events_analyzed = RunAltAnalyze()
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 3598, in RunAltAnalyze
    summary_results_db, summary_results_db2, aspire_output, aspire_output_gene, number_events_analyzed = splicingAnalysisAlgorithms(relative_splicing_ratio,adj_fold_dbase,dataset_name,gene_expression_diff_db,exon_db,ex_db,si_db,dataset_dir)
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 1446, in splicingAnalysisAlgorithms
    splice_event_list, p_value_call, permute_p_values, excluded_probeset_db = FIRMAanalysis(fold_dbase)
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 2460, in FIRMAanalysis
    probe_probeset_db = importProbeToProbesets(fold_dbase); global firma_scores; firma_scores = {}
  File "G:\My_Elements\AltAnalyze_Latest\AltAnalyze_v1164release\AltAnalyze_v1164release\AltAnalyze.py", line 2422, in importProbeToProbesets
    for probe in probeset_probe_db[probeset]: probe_probeset_db[probe] = probeset
MemoryError
}}}

Original issue reported on code.google.com by [email protected] on 19 Nov 2010 at 6:09

Error in altanalyze calling getopt.py

Hi!

I want to use AltAnalyze for my RNA-seq data (specifically to determine transcripts TPMs and splice junctions).
I ran altanalyze as follows:

altanalyze --platform "RNASeq" --species Hs -fastq_dir /fast/work/users/tmari_m/nbp_project/alt_test/ --groupdir /fast/work/users/tmari_m/nbp_project/alt_test/groups.txt --compdir /fast/work/users/tmari_m/nbp_project/alt_test/comps.txt --output /fast/work/users/tmari_m/nbp_project/alt_test/output --expname test

but the software immediately stops giving this error message:

`Arguments input: ['AltAnalyze.py', '--platform', 'RNASeq', '--species', 'Hs', '-fastq_dir', '/fast/work/users/tmari_m/nbp_project/alt_test/', '--groupdir', '/fast/work/users/tmari_m/nbp_project/alt_test/groups.txt', '--compdir', '/fast/work/users/tmari_m/nbp_project/alt_test/comps.txt', '--output', '/fast/work/users/tmari_m/nbp_project/alt_test/output', '--expname', 'test']

Traceback (most recent call last):
File "AltAnalyze.py", line 6347, in commandLineRun
'downsample=','query=','referenceFull='])
File "/fast/users/tmari_m/work/miniconda/lib/python2.7/getopt.py", line 90, in getopt
opts, args = do_shorts(opts, args[0][1:], shortopts, args[1:])
File "/fast/users/tmari_m/work/miniconda/lib/python2.7/getopt.py", line 190, in do_shorts
if short_has_arg(opt, shortopts):
File "/fast/users/tmari_m/work/miniconda/lib/python2.7/getopt.py", line 206, in short_has_arg
raise GetoptError('option -%s not recognized' % opt, opt)
GetoptError: option -f not recognized

There is an error in the supplied command-line arguments (each flag requires an argument)
`
Do you have any idea what could be the issue?
Thank you!

unable to download database files

What steps will reproduce the problem?
during database file download.

What is the expected output? What do you see instead?
expected output is database files downloaded tot he directory in altanalyze 
database.
Instead it shows error of no internet connection.

What version of the product are you using? On what operating system?
latest version 2.7.0 on windows 7 ultimate

Please provide any additional information below.
earlier it was working first 2-3 times and downloaded also but now it is not.

Original issue reported on code.google.com by [email protected] on 12 Aug 2013 at 7:49

Any problems in RNA-seq anlysis

AltAnalyze version 2.01 alpha_Py was run with default parameters and the sample 
data "hESC-NP_BED_files.zip", we get any error "print 
len(junction_db),'junctions present in',algorithm,'format BED files.' # 
('+str(pos_count),str(neg_count)+' by strand).'
UnboundLocalError: local variable 'algorithm' referenced before assignment";

thanks for your attention!

Original issue reported on code.google.com by [email protected] on 22 Feb 2011 at 7:42

Non-log expression error - AltAnalyze v.1.155 and below

What steps will reproduce the problem?
1. Analyze non-log exon, gene or junction normalized expression file produced 
outside of AltAnalyze (e.g., PLIER, MAS5) using the option "Process Expression 
File". Appropriately designate that the expression values are non-log under the 
option "Expression data format".
2. After successfully analyzing the expression file, return to AltAnalyze and 
select the option "Process AltAnalyze Filtered".
3. Since the option "Expression data format" is not an option when selecting 
"Process AltAnalyze Filtered", AltAnalyze will assume that the data is log2, 
when indeed it is not.

What is the expected output? What do you see instead?
An error will likely occur, although it is possible the analysis will run and 
produce erroneous errors (unlikely an actual user has encountered this since 
most users don't use the "Process AltAnalyze Filtered" option - discovered 
while performing ongoing software development).

To fix, we will likely log2 transform the expression values prior to export 
from ExonArray and change the status of logtype from non-log to log2 for 
remaining analyses (in ExpressionBuilder.py and in AltAnalyze.py), when the 
data is first analyzed.


Original issue reported on code.google.com by [email protected] on 21 Jan 2011 at 8:19

setup.py py2exe includes bug

https://code.google.com/p/altanalyze/source/browse/trunk/AltAnalyze_release/setu
p.py#88

I don't use your application but just noticed a bug in that the py2exe options 
dictionary has the includes dictionary key repeated, meaning that only the last 
instance will be kept in the dictionary.

Original issue reported on code.google.com by [email protected] on 6 Aug 2013 at 9:56

Error encountered: Exon or junction is zero

What steps will reproduce the problem?
1. I generated exon coordinate .bed file from Altanalyze.
2. Sort that file using BEDTools.
3. Re-running the Altanalyze

What is the expected output? What do you see instead?
No ouput.

What version of the product are you using? On what operating system?
AltAnalyze v2.0.6 - Windows

Please provide any additional information below.
The .bam file and junctions.bed (from Tophat) is already in the folder. The 
Exon file had been generated from Altanalyze and then used BedTools. Now 
re-running the process arising the problem. Looking forward for your support.

Thank You!

Original issue reported on code.google.com by [email protected] on 27 May 2012 at 12:27

Misc. 2.0.7 related issues

We have identified the following minor bugs associated with AltAnalyze v. 2.0.7:

1) Relative fold changes for hierarchical clustering has few headers than 
values.
2) Adjusted p-values in the alternative exon file are close to 1 in most all 
cases (likely due to not including non-regulated exons).

Original issue reported on code.google.com by [email protected] on 26 Sep 2012 at 3:33

Buggy 2.0.7

Dear Developers,
There seems to be a problem with the newer versions of AltAnalyze in my case. 
I'm sure if it's working for other people then it's something in that can be 
solved. But here's a short description:
I have a table of Illumina expression values that I can feed successfully into 
version 2.0.3 for example and get an ExpressionOutput folder with my results. 
It doesn't provide me with any annotation data though, but that is secondary.
When I use the same file in 2.0.6 or 2.0.7 it gives me an error that looks like 
this, regardless of the data I pass on to it:

"Beginning to Process the Mm 3'array dataset
Adding additional gene, GO and WikiPathways annotations
* * * * * * * * * * * * * * * * * * * ArrayID annotations imported in 15 seconds
45599 Array IDs with annotations from Illumina annotation files imported.
Processing the expression file: 
R:/Data/Transcriptomics/Illumina/TiKC_den+PC_GA+TA/20120803_ExpressionValues_Raw
_DM.txt
25697 IDs imported...beginning to calculate statistics for all group comparisons
Traceback (most recent call last):
  File "AltAnalyze.pyc", line 4916, in AltAnalyzeSetup
  File "AltAnalyze.pyc", line 4374, in __init__
  File "AltAnalyze.pyc", line 5207, in AltAnalyzeMain
  File "ExpressionBuilder.pyc", line 1360, in remoteExpressionBuilder
  File "ExpressionBuilder.pyc", line 145, in calculate_expression_measures
  File "reorder_arrays.pyc", line 124, in reorder
  File "statistics.pyc", line 492, in log_fold_conversion
OverflowError: math range error


...exiting AltAnalyze due to unexpected error"

Since the issue doesn't change with the type of data set I'm feeding into it, 
I'm finding it hard to figure out what the problem is. I also checked the files 
for NAs or non-numeric entries, but there are none.
Any idea what the problem might be?

Thanks a bunch!

PS: I'm operating on a 64-bit windows 7 system.

Original issue reported on code.google.com by [email protected] on 14 Aug 2012 at 8:59

Attachments:

NKX2-5 missing from HuEx database

This appears to be due to a very long overlapping RNA (UCSC) that is apart of a 
gene neighboring NKX2.5 and spanning it's gene structure. NKX2.5 should be 
included in the database so this should be considered a bug.

Original issue reported on code.google.com by [email protected] on 8 May 2012 at 4:28

Predicted sample export failed in GUI

Hello,
I am trying to map a list of fastq files to reference mus musculus for my research. My system has 32GB ram and i7 processor. But when i start running i get the following error. I don't have any idea how to solve this. I would be highly obliged if you help me solve this .
I have attached the image for your reference. Thanking you for your time.

Error

pair end rna seq issue

What steps will reproduce the problem?
1.run altanalyze to analyze pair-end rna seq data;
2.Seq was mapped with Tophat 1.4.1
3.Build Exon BED file from BAM with instructed command line
4. run altanalyze as instructed.

What is the expected output? What do you see instead?
altanalyze shd finish, but instead quit with following messages:
Processing exon/junction coordinates sequentially by chromosome...
* * * * * * * * * * * * * * * * * * * * * * user coordinates imported/processed
Importing read counts from coordinate data...
129485 junction read counts present for TopHat
Normalizing junction expression (RPKM analogue - 60nt length)... 13_exons.bed 
('chr1', 183074182, 183072530) v.2.0.6 test
Error Encountered... Exon or Junction of zero length encoutered... RPKM 
failed... Exiting AltAnalyze.
[1, 0, 60.0]
Traceback (most recent call last):
  File "AltAnalyze.py", line 4916, in AltAnalyzeSetup
  File "AltAnalyze.py", line 4374, in __init__
  File "AltAnalyze.py", line 5179, in AltAnalyzeMain
  File "RNASeq.pyc", line 1106, in alignExonsAndJunctionsToEnsembl
  File "RNASeq.pyc", line 616, in calculateRPKM
NameError: global name 'kill' is not defined


...exiting AltAnalyze due to unexpected error


What version of the product are you using? On what operating system?
v2.0.7 on windowns 64 system

Original issue reported on code.google.com by [email protected] on 6 Oct 2012 at 7:17

Bugs with constitutive vs. known exons (e.g. core probesets)

What steps will reproduce the problem?
1. In version 2.02, run AltAnalyze for CEL or junction BED files and under 
Expression Analysis Parameters and "Determine gene expression levels using", 
select either "core probesets" or "known exons".

What is the expected output? What do you see instead?
-For exon and junction arrays, constitutive probesets will still be used for 
all calculations.

SOLUTION:
This bug is fixed in version 2.03. The bug was due to a conflicting variable 
name in the config file options.txt.

Original issue reported on code.google.com by [email protected] on 27 May 2011 at 10:36

Windows serialization load error - misopy

Issue: Failure to produce SashimiPlots on Windows 64. This was not an issue in Windows 32. The error presents as an EOFError in pickle_utils.py of misopy at the line loaded_obj = pickle.load(pickled_file).

AltAnalyze v.2.0.7 crash - local variable 'log_fold'

A critical error can occur in AltAnalyze when the first row of read counts 
starts with 0. 

The error will look something like this:

  File "...AltAnalyze/ExpressionBuilder", line 1359, in remoteExpressionBuilder
  File "...AltAnalyze/ExpressionBuilder", line 118, in calculate_expression_measures
UnboundLocalError: local variable 'log_fold' referenced before assignment

This is a filtering error with AltAnalyze that can cause some low expression 
genes to be reported as expressed in the gene expression analysis only. This 
shouldn't effect analyses in any major way as it only results in some genes 
being reported "expressed" when they have low overall counts but reasonable 
RPKMs (greater than user designated threshold).

Two patches are available to fix this for Mac and Windows:

http://altanalyze.org/AltAnalyze_v.2.0.7_MacOSX_patch.zip
http://altanalyze.org/AltAnalyze_v.2.0.7-Win64_patch.zip

Original issue reported on code.google.com by [email protected] on 9 Jan 2013 at 5:29

fewer options displayed

What steps will reproduce the problem?

no option for affymetrix gene arrays.
no options for other manufacturers.

What is the expected output? What do you see instead?
According to manual several options for array types should be present.
Instead  only affymetrix expression arrays seen. 

What version of the product are you using? On what operating system?
AltAnalyze version 2.0.7 on windows 7 ultimate.

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 12 Aug 2013 at 7:51

Containerization?

Great tool here!

Due to the need for specific dependencies, would it be possible to create this repo with a Dockerfile so as to containerize the application?

Cytoscape installation fails without warning

What steps will reproduce the problem(s)?
1. Download AltAnalyze for a machine running Windows and install a species 
database. Cytoscape will be downloaded and extracted automatically. However, 
Cytoscape will prompt will either simply fail to open or prompt an 
incompatibility warning (32bit or 64bit incompatibility).
2. Download AltAnalyze for a machine running OSX and install a species 
database. Cytoscape will be downloaded and extracted automatically. However, 
Cytoscape will fail to open when the Cytoscape.app is selected without warning. 
Only opening it in a terminal window with the command "./Cytoscape.app" or 
opening the .jar file will initiate Cytoscape.

Cause of the error:
(Windows) After the cytoscape.zip file is downloaded, zip extraction creates 
faulty binary files due to an apparent problem with Python's builtin zipfile 
library.
(Mac) Unlike Windows, zipfile propperly extracts the cytoscape.zip download, 
however, permissions on the directories cause restricted access to files, 
unless permissions are changed (chmod -R 777 directory).

Solution in next release:
(all operating systems) The Cytoscape folder will be distributed as a 
cytoscape.gz.tar file which is extracted properly by the AltAnalyze supported 
extraction methods. Cytoscape was shown to run properly when extracted from 
this file.
(Mac) Permissions on the Cytoscape directory have been changed. 

Original issue reported on code.google.com by [email protected] on 26 Mar 2011 at 6:49

Wish for checkboxes and option to save settings

Hi,
This is just a wish to increase the usability.
I was using AltAnlyze now intensively over the last 10 days to analyze 
different murine expression data sets.
Certain things I had to redo with certain settings -but diferent from the 
default ones- for several datasets.
It would have been practical if I had the chance to save my settings and then 
just load them again. But so I had to type them in again and again and of 
course forgot to change the "rawp" to "adjp" and had to redo the thing again.
IF I do ORA I can choose between "one" of the listed ones and "all".
I was interested in only three of them. It would habe been nice to set that by 
checkboxes (or something equivalent) - I think checkboxes even had been part of 
a prior GO-Elite version (if I am not mistaken...).
THX
Mark

Original issue reported on code.google.com by [email protected] on 2 Oct 2014 at 7:58

Gene expression filtering not working in command-line mode

What steps will reproduce the problem?
1. Do or do not specific --GEcutoff

What is the expected output? What do you see instead?
You will obtain many more alternative exon predictions than performing the same 
query in the GUI. This occurs because the program was not converting the listed 
gene expression fold cutoff from a string to a float. As a result, python 
doesn't remove any genes from the analysis based on differential constitutive 
levels.

Please use labels and text to provide additional information.
This is fixed in AltAnalyze 2.0.

Original issue reported on code.google.com by [email protected] on 5 Feb 2011 at 9:12

Reduce database size

Species databases are particularly large for human, mouse and rat. To increase 
usability for systems with minimal hard-drive space, we can collapse 
junction/exon/reciprocal-junction annotation flat files into combined entries. 
To do this, we will need to combine unique IDs (e.g., exon/junction/junction 
pairs) into a single entry where the values are the same. This will eliminate 
the subfolders "exon" and "junction" for RNASeq and junction array databases 
and reduce database size by an estimated 20-40%.

Original issue reported on code.google.com by [email protected] on 21 Mar 2011 at 6:40

TPM or RPKM?

[Question from user by email]

Did altanalyze change it’s exp. file in the ExpressionInput folder? There is now ENS numbers with the : exons numbers… there is also the -steady state file but if I rerun alt using the steady state I get a warning not to run alt using that file… Is there a way to get back to just having a gene matrix table with FPKM (nonlog) and the Gene Symbol?

GO-Elite Reports a goelite_run variable error

What steps will reproduce the problem?
1. Run an analysis where no genes are differentially expressed

What is the expected output? What do you see instead?
goelite_run variable error in AltAnalyze.py line 5432


Original issue reported on code.google.com by [email protected] on 1 Oct 2014 at 3:41

Group Files Annotations

What steps will reproduce the problem?
1.uploading a gene expression file in log 2
2. assigning group names for group wise comparisons: Group 1 Exp Group 1 Base; 
Group 2 Exp; Group 2 Base
3. Assigning the comparisons

What is the expected output? What do you see instead?
comparing Group 1 (exp vs. base) and Group 2 (exp vs. base)
the following error message occurs: "you must at least assign two groups for 
each comparison"

What version of the product are you using? On what operating system?
v.2.0.8

Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 27 Aug 2013 at 12:57

t-test calculation for unequal sized groups is inaccurate

Applicable to version 1.15 and below. Three issues with the t-test calculation 
were recently discovered:
  1) The equal group size t-test calculation was used for unequal group sizes (http://en.wikipedia.org/wiki/Student's_t-test)
  2) The probability equation for t-test p-values rounded the degree's of freedom rather than rounding down.
  3) Assuming equal variance equation was incorrect.

The t-test calculation is only used for the calculation of splicing-index and 
FIRMA p-values and is only assuming unequal variance. Thus, this issue has a 
minor impact on the t-test p-value of splicing scores when the group sizes are 
unequal. Number three is not really applicable, since equal variance is not 
currently assumed anywhere in AltAnalyze. To test this, load the statistics.py 
module and submit group values for equal sized and unequal sized groups, equal 
and unequal variance (2 and 3) to the ttest() and t_probability() functions.

To correct this, we have replaced the ttest method with the 
statistics.OneWayANOVA() function in version 1.16.

Original issue reported on code.google.com by [email protected] on 2 Oct 2010 at 5:56

A custom database

Good afternoon,
I did not find it in the manual but is it now possible to make an AltAnalyze compatible species database using my own annotation (for example, a custom gtf file)?
Thank you!

PageRank presents with an error

ICGS-NMF errors when performing Down-Sampling

The error specifically is encountered in the PageRankSampling function of the ICGS_NMF module in the networkx library when calling:
neighbours=list(G.adj[key1])

producing a KeyError for return AtlasView(self._atlas[name])

The error can be overcome by increasing the down-sampling threshold (default = 2500 cells), however, the down-sampling option is not available in the version 2.1.4.2 GUI.

Missing exon array probesets from database

What steps will reproduce the problem?
1. Look at probeset 2352152, overlapping with ENST00000271277
2. Look at probeset 3161639 with the transcript cluster annotation 3161566

What is the expected output? What do you see instead?
Both Exon 1.0 probesets should be included in the AltAnalyze human EnsMart62 
database, but are missing. In the first case, it appears that there isa second 
gene overlapping with this probeset region, possibly resulting in exclusion. 
The entire overlapping exon (ENSE00001450658) is properly annotated in 
ensembl/Hs_Ensembl_exon.txt, hence, the issue arises in the module 
ExonArrayEnsemblRules.py (probably overlapping transcript cluster annotations). 
3161639 aligns to an intron of ENSG00000107077 and should be considered a 
"full" annotation. Although, it shares the same transcript cluster ID as exon 
aligning probesets for this gene, other annotated intron aligning probesets 
have more than 1 different transcript clsuters. Hence, it is probably excluded 
due to multiple overlapping transcript clusters. Since both 3161639 and the 
exon aligning probesets for this gene share the same transcript cluster, it 
should be included.

Testing during the next Ensembl build (EnsMart 63) should be conducted with 
these probesets to assess where they are excluded and resolve if possible.

Original issue reported on code.google.com by [email protected] on 12 Jun 2011 at 7:57

How to enable cell type annotation like previous AltAnalyze Version.

I have the following data downloadable here.

Now, I'm using the most recent version of AltAnalyze.

However when I tried the following script:

ALTANALYZE=/home/ubuntu/storage2/Tools/altanalyze/AltAnalyze.py
/home/ubuntu/anaconda2/bin/python $ALTANALYZE \
    --runICGS yes \
    --expdir test_outdir  \
    --platform RNASeq \
    --species Mm \
    --column_method hopach --rho 0.4 \
    --ExpressionCutoff 4\
    --FoldDiff 3  \
    --SamplesDiffering 3\
    --excludeCellCycle conservative

I cannot get this kind of plot where the cell type is assigned on the left.
Like the previous version of AltAnalyze.

IMG_20200605_130618

I have removed the old version and don't know anymore which the previous version can create that.
Please advice how can I go about it.

UI function defaults stored in option_db

Currently, defaults have to be imported in hard-coded format and are stored in 
a list rather than as attributes in the options attribute dictionary. This is 
not ideal and should be cleaned up in the future (set all as default options in 
option_db and read in variable names from file).

Original issue reported on code.google.com by [email protected] on 28 Apr 2012 at 11:09

2.01 alpha non-constitutive gene inclusion issue

What steps will reproduce the problem?
1. Analyze any array dataset looking at genes WITHOUT constitutive annotated 
identifiers (probesets or junctions).
2. Run with different expression filters

What is the expected output? What do you see instead?
If no constitutive probesets/junctions are present, all features that align to 
mRNAs (annotated with an AltAnalyze exon ID) should be included to assess gene 
expression. Instead, only the first probeset or junction is included.

This was fixed in 2.02 beta to now include all probesets/junctions that align 
to mRNAs.

Original issue reported on code.google.com by [email protected] on 25 Feb 2011 at 5:51

Error: No such file or directory: python2.7/site-packages/altanalyze/Config//arrays.txt

I just installed AltAnalyze through pip in conda environment (with python 2.7) and following instructions here tried to install the organism database with
altanalyze --species Dr --update Official --additional all --version EnsMart102

This threw the following error (I've replaced some paths with /path/to/):

Current database version: EnsMart102
AltAnalyze.py --species Dr --update Official --additional all --version EnsMart102
Traceback (most recent call last):
  File "AltAnalyze.py", line 8435, in runCommandLineVersion
    commandLineRun()
  File "AltAnalyze.py", line 7951, in commandLineRun
    array_codes = UI.remoteArrayInfo()
  File "/path/to/lib/python2.7/site-packages/altanalyze/UI.py", line 3679, in remoteArrayInfo
    importArrayInfo()
  File "/path/to/lib/python2.7/site-packages/altanalyze/UI.py", line 3685, in importArrayInfo
    for line in open(fn,'rU').readlines():
IOError: [Errno 2] No such file or directory: '/path/to/lib/python2.7/site-packages/altanalyze/Config//arrays.txt'

I don't know what should I do with it. Any help is appreciated.

Errors and differences in results while running the CLI version for ICGS2

I've used the latest github version (Dec 14 2020) and my goal is to run ICGS2 clustering. I've created two versions of input counts - one is plain tsv file (called 'tsv' below) and the other a directory with matrix.mtx.gz, features.tsv.gz and barcodes.tsv.gz (called 'mtx') to emulate the 10X output (I don't know how to make a compatible h5 file). I also had to replace ':' with '_' in gene symbols/names as it seemed to cause some issues. The counts (and gene names) are identical in both cases, but the results are not. In both cases, the ICGS seems to have finished - the last line in the log files is ICGS run complete... halted prior to full differential comparison analysis.

These are the commands I've issued:

tsv:

python altanalyze-master_github12.14.2020/AltAnalyze.py --platform RNASeq --species Dr --expname test1a --output test1a --runICGS yes --expdir 56hpf-LTA-counts/56hpf-LTA-counts.tsv --dataFormat counts

mtx:

python altanalyze-master_github12.14.2020/AltAnalyze.py --platform RNASeq --species Dr --expname test2 --output test2 --runICGS yes --ChromiumSparseMatrix 56hpf-LTA-counts/ --dataFormat counts

I had 4884 cells. For the tsv counts I have 4585 lines (cells) in ICGS-NMF/FinalGroups.txt while for mtx there are 4281.

I have these questions:

  • can I trust the clustering (ICGS2) results despite the multiple logged errors (please see below)? It seems that most of them are related to gene biotype annotations which are missing ...
  • why some cells are missing from ICGS-NMF/FinalGroups.txt files?
  • why do the two versions give different results? Is there some random component here?

Thank you for your help!

Here are the errors:

try({hopg<-hopach(data,dmat=distmatg,ord="own")})
Error in base::rowMeans(x, na.rm = na.rm, dims = dims, ...) : 
  'x' must be an array of at least two dimensions
In addition: Warning message:
In collap(data, level, d, dmat, newmed) :
  Not enough medoids to use newmed='medsil' in collap() - 
 using newmed='nn' instead 


Traceback (most recent call last):
  File "/path/to/altanalyze-master_github12.14.2020/visualization_scripts/clustering.py", line 261, in heatmap
    newFilename, Z1, Z2 = R_interface.remoteHopach(inputFilename,cluster_method,metric_gene,metric_array)
  File "/path/to/altanalyze-master_github12.14.2020/R_interface.py", line 106, in remoteHopach
    z.Hopach(cluster_method,metric_gene,force_gene,metric_array,force_array)
  File "/path/to/altanalyze-master_github12.14.2020/R_interface.py", line 626, in Hopach
    if 'clustering' in hopach_run:
UnboundLocalError: local variable 'hopach_run' referenced before assignment

hopach failed... continue with an alternative method
Traceback (most recent call last):
  File "/path/to/altanalyze-master_github12.14.2020/RNASeq.py", line 4122, in correlateClusteredGenesParameters
    except Exception: TFs = importGeneSets('BioTypes',filterType='transcription regulator',geneAnnotations=gene_to_symbol_db)
  File "/path/to/altanalyze-master_github12.14.2020/RNASeq.py", line 2826, in importGeneSets
    for line in open(fn,'rU').xreadlines():
IOError: [Errno 2] No such file or directory: '/path/to/altanalyze-master_github12.14.2020/AltDatabase/EnsMart72/goelite/Dr/gene-mapp/Ensembl-BioTypes.txt'
Traceback (most recent call last):
  File "/path/to/altanalyze-master_github12.14.2020/GO_Elite.py", line 1357, in runGOElite
    try:go_to_mod_genes, mapp_to_mod_genes, timediff, mappfinder_input, resource = mappfinder.generateMAPPFinderScores(species,species_code,source_data,mod,system_codes,permute,resources,file_dirs,root,Multi=mlp)
  File "/path/to/altanalyze-master_github12.14.2020/mappfinder.py", line 462, in generateMAPPFinderScores
    if PoolVar: q.put([print_out]); return None
AttributeError: 'NoneType' object has no attribute 'put'
gene associations assigned
Traceback (most recent call last):
  File "/path/to/altanalyze-master_github12.14.2020/stats_scripts/ICGS_NMF.py", line 1295, in CompleteICGSWorkflow
    annotatedGroupsFile = RNASeq.predictCellTypesFromClusters(finalgrpfile, goelite_path)
  File "/path/to/altanalyze-master_github12.14.2020/RNASeq.py", line 5583, in predictCellTypesFromClusters
    for line in open(goelite_path,'rU').xreadlines():
IOError: [Errno 2] No such file or directory: '/path/to/test1a/NMF-SVM/SVMOutputs/GO-Elite/clustering/MarkerFinder-subsampled-ordered/GO-Elite_results/pruned-results_z-score_elite.txt'

Unable to export annotated groups file with predicted cell type names.
Parent directory not found locally for ['/DataPlots/', '/DataPlots/exp.56hpf-LTA-counts-ICGS-UMAP_scores.txt']

Kallisto-splice empty BAM files

Issue: When running Kallisto-splice prior to AltAnalyze version 2.1.4.1, the user can sometimes obtain an empty BAM files and junction.bed when analyzing FASTQ files. This may not happen consistently on the same machine. The user may also have different sized BAM files, which can later result in BAM file indexing errors.

File "WikiPathways_webservice.pyc", line 64, in getPathwayAs NameError: global name 'client' is not defined

What steps will reproduce the problem?
1. Click AltAnalyze icon to start and choose "Begin Analysis"
2. Select database version - EnsMart65; Select vendor/data type - RNASeq; 
Select Species - Homo Sapiens; Select platform - RNA-seq aligned read counts
3. Analysis options - Process AltAnalyze filtered
4. Select directory
5. Analyze Exon or Junction Data - default parameters except for" Analyze 
ontologies and pathways with GO-Elite" - Run Immediately 

What do you see instead?

Exporting pathway images for 3 Wikipathways (expect 0-1 minute runtime)
...restricting to pdf
Error encountered with Multiple Processor Mode or server timeout... trying 
single processor mode
exported to the folder "WikiPathways"
3 Wikipathways failed to be exported (e.g., WP35 )
Traceback (most recent call last):
  File "GO_Elite.pyc", line 1949, in visualizePathways
  File "WikiPathways_webservice.pyc", line 249, in visualizePathwayAssociations
  File "WikiPathways_webservice.pyc", line 64, in getPathwayAs
NameError: global name 'client' is not defined

Wikipathways output in 1 seconds

What version of the product are you using? On what operating system?
AltAnalyze_v.2.0.8.1-Win64

Please provide any additional information below.

Seems like an issue with WikiPathways webservice access parameters.

Attached is the log file from the AltAnalyze Run.
Thanks, 
Tom

Original issue reported on code.google.com by [email protected] on 18 Feb 2014 at 9:43

Attachments:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.