mortazavilab / pywgcna Goto Github PK

View Code? Open in Web Editor NEW

197.0 5.0 47.0 698.49 MB

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)

Home Page: https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311

License: MIT License

Python 0.94% Jupyter Notebook 98.99% HTML 0.08%

wgcna bioinformatics network-analysis weighted-gene-correlation-network

pywgcna's Introduction

PyWGCNA

PyWGCNA is a Python library designed to do weighted correlation network analysis (WGCNA). It can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures. Users can also compare WGCNA networks from different datasets, or to external gene lists, to assess the conservation or functional enrichment of each module.

Documentation

PyWGCNA's full documentation can be found here

Installation

To install PyWGCNA, Python version 3.10 or greater is required.

Install from PyPi (recommended)

To install the most recent release, run

pip install PyWGCNA

Install with the most recent commits

Git clone the PyWGCNA repository, cd to the PyWGCNA directory, and run

pip install .

Tutorials

Data input, cleaning and pre-processing: How to format, clean and preprocess your input data for PyWGCNA
Quick Start: How to load data into PyWGCNA, find modules, and analyze them
PyWGCNA object: How to interact with PyWGCNA objects and some parameters we have them in the object and how you can access them
Compare two PyWGCNA objects: How to compare two PyWGCNA objects
Compare more than two PyWGCNA objects: How to compare three PyWGCNA objects
Compare PyWGCNA objects to gene marker list: How to compare PyWGCNA objects to external gene lists (here shown on marker genes from single-cell data)
Functional enrichment analysis : How to perform functional enrichment analysis using databases such as GO, KEGG, and REACTOME in PyWGCNA object
Visualize modules as network: How to visualize PyWGCNA objects as a network
Recover Protein-Protein Interaction: How to find and plot PPI using STRING database.
module trait relationships heatmap: How to calculate correlation between modules and traits

Cite

PyWGCNA is now online in Bioinformatics. Please cite our paper when using PyWGCNA:

Narges Rezaie and others, PyWGCNA: A Python package for weighted gene co-expression network analysis, Bioinformatics, 2023; https://doi.org/10.1093/bioinformatics/btad415

pywgcna's People

Contributors

Stargazers

Watchers

Forkers

nilesh-iiita bn-zhou abdo3a nmank crsky1023 annashcherbina abuchin avkermanov cry2133 michielperneel behrouzsh roldanjg jamielee183 tmbj-jerry lmigueel ddlatumalea mariusrklein dn070017 chaoneng ladyson1806 anna4kaa charapink lipingshu erguntiryaki yoshifumimiyagi shenmbsw shulp2211 animesh wangze09 zzgw hzauleibowen ofarrelle congca pk-zhu tenayatherapeutics mcamagna yochaytzur marioernestovaldes mfkessler sofia-castro sumeetmankar171 songbowen1995 stutimishra7 helix184 ryankim3gilead rufus-willy nxl365

pywgcna's Issues

Error in pyWGCNA dependences

Hi!
I installed pyWGCNA, but it showed a dependency error on numpy package and the MachAr class when I load it using "import pyWGNA". Looking at numpy updates, this class has been deprecated in version 1.22 (https://numpy.org/devdocs/release/1.22.0-notes.html), the most current being version 1.24.

Would you have any solution?

Thanks.

Error message:

AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_11288/4062357097.py in
----> 1 import PyWGCNA

~\anaconda3\lib\site-packages\PyWGCNA_init_.py in
----> 1 from PyWGCNA.wgcna import *
2 from PyWGCNA.utils import *
3 from PyWGCNA.geneExp import *
4 from PyWGCNA.comparison import *

~\anaconda3\lib\site-packages\PyWGCNA\wgcna.py in
11 from scipy.cluster.hierarchy import linkage, cut_tree, dendrogram, fcluster
12 from scipy.stats import t
---> 13 from statsmodels.formula.api import ols
14 import resource
15 from matplotlib import colors as mcolors

~\anaconda3\lib\site-packages\statsmodels\formula\api.py in
----> 1 import statsmodels.regression.linear_model as lm_
2 import statsmodels.discrete.discrete_model as dm_
3 import statsmodels.regression.mixed_linear_model as mlm_
4 import statsmodels.genmod.generalized_linear_model as glm_
5 import statsmodels.robust.robust_linear_model as roblm_

~\anaconda3\lib\site-packages\statsmodels\regression_init_.py in
----> 1 from .linear_model import yule_walker
2
3 from statsmodels.tools._testing import PytestTester
4
5 all = ['yule_walker', 'test']

~\anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in
44 from statsmodels.tools.decorators import (cache_readonly,
45 cache_writable)
---> 46 import statsmodels.base.model as base
47 import statsmodels.base.wrapper as wrap
48 from statsmodels.emplike.elregress import _ELRegOpts

~\anaconda3\lib\site-packages\statsmodels\base\model.py in
14 cached_value, cached_data)
15 import statsmodels.base.wrapper as wrap
---> 16 from statsmodels.tools.numdiff import approx_fprime
17 from statsmodels.tools.sm_exceptions import ValueWarning,
18 HessianInversionWarning

~\anaconda3\lib\site-packages\statsmodels\tools\numdiff.py in
49
50 # NOTE: we only do double precision internally so far
---> 51 EPS = np.MachAr().eps
52
53 _hessian_docs = """

~\anaconda3\lib\site-packages\numpy_init_.py in getattr(attr)
282 return Tester
283
--> 284 raise AttributeError("module {!r} has no attribute "
285 "{!r}".format(name, attr))
286

AttributeError: module 'numpy' has no attribute 'MachAr'

Errors:IndexError: positional indexers are out-of-bounds

When we performed the function “.findModules”, Python reported an error as“ positional indexes are out of bounds”. Could you please tell us why this happened and what should we do to fix the problem?

Error in analyseWGCNA()---color

Hi
When I run pywgcna.analyseWGCNA(), I got this error:
-> 4439 colors = mcolors.to_rgba_array(c)
ValueError: 'c' argument must be a color, a sequence of colors, or a sequence of numbers, not array(['darkviolet', 'darkviolet', 'darkvio

However, the file of sample info and color setting was same with the files in Quick start.
How can I solve this problem?

Thank you so much!

barplotModuleEigenGene

Hi,
The following is a figure generated by barplotModuleEigenGene, could you explain the meanings of the bar height/the grey points/ the red points and the red vertical lines ?

Besides, I have found a flaw of barplotModuleEigenGene, for example, when there are only a few traits, the figure cannot be show properly, the title is very large and the bars disappears !

Hope you are kind enough to fix it.

module number

Hi,
PyWGCNA can determine the module numbers automatically now, how to adjust the number if I wanna increase or decrease the modules ?

Input data format different than the tutorial

Hi,

In the quickstart tutorial, it's shown that the geneExp = '5xFAD_paper/expressionList_sorted.csv' data is loaded to give us this pandas dataframe as:

However, while I downloaded the data and loaded it ditto, I get a transposed version of the same dataframe:

Can you please help me with understanding what the actual format is? In the Data Input, cleaning and preprocessing doc, it does seem that the dataframe format I get while loading on my side is the correct one.

Which is the actual format that PyWGCNA works on?

Thanks.

AttributeError: 'WGCNA' object has no attribute 'top_n_hub_genes'

Hi!

First of all, thank you so much for this package! It has helped me a lot in my data analysis.

I have determined the different modules and now I would like to find the most important hub genes.

However when I run:
pyWGCNA_NBC.top_n_hub_genes(moduleName="lime", n=10)

I get error:
AttributeError: 'WGCNA' object has no attribute 'top_n_hub_genes'

I also looked into the source file and it seems like I cannot find the function.
Could you please help me with this?

moduleEigengenes function error, failure to use hubgenes

Error received during findModules call, preprocessing has already been done. I have all settings on default, including hubGenes = True and traperrors = False. I tried changing traperrors to true however this seemed to make things worse.

error:
Calculating 56 module eigengenes in given set...
..principal component calculation for module antiquewhite failed with the following error:
..hub genes will be used instead of principal components.

%tb:
MEList = WGCNA.moduleEigengenes(expr=self.datExpr.to_df(), colors=self.datExpr.var['dynamicColors'])

File ~\anaconda3\lib\site-packages\PyWGCNA\wgcna.py:2036 in moduleEigengenes
sys.exit("Error!")

I will attach the original data file.
transposed.csv

Many thanks in advance.

When TOMtype is set to unsigned, an unexpected error will be triggered.

Thank you very much to the developers for being willing to develop PyWGCNA

Recently, I found that when TOMtype is set to unsigned, the following error will be triggered: "`TOMType' cannot be 'none' for this function".

This is probably because when TOMtype is set, TOMTypeC is set to 0 based on here, which triggers this error.

In addition, TOMTypeC should not be assigned a value of 2, and it is unclear whether this is expected here.

Consensus Modules

Hello,

I really like your PyWGCNA! I'm interested in using it for some of my research.

I've been looking around what you currently have on git and I can't seem to find a 'detect consensus modules' function. Specifically, I'm looking for the Python equivalent of blockwiseConsensusModule.

Specifically, I'd like to repeat some of the analysis that was done in Eigengene networks for studying the relationships between co-expression modules by Langfelder and Horvath.

Did I miss something? Is there a way for me to detect consensus modules with what you already have? If not, will you be implementing anything like that soon?

Thanks,
Nate

ValueError: Linkage 'Z' contains negative distances

Hi,

When I test PyWGCNA with your 5xFAD sample data, I got an error "ValueError: Linkage 'Z' contains negative distances" in findModules step. The same error occurred in my own data. Please give me some suggestions. Below is the log file:

pyWGCNA_5xFAD.preprocess()
Pre-processing...
Detecting genes and samples with too many missing values...
Done pre-processing..

pyWGCNA_5xFAD.findModules()
Run WGCNA...
pickSoftThreshold: calculating connectivity for given powers...
will use block size 1876
Power SFT.R.sq slope ... mean(k) median(k) max(k)
0 1 0.368857 -0.481613 ... 2444.750756 2260.416614 5665.102661
1 2 0.7253 -0.99165 ... 840.665489 673.081241 3009.058821
2 3 0.791986 -1.194264 ... 385.685335 258.451265 1916.810605
3 4 0.835392 -1.3419 ... 207.404152 113.456087 1332.762771
4 5 0.853842 -1.472183 ... 123.232581 54.784481 984.036824
5 6 0.870673 -1.553348 ... 78.455923 28.47124 752.959999
6 7 0.886736 -1.600869 ... 52.572016 15.594822 591.514192
7 8 0.896672 -1.639343 ... 36.65884 9.454046 475.817182
8 9 0.903531 -1.677747 ... 26.397061 6.024431 389.237531
9 10 0.906045 -1.706474 ... 19.521431 3.975959 322.823838
10 11 0.905582 -1.731076 ... 14.767291 2.623921 270.867416
11 13 0.914482 -1.751347 ... 8.941254 1.205108 196.222414
12 15 0.912684 -1.771227 ... 5.759987 0.568044 146.575349
13 17 0.912188 -1.774908 ... 3.905403 0.273242 112.189052
14 19 0.907649 -1.774186 ... 2.766824 0.135454 87.594344

[15 rows x 7 columns]
Selected power to have scale free network is 9.
calculating adjacency matrix ...
Done..

calculating TOM similarity matrix ...
Done..

Traceback (most recent call last):
File "", line 1, in
File "/home/abc/miniconda3/envs/eqtl/lib/python3.8/site-packages/PyWGCNA/wgcna.py", line 293, in findModules
dynamicMods = WGCNA.cutreeHybrid(dendro=self.geneTree, distM=dissTOM, deepSplit=2, pamRespectsDendro=False,
File "/home/xukuipeng/miniconda3/envs/eqtl/lib/python3.8/site-packages/PyWGCNA/wgcna.py", line 1265, in cutreeHybrid
dendro_height = WGCNA.get_heights(dendro)
File "/home/abc/miniconda3/envs/eqtl/lib/python3.8/site-packages/PyWGCNA/wgcna.py", line 1189, in get_heights
clusternode = to_tree(Z, True)
File "/home/abc/miniconda3/envs/eqtl/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 1461, in to_tree
is_valid_linkage(Z, throw=True, name='Z')
File "/home/abc/miniconda3/envs/eqtl/lib/python3.8/site-packages/scipy/cluster/hierarchy.py", line 2284, in is_valid_linkage
raise ValueError('Linkage %scontains negative distances.' %
ValueError: Linkage 'Z' contains negative distances.

Error with findModules()

Hello, I am trying to begin using WGCNA for a project of ours.
When I attempt to load the modules for the PyWGCNA object I receive an error when eigengenes are being calculated.

import PyWGCNA, os
import numpy as np
import pandas as pd

PATH_1 = "datasets/Heart_Left_Ventricle.v8.normalized_expression.bed"
PATH_2 = "datasets/gene_tpm_heart_left_ventricle.gct"
PATH_3 = "datasets/expressionList.csv"
# Assuming `fulldata` is your DataFrame and it has been loaded correctly
fulldata = pd.read_csv(PATH_1, sep='\t', header=0)

fulldata_2 = pd.read_csv(PATH_2, sep='\t', header=0)

# fulldata_3 = pd.read_csv(PATH_3, header=0)


# Transpose the dataframe and select columns
datExpr0 = fulldata.iloc[:, 4:390].T  # Python uses 0-based indexing
datExpr1 = fulldata_2.iloc[:, 3:389].T  # Python uses 0-based indexing
# Set column names to gene_id
datExpr0.columns = fulldata['gene_id'].values
datExpr1.columns = fulldata_2['Name'].values


# Set row names (index)
datExpr0.index = fulldata.columns[4:390]
print(datExpr0.index.str.replace("-", ""))
datExpr1.index = fulldata_2.columns[3:389]

datExpr0.index=datExpr0.index.str.replace("-", "")
datExpr1.index=datExpr0.index.str.replace("-", "")
# View the dataframe
print("#1", datExpr0)
print("#2", datExpr1)
# print("#3", fulldata_3)
gsg = PyWGCNA.WGCNA.goodSamplesGenes(datExpr0)

allOK = gsg[-1]
print(allOK)
if not allOK:
    # Optionally, print the gene and sample names that were removed
    if sum(not gsg[1]) > 0:
        print("Removing genes:", ", ".join(list(datExpr0.columns[~gsg[1]])))
    if sum(not gsg[0]) > 0:
        print("Removing samples:", ", ".join(list(datExpr0.index[~gsg[0]])))

    # Remove the offending genes and samples from the data
    datExpr0 = datExpr0.loc[gsg[0], gsg[1]]
print("PICKING THRESHOLD")

DSet = PyWGCNA.WGCNA(name='GTEX-111FC',
                     species='Humans',
                     geneExp=datExpr0,
                     outputPath=os.path.realpath(''),
                     TOMType='unsigned',
                     powers=[10],
                     # sft=sfT,
                     # cut=2,
                     networkType="unsigned",
                     save=True)

DSet2 = PyWGCNA.WGCNA(name='GTEX',
                      species='Humans',
                      geneExp=datExpr1,
                      outputPath=os.path.realpath(''),
                      TOMType='unsigned',
                      powers=[10],
                      # sft=sfT,
                      # cut=2,
                      networkType="unsigned",
                      save=True)





print("PREPROCESSING")
DSet.preprocess()
# DSet2.preprocess()
print("FINDING MODULES")
DSet.findModules()

This is the dataframe I pass to the PyWGCNA object, I have to modify it to be in the correct format.

I am running this script on Python 3.7

setuptools >=66 causing error on pip install; package naming convention ('0.23ubuntu1') must be changed

I received the following error (below) when installing, both via PyPi and the github clone methods.

I believe it may be related to this noted error, which regards PEP440 enforcement in newer versions of setuptools (>=66):
https://bugs.launchpad.net/ubuntu/+source/distro-info/+bug/1991606

Some discussion on StackOverflow:
https://stackoverflow.com/questions/75272737/error-invalid-version-0-23ubuntu1-package-distro-info

SETUP
Ubuntu: 20.04
Python: 3.8.10
setuptools: 67.4.0 (latest at this time; released Feb 21, 2023)

ERROR: Exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pip/_internal/cli/base_command.py", line 186, in _main status = self.run(options, args) File "/usr/lib/python3/dist-packages/pip/_internal/commands/install.py", line 357, in run resolver.resolve(requirement_set) File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 177, in resolve discovered_reqs.extend(self._resolve_one(requirement_set, req)) File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one abstract_dist = self._get_abstract_dist_for(req_to_install) File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 293, in _get_abstract_dist_for req.check_if_exists(self.use_user_site) File "/usr/lib/python3/dist-packages/pip/_internal/req/req_install.py", line 443, in check_if_exists self.satisfied_by = pkg_resources.get_distribution(str(no_marker)) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 514, in get_distribution dist = get_provider(dist) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 386, in get_provider return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 956, in require needed = self.resolve(parse_requirements(requirements)) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 815, in resolve dist = self._resolve_dist( File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 844, in _resolve_dist env = Environment(self.entries) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1044, in __init__ self.scan(search_path) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1077, in scan self.add(dist) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1096, in add dists.sort(key=operator.attrgetter('hashcmp'), reverse=True) File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2640, in hashcmp self.parsed_version, File "/home/username/.local/lib/python3.8/site-packages/pkg_resources/__init__.py", line 2694, in parsed_version raise packaging.version.InvalidVersion(f"{str(ex)} {info}") from None pkg_resources.extern.packaging.version.InvalidVersion: Invalid version: '0.23ubuntu1' (package: distro-info)

Questions about one error in the analysis process

Hi, when I run this code:

pyWGCNA_5xFAD.findModules()

I meet such error:

2553 k = Y.shape[0]
2554 if k == 0:
-> 2555 raise ValueError("The number of observations cannot be determined on "
2556 "an empty distance matrix.")
2557 d = int(np.ceil(np.sqrt(k * 2)))
2558 if (d * (d - 1) / 2) != k:

ValueError: The number of observations cannot be determined on an empty distance matrix.

I can use IPA to find my module, so I wonder if there are any problems and how to solve them. Thanks a lot.

index out of bounds error

Hi Again,

I am running into an error in module processing as pasted below. I loaded data from an Anndata object and ran it:

wgObj=wg.WGCNA(name='caudate_15',species='human',anndata=adata,save=True,outputPath=wdir)
wgObj.preprocess()
wgObj.saveWGCNA()
wgObj.findModules()

adata
AnnData object with n_obs × n_vars = 15 × 62703
obs: 'Sample_id', 'Age', 'Sex', 'PrimaryDx', 'Best_RIN_PFC'

Log:

Saving data to be True, checking requirements ...
Figure directory does not exist!
Creating figure directory!
Pre-processing...
Detecting genes and samples with too many missing values...
MoTTY X11 proxy: Authorisation not recognised

In case you are trying to start a graphical application with "sudo", read this article in order to avoid this issue:
https://blog.mobatek.net/post/how-to-keep-X11-display-after-su-or-sudo/

    Done pre-processing..

Saving WGCNA as caudate_15.p
Run WGCNA...
pickSoftThreshold: calculating connectivity for given powers...
will use block size 938
Power SFT.R.sq slope truncated R.sq mean(k) median(k) max(k)
0 1 0.669481 1.05534 0.957384 31051.154672 34456.969227 38340.107901
1 2 0.497196 0.576056 0.965458 23742.416236 27164.831475 33212.334557
2 3 0.291156 0.315183 0.971497 19212.214492 22133.603972 29652.331155
3 4 0.114866 0.152961 0.977983 16080.835202 18386.478558 26944.114085
4 5 0.011362 0.041235 0.984779 13772.355995 15495.97514 24771.758022
5 6 0.016052 -0.04471 0.983994 11995.33809 13195.557949 22982.569073
6 7 0.112313 -0.113847 0.966303 10583.894625 11324.825903 21471.944854
7 8 0.249469 -0.171249 0.936059 9435.706678 9791.003668 20168.378655
8 9 0.390201 -0.222952 0.908155 8483.8443 8510.554845 19027.425415
9 10 0.504946 -0.266045 0.882287 7682.50703 7434.62295 18020.018942
10 11 0.593963 -0.304442 0.857825 6999.209965 6521.724537 17120.836504
11 13 0.71338 -0.371937 0.833508 5897.81059 5079.215524 15577.442657
12 15 0.772407 -0.426479 0.812742 5051.430162 4006.483937 14294.715799
13 17 0.798184 -0.476529 0.794877 4383.223243 3192.709391 13207.285919
14 19 0.808266 -0.5211 0.777081 3844.252587 2563.927356 12270.888872
No power detected to have scale free network!
Found the best given power which is 19.
calculating adjacency matrix ...
Done..

calculating TOM similarity matrix ...
Done..

Going through the merge tree...
..cutHeight not given, setting it to 0.995 ===> 99% of the (truncated) height range in dendro.
Traceback (most recent call last):
File "/users/kkathuri/nscripts/python/runWGCNA.v2.py", line 34, in
wgObj.saveWGCNA()
File "/users/kkathuri/.local/lib/python3.7/site-packages/PyWGCNA/wgcna.py", line 276, in findModules
minClusterSize=self.minModuleSize)
File "/users/kkathuri/.local/lib/python3.7/site-packages/PyWGCNA/wgcna.py", line 1377, in cutreeHybrid
onBranch[int(gene)] = clust
IndexError: index 47659 is out of bounds for axis 0 with size 47659

Any help appreciated! Thanks!

Kunal

"unsigned" option not working

Hi,

I've set the networkType="unsigned" and TOMType="unsigned", and use obj.findModules(), but after calculating adjacency matrix ... Done ...

I get the following error:

"An exception has occurred, use %tb to see the full traceback.

SystemExit: 'TOMType' cannot be 'none' for this function."

Maybe TOMType="unsigned" is not necessary as long as the networkType is already unsigned? Please let me know. Overall great package and thanks in advance!

Best,
Hamilton

'WGCNA' object has no attribute 'top_n_hub_genes'

import PyWGCNA
pyWGCNA_5xFAD = PyWGCNA.readWGCNA("5xFAD.p")
pyWGCNA_5xFAD.top_n_hub_genes(moduleName="coral", n=10)

AttributeError: 'WGCNA' object has no attribute 'top_n_hub_genes'

Import issue on Windows with ModuleNotFoundError: No module named 'resource'

Hi,

I am trying to run the code on my windows computer and I frequently run into this windows-specific error :
import PyWGCNA and it results in ModuleNotFoundError: No module named 'resource' . I tried different solutions on Windows but didnt yield anything useful so far.

I suggest that initial tutorial could be expanded to include imports, and also example(s) for creating anndata object(s) would be very useful indeed. Thank you.

Error in .findModules()

After executing the preprocess() function without any problem, in the next step in .findModules(), specifically during the step "Going through the merge tree..." I get an index error:
IndexError: index 67 is out of bounds for axis 0 with size 67
What could be the error?
Thanks in advance!

single cell data?

Hi,

I was trying to run PyWGCNA on single cell data (pseudobulk, in the suggested format). I was wondering if there are any downsides to this and if I should try hdWGCNA instead. In particular, our goal is to compare genes in 2 phenotype groups (by trying to run PyWGCNA on each group). Thank you!

Kunal

Error in `preprocess()` when loading from anndata

Hi, I keep getting an error in preprocess() after loading from anndata. Any advice is much appreciated!

Works with dataframe itself:

---
anndata     0.8.0
scanpy      1.9.1
---
PIL                 9.2.0
PyQt5               NA
PyWGCNA             NA
asciitree           NA
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
biomart             NA
brotli              NA
certifi             2022.06.15
cffi                1.15.1
charset_normalizer  2.1.0
cloudpickle         2.0.0
colorama            0.4.5
cycler              0.10.0
cython_runtime      NA
dask                2021.11.0
dask_image          0.6.0
dateutil            2.8.2
decorator           5.1.1
defusedxml          0.7.1
docrep              0.3.2
entrypoints         0.4
executing           0.8.3
fasteners           NA
fsspec              2021.11.0
gseapy              0.14.0
h5py                3.7.0
hypergeom_ufunc     NA
idna                3.3
igraph              0.9.8
imagecodecs         2022.8.8
imageio             2.10.3
ipykernel           5.5.5
ipython_genutils    0.2.0
ipywidgets          7.7.1
itables             1.1.2
jedi                0.18.1
jinja2              3.1.2
joblib              1.1.0
jsonpickle          2.2.0
jupyter_server      1.18.0
kiwisolver          1.4.4
leidenalg           0.8.8
llvmlite            0.39.0
louvain             0.7.1
markupsafe          2.1.1
matplotlib          3.5.3
matplotlib_inline   NA
matplotlib_scalebar 0.8.1
mpl_toolkits        NA
natsort             8.1.0
nbinom_ufunc        NA
ncf_ufunc           NA
networkx            2.8.6
numba               0.56.0
numcodecs           0.9.1
numexpr             2.7.3
numpy               1.22.4
packaging           21.3
pandas              1.4.3
parso               0.8.3
patsy               0.5.2
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.30
ptyprocess          0.7.0
pure_eval           0.2.2
pycparser           2.21
pygments            2.12.0
pyparsing           3.0.9
pytz                2022.2.1
pyvis               0.3.0
pywt                1.1.1
requests            2.28.1
scipy               1.9.0
seaborn             0.11.2
session_info        1.0.0
setuptools          65.2.0
setuptools_scm      NA
sip                 NA
six                 1.16.0
skimage             0.19.3
sklearn             1.1.2
socks               1.7.1
squidpy             1.2.2
stack_data          0.3.0
statsmodels         0.13.2
texttable           1.6.4
threadpoolctl       3.1.0
tifffile            2021.11.2
tlz                 0.11.2
toolz               0.11.2
tornado             6.1
traitlets           5.3.0
typing_extensions   NA
urllib3             1.26.9
validators          0.20.0
wcwidth             0.2.5
xarray              0.20.1
yaml                6.0
zarr                2.11.0a2
zipp                NA
zmq                 19.0.2

IPython             8.4.0
jupyter_client      7.0.6
jupyter_core        4.10.0
jupyterlab          3.4.3
notebook            6.4.12

Python 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0]
Linux-5.15.0-48-generic-x86_64-with-glibc2.17

Session information updated at 2022-10-11 19:01

pyWGCNA.findModules()—— sys.exit("Error!")

Hi!

I have been using PyWGCNA for a while. Thank you a lot, it has been very useful and did not encounter any errors before.

Recently, I got a new computer so had to re-install everything (I used to have 1.16.8 version of PyWGCNA before).
I saw that this issue appeared before and also tried versions - 1.2.4 and 1.2.3 (as you suggested to try the newest versions) but still got the same error. I also tried 1.16.8 which used to work before, and the error was still there. My colleague encountered the same error even with the dataset that was provided in the tutorial.

I checked the requirements and re-installed PyWGCNA several times.

Currently, I am trying to run .findModules() and this is the error I encountered:

Selected power to have scale free network is 9.
calculating adjacency matrix ...
Done..

calculating TOM similarity matrix ...
Done..

Going through the merge tree...
..cutHeight not given, setting it to 0.994 ===> 99% of the (truncated) height range in dendro.
Done..

Calculating 45 module eigengenes in given set...
..principal component calculation for module antiquewhite failed with the following error:
..hub genes will be used instead of principal components.
An exception has occurred, use %tb to see the full traceback.

SystemExit: Error!

I was also running it on a dataset where I did not encounter the error before I re-installed PyWGCNA, so I do not think it is the problem. I have attached it just in case (a final transposed version):

https://fromsmash.com/rnaseqdata

I am really running out of ideas of what could be the problem so help would be really appreciated.

pyWGCNA.findModules()—— sys.exit("Error!")

Hi
Thanks for providing PyWGCNA.
I am running using my data. In the step pyWGCNA.findModules(), I got Error
Calculating 29 module eigengenes in given set...
..principal component calculation for module black failed with the following error:
..hub genes will be used instead of principal components.
An exception has occurred, use %tb to see the full traceback.

SystemExit: Error!

`SystemExit Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 pyWGCNA.findModules()

File ~\AppData\Roaming\Python\Python39\site-packages\PyWGCNA\wgcna.py:300, in WGCNA.findModules(self, **kwargs)
297 self.datExpr.var['dynamicColors'] = WGCNA.labels2colors(labels=dynamicMods)
299 # Calculate eigengenes
--> 300 MEList = WGCNA.moduleEigengenes(expr=self.datExpr.to_df(), colors=self.datExpr.var['dynamicColors'])
302 self.MEs = MEList['eigengenes']
303 if 'MEgrey' in self.MEs.columns:

File ~\AppData\Roaming\Python\Python39\site-packages\PyWGCNA\wgcna.py:2036, in WGCNA.moduleEigengenes(expr, colors, impute, nPC, align, excludeGrey, grey, subHubs, softPower, scaleVar, trapErrors)
2034 if not check:
2035 if not trapErrors:
-> 2036 sys.exit("Error!")
2037 print(" ..ME calculation of module", modulename, "failed with the following error:", flush=True)
2038 print(" ", pc, " ..the offending module has been removed.", flush=True)

SystemExit: Error!`

The running log as following:
Run WGCNA...
pickSoftThreshold: calculating connectivity for given powers...
will use block size 3104
Power SFT.R.sq slope truncated R.sq mean(k) median(k)
0 1 0.019005 0.439419 0.944844 1583.917996 1554.807821
1 2 0.06366 -0.584601 0.929494 670.43699 652.433442
2 3 0.309456 -1.103302 0.942454 354.841864 335.688239
3 4 0.463625 -1.258173 0.934568 214.286872 195.668517
4 5 0.551524 -1.377 0.910081 141.322811 123.364176
5 6 0.665052 -1.332488 0.935139 99.308738 82.414462
6 7 0.808912 -1.275045 0.99298 73.223957 57.040702
7 8 0.861059 -1.346268 0.995675 56.071326 40.420707
8 9 0.899288 -1.39177 0.995843 44.268053 29.410026
9 10 0.915354 -1.439295 0.99548 35.840468 21.702019
10 11 0.930287 -1.462413 0.99669 29.635947 16.249537
11 13 0.94553 -1.491679 0.996215 21.327587 9.572117
12 15 0.95332 -1.479453 0.996157 16.193779 5.931626
13 17 0.954481 -1.464538 0.993863 12.811008 3.774937
14 19 0.953151 -1.434312 0.988609 10.466524 2.496161

     max(k)

0 2757.869438
1 1459.46165
2 910.290623
3 627.926617
4 477.597429
5 375.88095
6 308.184438
7 269.243033
8 238.667545
9 217.831659
10 200.730955
11 174.80621
12 156.010401
13 141.336406
14 129.549861
Selected power to have scale free network is 10.
calculating adjacency matrix ...
Done..

calculating TOM similarity matrix ...
Done..

Going through the merge tree...
..cutHeight not given, setting it to 0.994 ===> 99% of the (truncated) height range in dendro.
Done..

Calculating 29 module eigengenes in given set...
..principal component calculation for module black failed with the following error:
..hub genes will be used instead of principal components.
An exception has occurred, use %tb to see the full traceback.

SystemExit: Error!

thanks

Error appearance in PyWGCNA (version 1.2.4) in Google colaboratory

I ran PyWGCNA under Google colaboratory environment.

①　pyWGCNA_CC.findModules()
→UnboundLocalError: local variable 'merge' referenced before assignment

②　pyWGCNA_CC.analyseWGCNA()
→ValueError: 'box_aspect' and 'fig_aspect' must be positive

Details are below.
PyWGCNA problem

We welcome your valuable input.
Thank you in advance.

zero variance

Hi,
when I ran pyWGCNA.preprocess(), an error occurred:

Pre-processing...
Detecting genes and samples with too many missing values...

..Excluding 9 genes from the calculation due to too many missing samples or zero variance.

ValueError: operands could not be broadcast together with shapes (3407,) (3398,)

Could you check the code make sure was it caused by zero variance genes ?

outputPath does not work

When creating a WGCNA object I cannot set the outputPath argument and always receive and error i.e.

data = PyWGCNA.WGCNA(anndata=counts, save=True, outputPath='./figures')
Traceback (most recent call last):
  File "pywgcna_test.py", line 21, in <module>
    minModuleSize=30,
  File "/PHShome/sr1068/.conda/envs/pywgcna/lib/python3.7/site-packages/PyWGCNA/wgcna.py", line 173, in __init__
    if not os.path.exists(self.outputPath + '/figures/'):
AttributeError: 'WGCNA' object has no attribute 'outputPath'

I looked at the code and in the WGCNA class the variable is never instantiated, adding the line

self.outputPath = outputPath

to the WGCNA class in wgcna.py fixed this issue and allows me to set outputPath.

x and y ticks

Hi,
Now the x and y ticks cannot be displayed in the plot. Could you add them so we can see the x and y coordination clearly ?

high resolution PyWGCNA_overview.png

Hi,
Some letters on PyWGCNA_overview.png are too small to see clearly, would you mind providing higher resolution figures ?

I can't follow the tutorial

I am following the tutorial but at this point:
pyWGCNA_5xFAD = PyWGCNA.WGCNA(name='5xFAD', species='mouse',
geneExpPath=geneExp,
save=True)
I get this error:
ValueError: no types given
As I am using the data provided for the tutorial I do not know what might be wrong
Thanks in advance

PyWGCNA.getGeneList not working

Hi!

I really like this package and is trying to use it for my analysis.

I was going through the tutorial but PyWGCNA.getGeneList would get an error. HTTPError: 504 Server Error: Gateway Time-out for url: http://uswest.ensembl.org/biomart/martservice

It seems like this is due to server = biomart.BiomartServer('http://uswest.ensembl.org/biomart') this wbsite cannot be opened.

For my own analysis, I will use my own gene annotation but I kind of wanna be able to go through this tutorial. I am wondering is there another way to get the genelist?

Thank you,

Module Colors

Hi,

Thanks for your fix for the last bug!

I think I found another one in version 1.0.0

When I run findModules, I get the error that "Gray" isn't in colorSeq.

I think you can fix this by just removing line 1822 in wgcna.py that says:
colorSeq.remove("Gray")

Thanks,
Nate

Input data for WGCNA

Hi! I am working with your library to perform WGCNA, but I continously get an error and I could not figure out what I did wrong.
I have data regarding proteins RPM in various phago and non-phago samples. The table is as follows:

	Fam129a	Susd3
p1	0.7	0.3
n1	0.2	0.1

As indicated in the quick guide as a first step I use pyWGCNA_5xFAD = PyWGCNA.WGCNA(name='genes', species='human',geneExpPath=geneExp, save=True) (where geneExp is the path of the table mentioned above). With this pyWGCNA_5xFAD.preprocess() and pyWGCNA_5xFAD.findModules() does not give any problems to find the modules.

After this I try to use pyWGCNA_5xFAD.analyseWGCNA() but after printing the heatmap for each module on the "sample_id" gives me the error figure size must be positive finite not (26, 0). Also I would like to know how to change the fact that the heatmap is done only on sample_id because I need it on the various genes.
Attached there is the full snippet of the error.

Thank you very much for your help!

cannot preprocess / manipulate Anndata object

Hello,

I have managed to import my count table, annotation and metadata and combined it into an Anndata object, named "Nc_coexp".
To potentially verifiy the structure of the object here the output ftom head command:

Nc_coexp.geneExpr.to_df().head(5)
Out[54]:
NCU00003-t26_1 NCU00004-t26_1 ... NCU17284-t26_1 NCU17285-t26_1
sample_id ...
ERR6120389 7.05213 20.6476 ... 0.574325 3.15034
ERR6120390 7.12926 19.5792 ... 0.741205 6.68712
ERR6120391 9.01246 33.8402 ... 0.826182 1.60666
ERR6120392 22.02220 57.4943 ... 3.428080 9.33067
ERR6120393 23.10760 50.7808 ... 2.335950 10.58590

[5 rows x 10813 columns]

as I understand it, Nc_coexp.geneExpr only looks at the count table, which I also want to target for preprocessing. However this does not work:

Nc_coexp.geneExpr.preprocess()
Traceback (most recent call last):

Cell In[55], line 1
Nc_coexp.geneExpr.preprocess()

AttributeError: 'AnnData' object has no attribute 'preprocess'

Any idea what I am doing wrong here? it seems like some of the pyWGCNA functions are correctly implemented (like PyWGCNA.geneExp.GeneExp() which allowed me to create the AnnData object in the first place. Preprocess on the other hand does not appear among the listed options when entering "Nc_coexp." and waiting for IDE suggestions.

analyseWGCNA() improvement

Hi,
Now the analyses and plot function are integrated together in function analyseWGCNA(), is it better to separate these two process ? If so, analyseWGCNA can focus on analyzing the data, while visualization can be carried out by other functions (such as plotModuleEigenGene and barplotModuleEigenGene). Now the plotting in analyseWGCNA() are time-consuming, and sometimes errors could be triggered because the large size of figure. Although we can set show=False, the plotting are still performed internally.

Integer Gene Labels with geneExp.py

Suppose we have integer gene labels, eg. Entrez Identifiers.

On line 63 of geneExp.py you say
index=expressionList.values[:, 0]

If the first column of expression list (the gene labels) contains all entries of type int, this forces it to be the same type as the rest of the entries in expressionList. Namely, a float or double. This leads to an error in line 69 where you write

self.geneExpr = ad.AnnData(X=expressionList, obs=sampleInfo, var=geneInfo)

A fix for this is to write np.array(expressionList.iloc[:,0]) instead of expressionList.values[:, 0] on line 63.

pyWGCNA maintenance

Hi,
There is a lack of WGCNA in python ecosystem, So we're very glad to see the emergence of pyWGCNA ! However, we still have some worries.
1, I have read the paper "Systematic phenotyping and characterization of the 5xFAD mouse model of Alzheimer’s disease" cited in pyWGCNA preprint. It was said that "We applied PyWGCNA to 192 bulk RNA-seq samples of cortex and hippocampus of the 5xFAD mouse model and matching C57BL6/J mice at 4 ages (4, 8, 12, and 18 months) in both sexes" in pyWGCNA preprint, while I didn't find pyWGCNA (it's WGCNA) in the original 5xFAD paper. So was pyWGCNA or WGCNA used in the paper and what's the relationship between them ?
2, There is a related python packages named iterative WGCNA (https://github.com/cstoeckert/iterativeWGCNA), which is on bioRxiv for many years and not peer reviewed. I suspect it's not being supported. Do you known the features of iternative WGCNA and plan to support pyWGCNA for a long time ? We are expecting a python version WGCNA and want to learn how to use pyWGCNA. This will take our a great effect to migrate to pyWGCNA, So we hope it will be supported and stable.

TypeError in analyseWGCNA

Hi nargesr,

When I run pyWGCNA_5xFAD.analyseWGCNA(geneList=geneList), I got a TypeError: enrichr() got an unexpected keyword argument 'description'. How should I resolve this?

Thank you

AttributeError when running

I get this error when I run the .findModules()
I already installed all the requirements

EDIT: The test work ok, but doesn't work with data that I downloaded
EDIT: I already solve it

Going through the merge tree...
..cutHeight not given, setting it to 0.082  ===>  99% of the (truncated) height range in dendro.
cutHeight set too low: no merges below the cut.
Traceback (most recent call last):
  File "/Users/fmelis/Documents/Doctorado/Tesis/phd-thesis/__main__.py", line 21, in <module>
    main()
  File "/Users/fmelis/Documents/Doctorado/Tesis/phd-thesis/__main__.py", line 17, in main
    test(dataset)
  File "/Users/fmelis/Documents/Doctorado/Tesis/phd-thesis/core/wgcna.py", line 9, in test
    pyWGCNA_5xFAD.findModules()
  File "/Users/fmelis/opt/anaconda3/envs/wgcna/lib/python3.10/site-packages/PyWGCNA/wgcna.py", line 297, in findModules
    self.datExpr.var['dynamicColors'] = WGCNA.labels2colors(labels=dynamicMods)
  File "/Users/fmelis/opt/anaconda3/envs/wgcna/lib/python3.10/site-packages/PyWGCNA/wgcna.py", line 1860, in labels2colors
    if all(isinstance(x, int) for x in labels.Value):
  File "/Users/fmelis/opt/anaconda3/envs/wgcna/lib/python3.10/site-packages/pandas/core/generic.py", line 5575, in __getattr__
    return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'Value'. Did you mean: 'values'?

import PyWGCNA

AttributeError: 'NoneType' object has no attribute 'render'

This line of code:

pyWGCNA_m1_10x.CoexpressionModulePlot(modules=modules, numGenes=100, numConnections=1000, minTOM=0, file_name="all")

giving me following error.
AttributeError: 'NoneType' object has no attribute 'render'

What I am doing wrong?

TypeError: enrichr() got an unexpected keyword argument 'description'

Hi !
Thank you for developing PyWGCNA !
Running the command pyWGCNA_5xFAD.analyseWGCNA(geneList=geneList) in the Quick start notebook, I get the following error.

TypeError: enrichr() got an unexpected keyword argument 'description'

So, I checked the source code in the file PyWGCNA-main/PyWGCNA/wgcna.py, and I found that the above argument is used at the method gp.enrichr() in line3019.

However, in reading the official GSEApy web documentation, I could not find "description='' as an optional argument to the gp.enrichr() method. In my environment, gseapy==0.14.0 is installed through pypi.

Do I need to install another version of gseapy to solve this problem? Or do I just modify PyWGCNA.py? Which is better? Also, if I modify PyWGCNA.py, how should I install it?

Lastly... it's much easier to use than the R version of WGCNA, and I'm impressed!

Sincerely,

Error with function analyseWGCNA()

HI I AM TRYING TO FIND THE CORRELATION OF THE MODULES OF MY WGCNA OBJECT WITH THE TRAITS OF MY METADATA , BUT I GET THE FOLLOWING ERROR, DO ANYONE KNOW HOW TO SOLVE IT? THANK YOU.

pyWGCNA_TCGA_PAAD.analyseWGCNA()

Analysing WGCNA...
Calculating module trait relationship ...
Done..

Adding (signed) eigengene-based connectivity (module membership) ...
Done..

plotting module heatmap eigengene...
Done..

plotting module barplot eigengene...

ValueError Traceback (most recent call last)
Cell In[10], line 1
----> 1 pyWGCNA_TCGA_PAAD.analyseWGCNA()

File ~/miniconda3/envs/machine_learning/lib/python3.11/site-packages/PyWGCNA/wgcna.py:447, in WGCNA.analyseWGCNA(self, order, geneList, show)
445 print(f"{OKCYAN}plotting module barplot eigengene...{ENDC}")
446 for module in modules:
--> 447 self.barplotModuleEigenGene(module, metadata, colorBar=metadata[-1], show=True)
448 print("\tDone..\n")
450 if self.save:

File ~/miniconda3/envs/machine_learning/lib/python3.11/site-packages/PyWGCNA/wgcna.py:2946, in WGCNA.barplotModuleEigenGene(self, moduleName, metadata, combine, colorBar, show)
2944 df['all'] = df['all'].apply(lambda x: x[1:])
2945 cat = pd.DataFrame(pd.unique(df['all']), columns=['all'])
-> 2946 cat[metadata] = cat['all'].str.split('_', expand=True)
2947 ybar = df[['all', 'eigengeneExp']].groupby(['all']).mean()['eigengeneExp']
2948 ebar = df[['all', 'eigengeneExp']].groupby(['all']).std()['eigengeneExp']

File ~/miniconda3/envs/machine_learning/lib/python3.11/site-packages/pandas/core/frame.py:3968, in DataFrame.setitem(self, key, value)
3966 self._setitem_frame(key, value)
3967 elif isinstance(key, (Series, np.ndarray, list, Index)):
-> 3968 self._setitem_array(key, value)
3969 elif isinstance(value, DataFrame):
3970 self._set_item_frame_value(key, value)
...
402 else:
403 # Missing keys in columns are represented as -1
404 if len(columns.get_indexer_non_unique(key)[0]) != len(value.columns):

ValueError: Columns must be same length as key

principal component calculation for module black failed with the following error:

Hi,

I closed #35 by mistake. I change a compute node. There is no 'ValueError: Linkage 'Z' contains negative distances'. But I got another error when running it follows 'Quick start':

import PyWGCNA
geneExp = '5xFAD_paper/expressionList.csv'
pyWGCNA_5xFAD = PyWGCNA.WGCNA(name='5xFAD',species='mus musculus',geneExpPath=geneExp,outputPath='',save=True)
Saving data to be True, checking requirements ...
Figure directory does not exist!
Creating figure directory!
pyWGCNA_5xFAD.preprocess()
Pre-processing...
Detecting genes and samples with too many missing values...
Done pre-processing..

pyWGCNA_5xFAD.findModules()
Run WGCNA...
pickSoftThreshold: calculating connectivity for given powers...
will use block size 1876
Power SFT.R.sq slope ... mean(k) median(k) max(k)
0 1 0.368857 -0.481613 ... 2444.750756 2260.416614 5665.102661
1 2 0.7253 -0.99165 ... 840.665489 673.081241 3009.058821
2 3 0.791986 -1.194264 ... 385.685335 258.451265 1916.810605
3 4 0.835392 -1.3419 ... 207.404152 113.456087 1332.762771
4 5 0.853842 -1.472183 ... 123.232581 54.784481 984.036824
5 6 0.870673 -1.553348 ... 78.455923 28.47124 752.959999
6 7 0.886736 -1.600869 ... 52.572016 15.594822 591.514192
7 8 0.896672 -1.639343 ... 36.65884 9.454046 475.817182
8 9 0.903531 -1.677747 ... 26.397061 6.024431 389.237531
9 10 0.906045 -1.706474 ... 19.521431 3.975959 322.823838
10 11 0.905582 -1.731076 ... 14.767291 2.623921 270.867416
11 13 0.914482 -1.751347 ... 8.941254 1.205108 196.222414
12 15 0.912684 -1.771227 ... 5.759987 0.568044 146.575349
13 17 0.912188 -1.774908 ... 3.905403 0.273242 112.189052
14 19 0.907649 -1.774186 ... 2.766824 0.135454 87.594344

[15 rows x 7 columns]
Selected power to have scale free network is 9.
calculating adjacency matrix ...
Done..

calculating TOM similarity matrix ...
Done..

Going through the merge tree...
..cutHeight not given, setting it to 0.996 ===> 99% of the (truncated) height range in dendro.
Done..

Calculating 22 module eigengenes in given set...
..principal component calculation for module black failed with the following error:
..hub genes will be used instead of principal components.
Error!

KeyError: "gene_id" in running " Comparing two PyWGCNA objects"

Hi!

I successfully obtained the module, enrichr results, and network from each of the two own data.
Next, I tried to compare these two PyWGCNA objects and got a KeyError regarding "gene_id".
These two genExrp have the same ENSG ID, but the sample_id is completely different.

Any advice would be appreciated !!

local variable 'merge' referenced before assignment

when running pyWGCNA_5xFAD.runWGCNA()
`---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In [8], line 1
----> 1 pyWGCNA_5xFAD.runWGCNA()

File /opt/software/python/lib/python3.8/site-packages/PyWGCNA/wgcna.py:353, in WGCNA.runWGCNA(self)
348 """
349 Preprocess and find modules
350 """
351 WGCNA.preprocess(self)
--> 353 WGCNA.findModules(self)
355 return self

File /opt/software/python/lib/python3.8/site-packages/PyWGCNA/wgcna.py:337, in WGCNA.findModules(self, **kwargs)
334 self.datExpr.var['moduleLabels'] = [colorOrder.index(x) if x in colorOrder else None for x in
335 self.datExpr.var['moduleColors']]
336 # Eigengenes of the new merged modules:
--> 337 self.MEs = merge['newMEs']
339 # Recalculate MEs with color labels
340 self.datME = WGCNA.moduleEigengenes(self.datExpr.to_df(), self.datExpr.var['moduleColors'])['eigengenes']

UnboundLocalError: local variable 'merge' referenced before assignment`

getModulenames

Hi, there is a typo in this function:
def getModuleName(self):
"""
get names of modules
:return: name of modules
:rtype: ndarray
"""
return np.unique(self.datExpr.obs['moduleColors']).tolist() (obs should be "var")
Besides,
this typo exists in return getmodule functions, not just this one.

server error in getting gene list

Hello!

I encountered the following error when running

geneList = PyWGCNA.getGeneList(dataset='mmusculus_gene_ensembl',
attributes=['ensembl_gene_id',
'external_gene_name',
'gene_biotype'])

Traceback (most recent call last):
File "/users/kkathuri/nscripts/python/runWGCNA.v2.py", line 55, in
'gene_biotype'])
File "/users/kkathuri/.local/lib/python3.7/site-packages/PyWGCNA/utils.py", line 97, in getGeneList
server = biomart.BiomartServer('http://uswest.ensembl.org/biomart')
File "/users/kkathuri/.local/lib/python3.7/site-packages/biomart/server.py", line 27, in init
self.assert_alive()
File "/users/kkathuri/.local/lib/python3.7/site-packages/biomart/server.py", line 48, in assert_alive
self.get_request()
File "/users/kkathuri/.local/lib/python3.7/site-packages/biomart/server.py", line 104, in get_request
r.raise_for_status()
File "/jhpce/shared/jhpce/core/python/3.7.3/lib/python3.7/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Temporarily Unavailable for url: http://uswest.ensembl.org/biomart/martservice

Should I just keep trying or is there a different work-around?

Thank you!

Best,

Kunal

Could PyWGCNA draw Hierarchical cluster tree?

Hi
Thanks for your works.
I have a question. Could PyWGCNA draw Hierarchical cluster tree? Cluster tree of genes and annotated modules.
I did not find in tutorials.

Thanks!

help documentation of WGCNA

Hi,
I read the help documentation of WGCNA using "help(PyWGCNA.WGCNA)" and found that there are lots redundant arguments in the document, is it needed to update the help documentation ?

Issues running test.py

Error log: I am trying to run the test.py script in the tutorial and am getting the following error.

Saving data to be True, checking requirements ...
Pre-processing...
Detecting genes and samples with too many missing values...
Traceback (most recent call last):
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
[Previous line repeated 992 more times]
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3620, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3550, in _dendrogram_calculate_info
_append_singleton_leaf_node(Z, p, n, level, lvs, ivl,
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3422, in _append_singleton_leaf_node
ivl.append(labels[int(i - n)])
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 5038, in getitem
key = com.cast_scalar_indexer(key, warn_float=True)
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/pandas/core/common.py", line 175, in cast_scalar_indexer
if lib.is_float(val) and val.is_integer():
RecursionError: maximum recursion depth exceeded while calling a Python object
(bioinfo) yashsondhi@Yashs-MBP tutorials % python test.py
Saving data to be True, checking requirements ...
Pre-processing...
Detecting genes and samples with too many missing values...
Traceback (most recent call last):
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3653, in _dendrogram_calculate_info
_dendrogram_calculate_info(
[Previous line repeated 992 more times]
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3620, in _dendrogram_calculate_info
_dendrogram_calculate_info(
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3550, in _dendrogram_calculate_info
_append_singleton_leaf_node(Z, p, n, level, lvs, ivl,
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 3422, in _append_singleton_leaf_node
ivl.append(labels[int(i - n)])
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 5038, in getitem
key = com.cast_scalar_indexer(key, warn_float=True)
File "/Users/yashsondhi/opt/anaconda3/envs/bioinfo/lib/python3.9/site-packages/pandas/core/common.py", line 175, in cast_scalar_indexer
if lib.is_float(val) and val.is_integer():
RecursionError: maximum recursion depth exceeded while calling a Python object
(bioinfo) yashsondhi@Yashs-MBP tutorials % conda deactivate

mortazavilab / pywgcna Goto Github PK

pywgcna's Introduction

PyWGCNA

Documentation

Installation

Install from PyPi (recommended)

Install with the most recent commits

Tutorials

Suggested Reading

Cite

pywgcna's People

Contributors

Stargazers

Watchers

Forkers

pywgcna's Issues

Currently, I am trying to run .findModules() and this is the error I encountered:

plotting module barplot eigengene...

Recommend Projects

Recommend Topics

Recommend Org