eleozzr / desc Goto Github PK

Deep Embedding for Single-cell Clustering

Home Page: https://eleozzr.github.io/desc/

Python 6.16% Shell 0.02% Jupyter Notebook 93.82%

desc batch-remove scrna-seq deep-learning clustering

desc's Introduction

Deep Embedding for Single-cell Clustering (DESC)

DESC is an unsupervised deep learning algorithm for clustering scRNA-seq data. The algorithm constructs a non-linear mapping function from the original scRNA-seq data space to a low-dimensional feature space by iteratively learning cluster-specific gene expression representation and cluster assignment based on a deep neural network. This iterative procedure moves each cell to its nearest cluster, balances biological and technical differences between clusters, and reduces the influence of batch effect. DESC also enables soft clustering by assigning cluster-specific probabilities to each cell, which facilitates the identification of cells clustered with high-confidence and interpretation of results.

For thorough details, see our paper: https://www.nature.com/articles/s41467-020-15851-3

Usage

The desc package is an implementation of deep embedding for single-cell clustering. With desc, you can:

Preprocess single cell gene expression data from various formats.
Build a low-dimensional representation of the single-cell gene expression data.
Obtain soft-clustering assignments of cells.
Visualize the cell clustering results and the gene expression patterns.

Because of the difference between tensorflow 1* and tensorflow 2*, we updated our desc algorithm into two version such that it can be compatible with tensorflow 1* and tensorflow 2*, respectively.

For tensorflow 1*, we released desc(2.0.3). Please see our jupyter notebook example desc_2.0.3_paul.ipynb
For tensorflow 2*, we released desc(2.1.1). Please see our jupyter notebook example desc_2.1.1_paul.ipynb

Installation

To install desc package you must make sure that your python version is either 3.5.x or 3.6.x. If you don’t know the version of python you can check it by:

>>>import platform
>>>platform.python_version()
#3.5.3
>>>import tensorflow as tf
>>> tf.__version__
#1.7.0

Note: Because desc depend on tensorflow, you should make sure the version of tensorflow is lower than 2.0 if you want to get the same results as the results in our paper. Now you can install the current release of desc by the following three ways.

PyPI
Directly install the package from PyPI.

$ pip install desc

Note: you need to make sure that the pip is for python3，or we should install desc by

python3 -m pip install desc 
#or
pip3 install desc

If you do not have permission (when you get a permission denied error), you should install desc by

$ pip install --user desc

Github
Download the package from Github and install it locally:

git clone https://github.com/eleozzr/desc
cd desc
pip install .

Anaconda

If you do not have Python3.5 or Python3.6 installed, consider installing Anaconda (see Installing Anaconda). After installing Anaconda, you can create a new environment, for example, DESC (you can change to any name you like):

conda create -n DESC python=3.5.3
# activate your environment 
source activate DESC 
git clone https://github.com/eleozzr/desc
cd desc
python setup.py build
python setup.py install
# now you can check whether `desc` installed successfully!

Please check desc Tutorial for more details. And we also provide a simple example for reproducing the results of Paul's data in our paper.

Contributing

Souce code: Github

We are continuing adding new features. Bug reports or feature requests are welcome.

References

Please consider citing the following reference:

Xiangjie Li, Yafei Lyu, Jihwan Park, Jingxiao Zhang, Dwight Stambolian, Katalin Susztak, Gang Hu, Mingyao Li. Deep learning enables accurate clustering and batch effect removal in single-cell RNA-seq analysis. 2019. bioRxiv 530378; doi: https://doi.org/10.1101/530378

desc's People

Contributors

Stargazers

Watchers

desc's Issues

OSError: Unable to open file (unable to open file: name = '\data\paul15\paul15.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Hello, author. I encountered the following problems and found a solution that I could solve them by modifying the path. However, I still made the same mistakes after modification. Thank you very much!
Traceback (most recent call last):
File "D:/desc-master/desc/original/train.py", line 317, in
adata=sc.read("\data\paul15\paul15.h5ad")
File "E:\anaconda\envs\DESC\lib\site-packages\scanpy\readwrite.py", line 122, in read
**kwargs,
File "E:\anaconda\envs\DESC\lib\site-packages\scanpy\readwrite.py", line 675, in _read
return read_h5ad(filename, backed=backed)
File "E:\anaconda\envs\DESC\lib\site-packages\anndata_io\h5ad.py", line 408, in read_h5ad
with h5py.File(filename, "r") as f:
File "E:\anaconda\envs\DESC\lib\site-packages\h5py_hl\files.py", line 427, in init
swmr=swmr)
File "E:\anaconda\envs\DESC\lib\site-packages\h5py_hl\files.py", line 190, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 96, in h5py.h5f.open

#OSError: Unable to open file (unable to open file: name = '\data\paul15\paul15.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

@eleozzr #42 #

Auxiliary distribution P in the paper

Hi Xiangjie,

May I ask why you choose the specific auxiliary distribution P in the KL-divergence in the DESC paper?
(It makes sense that small q’s are shrunk more with the square, but I didn’t get the idea of construct p_{ij} with q_{ij}. P and Q are usually different distributions in the KL divergence.)

Questions about running time of DESC

Hi, I am trying to rewrite the process for getting the q value based on Pytorch, but I found that if I intend to use for loops, the running time will be pretty long. Could you please give me any suggestions related to this part? Thanks a lot.

Nature Communication paper figure reproduce

Hi, if you don't mind, could you please share the helpfunc_new.R script for reproducing figures of the publication?

source("/media/xiaoxiang/D/DESC_reproducible_file/helpfunc_new.R")
at the begining of the https://eleozzr.github.io/desc/private_html/DESC_paper_macaque_retina.html

Please direct me to it if it is already posted somewhere in the repo.

Thanks!

License

Hello, Thank you for the tool. Just wondering what the license is for it. Many thanks, Carmen

Unusual UMAP plot

Thanks for your amazing work. I've been using DESC for a while now and no matter what dataset I use I always get these strange looking UMAP plots. tsne looks good though and result of the clustering is great.

python: 3.6.13
tensorflow: 2.6.1
scanpy: 1.7.2

Bipolar Cells

Many thanks for the interesting algorithm!

As you mentioned in the supplementary note 2, only biploar cells in the macaque retina data sets were analyzed.
But I failed to find the cell annotated file which could tell me which part of the macaque retina cells are biploar cells, so could you please provide the link which could find this cell annoatation file?

Best Regards

Identification of rare celltypes

Hi,

I really like the idea behind DESC, great work!

We have a single-cell dataset, which we harmonized using harmony. As a test I wanted to compare wether we can identify similar or the same celltypes when processing the data with DESC.

However, the algorithm with the settings I use does not seem to identify rare celltypes. In my previous clustering with harmony & scanpy I obtained 13 clusters, ranging from 35% to 0.4% of total cell abundance. The markers of the rare clusters are in line with known/validated markers, so I have high confidence that the rare clusters are true. I see this behaviour of celltypes with an abundance of <1%. As some celltypes are very similar, I added more hidden layers to improve the model performance (my dataset has in total 80k cells). Increasing the batch size did not help to solve the problem.

desc.train(adata, dims=[adata.shape[1], 64, 64, 32, 32], tol=0.005, n_neighbors=20,
batch_size=256, louvain_resolution=[0.5],
save_dir="test1“, do_tsne=False, learning_rate=300,
do_umap=True, num_Cores_tsne=4,
save_encoder_weights=True, pretrain_epochs=100)

Happy to hear your thoughts on this!

float() argument must be a string or a number, not 'SparseCSRView'

Ran desc test without any issues. Then put together my AnnData object following your tutorial:

View of AnnData object with n_obs × n_vars = 98465 × 3567
    obs: 'barcode'
    var: 'gene_ids', 'gene_symbols', 'highly_variable', 'means', 'dispersions', 'dispersions_norm'

Then I ran DESC.train according to the tutorial:

adata = DESC.train(adata, dims=[adata.shape[1], 32, 16], tol=0.005, n_neighbors=10,
                   batch_size=256, louvain_resolution=[0.8],
                   save_dir="result_desc_combined_sct", do_tsne=True, learning_rate=300,
                   do_umap=True, num_Cores_tsne=35,
                   save_encoder_weights=True)

And got the following error:


TypeError                                 Traceback (most recent call last)
TypeError: float() argument must be a string or a number, not 'SparseCSRView'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<ipython-input-17-f81efb2c99ea> in <module>
      3                    save_dir="result_desc_combined_sct", do_tsne=True, learning_rate=300,
      4                    do_umap=True, num_Cores_tsne=35,
----> 5                    save_encoder_weights=True)

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/desc.py in train(data, dims, alpha, tol, init, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, GPU_id, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
    335             perplexity=perplexity,
    336             do_umap=do_umap,
--> 337             kernel_clustering=kernel_clustering)
    338         #update adata
    339         data=res

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/desc.py in train_single(data, dims, alpha, tol, init, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, GPU_id, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
    151               save_encoder_step=save_encoder_step,
    152               save_dir=save_dir,
--> 153               kernel_clustering=kernel_clustering
    154     )
    155     desc.compile(optimizer=SGD(0.01,0.9),loss='kld')

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/network.py in __init__(self, dims, x, alpha, tol, init, louvain_resolution, n_neighbors, pretrain_epochs, epochs_fit, batch_size, random_seed, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, kernel_clustering)
    171         tf.set_random_seed(random_seed) if tf.__version__ < "2.0" else tf.random.set_seed(random_seed)
    172         #pretrain autoencoder
--> 173         self.pretrain()
    174 
    175 

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/network.py in pretrain(self)
    192                 print("The file ae_weights.h5 is not exits")
    193                 if self.is_stacked:
--> 194                     sae.fit(self.x,epochs=self.pretrain_epochs)
    195                 else:
    196                     sae.fit2(self.x,epochs=self.pretrain_epochs)

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/SAE.py in fit(self, x, epochs, decaying_step)
    188 
    189     def fit(self, x, epochs=300,decaying_step=3): # use stacked autoencoder pretrain and fine tuning
--> 190         self.pretrain_stacks(x, epochs=int(epochs/2),decaying_step=decaying_step)
    191         self.pretrain_autoencoders(x, epochs=epochs)
    192 

~/anaconda3/envs/DESC/lib/python3.7/site-packages/desc-2.0.2-py3.7.egg/desc/models/SAE.py in pretrain_stacks(self, x, epochs, decaying_step)
    155                 if self.use_earlyStop is True:
    156                     callbacks=[EarlyStopping(monitor='loss',min_delta=1e-4,patience=10,verbose=1,mode='auto')]
--> 157                     self.stacks[i].fit(features,features,callbacks=callbacks,batch_size=self.batch_size,epochs=math.ceil(epochs/decaying_step))
    158                 else:
    159                     self.stacks[i].fit(x=features,y=features,batch_size=self.batch_size,epochs=math.ceil(epochs/decaying_step))

~/anaconda3/envs/DESC/lib/python3.7/site-packages/Keras-2.1.0-py3.7.egg/keras/models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
    958                               initial_epoch=initial_epoch,
    959                               steps_per_epoch=steps_per_epoch,
--> 960                               validation_steps=validation_steps)
    961 
    962     def evaluate(self, x, y, batch_size=32, verbose=1,

~/anaconda3/envs/DESC/lib/python3.7/site-packages/Keras-2.1.0-py3.7.egg/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
   1646                               initial_epoch=initial_epoch,
   1647                               steps_per_epoch=steps_per_epoch,
-> 1648                               validation_steps=validation_steps)
   1649 
   1650     def evaluate(self, x=None, y=None,

~/anaconda3/envs/DESC/lib/python3.7/site-packages/Keras-2.1.0-py3.7.egg/keras/engine/training.py in _fit_loop(self, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps)
   1211                     batch_logs['size'] = len(batch_ids)
   1212                     callbacks.on_batch_begin(batch_index, batch_logs)
-> 1213                     outs = f(ins_batch)
   1214                     if not isinstance(outs, list):
   1215                         outs = [outs]

~/anaconda3/envs/DESC/lib/python3.7/site-packages/Keras-2.1.0-py3.7.egg/keras/backend/tensorflow_backend.py in __call__(self, inputs)
   2350         session = get_session()
   2351         updated = session.run(fetches=fetches, feed_dict=feed_dict,
-> 2352                               **self.session_kwargs)
   2353         return updated[:len(self.outputs)]
   2354 

~/anaconda3/envs/DESC/lib/python3.7/site-packages/tensorflow-1.15.3-py3.7-linux-x86_64.egg/tensorflow_core/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    954     try:
    955       result = self._run(None, fetches, feed_dict, options_ptr,
--> 956                          run_metadata_ptr)
    957       if run_metadata:
    958         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/anaconda3/envs/DESC/lib/python3.7/site-packages/tensorflow-1.15.3-py3.7-linux-x86_64.egg/tensorflow_core/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1147             feed_handles[subfeed_t] = subfeed_val
   1148           else:
-> 1149             np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
   1150 
   1151           if (not is_tensor_handle_feed and

~/anaconda3/envs/DESC/lib/python3.7/site-packages/numpy-1.19.0rc1-py3.7-linux-x86_64.egg/numpy/core/_asarray.py in asarray(a, dtype, order)
     81 
     82     """
---> 83     return array(a, dtype, copy=False, order=order)
     84 
     85 

ValueError: setting an array element with a sequence.

I am using python 3.7.3 as the installation with python 3.6 did not work. Do you think that this is the problem?

ValueError: Cannot assign value to variable ' encoder_0/kernel:0': Shape mismatch.

I could run through the PBMC tutorial data set with no problems, but got the ValueError when running desc.train on my own data. What did I do wrong?

ValueError: Cannot assign value to variable ' encoder_0/kernel:0': Shape mismatch.The variable shape (1141, 32), and the assigned value shape (2013, 32) are incompatible.

desc.version
'2.1.1'

tf.version
'2.12.0'

scanpy.version
'1.9.3'

My file after running

desc.normalize_per_cell(adata, counts_per_cell_after=1e4)
desc.log1p(adata)
sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5, subset=True)
desc.scale(adata, zero_center=True, max_value=3)
adata

AnnData object with n_obs × n_vars = 5141 × 1141
obs: 'BATCH', 'n_genes', 'n_counts'
var: 'n_cells', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
uns: 'log1p', 'hvg'

How to remove batch effect?

Hi, eleozzr!
Thanks for your great tool. Advantage of your tool that attracts me is batch effect removal. I have two 10X scRNA-Seq dataset. And I want to combine them and then cluster. So I have two questions:
1. Can I use DESC to remove batch effect and then cluster?
2. If DESC can deal with this situation, could you provide me some simple code, as I can't find related information in your tutorial?

Thanks
Mengyan Zhu

error: Couldn't find a setup script in C:\Users\lenovo\AppData\Local\Temp\easy_install-i000x5dd\pandas-2.1.0.tar.gz

hi！i'm using Anaconda to install desc,
but python setup.py install error,error: Couldn't find a setup script in C:\Users\lenovo\AppData\Local\Temp\easy_install-i000x5dd\pandas-2.1.0.tar.gz,anyone could help?Thanks

Depreciation warning from pip about matplotlib version

Hi,
First, thank you for your work.
Pip runs a depreciation after DESC is installed: desc 2.1.1 has a non-standard dependency specifier matplotlib>=2.2pydot.
I believe this is due to a missing comma at the end of the following line:

desc/setup.py

Line 19 in e4116d6

'matplotlib>=2.2'

Best,
Vivien

DESC Question

Hello Xiangjie,

I am sorry for creating an 'issue', I have more of a question regarding DESC's functionality. I want to be able to see the probability that an individual cells has for being in a particular cell cluster, what's the best way of doing that?
I understand that there is a 'prob_matrix0.8' variable in the DESC object that contains a probability data matrix. However I don't know how that probability matrix aligns to individual cells.

Is there a way to see the probability of an individual cell belonging to a cell cluster?

Thank you,
Behram

TypeError: add_weight() got multiple values for argument 'name'

Dear author,

I have encountered with a running error with "desc.run_desc_test()":

TypeError: add_weight() got multiple values for argument 'name'

It appears that the error is backtracked at function "add_weight":

/extraspace/hruan/softs/anaconda3/envs/DESC/lib/python3.6/site-packages/desc/models/network.py in build(self, input_shape)
73 input_dim = input_shape[1]
74 self.input_spec = InputSpec(dtype=K.floatx(), shape=(None, input_dim))
---> 75 self.clusters = self.add_weight((self.n_clusters, input_dim), initializer='glorot_uniform', name='clusters')
76 if self.initial_weights is not None:
77 self.set_weights(self.initial_weights)

Please see the attached package version for my env:

In [2]: keras.version
Out[2]: '2.3.1'

In [4]: tensorflow.version
Out[4]: '1.14.0'

In [6]: scanpy.version
Out[6]: '1.3.6'

I am guesting that this could be an package compatibility issue.
What do you think?

Thank you so much for your help

Input contains NaN, infinity or a value too large for dtype('float32')

hello,you really do a good job,but i met some trouble,could ypu tell me how to fix it.and the question is as followed:
ValueError Traceback (most recent call last)
in
----> 1 adata=desc.train(adata,
2 dims=[adata.shape[1],64,32],
3 tol=0.005,
4 n_neighbors=10,
5 batch_size=256,

~/anaconda3/envs/R_env/lib/python3.8/site-packages/desc/models/desc.py in train(data, dims, alpha, tol, init, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, GPU_id, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
301 print("Start to process resolution=",str(resolution))
302 use_ae_weights=use_ae_weights if ith==0 else True
--> 303 res=train_single(data=data,
304 dims=dims,
305 alpha=alpha,

~/anaconda3/envs/R_env/lib/python3.8/site-packages/desc/models/desc.py in train_single(data, dims, alpha, tol, init, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, GPU_id, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
162 if do_tsne:
163 num_Cores_tsne=int(num_Cores_tsne) if total_cpu>int(num_Cores_tsne) else int(math.ceil(total_cpu/2))
--> 164 sc.tl.tsne(adata,use_rep="X_Embeded_z"+str(louvain_resolution),learning_rate=learning_rate,perplexity=perplexity,n_jobs=num_Cores_tsne)
165 adata.obsm["X_tsne"+str(louvain_resolution)]=adata.obsm["X_tsne"].copy()
166 print('tsne finished and added X_tsne'+str(louvain_resolution),' into the umap coordinates (adata.obsm)\n')

~/anaconda3/envs/R_env/lib/python3.8/site-packages/scanpy/tools/_tsne.py in tsne(adata, n_pcs, use_rep, perplexity, early_exaggeration, learning_rate, random_state, use_fast_tsne, n_jobs, copy)
113 tsne = TSNE(**params_sklearn)
114 logg.info(' using sklearn.manifold.TSNE with a fix by D. DeTomaso')
--> 115 X_tsne = tsne.fit_transform(X)
116 # update AnnData instance
117 adata.obsm['X_tsne'] = X_tsne # annotate samples with tSNE coordinates

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/manifold/t_sne.py in fit_transform(self, X, y)
889 Embedding of the training data in low-dimensional space.
890 """
--> 891 embedding = self.fit(X)
892 self.embedding = embedding
893 return self.embedding

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py in _fit(self, X, skip_num_points)
667 raise ValueError("'angle' must be between 0.0 - 1.0")
668 if self.method == 'barnes_hut':
--> 669 X = self._validate_data(X, accept_sparse=['csr'],
670 ensure_min_samples=2,
671 dtype=[np.float32, np.float64])

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
418 f"requires y to be passed, but the target y is None."
419 )
--> 420 X = check_array(X, **check_params)
421 out = X
422 else:

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/utils/validation.py in inner_f(*args, **kwargs)
70 FutureWarning)
71 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72 return f(**kwargs)
73 return inner_f
74

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
642
643 if force_all_finite:
--> 644 _assert_all_finite(array,
645 allow_nan=force_all_finite == 'allow-nan')
646

~/anaconda3/envs/R_env/lib/python3.8/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
94 not allow_nan and not np.isfinite(X).all()):
95 type_err = 'infinity' if allow_nan else 'NaN, infinity'
---> 96 raise ValueError(
97 msg_err.format
98 (type_err,

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

Install errors

Hi:
when I install DESC by the following command lines, I met some problems, can you give me some help if possible?
conda create -n DESC python=3.5.3
activate your environment
source activate DESC
git clone https://github.com/eleozzr/desc
cd desc
python setup.py build
python setup.py install

error message:
../../../source/igraph/src/bliss/graph.cc: In constructor ‘bliss::AbstractGraph::AbstractGraph()’:
../../../source/igraph/src/bliss/graph.cc:72:13: error: ‘stdout’ was not declared in this scope
verbstr = stdout;
^
Makefile:8801: recipe for target 'bliss/libigraph_la-graph.lo' failed
make[3]: *** [bliss/libigraph_la-graph.lo] Error 1
make[3]: Leaving directory '/tmp/easy_install-0ffjxgqf/python-igraph-0.8.3/vendor/build/igraph/src'
Makefile:1746: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/tmp/easy_install-0ffjxgqf/python-igraph-0.8.3/vendor/build/igraph/src'
Makefile:497: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/tmp/easy_install-0ffjxgqf/python-igraph-0.8.3/vendor/build/igraph'
Makefile:404: recipe for target 'all' failed
make: *** [all] Error 2
Could not compile the C core of igraph.

error: Setup script exited with 1

Evaluation metric for batch effect removal

Hello, recently I read your paper and it is a perfect work. But I can not find the code for KL divergence of evaluation metric for batch effect removal in your paper, thus can you send me it? Thanks very much.

OSError: Unable to open file (unable to open file: name = '\data\paul15\paul15.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

UnicodeEncodeError: 'ascii' codec can't encode character '\xd7' in position 26: ordinal not in range(128)

Hi, I've installed desc and am trying to run the basic tests. However I'm running into a small problem (I believe the training finishes successfully):
pipfreeze_desc.txt

delta_label  0.004814814814814815 < tol  0.005
Reached tolerance threshold. Stop training.
The final prediction cluster is:
0    1492
1    1208
dtype: int64
The desc has been trained successfully!!!!!!
The summary of desc model is:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 100)               0         
_________________________________________________________________
encoder_0 (Dense)            (None, 64)                6464      
_________________________________________________________________
encoder_1 (Dense)            (None, 16)                1040      
_________________________________________________________________
clustering (ClusteringLayer) (None, 2)                 32        
=================================================================
Total params: 7,536
Trainable params: 7,536
Non-trainable params: 0
_________________________________________________________________
The runtime of (resolution=0.1)is: 45.76247525215149
The run time for all resolution is: 45.7717764377594
After training, the information of adata is:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/conda/lib/python3.6/site-packages/desc/tools/test.py", line 9, in run_desc_test
    adata = train(adata, dims=[100, 64, 16], louvain_resolution=0.1)
  File "/opt/conda/lib/python3.6/site-packages/desc/models/desc.py", line 330, in train
    print("After training, the information of adata is:\n",adata)
UnicodeEncodeError: 'ascii' codec can't encode character '\xd7' in position 26: ordinal not in range(128)

DESC needs to be updated, it is no longer working properly !

DESC needs to be updated, it is no longer working properly ! I hope you can update and test it in time. I believe DESC will be better.

Attribute error on desc.py

Hi,
I am unable to figure out the following error. could you help? Thanks.
When I run the quick desc test, in a Python interpeter session, type

import desc
desc.run_desc_test()
AttributeError: 'tuple' object has no attribute 'tocsr'.
My env:
'matplotlib==3.0.3'
'pydot==1.4.1',
'tensorflow==1.7',
'keras==2.1',
'scanpy==1.4.4',
'louvain==0.7.0',
'python-igraph==0.8.0',
'h5py==2.10.0',
'pandas==1.0.3',

monocle3_alpha has been deprecated, :(

Hi, i am trying to reproduce the outcomes of your paper, but monocle3_alpha version has been deprecated.
Can you guys kindly provide a way to deal with this problem?

Maybe providing a repository for monocle3_alpha or a way to download the alpha version??

Thank you in advance

Adding DESC to data integration benchmarking

Hi @eleozzr,

We were thinking about adding DESC to our benchmark of data integration tools (https://github.com/theislab/scib). We would be running our own pre-processing for the input to DESC for this, which is reliant on Scanpy version 1.4.5+. Do you think it would be possible to use just the desc.train() function if we remove the Scanpy requirement and install via github? Would this also be okay for using Keras 2.2.4?

Also, to compare the methods properly we would not be able to use the clustering output you provide, but instead we would use the embedding at a default clustering resolution (resolution=0.8 as in your tutorial). Would this be a suitable way of evaluating DESC?

Kind regards,

Question: DESC does not use batch labels?

Hi, I've been playing with DESC and I just wanted to make sure I wasn't being stupid about something.

DESC does not use batch labels at any point in the process?

Dimension 0 in both shapes must be equal

Hi, starting to tinker with the package (looks really useful, thanks for developing it!). Trying to load in my own dataset (tutorial code otherwise more or less the same). When I run DESC.train I get:

Start to process resolution= 0.8
The number of cpu in your computer is 8
Checking whether result_pbmc3k/ae_weights,h5  exists in the directory
Traceback (most recent call last):
  File "/anaconda2/envs/DESC/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1659, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 2769 and 1838. Shapes are [2769,32] and [1838,32]. for 'Assign_17' (op: 'Assign') with input shapes: [2769,32], [1838,32].

It's probably a simple thing but be good to have an error catch for this giving a more informative message

consult

Hello, author! Do you have a pre trained model and code? Can you share it?
@eleozzr

Attribute error on desc.train

Hi,
I've got the following error message desc.train with anndata object.
AttributeError: module 'scanpy.api.logging' has no attribute 'msg'
Thanks,
Javier

An error on running DESC

I got an error on running DESC
module 'desc' has no attribute 'train'

Scaled data post-batch-correction

Hi,

Your paper caught my eye and I'm looking into using DESC to merge ~200-250k cells from ~40 libraries. How do I get a batch corrected matrix of counts for visualization post-batch correction?

From what i understand, you store a representation of the batch corrected data in adata.obsm['X_Embeded_zRES'] that's used for the tsne and umap but what if I want the batch corrected gene expression values for overlaying the gene expression on the umap, or if i want to plot a heatmap?

Outdated dependencies

Hi!

I was wondering if you are working on/thinking about upgrading the package to dependencies with Python 3.7 and higher versions of scanpy? Right now, your package is not appealing to use as it is not up-to-date with recent releases of packages (scanpy, tensorflow, anndata and python itself).

I could make a new conda environment that meets the dependencies that you state, but then I won't be able to use new functions for downstream-analyses in scanpy so that won't solve my issue.

msg error

Hi,
I am unable to figure out the following error. could you help? Thanks.

dtype: int64
The desc has been trained successfully!!!!!!
The summary of desc model is:

Layer (type) Output Shape Param #

input (InputLayer) (None, 1000) 0

encoder_0 (Dense) (None, 64) 64064

encoder_1 (Dense) (None, 32) 2080

clustering (ClusteringLayer) (None, 10) 320

Total params: 66,464
Trainable params: 66,464
Non-trainable params: 0

The runtime of (resolution=0.8)is: 136.45320010185242
/Users/tommy/miniconda3/envs/DESC/lib/python3.6/site-packages/umap/spectral.py:229: UserWarning: Embedding a total of 3 separate connected components using meta-embedding (experimental)
n_components

AttributeError Traceback (most recent call last)
in
17 save_encoder_step=3,# save_encoder_weights is False, this parameter is not used
18 use_ae_weights=False,
---> 19 do_umap=True) #if do_uamp is False, it will don't compute umap coordiate

~/desc-1.0.0.2/desc/models/desc.py in train(data, dims, alpha, tol, init, n_clusters, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
324 perplexity=perplexity,
325 do_umap=do_umap,
--> 326 kernel_clustering=kernel_clustering)
327 #update adata
328 data=res

~/desc-1.0.0.2/desc/models/desc.py in train_single(data, dims, alpha, tol, init, n_clusters, louvain_resolution, n_neighbors, pretrain_epochs, batch_size, activation, actincenter, drop_rate_SAE, is_stacked, use_earlyStop, use_ae_weights, save_encoder_weights, save_encoder_step, save_dir, max_iter, epochs_fit, num_Cores, num_Cores_tsne, use_GPU, random_seed, verbose, do_tsne, learning_rate, perplexity, do_umap, kernel_clustering)
167 sc.tl.umap(adata)
168 adata.obsm["X_umap"+str(louvain_resolution)]=adata.obsm["X_umap"].copy()
--> 169 sc.logging.msg(' umap finished', t=True, end=' ', v=4)
170 sc.logging.msg('and added\n'
171 ' 'X_umap''+str(louvain_resolution),'the umap coordinates (adata.obsm)\n', v=4)

AttributeError: module 'scanpy.api.logging' has no attribute 'msg'

cells clustering by sample

My dataset contains one cell type from 10 samples. Each sample was sequenced independently and therefore is its own batch as well.
When I cluster using desc each sample clusters purely by sample (with minimal overlap)

I have tried varying the learning rate with limited success.
Are there other variables to adjust the model to encourage more rigorous batch correction?

Installation problem

Hi,

I keep on having the following error when installing desc. Your help would be greatle appreciated.

import platform
platform.python_version()
'3.6.8'
import tensorflow as tf
tf.version
'1.7.0'
import desc
Traceback (most recent call last):
File "", line 1, in
File "/Users/tommy/desc/desc/init.py", line 2, in
from . import tools
File "/Users/tommy/desc/desc/tools/init.py", line 1, in
from anndata import read_h5ad
ImportError: cannot import name 'read_h5ad'

Can DESC output more than 2 UMAP dimensions?

Hi, is it possible for DESC to output the first 30 UMAP or t-SNE dimensions rather than just the first 2?

Installation error on Mac OS X

Hi all, I am having an issue with installing this package on my Mac OS X. I am using an anaconda environment with python 3.6, so I am using the 'git clone' followed by 'python setup.py build and install' method, and I am getting this error: "fatal error: 'string' file not found" during what looks to be the installation of the "louvain" package, which I believe is a dependency for your package. I have tried installing Xcode and Xcode Command Line Tools, but still get the same error. Any help would be greatly appreciated, thanks so much for your time!

Can not get same results from tutorial desc_2.1.1_paul.ipynb

I can not get the same results from the tutorial desc_2.1.1_paul.ipynb and desc_2.0.3_paul.ipynb.
https://eleozzr.github.io/desc/tutorial.html

my desc_2.1.1 env:
conda create the python==3.6.13

Name	Version	Build
absl-py	0.12.0	pypi_0
anndata	0.7.5	pypi_0
anyio	2.2.0	py36ha15d459_0
argon2-cffi	20.1.0	py36h68aa20f_2
astunparse	1.6.3	pypi_0
async_generator	1.1	py_0
attrs	20.3.0	pyhd3deb0d_0
babel	2.9.0	pyhd3deb0d_0
backports	1	py_2
backports.functools_lru_cache	1.6.3	pyhd8ed1ab_0
bleach	3.3.0	pyh44b312d_0
cached-property	1.5.2	pypi_0
cachetools	4.2.1	pypi_0
certifi	2020.12.5	py36ha15d459_1
cffi	1.14.5	py36he58ceb7_0
chardet	4.0.0	pypi_0
click	7.1.2	pypi_0
colorama	0.4.4	pyh9f0ad1d_0
contextvars	2.4	py_0
cycler	0.10.0	pypi_0
dataclasses	0.8	pyh787bdff_0
decorator	4.4.2	pypi_0
defusedxml	0.7.1	pyhd8ed1ab_0
desc	2.1.1	pypi_0
entrypoints	0.3	pyhd8ed1ab_1003
flatbuffers	1.12	pypi_0
gast	0.3.3	pypi_0
get-version	2.1	pypi_0
google-auth	1.28.0	pypi_0
google-auth-oauthlib	0.4.4	pypi_0
google-pasta	0.2.0	pypi_0
grpcio	1.32.0	pypi_0
h5py	2.10.0	pypi_0
idna	2.1	pypi_0
immutables	0.15	py36h68aa20f_0
importlib-metadata	3.10.0	py36ha15d459_0
ipykernel	5.5.3	py36hfacbf0b_0
ipython	5.8.0	py36_1
ipython_genutils	0.2.0	py_1
jinja2	2.11.3	pyh44b312d_0
joblib	1.0.1	pypi_0
json5	0.9.5	pyh9f0ad1d_0
jsonschema	3.2.0	pyhd8ed1ab_3
jupyter-packaging	0.7.12	pyhd8ed1ab_0
jupyter_client	6.1.12	pyhd8ed1ab_0
jupyter_core	4.7.1	py36ha15d459_0
jupyter_server	1.5.1	py36ha15d459_0
jupyterlab	3.0.12	pyhd8ed1ab_0
jupyterlab_pygments	0.1.2	pyh9f0ad1d_0
jupyterlab_server	2.4.0	pyhd8ed1ab_0
keras-preprocessing	1.1.2	pypi_0
kiwisolver	1.3.1	pypi_0
legacy-api-wrap	1.2	pypi_0
libsodium	1.0.18	h8d14728_1
llvmlite	0.36.0	pypi_0
m2w64-gcc-libgfortran	5.3.0	6
m2w64-gcc-libs	5.3.0	7
m2w64-gcc-libs-core	5.3.0	7
m2w64-gmp	6.1.0	2
m2w64-libwinpthread-git	5.0.0.4634.697f757	2
markdown	3.3.4	pypi_0
markupsafe	1.1.1	py36h68aa20f_3
matplotlib	3.3.4	pypi_0
mistune	0.8.4	py36h68aa20f_1003
msys2-conda-epoch	20160418	1
natsort	7.1.1	pypi_0
nbclassic	0.2.6	pyhd8ed1ab_0
nbclient	0.5.3	pyhd8ed1ab_0
nbconvert	6.0.7	py36ha15d459_3
nbformat	5.1.3	pyhd8ed1ab_0
nest-asyncio	1.5.1	pyhd8ed1ab_0
networkx	2.5.1	pypi_0
notebook	6.3.0	py36ha15d459_0
numba	0.53.1	pypi_0
numexpr	2.7.3	pypi_0
numpy	1.19.5	pypi_0
oauthlib	3.1.0	pypi_0
opt-einsum	3.3.0	pypi_0
packaging	20.9	pyh44b312d_0
pandas	1.1.5	pypi_0
pandoc	2.13	h8ffe710_0
pandocfilters	1.4.2	py_1
patsy	0.5.1	pypi_0
pickleshare	0.7.5	py_1003
pillow	8.2.0	pypi_0
pip	21.0.1	pyhd8ed1ab_0
prometheus_client	0.10.0	pyhd8ed1ab_0
prompt_toolkit	1.0.15	py_1
protobuf	3.15.7	pypi_0
pyasn1	0.4.8	pypi_0
pyasn1-modules	0.2.8	pypi_0
pycparser	2.2	pyh9f0ad1d_2
pydot	1.4.2	pypi_0
pygments	2.8.1	pyhd8ed1ab_0
pynndescent	0.5.2	pypi_0
pyparsing	2.4.7	pyh9f0ad1d_0
pyrsistent	0.17.3	py36h68aa20f_2
python	3.6.13	h39d44d4_0_cpython
python-dateutil	2.8.1	py_0
python_abi	3.6	1_cp36m
pytz	2021.1	pyhd8ed1ab_0
pywin32	300	py36h68aa20f_0
pywinpty	0.5.7	py36h9f0ad1d_1
pyzmq	22.0.3	py36h1d5d788_1
requests	2.25.1	pypi_0
requests-oauthlib	1.3.0	pypi_0
rsa	4.7.2	pypi_0
scanpy	1.5.1	pypi_0
scikit-learn	0.24.1	pypi_0
scipy	1.5.4	pypi_0
seaborn	0.11.1	pypi_0
send2trash	1.5.0	py_0
setuptools	54.2.0	pypi_0
setuptools-scm	6.0.1	pypi_0
simplegeneric	0.8.1	py_1
six	1.15.0	pyh9f0ad1d_0
sniffio	1.2.0	py36ha15d459_1
statsmodels	0.12.2	pypi_0
stdlib-list	0.8.0	pypi_0
tables	3.6.1	pypi_0
tensorboard	2.4.1	pypi_0
tensorboard-plugin-wit	1.8.0	pypi_0
tensorflow	2.4.1	pypi_0
tensorflow-estimator	2.4.0	pypi_0
termcolor	1.1.0	pypi_0
terminado	0.9.4	py36ha15d459_0
testpath	0.4.4	py_0
texttable	1.6.3	pypi_0
threadpoolctl	2.1.0	pypi_0
tornado	6.1	py36h68aa20f_1
tqdm	4.60.0	pypi_0
traitlets	4.3.3	py36h9f0ad1d_1
typing_extensions	3.7.4.3	py_0
umap-learn	0.5.1	pypi_0
urllib3	1.26.4	pypi_0
vc	14.2	hb210afc_4
vs2015_runtime	14.28.29325	h5e1d092_4
wcwidth	0.2.5	pyh9f0ad1d_2
webencodings	0.5.1	py_1
werkzeug	1.0.1	pypi_0
wheel	0.36.2	pyhd3deb0d_0
wincertstore	0.2	py36ha15d459_1006
winpty	0.4.3	4
wrapt	1.12.1	pypi_0
zeromq	4.3.4	h0e60522_0
zipp	3.4.1	pyhd8ed1ab_0

Then, I follow the tutorial desc_2.1.1_paul.ipynb.

1.different cell matrix

desc_2.1.1_paul.ipynb.
AnnData object with n_obs × n_vars = 2730 × 999
obs: 'paul15_clusters', 'celltype', 'celltype2', 'desc_0.8', 'desc_1.0'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
uns: 'iroot', 'log1p', 'prob_matrix0.8', 'prob_matrix1.0'
obsm: 'X_Embeded_z0.8', 'X_tsne', 'X_tsne0.8', 'X_Embeded_z1.0', 'X_tsne1.0'

my script:
AnnData object with n_obs × n_vars = 2730 × 1000
obs: 'paul15_clusters', 'celltype', 'celltype2', 'desc_0.8', 'desc_1.0'
var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
uns: 'iroot', 'log1p', 'prob_matrix0.8', 'prob_matrix1.0'
obsm: 'X_Embeded_z0.8', 'X_tsne', 'X_tsne0.8', 'X_Embeded_z1.0', 'X_tsne1.0'

2.different t-sen
picture
desc_2.1.1_paul.ipynb.
https://ibb.co/GsPv38j

my script:
https://ibb.co/NxhHgq0

What is the cause of these differences?

Install error in Ubuntu 20.04

Hi team,

Congratulation on the new pipeline. However, I spent a day trying to install desc on my ubuntu server (20.04, amd64) but it failed. Here are my attempts:

I created a new conda environment and install python 3.5.4 on my server. Then I installed tensorflow 1.7.0 in the new env. Then I install desc per the tutorial. Everything looked fine. However, when I ran desc.run_desc_test(), it failed with error
ImportError: ('Failed to import pydot. You must pip install pydot and install graphviz (https://graphviz.gitlab.io/download/), ', 'for pydotprint to work.'). I installed pydot and graphviz through PyPI, and ran the test again. I got the same error.
I created another new environment and installed python 3.6.6. Tensorflow installation is fine. But when I installed desc, I failed to build wheels for louvain and igraph.
I created another environment and installed python 3.8.3. Tensorflow 1.7.0 installation failed.
I repeated the trial 1 as described above and build desc from scratch by downloading the repo. It threw me an error that pandas, matplotlib did not found. I installed pandas and matplotlib, and then re-installed desc. It threw me the same errors.

I wonder what Ubuntu version you were using when you tested desc on linux?

Many thanks!

Zhang

Data set

Dear teacher, hello! I have successfully reproduced your code, but I used my own data set to run your code. The data set cannot be read. The error message is that the format of my data set is different from yours and cannot be read. But I don’t know the specific style of your Paul5 data set, and I can’t download Paul15 on the Internet, so the processing format of my own data set is different from yours, so that it reports an error, so now I can ask if you can Share your Paul5 data set. I want to change my own data according to your data set format, and see if my data can run according to your changes. Teacher, I’m sincerely to ask for advice. I hope the teacher can reply to my request. Thank you very much. Good luck.
@eleozzr @Yafei611

Using SCTransformed counts instead of default normalization?

I was wondering if it is also possible to use the counts from SCTransform (Seurat) instead of the default normalization and log transformation.

ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running

Hello

I have been trying to reproduce this. However, I keep getting an error with TkAgg.

This is the block that is giving me an error (with a few modifications, but the original gives me the same message)

import os              
os.environ['PYTHONHASHSEED'] = '0'

import matplotlib
matplotlib.use('TKAgg', force = True)
import matplotlib.pyplot as plt

import desc          
import pandas as pd                                                    
import numpy as np                                                     
import scanpy.api as sc                                                                                 
from time import time                                                       
import sys
import matplotlib.pyplot as plt
%matplotlib inline 
sc.settings.set_figure_params(dpi=300)

The output is:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-1-9cdf2f4155a6> in <module>
      4 import matplotlib
      5 matplotlib.use('TKAgg', force = True)
----> 6 import matplotlib.pyplot as plt
      7 
      8 import desc

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/matplotlib/pyplot.py in <module>
   2228     dict.__setitem__(rcParams, "backend", rcsetup._auto_backend_sentinel)
   2229 # Set up the backend.
-> 2230 switch_backend(rcParams["backend"])
   2231 
   2232 # Just to be safe.  Interactive mode can be turned on without

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/matplotlib/pyplot.py in switch_backend(newbackend)
    273         if (current_framework and required_framework
    274                 and current_framework != required_framework):
--> 275             raise ImportError(
    276                 "Cannot load backend {!r} which requires the {!r} interactive "
    277                 "framework, as {!r} is currently running".format(

ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running

It seemed to be a problem with TkAgg, (hence why I modified the first few lines). But when I run

print("Using:", matplotlib.get_backend())

It seems the backend TkAgg can be used fine:

Using: TkAgg

Here is the output when I run exactly what is in the ttutorial

import os              
os.environ['PYTHONHASHSEED'] = '0'
import desc          
import pandas as pd                                                    
import numpy as np                                                     
import scanpy.api as sc                                                                                 
from time import time                                                       
import sys
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline 
sc.settings.set_figure_params(dpi=300)

`---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-2-792688ab25c3> in <module>
      1 import os
      2 os.environ['PYTHONHASHSEED'] = '0'
----> 3 import desc
      4 import pandas as pd
      5 import numpy as np

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/desc/__init__.py in <module>
      1 #from . import original as og
----> 2 from . import tools
      3 from . import models
      4 from . import datasets
      5 

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/desc/tools/__init__.py in <module>
      2 from scanpy.preprocessing import normalize_per_cell, highly_variable_genes, log1p, scale
      3 
----> 4 from .test import run_desc_test
      5 from .read import read_10X
      6 from .write import write_desc_result

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/desc/tools/test.py in <module>
      1 
      2 from ..datasets import pbmc_processed
----> 3 from ..models.desc import train
      4 
      5 

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/desc/models/__init__.py in <module>
----> 1 from .desc import train
      2 

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/desc/models/desc.py in <module>
     22 #if we have a display use a plotting backend
     23 if havedisplay:
---> 24     matplotlib.use('TkAgg')
     25 else:
     26     matplotlib.use('Agg')

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/matplotlib/__init__.py in use(backend, force)
   1142                 # user does not have the libraries to support their
   1143                 # chosen backend installed.
-> 1144                 plt.switch_backend(name)
   1145             except ImportError:
   1146                 if force:

/data04/projects04/MarianaBoroni/lbbc_members/lib/conda_envs/diogoamb/lib/python3.9/site-packages/matplotlib/pyplot.py in switch_backend(newbackend)
    273         if (current_framework and required_framework
    274                 and current_framework != required_framework):
--> 275             raise ImportError(
    276                 "Cannot load backend {!r} which requires the {!r} interactive "
    277                 "framework, as {!r} is currently running".format(

ImportError: Cannot load backend 'TkAgg' which requires the 'tk' interactive framework, as 'headless' is currently running`

Any ideas?

I downgraded to Tensorflow 1.7.0 using pip i.e pip install "tensorflow>=1.7.0" still I am facing installation error

ERROR: Could not find a version that satisfies the requirement tensorflow==1.7.0 (from desc) (from versions: 1.13.0rc1, 1.13.0rc2, 1.13.1, 1.13.2, 1.14.0rc0, 1.14.0rc1, 1.14.0, 1.15.0rc0, 1.15.0rc1, 2.0.0a0, 2.0.0b0, 2.0.0b1, 2.0.0rc0, 2.0.0rc1, 2.0.0rc2)
ERROR: No matching distribution found for tensorflow==1.7.0 (from desc)

clustering indicators

Dear teacher, how are you? I have reproduced your results, but there is no specific evaluation index. I see you have two evaluation indexes ARI and KL in the introduction of the paper. Could you please send the code of the evaluation index to the teacher? I would like to see if my reduplicated result is consistent with yours. I sincerely hope that you can reply me, thank you!
@Yafei611 @eleozzr

DESC: Finding Cluster Specific Markers

Is there a way to get a list of marker genes that are specific to certain cell clusters? For example, something similar to the 'FindMarkers' function from Seurat or the 'rank_genes_groups' method from Scanpy.

The only potential way I see is creating a violin plot of selected 'marker' genes within cell clusters (towards the end of DESC's tutorial). Is there a way to create a list of the most prominent genes in a particular cluster?

Question on DESC2.0.3 K-means initialization

Hi,

Thank you for the great work in applying Autoencode in single-cell data clustering as well as removing the batch effect. When I tried the K-means initialization on the DESC2.0.3 version, I got TypeError: train() got an unexpected keyword argument 'n_clusters'.

Here is how I call the function:

adata = desc.train(adata,
                       dims=[adata.shape[1], 128, 32],
                       tol=0.005,
                       n_clusters=10,
                       n_neighbors=10, louvain_resolution=[0.8],
                       batch_size=256,
                       save_dir=result_dir,
                       do_tsne=True, learning_rate=300, # learning_rate for tSNE
                       do_umap=True, num_Cores_tsne=4,
                       save_encoder_weights=True,
                       use_GPU=False)

My environment information is:
TF 1.15.0 and scanpy==1.5.1 anndata==0.7.4 umap==0.4.6 numpy==1.19.2 scipy==1.5.2 pandas==1.1.3 scikit-learn==0.23.2 statsmodels==0.12.0 python-igraph==0.8.3 louvain==0.7.0.

Could you please give me some insights? Thank you in advance!

Sincerely,

ImportError: cannot import name 'stacked_violin'

python 3.6
tensorflow 1.7.0

When I install desc and do
import desc
It gives me the following errors:
ImportError: cannot import name 'stacked_violin'

I guess it caused by incompatible scanpy package. Do you have an environment file that I can use for conda or mincoda?

The learning rate of pre-trained autoencoder

In the code comment, it says that the learning rate of the pre-trained autoencoder decays from 0.01, but the actual code decays from 1. Can I know which learning rate it should be？

Update citation to Nat Comms paper

Hi!

Now that the DESC manuscript has been peer-review published in Nature Communication, the citation in the README.md should be updated.

Cheers!

matplotlib backend problem!

Whenever I import desc, my matplotlib backend always is 'Agg', which is very inconvinient so much! How to stop this behavior?

Bulk RNA seq

Is desc applicable to bulk RNA seq?