Comments (8)
Hi, I don’t see any errors in your example. Can you please specify what you expected to happen and how what happened differs from that?
Also please provide code as text (in a Markdown code block), not as images.
from scanpy.
Hi, I intend to have 13634 hvgs, but it gave me 13652 genes. I am curious about the difference. Is it a bug?
Code:
adata_atac.X = (adata_atac.X > 0)*1
sc.pp.highly_variable_genes(adata_atac, n_top_genes=13634)
adata_atac = adata_atac[:,adata_atac.var['highly_variable']]
from scanpy.
Could have all kinds of reasons. Genes sharing the same score, duplicate gene names.
We won’t be able to tell without a reproducible example.
from scanpy.
If it is caused by the same score, how to ensure the uniquenss? Thanks. I did not receive the var.unique.name error thus I think it was not caused by duplicate gene names
It is neurips 2021 single cell competition dataset.
from scanpy.
I’ll take a look if you update your issue with a code block that I can copy, that will download the dataset and then reproduce the error without me having to go to any website and download anything manually.
from scanpy.
Thanks. The dataset is here:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122
Sicne it is from GSE, I think it is easy to download and it is formed into h5ad format (the multiome part).
GSE194122_openproblems_neurips2021_multiome_BMMC_processed.h5ad.gz
Code:
adata_atac.X = (adata_atac.X > 0)*1
sc.pp.highly_variable_genes(adata_atac, n_top_genes=13634)
adata_atac = adata_atac[:,adata_atac.var['highly_variable']]
from scanpy.
So this would be a reproducible example:
import gzip
import shutil
from urllib.request import urlopen
from pathlib import Path
from tqdm.notebook import tqdm
import scanpy as sc
url = "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE194122&format=file&file=GSE194122%5Fopenproblems%5Fneurips2021%5Fmultiome%5FBMMC%5Fprocessed%2Eh5ad%2Egz"
path = Path("data/GSE194122_openproblems_neurips2021_multiome_BMMC_processed.h5ad")
if not path.is_file():
with (
urlopen(url) as raw,
tqdm.wrapattr(raw, "read", total=int(raw.headers["Content-Length"])) as wrapped,
gzip.open(wrapped, 'rb') as f_in,
path.open('wb') as f_out,
):
shutil.copyfileobj(f_in, f_out)
adata_atac = sc.read(path)
adata_atac.X = (adata_atac.X > 0)*1
sc.pp.highly_variable_genes(adata_atac, n_top_genes=13634)
adata_atac = adata_atac[:,adata_atac.var['highly_variable']]
from scanpy.
Problem seems to be fixed in the newest scanpy:
from scanpy.
Related Issues (20)
- scale for sparse matrixes and mask
- missing scanpy components HOT 2
- `pl.rank_genes_groups_violin` should be able to pass `**kwds` to `pl.violin`
- plots in Rstudio messed up HOT 3
- tl.leiden suddenly produces different results HOT 7
- why scale is not in spatial HOT 12
- sc.get.aggregate looses values HOT 4
- Dependency on `legacy-api-wrap` prevents 1.10 conda release HOT 1
- Future warning in \preprocessing\_highly_variable_genes.py:226 HOT 3
- leiden alg with igraph flavor causes out of bounds freezing HOT 19
- Adding format checking for sparse matrix in the function "read_v3_10x_h5"
- add support for Visium HD Spatial Gene Expression data HOT 9
- `sc.tl.ingest`: `pkg_version` does not work with `anndata==0.10.6` HOT 12
- AssertionError: Don’t call _normalize_index with non-categorical/string names HOT 1
- sc.tl.leiden TypeError: unexpected keyword 'flavor' HOT 1
- Suggestion to support VisiumHD tissue_position_list files (using parquet files) HOT 2
- Performance: Investigate `pp.scale` with sparse matrices HOT 1
- Tests fail with pytest 8.1 when a `data` dir exists HOT 13
- AxisError when calculating QC metrics on backed data HOT 1
- Failed violins HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scanpy.