Code Monkey home page Code Monkey logo

mpox's Introduction

Nextstrain repository for mpox virus

pre-commit.ci status

This repository contains three workflows for the analysis of mpox virus (MPXV) data:

  • ingest/ - Download data from GenBank, clean and curate it and upload it to S3
  • phylogenetic/ - Filter sequences, align, construct phylogeny and export for visualization
  • nextclade/ - Make Nextclade datasets for nextstrain/nextclade_data

Each folder contains a README.md with more information. The results of running both workflows are publicly visible at nextstrain.org/mpox.

Installation

Follow the standard installation instructions for Nextstrain's suite of software tools.

Quickstart

Run the default phylogenetic workflow via:

cd phylogenetic/
nextstrain build .
nextstrain view .

Documentation

mpox's People

Contributors

babarlelephant avatar chaoran-chen avatar corneliusroemer avatar dependabot[bot] avatar emmahodcroft avatar huddlej avatar ivan-aksamentov avatar j23414 avatar jameshadfield avatar joverlee521 avatar pre-commit-ci[bot] avatar pvanheus avatar rneher avatar theosanderson avatar trvrb avatar tsibley avatar victorlin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mpox's Issues

ingest: notify Slack with metadata diff

Context

It would be helpful for build maintainers to see metadata diffs in Slack along with notifications of the metadata TSV being updated.

Possible solution

  1. Use diff. The output may be a chore to read since it outputs the entire line that has changed
  2. Use csv-diff as it was used in ncov-ingest/bin/notify-on-metadata-change. This sends notifications for changes and additions separately. We eventually stopped using it because it ran out of memory due to the large number of SARS-CoV-2 sequences.
  3. Use daff. Outputs diff in a table with a new column marking changes, additions, and deletions. (I personally use daff when comparing tabular files locally and find the output much easier to understand)

`IndexError: tuple index out of range` in Snakemake file related to Nextalign rule, line 127

Current Behavior

An error is triggered when the monkepox pipeline is run using snakemake. The chunk code that starts in line 127 which executes Nextalign fails. This part of the code can be run directly in command line and the output is produced correctly but this extra step is not ideal.

Expected behavior

Snakemake to execute all the jobs without failing in line 127

How to reproduce

Steps to reproduce the current behavior:

  1. Install Nextstrain using the Ambient directions
  2. Install the monkeypox Nextstrian pipeline according to instructions
  3. Run the pipeline after the installations is completed using the command snakemake -j 1 -p --configfile config/config_hmpxv1.yaml
  4. See error:
Job 8: 
        Aligning sequences to config/reference.fasta
          - filling gaps with N
        
Reason: Missing output files: results/hmpxv1/aligned.fasta

RuleException in rule align in line 127 of /home/lmarcelat/monkeypox/workflow/snakemake_rules/core.smk:
IndexError: tuple index out of range, when formatting the following:

        nextalign run             --jobs {3}             --reference {input.reference}             --genemap {input.genemap}             --max-indel {params.max_indel}             --seed-spacing {params.seed_spacing}             --retry-reverse-complement             --output-fasta -             --output-insertions {output.insertions}             {input.sequences} | seqkit seq -i > {output.alignment}

My environment: if running Nextstrain locally

Windows operating system running locally using WSL

ingest: thread Slack notifications

Context

We get multiple Slack notifications per ingest run, so it would be cleaner to have these notifications threaded.

Possible Solution

  • See ncov workflow for example of how this can be done with PersistentDict.

  • Maybe another way we can go about this is edit the global Snakemake config to store the thread_ts? (Just a thought, haven't really tested if this is possible...)

CI is using an incompatible version of the Conda runtime

Currently (as observed in #176), the Conda runtime job instance of pathogen-ci is failing with the following error:

Current augur version: 22.1.0. Minimum required: 22.2.0

Augur version 22.1.0 is coming from this version of the Conda runtime: nextstrain-base 20230717T174555Z.

This used to work without any noticeable changes. Example: when the Augur minimum version was bumped to 22.2.0, Augur version 22.2.0 was available in this CI run. Notably, the version of the Conda runtime is nextstrain-base 20230731T212806Z.

This also seems to be working fine in the ncov repo, where the latest run resolved to nextstrain-base 20230830T164409Z.

My outstanding question is: why is an older version of the Conda runtime being resolved now, and seemingly only in this repo?

The geographic map is not appearing behind the geolocations in the mapbox image on our installed server as it does on the nexstrain.org server at https://nextstrain.org/monkeypox/mpxv?f_host=Homo%20sapiens

Current Behavior

The geographic map is not appearing behind the geolocations in the mapbox image on our installed server as it does on the nexstrain.org server at https://nextstrain.org/monkeypox/mpxv?f_host=Homo%20sapiens

Expected behavior

A clear and concise description of what you expected to happen instead.

How to reproduce

Steps to reproduce the current behavior:

  1. Installed nextstrain/monkeypox today (28 June 2022) using Docker on our Amazon Server (with nextstrain.cli 3.0.5):

nextstrain-cli/bin/nextstrain build --docker . data/sequences.fasta data/metadata.tsv

nextstrain-cli/bin/nextstrain build --docker --cpus 50 . --configfile config/config_mpxv.yaml

nextstrain-cli/bin/nextstrain build --docker --cpus 50 . --configfile config/config_hmpxv1.yaml

Visualize results

nextstrain-cli/bin/nextstrain view auspice/ --allow-remote-access

In Chrome browser:

http://awsgenomep:4000/monkeypox/mpxv?f_host=Homo%20sapiens

Possible solution

(optional)

Your environment: if browsing Nextstrain online

  • Operating system:
  • Browser:

Your environment: if running Nextstrain locally

  • Operating system: Amazon Linux 2 AMI
  • Browser: Chrome
  • Version (e.g. auspice 2.7.0):

Additional context

Add any other context about the problem here.
Screen Shot 2022-06-28 at 12 22 24 PM

Transmission Line Visualization Feature in Monkeypox Pipeline, Similar to the ncov Build

Context

This feature request aims to enhance the understanding of monkeypox epidemiology and transmission patterns.

Description

Currently, in the ncov build of Nextstrain, when running the pipeline locally, the output JSON file includes a feature that displays the transmission lines between sequences from different countries. I would like to propose extending this feature to the monkeypox pipeline as well.

By including the transmission line visualization in the monkeypox pipeline, researchers and public health professionals can gain valuable insights into the spread and transmission dynamics of monkeypox. This visualization will help track the movement of the virus across geographical regions and identify potential sources of outbreaks.

Implementing this feature in the monkeypox pipeline will contribute to a better understanding of the epidemiology of monkeypox and support more effective disease surveillance and control measures.

Thank you for considering this feature request.

ingest: adopt geolocation rules

Context

Use standard geolocation rules to annotate geolocations so that we do not have to make an annotation for the same geolocation edits for multiple records. This would a similar process to how the ncov-ingest uses the gisaid_geoLocationRules.tsv.

Description

Ideally, this would use a centralized geolocation rules TSV (could be within augur/augur/data/) for the most general rules.
Then, there can be monkeypox data specific TSV within the repo.

Within the ingest pipeline, we can fetch the general rules from augur's master branch and concatenate the local rules.
For the function that loads the geolocation rules, we can make sure that the local monkeypox rules can overwrite the general rules. Then include a transform step that overwrites geolocation fields using the full geolocation rules.

Support for GISAID data

For users who want to use GISAID data with this workflow, the following steps work nearly as expected.

These steps assume you have downloaded:

  • all sequences in FASTA format with whitespace replaced by underscore
  • patient metadata
# Download sequences: data/gisaid_pox_2022_06_16_19.fasta
# Download patient metadata: data/gisaid_pox_2022_06_16_19.tsv
# Note: patient metadata lacks submitting/originating lab.

# Parse out metadata from sequence deflines.
augur parse \
  --sequences data/gisaid_pox_2022_06_16_19.fasta \
  --fields strain gisaid_epi_isl date \
  --output-sequences data/sequences.fasta \
  --output-metadata data/sequence_metadata.tsv

# Join sequence metadata with patient metadata.
csvtk --tabs join -f 1 \
  data/sequence_metadata.tsv \
  data/gisaid_pox_2022_06_16_19.tsv > data/metadata.tsv

# TODO: Need a transform for GISAID locations like the one we have for GenBank.

# Run workflow.
# TODO: This step requires users to know that the "wrangling" of metadata renames the "strain" column to "strain_original"
# so they can rename it back to "strain". Correspondingly, the user has to tell the workflow not to use "strain_original"
# as the display strain name.
nextstrain build \
  --docker \
  --image=nextstrain/base:branch-nextalign-v2 \
  --cpus 1 \
  . \
  --configfile config/config_mpxv.yaml \
  --config strain_id_field=strain_original display_strain_field=strain

Note, the biggest issue with the implementation above is that there is no transform command to convert GISAID's location field to the standard Nextstrain geographic columns (region, country, division, and location). This means the default Augur filter logic that groups by country and year prints a warning message that it cannot find a "country" column and only groups. In Augur 16.0.0, this missing group-by column will produce an error message, so we should consider implementing the transform for GISAID locations.

Given the commands above, however, I get the following tree from the workflow:

image

The very long branches also indicate that users will need to manage their own list of strains to exclude, since strain names will not match GenBank accessions.

ingest: Split monolith transform rule

Context

Currently in the ingest pipeline, there is a single transform rule that runs a shell pipeline of multiple Python scripts. This works in the automated ingest pipeline, but may be tedious to debug when developing or when there's an error in the pipeline.

Description

We can split up the single rule into multiple rules by using Snakemake's piped outputs feature. I don't think anyone in the group has used this feature, so we don't know the pitfalls.

README says 'not public'

I see that you just committed to this repo. Your README.md says 'this is not public', yet this GitHub Repo is public. Just wanted to let you know, in case you forgot to set this GitHub Repo to private.

`fix_tree.py` can create invalid tree

The hmpxv1_big build failed yesterday with a validation error from augur export v2

[batch] [2024-01-21T16:43:57-08:00] Validating schema of 'results/hmpxv1_big/nt_muts.json'...
[batch] [2024-01-21T16:43:57-08:00] Validating schema of 'results/hmpxv1_big/aa_muts.json'...
[batch] [2024-01-21T16:43:57-08:00] Validating config file config/hmpxv1_big/auspice_config.json against the JSON schema
[batch] [2024-01-21T16:43:57-08:00] Validating schema of 'config/hmpxv1_big/auspice_config.json'...
[batch] [2024-01-21T16:43:57-08:00] Validating produced JSON
[batch] [2024-01-21T16:43:57-08:00] Validating schema of 'results/hmpxv1_big/raw_tree.json'...
[batch] [2024-01-21T16:43:57-08:00] Validating that the JSON is internally consistent...
[batch] [2024-01-21T16:43:57-08:00] Node OP615261 appears multiple times in the tree.
[batch] [2024-01-21T16:43:57-08:00] ------------------------
[batch] [2024-01-21T16:43:57-08:00] Validation of results/hmpxv1_big/raw_tree.json failed. Please check this in a local instance of `auspice`, as it is not expected to display correctly. 

I searched for OP615261 in the results files and see that it only appears once in the tree_raw.nwk (produced by augur tree) but appears twice in the tree_fixed.nwk (produced by scripts/fix_tree.py). Somehow scripts/fix_tree.py is duplicating the node.

ENH: Don't subsample non-B.1 lineages

Context

Right now we may sample out some non-B.1 sequences because I argued we shouldn't sample on lineage country month year only on country month year -> my mistake.

We should sample by country within B.1 - but not subsample outside B.1. That way we combine the best of both.

Raised by @rambaut

Possible clade misannotations

Thank you very much for this resource.

On my very naive tree, the following sequences cluster with West African sequences:

MPXV_WRAIR7_61__Walter_Reed_267
COP_58
Liberia_1970_184
Ivory_Coast_2012
Sierra_Leone
USA_2003_044
USA_2003_039

added 23/5:
MPXV_TNP_2017_North_Ponan
3030

though they are annotated as CA. In every case I have so far looked into in the literature, a West Africa-clade annotation would seem reasonable to my non-expert eyes. (But I may well be wrong in terms of how you choose to define the clades, etc.)

Thanks again

/usr/bin/bash: line 1: tsv-filter: command not found

Running the monkeypox pipeline the workflow errors out with the "tsv-filter" not found. Not sure in which script file or nextstrain package this command comes from. Am I missing a specific extra Nextstrain package or something that I should have installed?

[phylo] CI workflow DAG includes `update_example_data`

Context

Sometimes running the phylo workflow with the CI configs locally includes the update_example_data rule in the DAG:

$ nextstrain build . --configfile profiles/ci/builds.yaml -n
Building DAG of jobs...
Job stats:
job                            count
---------------------------  -------
align                              1
all                                1
ancestral                          1
clades                             1
colors                             1
combine_samples                    1
copy_example_data                  1
decompress                         1
download                           1
export                             1
filter                             1
final_strain_name                  1
fix_tree                           1
mask                               1
mutation_context                   1
recency                            1
refine                             1
rename_clades                      1
reverse_reverse_complements        1
subsample                          2
traits                             1
translate                          1
tree                               1
update_example_data                1
total                             25
...
Reasons:
    (check individual jobs above for details)
    code has changed since last execution:
        decompress
    input files updated by another job:
        align, all, ancestral, clades, colors, combine_samples, copy_example_data, decompress, export, filter, final_strain_name, fix_tree, mask, mutation_context, recency, refine, rename_clades, reverse_reverse_complements, subsample, traits, translate, tree, update_example_data
    missing output files:
        download
    set of input files has changed since last execution:
        decompress
Some jobs were triggered by provenance information, see 'reason' section in the rule displays above.
If you prefer that only modification time is used to determine whether a job shall be executed, use the command line option '--rerun-triggers mtime' (also see --help).
If you are sure that a change for a certain output file (say, <outfile>) won't change the result (e.g. because you just changed the formatting of a script or environment definition), you can also wipe its metadata to skip such a trigger via 'snakemake --cleanup-metadata <outfile>'. 
Rules with provenance triggered jobs: decompress

This is not an issue in our automated CI runs via GitHub Action because the GH Action workflow does a clean clone of the repo.

Possible solutions

  1. Manually removing the local .snakemake directory clears the Snakemake cache and resolves the issue.
  2. Move the chores.smk file to be conditionally included in the core phylo workflow
  3. Move the chores.smk file to a separate build-config that extends the workflow with custom_rules (conforms to the pathogen-repo-guide)

Push aligned sequences up to data.nextstrain.org for download availability

Context

We currently provide links to download the curated sequences & metadata, which is great. However, many times one just wants to start with an aligned sequence set (particularly in cases when alignment can be tricky, as with MPX). We generate this as part of our workflow, it would be great to:

  • Add a rule that uploads aligned.fasta to data.nextstrain.org (either after alignment or at the end)
  • Include a link to this aligned file in the description at the bottom of builds (and in the github repo)

Ingest: remove `reverse` column from metadata TSV

(Originally flagged the obsolete reverse column in #207 (comment))

Reverse complement sequences were initially manually flagged by the reverse column added in #79.

Since Nextclade v2.2.0, there's a built-in --retry-reverse-complement option that adds a new column isReverseComplement. This feature was used in the ingest pipeline starting from #89. Then in #94, the ingest/bin/reverse_reversed_sequences.py script was replaced with the built-in Nextclade functionality as well.

In #191, the phylogenetic pipeline switched over from using the reverse column to the is_reverse_complement column output from Nextclade. This seemingly makes the reverse column obsolete. When checking the latest metadata TSV (2023-10-13), the reverse column is completely empty.

From my point of view, we can just remove the reverse column from the metadata.tsv file, but wanted to confirm with other users of the pipeline/metadata.tsv file (cc: @corneliusroemer, @chaoran-chen).

Ingest currently blocked by `fetch-from-ncbi-virus`

Current Behavior

Because of the behavior described in nextstrain/ingest#18, the ingest pipeline does not include sequences in it's fetch from NCBI Virus. This results in all of the records being dropped in the pipeline and the final outputs to s3://nextstrain-data/files/workflows/monkeypox/ are empty. This was first flagged internally by downstream CZI consumers on Slack.

We don't have insight into the undocumented NCBI Virus API and whether this new behavior is intentional, so the best thing might be to just switch to the NCBI Datasets CLI to fetch data.

Rename repo & builds

Context

General naming recommendations are continuing to depreciate (and expected to depreciate further) using 'monkeypox'. We likely should replace 'monkeypox' with 'MPXV' (and possibly Mpox in some places). This will require:

ingest: deduplicate sequences using strain names

Context

Once we've completed #32, we can use strain names to deduplicate sequences.
This is necessary in case different groups sequence the same virus or if sequences are generated from different protocols.
(NOTE: This is separate from the versioning in GenBank, we already pull in the latest version of GenBank sequences).

Description

The duplicate sequences should probably be filtered out in a new script (e.g. ingest/bin/deduplicate-records) OR potentially use the augur deduplicate command (see nextstrain/augur#919).

We probably want to keep a file with all sequences in case people want the duplicate sequences for any reason.
The deduplicated files will be the main ones used for LAPIS and/or our monkeypox builds.

ingest: include `url` field

Context

See #72 (comment)

Possible solution

GenBank urls can be specially added as https://www.ncbi.nlm.nih.gov/nuccore/<genbank_accession>
URLs for arbitrary non-GenBank sequences will have to be added through manual annotations.

Potentially include year-only sequences

Right now we seem to exclude sequences from the B.1 build that lack a month, i.e. year-only sequences 2022-XX-XX

They get filtered out in subsampling as they don't find neatly into a year month sampling scheme. We could add a separate "year-only" filter to get them back in.

update README.

the readme contains a bunch of outdated instructions.

ingest: canonicalize strain names

Context

Currently, the ingest pipeline accepts any format for the strain names.
We should canonicalize them to have prettier names for display in Auspice and to have a way to deduplicate sequences.

Description

We need a clear standard format for strain names. If we follow the existing pattern we use for other pathogens (e.g. SARS-CoV-2), this would be <country>/<sample_id>/<year>

Once we've decided on a format, we should add necessary transforms to ingest/bin/transform-strain-names.

LAPIS data: cannot reindex on an axis with duplicate labels

Context

When using LAPIS data (data_source: "lapis"), the rule filter exits with the error: ValueError: cannot reindex on an axis with duplicate labels1. I think augur is unhappy that a year column already exists in the LAPIS data.

Additional Context

I'm using a conda environment rather than the docker image. But the conda environment works flawlessly for Nextstrain data, just not LAPIS. I'm guessing it's because I'm using a newer version of pandas (v1.4.2) since augur is also raising FutureWarning: reindexing with a non-unique Index is deprecated.

Possible Solution

One way to solve this, would be to drop the year column before the filter rule. Adding the following segment to scripts/wrangle_metadata.py fixes the issue for me:

# Remove the year column, because it will break augur filter
if "year" in metadata.columns:
  new_dates = []
  # Iterate through the 'date' and 'year' columns
  for s_date, s_year in zip(metadata["date"], metadata["year"]):

    # If date is null, we use the year
    if pd.isna(s_date) and not pd.isna(s_year):
      new_dates.append("{}-XX-XX".format(int(s_year)))

    # if date is not null, use it
    elif not pd.isna(s_date):
      new_dates.append(s_date)

    # Otherwise, use none
    else:
      new_dates.append(None)

  metadata["date"] = new_dates
  metadata.drop(columns=["year"], inplace=True)

Steps to Reproduce

Here is the shell command in isolation (after LAPIS download):

augur filter \
  --sequences data/sequences.fasta \
  --metadata results/metadata.tsv \
  --exclude config/exclude_accessions_hmpxv1.txt \
  --output-sequences results/hmpxv1_lapis/filtered.fasta \
  --output-metadata results/hmpxv1_lapis/metadata.tsv \
  --group-by country year \
  --sequences-per-group 1000 \
  --min-date 2017 \
  --min-length 10000 \
  --output-log results/hmpxv1_lapis/filtered.log

Environment

name: nextstrain-mpx
channels:
  - bioconda
  - conda-forge
  - anaconda
  - defaults
dependencies:
  - anaconda::python=3.9.10
  - anaconda::pip=22.0.3
  - conda-forge::pandas=1.4.2
  # Workflow
  - bioconda::snakemake=7.3.6
  # Phylogeny
  - bioconda::iqtree=2.2.0.3
  # Misc
  - bioconda::epiweeks=2.1.4
  - conda-forge::gzip>=1.6
  - pip:
    - nextstrain-augur==16.0.1

# Notes:
# - nextclade and nextalign: v2 must be manually installed and renamed to nextclade2 and nextalign2
#     wget -O $CONDA_PREFIX/bin/nextclade2 https://github.com/nextstrain/nextclade/releases/download/2.0.0-beta.5/nextclade-x86_64-unknown-linux-gnu
#     wget -O $CONDA_PREFIX/bin/nextalign2 https://github.com/nextstrain/nextclade/releases/download/2.0.0-beta.5/nextalign-x86_64-unknown-linux-gnu

Full Traceback

/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/augur/filter.py:953: FutureWarning: reindexing with a non-unique Index is deprecated and will raise in a future version.
  df_skip = metadata[metadata['year'].isnull()]
Traceback (most recent call last):
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/augur/__init__.py", line 81, in run
    return args.__command__.run(args)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/augur/filter.py", line 1424, in run
    group_by_strain, skipped_strains = get_groups_for_subsampling(
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/augur/filter.py", line 953, in get_groups_for_subsampling
    df_skip = metadata[metadata['year'].isnull()]
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/frame.py", line 3492, in __getitem__
    return self.where(key)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/frame.py", line 10955, in where
    return super().where(cond, other, inplace, axis, level, errors, try_cast)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/generic.py", line 9308, in where
    return self._where(cond, other, inplace, axis, level, errors=errors)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/generic.py", line 9075, in _where
    cond = cond.reindex(self._info_axis, axis=self._info_axis_number, copy=False)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/util/_decorators.py", line 324, in wrapper
    return func(*args, **kwargs)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/frame.py", line 4804, in reindex
    return super().reindex(**kwargs)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/generic.py", line 4966, in reindex
    return self._reindex_axes(
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/frame.py", line 4617, in _reindex_axes
    frame = frame._reindex_columns(
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/frame.py", line 4662, in _reindex_columns
    return self._reindex_with_indexers(
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/generic.py", line 5032, in _reindex_with_indexers
    new_data = new_data.reindex_indexer(
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/internals/managers.py", line 679, in reindex_indexer
    self.axes[axis]._validate_can_reindex(indexer)
  File "/home/keaton/.conda/envs/nextstrain-mpx/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 4107, in _validate_can_reindex
    raise ValueError("cannot reindex on an axis with duplicate labels")
ValueError: cannot reindex on an axis with duplicate labels

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.