Code Monkey home page Code Monkey logo

clmr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clmr's Issues

Weights for million song dataset?

Hi,

Thanks for releasing this nice repository!

Do you plan on releasing the weights for the linear classifier trained on the million song dataset too? I would be very happy to use it in my work because I would care about these more abstract "happy" or "sad" classes.

On another note: I might be too stupid to see it, but I could not find an easy way to assign the predictions of the linear classifier trained on the magnatagatune dataset to their corresponding labels. In the paper, you say that you choose the top 50 most common labels. Is there a list somewhere here in the repository for it or can I look it up on the dataset site? I do not want to mess up the order of the labels for obvious reasons...

Thanks again and kind regards,
Anton

moules.linear_evaluation is missing parameters when calling Accuracy and AveragePrecision

When running the LinearEvaluation module I get the following error:

TypeError: Accuracy.__new__() missing 1 required positional argument: 'task'

and similar when calling torchmetrics.AveragePrecision without the task argument

Proposed solution

        self.accuracy = torchmetrics.Accuracy(
            task="multilabel",
            num_labels=output_dim
        )
        self.average_precision = torchmetrics.AveragePrecision(
            task='multilabel',
            num_labels=output_dim,
            pos_label=1
        )

Import Error in colab

Hello, I'm trying to run your colab-version source code. but it seems like some ERROR here:
when I run this code:

!git clone https://github.com/spijkervet/clmr %cd /content/clmr !pip install . -q

output:

Cloning into 'clmr'... remote: Enumerating objects: 3460, done. remote: Counting objects: 100% (817/817), done. remote: Compressing objects: 100% (546/546), done. remote: Total 3460 (delta 414), reused 579 (delta 245), pack-reused 2643 Receiving objects: 100% (3460/3460), 45.50 MiB | 29.01 MiB/s, done. Resolving deltas: 100% (2047/2047), done. /content/clmr DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default. pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555. |████████████████████████████████| 2.9 MB 7.9 MB/s |████████████████████████████████| 1.0 MB 35.8 MB/s |████████████████████████████████| 829 kB 48.7 MB/s |████████████████████████████████| 596 kB 46.7 MB/s |████████████████████████████████| 125 kB 46.4 MB/s |████████████████████████████████| 329 kB 45.1 MB/s |████████████████████████████████| 1.1 MB 40.7 MB/s |████████████████████████████████| 271 kB 39.2 MB/s |████████████████████████████████| 160 kB 47.2 MB/s |████████████████████████████████| 192 kB 52.0 MB/s |██████████████████████████████▎ | 834.1 MB 1.3 MB/s eta 0:00:38tcmalloc: large alloc 1147494400 bytes == 0x55d9faec4000 @ 0x7f710baee615 0x55d9f71684cc 0x55d9f724847a 0x55d9f716b2ed 0x55d9f725ce1d 0x55d9f71dee99 0x55d9f71d99ee 0x55d9f716cbda 0x55d9f71ded00 0x55d9f71d99ee 0x55d9f716cbda 0x55d9f71db737 0x55d9f725dc66 0x55d9f71dadaf 0x55d9f725dc66 0x55d9f71dadaf 0x55d9f725dc66 0x55d9f71dadaf 0x55d9f716d039 0x55d9f71b0409 0x55d9f716bc52 0x55d9f71dec25 0x55d9f71d99ee 0x55d9f716cbda 0x55d9f71db737 0x55d9f71d99ee 0x55d9f716cbda 0x55d9f71da915 0x55d9f716cafa 0x55d9f71dac0d 0x55d9f71d99ee |████████████████████████████████| 881.9 MB 20 kB/s |████████████████████████████████| 58 kB 5.1 MB/s |████████████████████████████████| 23.3 MB 48.1 MB/s Building wheel for clmr (setup.py) ... done Building wheel for future (setup.py) ... done Building wheel for julius (setup.py) ... done ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchtext 0.10.0 requires torch==1.9.0, but you have torch 1.10.0 which is incompatible.

t-sne visualization

Can you tell me more details about t-sne visualization?I reproduced your program on the magnatagatune dataset and got PR-AUC of 0.3496 and ROC-AUC of 0.886. I noticed that you used t-sne visualization in your paper. But when I tried to visualize the obtained features, I got a bad visualization result. I put 'extract_ 'representations' in 'representations' is converted to list format as the input of t-sne. The feature map I got shows that the features are not well distinguished, I printed the representations and found that each feature was similar. I would like to know what is your input when visualizing with t-sne. I want to get a good feature visualization.

def extract_representations(self, dataloader: DataLoader) -> Dataset:

    representations = []
    ys = []
    for x, y in tqdm(dataloader):
        with torch.no_grad():
            h0 = self.encoder(x)
            representations.append(h0)
            ys.append(y)

    if len(representations) > 1:
        representations = torch.cat(representations, dim=0)
        ys = torch.cat(ys, dim=0)
    else:
        representations = representations[0]
        ys = ys[0]

    tensor_dataset = TensorDataset(representations, ys)
    return tensor_dataset

1%| | 1/147 [00:08<21:49, 8.97s/it][tensor([[ 0.0147, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
...,
[ 0.0147, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170]])]
[tensor([[ 0.0147, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
[ 0.0146, 0.0385, 0.0067, ..., -0.0021, -0.0179, -0.0170],
...,

config for reproducing the provided model clmr_checkpoint_10000.pt

I trained and evaluated with the default config, which resulted in roc auc 86.3.

Then I find the config is inconsistent with the paper's golden setttings,

                     config.yaml        paper
batch size            48                 96
max epoches          200               1000
sample rate        16000              22050

I changed these params and retrained, got roc auc 87.3, a lot better. Still, it was far from the 88.5 in the paper.

Meanwhile, finetuning linear classifier from clmr_checkpoint_10000.pt did work.

So I wonder if you could provide a config setting for reproducing the pretrained model.

Extracting embeddings?

Hey I want to use trained model to extract audio embeddings, could you help with an example on how to do it? I understand that the repo is no longer maintained and probably abandoned, but maybe someone sees this. =)

PyTorch Lightning version

Hi,

When I followed these steps to test pre-trained model, I got this error:

(clmr) yunusemre.ozkose@server:/CLMR$ python main.py --dataset audio --dataset_dir /path/to/test_audio_dataset_dir
Global seed set to 42
Traceback (most recent call last):
  File "main.py", line 129, in <module>
    module = module.load_from_checkpoint(
  File "/miniconda3/envs/cmlr/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
    return _load_from_checkpoint(
  File "/miniconda3/envs/cmlr/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 205, in _load_from_checkpoint
    return _load_state(cls, checkpoint, strict=strict, **kwargs)
  File "/miniconda3/envs/cmlr/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 250, in _load_state
    obj = cls(**_cls_kwargs)
TypeError: __init__() missing 1 required positional argument: 'args'

There was a related bug in Lightning that seems to be solved. Could you state your PyTorch Lightning version?

colab installation broken (dependencies)

Hi! It seems there is some dependency issue going on. I've been troubleshooting locally with no luck.

When running the installation bit of the colab notebook I'm getting the following:

  Preparing metadata (setup.py) ... done
ERROR: Could not find a version that satisfies the requirement torch==1.9.0 (from clmr) (from versions: 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==1.9.0

Currently investigating a good dependency configuration for this repo. Info welcome.

Is there any length limit of the music with this method?

Nice to see contrastive learning used in music area, is there any length limit? Is it possible to get meaningful representation (for example, hundreds dimension vector) of song (few minutes long) with this method? Look forward for your reply, thanks a lot.

the pretrained model on GTZAN

hello, I appreciate your work in music representation. But i have a question: how many epochs do you train the model for the GTZAN dataset? Can you release the pretrained model of GTZAN?

Possible major bug in evaluation

Hi @Spijkervet !

I played around with the pretrained Magnatagatune weights and discovered something. The predicted values were relatively low and I was wondering what's up.

Now I discovered, that in the evaluation.py in line 29 there is output = torch.nn.functional.softmax(output, dim=1). But, as far as I understand, the Magnatagatune (and the MSD dataset) are multi-label tasks - their loss functions are also the binary cross entropy in code. Hence, I suppose that instead of the softmax there should be a torch.sigmoid being used there.

Please let me know if I'm wrong, but as I see it now this could change the results of CLMR quite significantly (for the better), given that this code was used to generate the results.

ValueError: The `target` has to be an integer tensor.

Hi
When I do linear evaluation:
Traceback (most recent call last):
File "linear_evaluation.py", line 149, in
trainer.fit(module, train_loader, valid_loader)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 741, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 685, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 777, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1199, in _run
self._dispatch()
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1279, in _dispatch
self.training_type_plugin.start_training(self)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 202, in start_training
self._results = trainer.run_stage()
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1289, in run_stage
return self._run_train()
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1311, in _run_train
self._run_sanity_check(self.lightning_module)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1375, in _run_sanity_check
self._evaluation_loop.run()
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance
dl_outputs = self.epoch_loop.run(dataloader, dataloader_idx, dl_max_batches, self.num_dataloaders)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/loops/base.py", line 145, in run
self.advance(*args, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 122, in advance
output = self._evaluation_step(batch, batch_idx, dataloader_idx)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 217, in _evaluation_step
output = self.trainer.accelerator.validation_step(step_kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 239, in validation_step
return self.training_type_plugin.validation_step(*step_kwargs.values())
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 219, in validation_step
return self.model.validation_step(*args, **kwargs)
File "/data1/simon/projects/musicrepresentations/clmr/modules/linear_evaluation.py", line 64, in validation_step
self.log("Valid/accuracy", self.accuracy(preds, y))
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/metric.py", line 205, in forward
self.update(*args, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/metric.py", line 263, in wrapped_func
return update(*args, **kwargs)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/classification/accuracy.py", line 228, in update
mode = _mode(preds, target, self.threshold, self.top_k, self.num_classes, self.multiclass)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/functional/classification/accuracy.py", line 59, in _mode
preds, target, threshold=threshold, top_k=top_k, num_classes=num_classes, multiclass=multiclass
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/utilities/checks.py", line 251, in _check_classification_inputs
_basic_input_validation(preds, target, threshold, multiclass)
File "/data1/anaconda3/envs/CLMR/lib/python3.6/site-packages/torchmetrics/utilities/checks.py", line 33, in _basic_input_validation
raise ValueError("The target has to be an integer tensor.")
ValueError: The target has to be an integer tensor.

Where is 'processed_annotations' for Million Song Dataset?

Hello, thank you very much for your wonderful GitHub repository. 😊

I tried to train the code in the repository using the Million Song Dataset, but it prompted me with a FileNotFoundError: [Errno 2] No such file or directory: ‘data/million_song_dataset/processed_annotations/output_labels_msd.txt’. 😢

Where can I download the files in the processed_annotations folder such as ‘train_gt_msd.tsv’, ‘output_labels_msd.txt’, ‘index_msd.tsv’, etc.?

I couldn’t find any clues to download these files on the official website of the Million Song Dataset, and the MSD dataset I downloaded did not contain these files.😭

Thank you for your time and assistance. 😊

Train / Validation / Test splits for million song dataset

Hi!
Thank you for releasing this repo :)

I was wondering where I can find the train/test/validation splits you used for MSD? My team and I are trying to reproduce this study but, unfortunately, we can't find the 201 680 / 11 774 / 28 435 splits and the corresponding tags from Last.FM. Would be very helpful for any assistance on this!

Kind regards,
Cody

train by raw dataset

Hi,I want to use this method to implement a feature of my graduation project. But how can i train a model using a raw dataset such as Universal Music? Looking forward to your reply

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.