fosfrancesco / asap-dataset Goto Github PK
View Code? Open in Web Editor NEWA dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
License: Other
A dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.
License: Other
Hi,
Thank you for publishing such useful dataset !
Everything looks great, except that there seems to be some errors in beat annotation of Bach/Fugue/bwv_856/LuoJ01M.
When listening to the beats/downbeats with Bach/Fugue/bwv_856/LuoJ01M.wav, I found some beats are unreasonably close to downbeats. (starts from 0:14 second)
I'm wondering if this could be an systematic error, or just an single case.
(I only found this one, but I didn't check many.)
What I did is apply librosa.clicks to beats and downbeats (with different click_freq), and listen.
If you need any further information, please let me know.
Thank you!
Dear Authors,
Thanks for the great work!
However, I found some errors in beat annotations while calculating statistics regarding inter-beat-intervals(IBI) for all beats.
I converted all IBI into BPM, and found 19 songs with max tempo larger than 300 bpm.
You may check the .xlsx (https://drive.google.com/file/d/1mluXcJQQGjr-K5Cm0JAoFBP521eSUKgK/view?usp=sharing) for these songs. Especially for the song Beethoven/Piano_Sonatas/26-3/HONG05M.wav, the fastest beat is 6000bpm.
(a screenshot for the corresponding annotations is also shown in the .xlsx)
Could you please check if there's any systematic error in the automatic annotation algorithm that cause this?
If there's any further information required from me, please let me know.
Thank you!
Hi, I found that some songs haven't been fully annotated, most notably the beats in the "txt" file. Just wondering it is partially annotated, or the annotation is wrongly matched the song. Here is the list:
Great appriciate your great works! The source dataset I used is maestro v2.0 and git was replicated on 01 Feb 2023, both matching the description of the ASAP dataset.
Best, Zhanh
Hey, I'm not actually 100% sure whether this is a bug or just me not understanding how exactly these MIDI files are meant to work- but I'm noticing something a little odd where the midi_scores for pieces have MIDI that looks something like this:
---- note_on, 60, time=0 -----
---- note_off, 60, time=0 -----
---- note_on, 60, time=0 -----
< other stuff that takes time >
---- note_off, 60, time=(something >0) ----
This example happens in the midi_score for Bach Fugue bvw_846, 7.0 seconds into the piece.
In other words, there is often a quick succession of on-off-on, then a gap, then the off signal for the actual note. Is this unintentional (meaning my processor should ignore these zero-length hits) or does it mean something different when processing the files?
The other pattern I notice (much less frequently) is what seems to be "duplicated notes" for lack of a better term, which looks like:
---- note_on, 63, time=0 -----
---- note_on, 63, time=0 -----
< other stuff that takes time >
---- note_off, 63, time=(something >0) -----
---- note_off, 63, time=0 ----
In this case, it's like both the on and off signals occur in a place that makes sense, it's just that they are both duplicated. This example happens in the score for Bach Fugue bwv_848, 22.75 seconds into the piece.
I could write my code to ignore both of these types of occurrences (in other words, ignore 0-length and duplicated notes) but I just want to make sure these aren't meant to convey any other special meaning before destroying that information.
Thanks for your help!
The entries for Schumann's Toccata with the repeat have the "repeat" in the piece title in metadata.csv
. This poses an issue when creating dataset splits that don't overlap multiple versions of the same piece. According to the README:
For the applications where unique pieces are needed (e.g., to create a training/test dataset with not overlapping) look for the unique couple (title,composer).
However, on line 1054 and 1055 of metadata.csv, the second column (title) is Toccata_repeat
in contrast to lines 1050-1053 which have the title Toccata
only, meaning there is a unique entry for (title,composer), despite not being a unique piece.
I can submit a pull request if you'd like!
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.