Code Monkey home page Code Monkey logo

songrec's People

Contributors

a33k avatar albanobattistella avatar amogusussy avatar damonhayhurst avatar dotx12 avatar ducaton avatar gnuhead-chieb avatar heldderarbeit avatar hummer12007 avatar kepet19 avatar m0rf30 avatar marcelocripe avatar marin-m avatar matteogauthier avatar nemobis avatar nycex avatar orhun avatar tsar avatar tyd avatar vistaus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

songrec's Issues

No microphone input detects sound from speakers

Hello,

SongRec works great when I use my actual microphone. However, none of the options for microphone input manages to detect sound from Chrome or VLC for instance, despite me being actually able to hear sound. I've tried with and without Recognize from my speakers instead of microphone every time it was available.

I don't really know what to do to make it work or to help you help me figuring it out...

I'm using Manjaro KDE and I installed SongRec from AUR on version 0.1.7-2. Let me know if you need any specific technical detail.

Thank you

I wonder how you got to know the signature format. Is this the result of reverse engineering? Or do you have any insider knowledge? Also, do you know anything about alternative signature formats, for example, the one from the Windows version that I tried to reverse myself?

Your project filed all blank spots in my understanding, and I was able to build my own Shazam client in .Net.

I learned a few things that I'd like to share with you.

Signature format

Peak detection

EDIT: turned out to be wrong; skips important peaks

I found that peak detection can be as simple as scanning for the maximum magnitude within ±48 interval horizontally and vertically.

retryms

Shazam API returns a retryms hint when there are no matches:

{ "matches": [], "tagid": "34fe8632-7e6b-47c0-a5fa-1fedaa623c37", "retryms": 5000 }
{ "matches": [], "tagid": "34fe8632-7e6b-47c0-a5fa-1fedaa623c37", "retryms": 9000 }
{ "matches": [], "tagid": "34fe8632-7e6b-47c0-a5fa-1fedaa623c37", "retryms": 12000 }
{ "matches": [], "timestamp": 0, "tagid": "34fe8632-7e6b-47c0-a5fa-1fedaa623c37" }

It suggests to submit signatures for incrementally growing spectrograms. In other words, it allows up to 4 attempts before giving up.

Looks like they have multiple databases with different levels of detail, because popular songs are usually recognized faster.

2 Songrec instances : No match for this song

Hello,

Platform 1 : Calculate Linux 21 - Flatpak 1.10.2
Platform 2 : Linux Mint 19.3 - Flatpak 1.0.9

Name Application ID Version Branch Installation
SongRec com.github.marinm.songrec 0.1.8 stable system

Step to reproduce :
Open 1 songrec instance
Listen music
Open second songrec instance
(Eventually Close the second instance)
Message "No Match for this song" pop several times, unable to stop theses messages while one instance of songrec started)

No problem if 1 instance of songrec is started (no message with songrec used for long long hours before i see the bug)

Can you fix it or don't start second instance of songrec if 1 already started ?

Thanks

Translations

Thanks for this great app! Is there a way to translate the app? I'm willing to do so, in Greek!

algorithm is too slow

Awsome project!
I'm using python version, function do_peak_spreading cost over 4 seconds every time.
any idea to improve it? thank you.

getting ID3 tag

is there a way to get ID3 tag as a json from either the shazam api used or as a function in songrec
it would really help a lot of people using songrec in scripts to recongize songs , rename and add metadata

thanks

calling from python with chunks of sets

Great Project!

I want to use this to get the ids of track from sets. So let's say I have the audio file of a set, how would I approach it? I would like to have something like a sliding window over the set and use songrec on each window so that I get the time-stamps where the songs change.

I would write a small python scripts which calls SongRec (I don't know any rust).

Does this code just take the raw 12s PCM signals of the middle of the song and return it? Does the algorithm always expect 12s of pcm-data?

pub fn make_signature_from_file(file_path: &str) -> Result<DecodedSignature, Box<dyn Error>> {

continuous recognition from file

I'm not that good with rust, but it seems like it just skips to the middle of a file if you do recognition for a file.

To be able to get a full setlist from a DJ set it would be neat to be able to run it in continuous mode on the whole file instead (instead of working around it with a null element in pulseaudio/pipewire for loopback from a player to songrec).

Should probably not fire off requests as rapidly as it is decoded, though, but rather preserve the normal playback speed.

Ignore the same song repeatedly being recognised

I love this app! It passively listens to tracks so I can get a playlist populated pretty quickly. The bars and cafes won't know what hit 'em!

That being said I would like it if the app compared the previous Shazam lure with the currently generated lure and ignored re-adding it to the list if it's just played. This way if the same song plays down the list you can still see that it's been played more than once but the current song won't be added several times over.

The only issue with this is if the song actually plays again (repeated in the playlist), meaning the app would have to differentiate between the beginning of a song and quite possibly even it's end (something which would require a fair bit of coding I expect, adding to the fact that Shazam doesn't support checking for parts of a song). But I think rather the SongRec playlist and polling issue takes presedence.

I also think this will solve the bug issue #25 since it will compare the lure locally, ignore it if it's the same, minimizing the amount of times the Shazam service is polled.

Screenshot for reference:
bilde

Global Command

When using the software as root it works fine, if i try and call it from python using os.system() it says command not found, is there a way to fix this?

Doesn't work with .wma files

I tried to recognize a song with a .wma format and it didn't work. I can play the song without any problem outside of it.

On a side note: does it work with video files? I tried with one and it didn't recognize the song, but it didn't give me the "Unrecognized format" error.

Two more insights

Hello

I'd like to share two more insights that I got, that helped me better understand the signature format.

Bin interpolation

let peak_variation_1: f32 = peak_magnitude * 2.0 - peak_magnitude_before - peak_magnitude_after;
let peak_variation_2: f32 = (peak_magnitude_after - peak_magnitude_before) * 32.0 / peak_variation_1;
let corrected_peak_frequency_bin: u16 = bin_position as u16 * 64 + peak_variation_2 as u16;

This is precisely Quadratic Interpolation of Spectral Peaks described very well in https://ccrma.stanford.edu/~jos/sasp/Quadratic_Interpolation_Spectral_Peaks.html. They also suggest an interpolation formula for magnitude. And I guess this is what is refered to as 'curvature' in the binary.

Magnitude scaling

let peak_magnitude: f32 = fft_minus_46[bin_position].ln().max(1.0 / 64.0) * 1477.3 + 6144.0;

Empirically, I found that the exact expression is

3 * 4096 * (log(m) / log(64) - 1)

where m is clean magnitude (i.e. without scaling and squaring), and 64 happens to be the minimum value that yields non negative results.

It seems to me that here

real_fft_results[index] = (
(
complex_fft_results[index].re.powi(2) +
complex_fft_results[index].im.powi(2)
) / ((1 << 17) as f32)
).max(0.0000000001);

the divider should be 1 << 18 to match the formula. Surely, it doesn't affect the recognition. Just found it interesting to share.

Cheers

New release?

bunch of new fixes in code but no new release?

How to recognize from system sound ?

How to change input device to System Sounds ?

For example when i listen music from youtube or coub , and i wanna know the name of music , i can't , because it recognize only from microphone ..
image

SongRec as a lib

As mentioned here #11, I want to use SongRec from python. I never coded in rust, but I set up a small rust wrapper to marshall the types, but I am unable to import anything even though SongRec is in my cradle dependencies (can't find crate for `songrec` ).

I think this is due to SongRec not having a lib-section and therefore not exposing anything (first time using rust...)?

Flatpak packaging

Your app does look good, but having to compile it makes it difficult for people to try it out. Maybe flatpak packaging could help people in multiple distros (except for Ubuntu) to easily test it.
Thanks!

Could be possible to integrate the Option to listen the music from bluetooth HeadPhones or By Browser?

Hello @marin-m , you are the legend man, I was looking something similar to SoundHound or Shazam for Linux and....boom I jumped into your amazing project. Really thanks for this precious Milestone in Linux world.

Now, let's share the point, I was with my earphones to watch some video in internet, and it was a video with great Music in background, and I noticed as an example Shazam can recognize the Audio even when you are by Bluetooth earphone/Headphones. Not sure which method it's using to read the Audio datas if from the OS audio management before arrive to the Bluetooth Audio Hardware or from the Bluetooth Hardware into the phone before arrive to the Bluetooth Earphones... Do you thing it's that easy to implement this future, or even by selecting the app that produce the audio from the system (before arriving to the device) to let it recognize what kind of song it's playing?

Dark Theme on Windows

I saw in another issue that SongRec has a dark theme, but since there's no Option panel, I think that uses some default options from OS.

But, for some reason, this dark theme doesn't work on Windows. I have the dark theme and the application just opens in the light theme. :(

OS: Windows 10 x64

Consider tracking Cargo.lock

Thanks for your work on this tool, it's great to have something like this available on the desktop.

One very small issue, though: the project's Cargo.lock file is currently not tracked since it's listed in .gitignore. Would you consider adding it to the repository? This is the approach recommended by the Cargo book for "end product"-type software, such as applications (emphasis mine):

If you’re building a non-end product, such as a rust library that other rust packages will depend on, put Cargo.lock in your .gitignore. If you’re building an end product, which are executable like command-line tool or an application, or a system library with crate-type of staticlib or cdylib, check Cargo.lock into git. If you're curious about why that is, see "Why do binaries have Cargo.lock in version control, but not libraries?" in the FAQ.

Having the lockfile available would bring reproducibility benefits for distributions which package SongRec, since they would be able to ensure builds use the exact same set of dependencies as upstream. (I'm also considering packaging SongRec in nixpkgs, which more or less requires having this level of reproducibility for packages to be accepted.)

Using SongRec's implementation in my own app.

Hello, I would like to ask, whether I could use your fingerprinting and signature algorithms in my app - OneTagger (https://github.com/Marekkon5/onetagger), since you don't release this as a library, but as a standalone app. My app is using the Apache license, however I could switch to GPL as well. I am asking because I would practically copy the functionality of your app into mine, but with some other features as well. Thanks

Spotify

Hi, can you add a function to insert automatically the Song in a Spotify Playlist .

Non-French?

The French location and language are hardcoded into the comms, any change of making it configurable? I've noticed the results have french in them, even if the bits you are parsing out for the GUI are language independent.

Doesn't have to be all languages, but English would be nice.

More on peak detection

Hello, it's me again

I became really obsessive with figuring out how Shazam actually picks peaks. So I went data-driven. I captured many signatures from the official app and explored the distance between peaks. Having analyzed 16,000 peaks, I found 2 empty areas around each:

  • ±47 time units
  • ±9 freq bins by ±3 time units rectangle

These parameters produce quite dense constellations:
spectro1

Obviously, only most intense peaks are wanted. But what should be the threshold? Intuitively, I didn't like absolute or relative magnitude because a single noise spike can spoil the entire spectrogram. Then I noticed that in original signatures, the number of peaks is often limited by approx 12 × sample duration:

sample len: 4.19
peaks in band 0: 14
peaks in band 1: 48
peaks in band 2: 48

sample len: 14.6
peaks in band 0: 63
peaks in band 1: 168
peaks in band 2: 168

This made me think of a rate limiter. After a bit of trial and error, I wrote a specialized rate limiter (code). When the next peak doesn't fit, the algorithm checks if there is a less intense peak among previously recorded. If found, it is replaced with the new peak, otherwise the new peak is rejected.

For 12 peaks per second per band, spectrograms look like this:
spectro2

Recognition worked fairly well.

Then I tried the lowest limit of 1. And you know what? It worked too. These 20 points were correctly tagged as Smells Like Teen Spirit:
spectro3

Maybe you'll want to check this approach in your app.
Thank you.

HTTP error spam

Bug description:
When network cuts out or there is some other API error, the user is spammed with an error dialog every time a request is sent.

The particular message I was getting was along the lines of (not exactly this):

Unable to decode JSON
Unexpected token at line 2 position 0
error decoding response body: EOF while parsing a value at line 1 column 0

Expected behavior:
Either:

  • API errors are logged to terminal, file or somplace else.
  • Error dialogs are ignored if the last error message is the same as the next one
  • Some way to disable error dialogs

Version:
songrec: 0.1.6-1 (AUR)

Relevant code:

Err(error) => {
gui_tx.send(GUIMessage::ErrorMessage(error.to_string())).unwrap();
}

ErrorMessage(string) => {
recognize_file_button.show();
spinner.hide();
if !(string == "No match for this song" && microphone_stop_button.is_visible()) {
let dialog = gtk::MessageDialog::new(Some(&window),
gtk::DialogFlags::MODAL, gtk::MessageType::Error, gtk::ButtonsType::Ok, &string);
dialog.connect_response(|dialog, _| dialog.close());
dialog.show_all();
}

Great project by the way :)

Capture output audio stream as source

It would be nice to have the software pickup audio from the output stream.
Often while watching a movie or a twitch stream I would like to know the song which is playing and headphones -> mic isn't the best solution.

I guess this may be hard to implement but still a nice to have.

What data it collects?

As I understand this talks to Shazam's servers, it so what data does it collects about me?

Mic input makes SongRec crash (and crash on reopen)

Hello,

In my Microphone inputs I've got two options that make SongRec crash immediately : upmix and vdownmix.
I don't know where those come from, I've been installing and uninstalling multiple sound softwares over the years, they might be a remnant of something that's not there anymore.
Anyway, when I select one of them, SongRec crashes, and since it remembers last selected input, SongRec crashes on reopening. I have to empty ~/.local/share/SongRec/device_name.txt for SongRec to reopen without crashing.
When I try one of these options with SongRec opened from terminal, I get the following error : Erreur de segmentation (core dumped).

I'm on Manjaro KDE and I installed SongRec from AUR on version 0.1.7-2. Let me know if you need more specific technical details.

Genre?

Shazam also provides a music genre in the results, which is often useful to know.

Default device (speakers)

I love Songrec !!!! thanks for this fine piece of software !
Maybe, in a future release we could have the "recognize from speakers" checkbox state also stored in the device_name.txt file ?

Ability to Save Microphone Input

Hey, just wanted to say love your app. It has been working great so far, but I just wanted to suggest adding the ability to save the microphone input, so the user doesn't have to choose between pipewire, pulse, and etc. every time.

Recognition from files not possible

TLDR: File input does not work as intended, only microphone recognition.

Hey,

SongRec is quite well in recognizing files via the desktop microphone.

However most of the songs I am tagging on the go on my smartphone (Android). I was really happy that there is an alternative to the shazam app. When I tried to record audio files on my phone and copying them onto the laptop to SongRec, it does not recognize them.

The audio files is always displayed as "Unrecognized format" and the .wav-file gives an empty json string (only some metadata), but no song. However when I play the song the microphone recognition puts out the right song (so it does not depend on a bad quality of the .wav-record).

I don't know why the file input does not work with both file formats, I even tried the command line but with the same result :( I can upload these files if you want.

PS Dreaming: It would be really cool, if this app will ever find its way on androids (f-droid).

Kind regards

Differencies between Python and Rust implementations

Hi!

I found that Python-version and Rust-version give slightly different signature for the sample file 'stupeflip.wav'. The input PCM buffer is completely the same for them.

After some digging I found following things, which made the behaviour different:

  1. Python-version uses max_value from outside the cycle:

    max_value = spread_last_fft[position]
    for former_fft_num in [-1, -3, -6]:
    former_fft_output = self.spread_ffts_output[(self.spread_ffts_output.position + former_fft_num) % self.spread_ffts_output.buffer_size]
    former_fft_output[position] = max_value = max(former_fft_output[position], max_value)

    Rust-version just uses max of 2 values inside the cycle:
    former_fft_output[position] = former_fft_output[position]
    .max(spread_fft_results_copy[position]);

    This is the main reason that makes peak counts different (signature from Python-version is smaller, has less peaks).

  2. Rust-version casts value to u16 earlier:

    let corrected_peak_frequency_bin: u16 = bin_position as u16 * 64 + peak_variation_2 as u16;

    Python-version uses float almost till the end:
    corrected_peak_frequency_bin = bin_position * 64 + peak_variation_2

    This made some peaks matched to other frequency bands.

  3. ln/log and max operations order is different.

    let peak_magnitude: f32 = fft_minus_46[bin_position].ln().max(1.0 / 64.0) * 1477.3 + 6144.0;
    let peak_magnitude_before: f32 = fft_minus_46[bin_position - 1].ln().max(1.0 / 64.0) * 1477.3 + 6144.0;
    let peak_magnitude_after: f32 = fft_minus_46[bin_position + 1].ln().max(1.0 / 64.0) * 1477.3 + 6144.0;

    peak_magnitude = log(max(1 / 64, fft_minus_46[bin_position])) * 1477.3 + 6144
    peak_magnitude_before = log(max(1 / 64, fft_minus_46[bin_position - 1])) * 1477.3 + 6144
    peak_magnitude_after = log(max(1 / 64, fft_minus_46[bin_position + 1])) * 1477.3 + 6144

    Didn't affect the result on the sample file, but might affect on some other.

Now I'm confused, because I do not understand the algorithm deeply, so I can't say which solution is correct.
Both seem to work actually, but is one of them better?

GUI features but in command-line only?

Hi,

The continuous song recognition and the history export to csv are great features!
However I would like to run these on a headless server (e.g. rpi with mic connected).

Is there a simple way to run songRec with all these amazing features (continuous recognition and csv export), but from a CLI? Alas I don't know rust at all...

Merci d'avance pour votre aide :)

Consider moving fingerprinting to different crate licensed as MIT

Hi,
Recently I searched for any music hash/fingerprint crate, but I couldn't find any.

Looks that src/fingerprinting/algorithm.rs is more or less solution to my problems, but I can't use them directly due to license problems(my project is licensed under MIT but this code is GPL).

Also looks that this code returns data only needed by Shazam, but I need something continuous hash e.g. [u8;8] to be able check Hamming distance between two hashes to get informatiion if this music are similar or not.

I think that should looks similar to img_hash crate which produce 64 bits(or 256 etc.) hashes from images - https://crates.io/crates/img_hash

Android port ?

Is there any hope that this gets ported to android ?

PS : Thanks for that amazing work !

batch music file renaming

My dad has several compilation CDs that are unrecognised by CDDB and musicbrainz etc so to rip them properly he would've had to manually rename every track. That's a tedious task and nearly all of the tracks are well known songs that are recognised by Shazam so I wrote a script using songrec to automate the process of renaming every supported music file within a directory using songrec:

https://github.com/danboid/songrec-rename

Might you consider adding this feature into songrec itself?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.