Code Monkey home page Code Monkey logo

lyricsgenius's Introduction

LyricsGenius: a Python client for the Genius.com API

Build Status Documentation Status PyPI version Python version

lyricsgenius provides a simple interface to the song, artist, and lyrics data stored on Genius.com.

The full documentation for lyricsgenius is available online at Read the Docs.

Setup

Before using this package you'll need to sign up for a (free) account that authorizes access to the Genius API. The Genius account provides a access_token that is required by the package. See the Usage section below for examples.

Installation

lyricsgenius requires Python 3.

Use pip to install the package from PyPI:

pip install lyricsgenius

Or, install the latest version of the package from GitHub:

pip install git+https://github.com/johnwmillr/LyricsGenius.git

Usage

Import the package and initiate Genius:

import lyricsgenius
genius = lyricsgenius.Genius(token)

If you don't pass a token to the Genius class, lyricsgenus will look for an environment variable called GENIUS_ACCESS_TOKEN and attempt to use that for authentication.

genius = Genius()

Search for songs by a given artist:

artist = genius.search_artist("Andy Shauf", max_songs=3, sort="title")
print(artist.songs)

By default, the search_artist() only returns songs where the given artist is the primary artist. However, there may be instances where it is desirable to get all of the songs that the artist appears on. You can do this by setting the include_features argument to True.

artist = genius.search_artist("Andy Shauf", max_songs=3, sort="title", include_features=True)
print(artist.songs)

Search for a single song by the same artist:

song = artist.song("To You")
# or:
# song = genius.search_song("To You", artist.name)
print(song.lyrics)

Add the song to the artist object:

artist.add_song(song)
# the Artist object also accepts song names:
# artist.add_song("To You")

Save the artist's songs to a JSON file:

artist.save_lyrics()

Searching for an album and saving it:

album = genius.search_album("The Party", "Andy Shauf")
album.save_lyrics()

There are various options configurable as parameters within the Genius class:

genius.verbose = False # Turn off status messages
genius.remove_section_headers = True # Remove section headers (e.g. [Chorus]) from lyrics when searching
genius.skip_non_songs = False # Include hits thought to be non-songs (e.g. track lists)
genius.excluded_terms = ["(Remix)", "(Live)"] # Exclude songs with these words in their title

You can also call the package from the command line:

export GENIUS_ACCESS_TOKEN="my_access_token_here"
python3 -m lyricsgenius --help

Search for and save lyrics to a given song and album:

python3 -m lyricsgenius song "Begin Again" "Andy Shauf" --save
python3 -m lyricsgenius album "The Party" "Andy Shauf" --save

Search for five songs by 'The Beatles' and save the lyrics:

python3 -m lyricsgenius artist "The Beatles" --max-songs 5 --save

Example projects

Contributing

Please contribute! If you want to fix a bug, suggest improvements, or add new features to the project, just open an issue or send me a pull request.

lyricsgenius's People

Contributors

adamspannbauer avatar allerter avatar danielcliu avatar darreldonald avatar disonds avatar eavelardev avatar eeishaan avatar gal20 avatar hhkarimi avatar hotgiardiniera avatar jiafi avatar johnwmillr avatar kyu avatar ludehon avatar nickreiher avatar npmccord avatar thedustyrover avatar uhlikfil avatar vitominheere avatar vkurenkov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lyricsgenius's Issues

search_artist should use Genius's the list songs endpoint from the artist's page

Is your feature request related to a problem? Please describe.
The current Genius.search_artist method relies on a heuristic for finding song's by the requested artist. This method is slow, inefficient, and may miss songs that belong to the artist.

Describe the solution you'd like
Use the same endpoint Genius.com uses when listing songs on an artist's page.

Here is an example of the all songs endpoint for Jay-Z:

Additional context
Not sure if this API endpoint is publicly listed by Genius, but the endpoint returns a 200 when I make a request to it.

Song.save_lyrics doesn't include song title in default file name

Describe the bug
The Song.save_lyrics method saves a file name with artist name but not song title, potentially overwriting different songs by the same artist.

Expected behavior
Default file name should be f"Lyrics_{song.title}_{song.artist}.txt".

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems Jay-z")
  2. song.save_lyrics()

Additional context
Problem is an issue if saving multiple songs individually, potentially by the same artist. I should provide a save_songs method that accepts a list of songs.

It's not an issue, but..

I got a little question here. Is there a node.js version of this, or can somebody convert it?

I'm making a project that gets lyrics, and this is exactly what I need - but it's Python :(

Thanks!

Write more tests

There really need to be more unit tests for this package. Help wanted!

The package currently runs with continuous integration on Travis-CI.

Installation - AUR package link

I created a package of LyricsGenius for Arch Linux and published it to AUR.
Maybe you could put Arch Linux installation instructions under "Installation" like this:

Install the AUR package for Arch Linux manually:

curl -L -O https://aur.archlinux.org/cgit/aur.git/snapshot/python-lyricsgenius.tar.gz
tar -xvf python-lyricsgenius.tar.gz
cd python-lyricsgenius
makepkg -si

Lyrics only?

Is there a way to only pull down the lyrics?
I'm having to sort through the files to remove year/album/artist/etc, and I got to thinking that there just has to be a better way of doing it.

Feature request: add support for the Genius annotations

If we're using the Genius API we really should allow the user to access the lyric annotations, not just the lyrics themselves. It'd take some thought to figure how to properly organize and structure the lyrics, but those decisions may be guided by how Genius already formats their API responses.

Would the lyrics be keys in a dictionary corresponding to the annotation? Would the annotations just be stored sequentially in a list? What's the best format?

How to avoid "SKIPPING `song name` (already found in artist collection)"?

Hi, great wrapper!

I'm trying to grab all the Radiohead lyrics from Genius to do some analysis on them. When I try and save all the songs to the json file I get the message

SKIPPING song name (already found in artist collection)

In my case it's

SKIPPING "Morning Bell/Amnesiac" (already found in artist collection)
SKIPPING "Hunting Bears" (already found in artist collection)
SKIPPING "Feral" (already found in artist collection)

How can I avoid this? I need the data for these three songs.

Thanks!

_result_is_lyrics customization

🏷 Enhancement

I agree with most of the filters being applied to reject songs, but having the ability to pass in a list of extra lyric filters or customize the existing criteria could provide additional value to users.

def _result_is_lyrics(self, song_title):
    """Returns False if result from Genius is not actually song lyrics"""
        regex = re.compile(
            r"(tracklist)|(track list)|(album art(work)?)|(liner notes)|(booklet)|(credits)|(remix)|(interview)|(skit)", re.IGNORECASE)
        return not regex.search(song_title)

Add more usage examples and documentation

Is your feature request related to a problem? Please describe.
Users aren't aware of what features are available in lyricsgenius and are requesting features that are already a part of the package.

Describe the solution you'd like
Add documentation to the README that includes examples for more use cases. Eventually it would be nice to have a dedicated documentation site.

Trouble with cyrillic

TRY:

import lyricsgenius as genius
api = genius.Genius('token')
song = api.search_song('Возможно')

print(song.lyrics)

Possible Solution:

api.py

110: lyrics = html.find("div", class_="lyrics").get_text().encode('ascii','ignore').decode('ascii')

change to

110: lyrics = html.find("div", class_="lyrics").get_text()

Artist.save_lyrics failing

Describe the bug
Using the code artist.save_lyrics() I am given an error when running the script

Expected behavior
I expected the lyrics of a chosen song to be saved to a file

To Reproduce
Describe the steps required to reproduce the behavior.
Use the following code:

import lyricsgenius as genius

api = genius.Genius("MY TOKEN") # Replaced my api token with "MY TOKEN"
artist = api.search_artist("Ariana Grande", max_songs=1)
song = api.search_song("thank u, next", artist.name)
artist.add_song(song)
artist.save_lyrics()

Include the error message associated with the bug.

Traceback (most recent call last):
  File "C:\Users\sebfa\PycharmProjects\TTS\main.py", line 8, in <module>
    artist.save_lyrics()
  File "C:\Users\sebfa\PycharmProjects\TTS\venv\lib\site-packages\lyricsgenius\artist.py", line 109, in save
_lyrics
    filename = "Lyrics_{}.{}".format(self.artist.replace(" ", ""), format_)
AttributeError: 'Artist' object has no attribute 'artist'

Version info

  • Package version: 1.0.0
  • OS: Windows 10

Additional context
Add any other context about the problem here.

Duplicative Effort?

1 def songsAreSame(s1, s2):
2 from difflib import SequenceMatcher as sm # For comparing similarity of lyrics
3 seqA = sm(None, s1.lyrics, s2['lyrics'])
4 seqB = sm(None, s2['lyrics'], s1.lyrics)
5 return seqA.ratio() > 0.5 or seqB.ratio() > 0.5

I'm curious as to the purpose of the second SM on line 4 (line 80 in artist.py), wouldn't this be one possible cause of the bottleneck occurring during the JSON writing (line 101 artist.py)? If the second SM is necessary, I believe using a permutation approach to lyric checks could reduce the time to write to file. that is mentioned in the comment above the line.

E.g - A temp list would be created and "Song A" would be compared with "B" and "C", then "A" would be removed from the temp list and "B" would be compared with only "C"

Error message

Hi, I get an error message while using your code:

import lyricsgenius as genius
api = genius.Genius('----my api code ---')
artist = api.search_artist('Andy Shauf', max_songs=3)

Error message:

Traceback (most recent call last):
File "C:/Users/Chris/AppData/Local/Programs/Python/Python37-32/top2000/181208 top2000.py", line 3, in
artist = api.search_artist('Andy Shauf', max_songs=3)
File "C:\Users\Chris\AppData\Local\Programs\Python\Python37-32\lib\site-packages\lyricsgenius\api.py", line 283, in search_artist
found_name = artist_info['artist']['name']
TypeError: 'NoneType' object is not subscriptable

Can you help me with this?
Many thanks!

Artist search fails on "Tupac"

Artist search fails when searching for "Tupac" because Genius.com lists him as "2Pac".

The artist page for 2Pac has an AKA section that includes "Tupac". It would probably be possible to check if the user's search term is included in the AKA section of the first artist search result, continuing with the search if a match is found.

Is there a way to just return a JSON object with save_lyrics and not actually download the file?

Is your feature request related to a problem? Please describe.
Write a clear and concise description of what the problem is -- e.g. "I'm always frustrated when [...]"

Describe the solution you'd like
Write a clear and concise description of what you want to happen.

Describe alternatives you've considered
Write a clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Speed issues

Hello! I've been attempting to use this wrapper (thank you for putting this up!), but I've been noticing that a lot of times the search_artist function slows to a crawl and takes quite a long time to return any results. Is this to avoid some sort of rate limiting? Is there anything that I can do on my end to improve the speed at which lyrics are returned? Thanks again!

EDIT: I think the speed issues were a result of some of the first songs not having any lyrics. Those results seem to take a lot longer than results with lyrics.

Skipping songs taking longer than fetching one

First of all, thanks for the nice program, seems to work well for the most part.
I'm trying to build a corpus of lyrics for a project at my university, so I try to fetch all the songs of the artists I want to incorporate.
Once the program fetched most of the songs, it seems to find many duplicates and attempts to skip, but skipping takes way longer than fetching a song.
Is there any way to speed up the skipping process?
Best regards.

Genius API returns non-songs masquerading as songs

The Genius API includes entries the site refers to as songs that aren't actually songs.

For example, searching for Taylor Swift will return entries for liner notes and a booklet along with actual song lyrics.

My wrapper needs to be able to identify and reject these non-song entries. From what I can tell, the Genius API does not flag these items as non-songs — their type is still listed as "song" in the JSON object.

artist.save_lyrics failing

import lyricsgenius as genius
access_token = 'XXXX'
api = genius.Genius(access_token)
artist = api.search_artist("The Beatles", max_songs=3)
artist.save_lyrics(format_='json', filename='out.json')


.\python\lyrics>py -3 ./genius.py
Searching for songs by The Beatles...

Song 1: "12-Bar Original"
Song 2: "1822!"
"1 [Booklet]" is not valid. Skipping.
"20 Greatest Hits - Art and Tracklist" is not valid. Skipping.
Song 3: ""Abbey Road" side two"

Reached user-specified song limit (3).
Done. Found 3 songs.
Traceback (most recent call last):
File "./genius.py", line 19, in
artist.save_lyrics(format_='json', filename='out.json')
File "C:\Python3\lib\site-packages\lyricsgenius\artist.py", line 129, in save_lyrics
lyrics_to_write['songs'][-1]['album'] = song.album
File "C:\Python3\lib\site-packages\lyricsgenius\song.py", line 45, in album
if 'album' in self._body and 'name' in self._body['album']:
TypeError: argument of type 'NoneType' is not iterable

.\python\lyrics>py -3 --version
Python 3.6.2

Originally posted by @robot3498712 in #71 (comment)

Search is case sensitive

Search is case sensitive, but shouldn't be.
For example:
song = api.search_song('lose yourself', 'Eminem')
returns no results, whereas if you search by url:
https://genius.com/search?q=lose%20yourself
it returns the correct result.
To fix, edit _clean() function:

def _clean(self, s):
    return s.translate(str.maketrans('','',punctuation)).replace('\u200b', " ").strip().lower()

I.e. just add .lower()

The "save lyrics" methods should be Song and Artist class methods

It'd make sense to at least have the option to do the following:

# Save lyrics for a single song
song = api.search_song("Hello, Goodbye", "The Beatles")
song.save_lyrics()

# Save all lyrics from a given artist
artist = api.search_artist("The Beatles")
artist.save_lyrics()

Currently you save lyrics by calling api.save_artist_lyrics(artist).

Character not recognized

Describe the bug
When I try to scrape lyrics of the top 10 popular Kanye songs, it doesn't recognize one character.

Expected behavior
return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0150' in position 2716: character maps to <undefined>

This error will pop up, and I think this means it encountered the character u0150.

To Reproduce
Describe the steps required to reproduce the behavior.

`if scrape_mode is True:
artist = genius.search_artist("Kanye West", max_songs=10, sort="popularity")
lyrics = ''

for i in range(10):
    with open('Kanye.txt', 'a') as file:
        file.write(artist.songs[i].lyrics)`

Include the error message associated with the bug.

Version info

  • Package version [import lyricsgenius; print(lyricsgenius.__version__)]
  • OS: [e.g. macOS, Windows, etc.]

Additional context
Add any other context about the problem here.

FileNotFoundError when saving song with a "/"

I was trying to save all the lyrics for songs from an artist, but the save_lyrics() function stopped once it hit a song that has a "/" in the song title.

Here is the error message I received:
FileNotFoundError: [Errno 2] No such file or directory: 'lyrics_arianagrande_blessed/rainbow.json'

To reproduce:
artist_name = "{Ariana Grande}"
artist = api.search_artist(artist_name)
artist.save_lyrics()

(The song is the 32nd song of hers pulled up.)

Can't find certain songs

Describe the bug
When searching for certain songs, no songs are returned. Examples include:

  • "Sunflower" by Post Malone and Swae Lee
  • "The Glorious Five" by Logic

Expected behavior
Results should show for songs that are easily searchable using the genius.com UI.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. From the CLI, run the following command: lyricsgenius song "Sunflower" "Post Malone"

Error message associated with the bug:

Searching for "Sunflower" by Post Malone...
Could not find specified song. Check spelling?
Could not find specified song. Check spelling?

Version info

  • 1.0.2
  • OS: macOS

Additional context
Doesn't appear to be an issue with special characters or too many characters (in the song or artist).

set up as a pypi module?

If this is actually considered, this will require

  • adding a setup.py
  • moving config from config to inside python code

Error while searching for all lyrics by Kanye West

From a comment on my blog:

I'm looking to use it to analyze how an artist's lyrics change over different albums. My first thought was just to pull all of the artist's songs, but I believe there is a song in their directory with missing lyrics that is causing the search to quit.

So is there either a.) a way to avoid the search from stopping or b.) a way to pull songs by album instead of by artist?

I got the error when using the search function on Kanye West. The seach will run up to "All Falls Down" and it prints this AttributeError: 'NoneType' object has no attribute 'get_text' and stops. Looking on the website, the next song on his list of songs is "All Falls Down (Live)" and says it is "Missing Lyrics" so I assumed this caused the error.

So this probably has to do with calling the get_text() function when there aren't actually lyrics available.

Searching for song or artist name requires exact match

My code in the search_song() and search_artist() functions requires an exact match between the user's query and the result returned from the Genius.com search.

Here's an example of the issue:

python genius.py --search_song "Hello Goodbye" "The Beatles"
    Searching for "Hello Goodbye" by The Beatles...
    Specified song was not first result :(

search_song() didn't find "Hello Goodbye" because the top result from Genius.com was "Hello, Goodbye" (note the comma).

Whereas this works:

python genius.py --search_song "Hello, Goodbye" "The Beatles"
   Searching for "Hello, Goodbye" by The Beatles...
   Done.

      "Hello, Goodbye" by The Beatles:
      You say yes, I say no
      You say stop and I say go go go, oh no
      You say goodbye and I say hello
      Hello h...

One simple fix would be stripping any punctuation and capitalization from both the user's search term and the Genius.com search results.

Is it possible to add a timeout parameter for api.search_song()?

I'm scraping lyrics of a list of songs, got a Read Timed Out error. Is it possible to change timeout parameter from 5 to 30?

error message:
ReadTimeout: HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)

Version info

  • Package version: 0.9.5
  • OS: MacOS Mojave 10.14

Song search needs titles check

Describe the bug
The search_song method doesn't check that it's returning the correct song.

Expected behavior
Searching for "99 problems" returns a Drake song, "All Me", instead of the expected "99 Problems" by Jay-Z.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems")
  2. print(song)

Additional context
I should probably add a check in to make sure we're not missing a search result that actually matches the song name.

UnicodeEncodeError when parsing Genius.com search results

Occasionally my code barfs when it encounters a character the ascii codec can't encode.

python genius/genius.py --search_song "Begin Again"

Searching for "Begin Again"...
Traceback (most recent call last):
    File "genius/genius.py", line 397, in <module>
        song = G.search_song(sys.argv[2])                                
    File "genius/genius.py", line 147, in search_song
        found_title  = str(search_hit['title']).translate(None,' ').lower()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 0: ordinal not in range(128)

I assume there is a standard easy fix to this issue. So, I should fix it.

Encoding error during saving of lyrics for an artist

I tried fetching the lyrics of the french rapper Nekfeu and saving them in txt format but I got that error
Traceback (most recent call last): File "lyrics_fetch.py", line 6, in <module> artist.save_lyrics(format = "txt") File "D:\Anaconda3\lib\site-packages\lyricsgenius\artist.py", line 134, in save_lyrics lyrics_file.write(lyrics_to_write) File "D:\Anaconda3\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 20121: character maps to <undefined>

Song titles and artist names need to be too exact when searching

Describe the bug
Searching for a song by a given artist requires an input for both song name and artist title too close to the exact song title and artist name.

Expected behavior
Searching for "problems" by "jay z" should get the song "99 problems", but the search fails, even though the search works on Genius.com. Searching for "99 problems" without an artist argument does find the correct song.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems", "jay z")
  2. song is None, but it should have found the song.

Additional context
The lyricsgenius search should be just as flexible as the Genius.com search.

JSON is not well formatted

When viewing any of the JSON files exported by any of the save() functions in a Quicklook preview or trying to open the file in Sublime, I get a warning: JSON is not well formatted: Unexpected EOF. The JSON files can still be read into Python just fine using the json module, but I should figure out why I get this warning.

Package won't install

Describe the bug
When I try to install globally using pip install lyricsgenius, I get the following output:

Collecting lyricsgenius
  Using cached https://files.pythonhosted.org/packages/9d/4e/8cd3ff464d5c08e745bfae7c8ea96e64a3584e248ed8b57b9c2d102150d1/lyricsgenius-1.0.0.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-kJMjH9/lyricsgenius/setup.py", line 21, in <module>
        with open(path.join(this_directory, 'README.md'), encoding='utf-8') as f:
    TypeError: 'encoding' is an invalid keyword argument for this function
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-kJMjH9/lyricsgenius/

Expected behavior
A global pip install would work without errors.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. Open terminal
  2. pip install lyricsgenius

Include the error message associated with the bug.

TypeError: 'encoding' is an invalid keyword argument for this function
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-kJMjH9/lyricsgenius/

Version info

  • Package version: Latest
  • OS: macOS

Additional context
I'm coming from a Node background so this could easily be something I'm doing but I tried this with pipenv, virtualenv, global pip install, and on an AWS Cloud9 instance (to make sure my global pip isn't muddied) and I got similar results each time so I'm thinking there could be an issue at play.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.