darel13712 / rs_datasets Goto Github PK
View Code? Open in Web Editor NEWTool for autodownloading recommendation systems datasets
Home Page: https://darel13712.github.io/rs_datasets/
License: MIT License
Tool for autodownloading recommendation systems datasets
Home Page: https://darel13712.github.io/rs_datasets/
License: MIT License
Hello, this seems to be a very useful tool. It would be nice if you could add a license, to make it clearer for others how they can (re)use your code,
>>> rs_datasets.Amazon("fashion").ratings
Traceback (most recent call last):
$PY/rs_datasets/amazon.py:64 in __init__ columns=['user_id', 'item_id', 'rating']
$PY/datatable/utils/fread.py:382 in _override_columns return _apply_columns_list(colspec, coldescs)
$PY/datatable/utils/fread.py:444 in _apply_columns_list % (plural(n, "column"), plural(nn, "column")))
(with $PY = /home/boris/projects/recommender-systems-course/venv/lib/python3.7/site-packages/)
ValueError: Input contains 4 columns, whereas columns parameter specifies only 3 columns
>>> rs_datasets.Jester()
...
ImportError: Missing optional dependency 'openpyxl'. Use pip or conda to install openpyxl.
Downloading is fine, but to open data, something else is needed
rs_datasets.DatingAgency()
urllib.error.HTTPError: HTTP Error 403: Forbidden
The site went down: http://www.occamslab.com/petricek/data/ responds with a domain selling ad
Upon using the code:
from rs_datasets import Diginetica
d = Diginetica('~/datasets/diginetica')
d.info()
I am receiving the following error message:
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
>>> rs_datasets.Diginetica()
Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=0B7XZSACQf0KdenRmMk8yVUU5LWc
If I log in to https://drive.google.com/uc?id=0B7XZSACQf0KdenRmMk8yVUU5LWc, I see the message that I don't have access to the file.
The pre-release of datatable
works with other datasets but not with some from Amazon:
Python 3.9.9 (main, Nov 20 2021, 11:10:09)
[GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import datatable
>>> datatable.__version__
'1.1.0a0+build.1641309144.user'
>>> import rs_datasets
>>> rs_datasets.Amazon("automotive")
Traceback (most recent call last):
/home/sweet/home/rs_datasets/rs_datasets/amazon.py:62 in __init__ self.ratings = dt.fread(
IOError: Too few fields on line 830052: expected 4 but found only 2 (with sep=','). Set fill=True to ignore this error. <<B000JFHNEC,A1NDJGRZG5MK4E>>
>>> rs_datasets.Lastfm().ratings
...
$PY/rs_datasets/lastfm.py:19 in __init__ columns=['user_id', 'artist_id', 'artist_name', 'play_count']
...
IOError: Too few fields on line 26169: expected 4 but found only 3 (with sep='t'). Set fill=True to ignore this error. <<005eb8248718e55422075bb1678699c63ff5cf09t7746d775-9550-4360-b8d5-c37bd448ce01t"weird al" yankovict29>>
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.