Comments (4)
Hello,
Apologies for the delayed response. Can you please provide the exact command that you ran and your directory structure?
I will describe the process I followed now to get a successful run. Denote DIR/ to be the base directory. I downloaded train-balanced.csv.bz2, test-balanced.csv.bz2 and comments.json.bz2 from https://nlp.cs.princeton.edu/SARC/2.0/pol/ and extracted all of them using bzip2 -d .bz2 and moved to DIR/data/pol/. Also cloned SARC and text_embedding repos into DIR/. So my directory structure looks like
DIR/
-> SARC/
-> text_embedding/
-> data/
-----> pol/
--------> train-balanced.csv
--------> test-balanced.csv
--------> comments.json
Changed SARC_DATA = 'DIR/data/'
in utils.py as per instructions and ran the following commands
`
$ export PYTHONPATH=$PYTHONPATH:DIR/
$ python SARC/eval.py pol -l
Load SARC data
Create bongs
Dimension of representation: 12683
Evaluate the classifier on all responses
Train acc: 0.767047117354
Test acc: 0.697298884322
Evaluate the classifier on the original dataset
Train acc: 0.843576236465
Test acc: 0.759248385203
`
Alternatively, instead of adding DIR/ to PYTHONPATH, you can also do what you suggested, adding `import sys; sys.path.append('../') and running eval.py from SARC. Hope this helps.
from sarc.
Hi,
My apologies, I swore I commented this but I had a brain fart. This is the directory structure:
DIR/
-> SARC/
-----> main/
--------> train-balanced.csv
--------> test-balanced.csv
--------> comments.json
-----> pol/
--------> train-balanced.csv
--------> test-balanced.csv
--------> comments.json
-> text_embedding/
I downloaded the data from https://nlp.cs.princeton.edu/SARC/2.0/
I extracted the data via 7zip
Changed SARC_DATA = 'DIR/SARC/'
in utils
Added `import sys
sys.path.append('../')` to the top of eval.py
when inside the SARC directory, I run the following:
$ python eval.py pol -l
Load SARC data
Traceback (most recent call last):
File "eval.py", line 119, in
main()
File "eval.py", line 45, in main
load_sarc_responses(train_file, test_file, comment_file, lower=args.lower)
File "C:\Users\matth\Desktop\Data-Science-Projects\SARC\utils.py", line 34, in load_sarc_responses
responses = row[1].split(' ')
IndexError: list index out of range
from sarc.
Can you please print out the first few entries of DIR/SARC/pol/train-balanced.csv by doing cat ../data/pol/train-balanced.csv | head -n 3
? It seem like you are working on Windows platform. Did you by any chance open the csv file using some software like Microsoft Excel? I do not see any concrete reason for this error, except that the csv package is not parsing the file correctly. (I am assuming you are using python3 instead of python2, since that could have led to other errors)
from sarc.
My apologies for the delayed response, I think the issue was looking at the data with excel as you suggested! I reran the code after unzipping the data again and was able to run the code just fine!
from sarc.
Related Issues (4)
- text_embedding not found HOT 1
- subsets to SARC Files HOT 1
- raw/sarc.csv find user comments HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sarc.