skblaz / rakun Goto Github PK
View Code? Open in Web Editor NEWRank-based Unsupervised Keyword Extraction via Metavertex Aggregation
License: GNU General Public License v3.0
Rank-based Unsupervised Keyword Extraction via Metavertex Aggregation
License: GNU General Public License v3.0
Hello,
I can't find the definition of function hairball_plot, where could I get it?
E NameError: name 'hairball_plot' is not defined
mrakun/init.py:71: NameError
Can I use a specific BERT model with this code? Say I want to use bert-base-uncased to extract keywords. Will it work here?
I have tried the code and it is impressive, however I have some difficulties to show in the output not single words, but 2-grams or 3-grams that represent a keyword for specific text. Any info on how to tune the hyperparameters?
Concerning the hyperparameters could you explain some of them, such as:
"pair_diff_length":2,
"bigram_count_threshold":2,
"num_tokens":[1,2],
"max_similar" : 3, ## n most similar can show up n times
"max_occurrence" : 3} ## maximum frequency overall
Thanks.
Hello, I majored in computer science at Okayama University in Japan.
When I use this algorithm, RaKUn for my NLP task by using Japanese,
I found for loop in "missing connectives" part is a suspected case of bug.
it is following part,
file: __init__.py"
line: 449 ...
for ind in i1_indexes:
if ind + 2 in i2_indexes_map:
joint_kw = " ".join([
p1, self.raw_text[ind + 1],
self.raw_text[ind + 2]
])
final_keywords.append((joint_kw, kw[1]))
joint = True
I think if you don't exit this for loop with break, there is a possibility that multiple similar keywords will be added. Is this normal?
Could you please confirm this?
Hi,
when using fasttext in google colab, I'll get the error
module 'gensim.models.fasttext' has no attribute 'load_facebook_vectors'
Any idea?
https://github.com/SkBlaz/rakun/blob/master/mrakun/__init__.py#L29
logging.basicConfig(format='%(asctime)s - %(message)s',
datefmt='%d-%b-%y %H:%M:%S')
logging.getLogger().setLevel(logging.INFO)
Using getLogger() without any arguments makes it default to a root logger, and thus
overwrites its level against the users wishes. This makes integrating the package into bigger applications
very difficult as it robs control of the logs from the developer.
Although I'm not aware of what the good practice for setting up loggers for a package is, replacing those lines with the following significantly reduces log clutter:
logging.basicConfig(format='%(asctime)s - %(message)s', datefmt='%d-%b-%y %H:%M:%S')
logging.getLogger(__name__).setLevel(logging.INFO)
I tried to run the sample code provided, but am getting the above error. Unsure how I could debug it, please do help!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.