crack521 / dkpro-keyphrases Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/dkpro-keyphrases
License: Other
Automatically exported from code.google.com/p/dkpro-keyphrases
License: Other
This unit test uses code from stanfordcorenlp, which is not under asl. Also, it
does not follow camel case convention.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 3:32
There is no filter which removes from the indexes keyphrases candidates that
are not in a filter corpus.
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 12:51
DKPro Lab is now available at Maven central, however it is not being declared
in the poms of some modules.
Original issue reported on code.google.com by [email protected]
on 20 Jun 2014 at 2:42
Since textgraphs module uses MTJ dependency, its license should be changed to
ASL.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 5:25
There is no filtering for obtaining the keyphrases from a specific range from
the text.
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 12:43
The class KeyphrasePerformanceCounter should be refactored because it is too
big.
Original issue reported on code.google.com by [email protected]
on 10 Nov 2013 at 2:41
Since decompounding might affect the lemmas from each compound part, it is
necessary to have the lemmatization after the decompounding is done. Then, the
preprocessing task should be changed.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 8:17
There is no filter based on the character length from the keyphrases.
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 12:50
There is no filter based on the frequency of the keyphrase.
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 12:48
Textgraphs module's pom file does not declare a dependency on the parameter
module from DKPro Core.
Original issue reported on code.google.com by [email protected]
on 20 Jun 2014 at 2:45
The package core.candidate from the core module needs a better coverage. At
least 70% would be reasonable to have.
Original issue reported on code.google.com by [email protected]
on 23 Jan 2014 at 11:53
Some dependencies being used are out of date.
Original issue reported on code.google.com by [email protected]
on 23 Jun 2014 at 8:04
KeyphraseCandidate type is not necessary since the candidate annotator and the
filters can work on the Keyphrase type without any problem.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2014 at 1:45
- Move DictionaryFilter from module wikipediafilter to core
- It should extend from AbstractFilter
Original issue reported on code.google.com by [email protected]
on 3 Feb 2014 at 10:14
This module uses stanfordnlp module from dkpro core. Its license shall be
changed to gpl.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 1:06
Lab module uses code from Stanford CoreNLP wrapper from dkpro. For this reason,
its license cannot be ASL.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 3:38
Currently there is no information about the precision at 5, 10 and 15 at the
report.
Original issue reported on code.google.com by [email protected]
on 14 Nov 2013 at 2:36
Implement a filter which removes candidates that either do or do not match a
pos pattern.
Original issue reported on code.google.com by [email protected]
on 29 Jan 2014 at 7:06
Hi,
I'm using DKPro Keyphrases' CooccurrenceGraphExtractor to extract keyphrases
from various texts. The keyphrase extraction for the texts is
performed sequentially. However, my Windows Task-Manager reports that the
tree-tagger processes do not terminate. So although I process the texts
sequentially, a growing number of tree-tagger processes accumulates in my RAM
until my RAM is used up completely.
My code that invokes the keyphrase extraction looks like this:
For(String text : allTexts){
CooccurrenceGraphExtractor extractor = new CooccurrenceGraphExtractor();
extractor.setMinKeyphraseLength(2);
extractor.setCandidate(new Candidate(CandidateType.Token, PosType.N));
List<Keyphrase> keyphrases = extractor.extract(text);
keyphrases = getTopRankedUniqueKeyphrases(keyphrases, keyphrases.size());
// save text
...
}
Is there a way to avoid this accumulation of tree-tagger processes?
Thanks in advance.
Sincerely yours,
Laura
What steps will reproduce the problem?
1.
2.
3.
What is the expected output? What do you see instead?
What version of the product are you using? On what operating system?
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 8 Jul 2014 at 1:01
KeyphraseWriter purpose is to be used for evaluation. However, if we run a
keyphrase extracton pipeline for several configurations using dkpro lab, it is
not that simple to analyze the outcome print in the console. KeyphraseWriter
should offer the option of using a file to print out the keyphrases extracted.
Original issue reported on code.google.com by [email protected]
on 5 May 2014 at 8:02
KeyphrasePatternCounter should not be an inner-class in
KeyphraseDatasetStatistics
Original issue reported on code.google.com by [email protected]
on 6 Nov 2013 at 1:16
KeyphraseDatasetStatistics needs to have unit test
Original issue reported on code.google.com by [email protected]
on 8 Nov 2013 at 3:40
The mtj dependency is used only by the textgraphs module. Then, it should be
only in the textgraphs module pom file.
Original issue reported on code.google.com by [email protected]
on 30 Sep 2013 at 2:08
The core module license should be changed to ASL.
Original issue reported on code.google.com by [email protected]
on 30 Sep 2013 at 2:44
The weka dependency is used only in the wrappers module. Then, it should be
used only in the wrappers module.
Original issue reported on code.google.com by [email protected]
on 30 Sep 2013 at 1:55
The kea dependency is used only in the wrappers module. Then, it should be used
only in the wrappers module.
Original issue reported on code.google.com by [email protected]
on 30 Sep 2013 at 1:03
Kea dependency is not on maven central, release should not be blocked by that.
Original issue reported on code.google.com by [email protected]
on 23 Jun 2014 at 3:17
Would be good if the filters extended an abstract type representing a candidate
filter.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2014 at 2:47
This is class is not used by any other class.
Original issue reported on code.google.com by [email protected]
on 8 Nov 2013 at 3:43
DKPro Core ASL and GPL 1.6.1 are already available on maven central. Their
dependencies should be updated wherever they are declared.
Original issue reported on code.google.com by [email protected]
on 20 Jun 2014 at 2:47
The current frequency filter only works on the document text.
Original issue reported on code.google.com by [email protected]
on 3 Feb 2014 at 8:55
Since this module uses dkpro stanfordnlp module, its license should be changed
to gpl.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 12:51
No model for tf-idf is provided.
Original issue reported on code.google.com by [email protected]
on 3 Feb 2014 at 9:31
Wikipediafilter uses JWPL, so it cannot be licensed under ASL.
Original issue reported on code.google.com by [email protected]
on 1 Oct 2013 at 6:46
Several deprecated methods from uimaFIT are still being used in dkpro-keyphrases
Original issue reported on code.google.com by [email protected]
on 22 Jan 2014 at 9:49
Should we replace that with OpenNLP?
Original issue reported on code.google.com by [email protected]
on 20 Mar 2014 at 10:12
Filter factories should be moved to a factory package.
Original issue reported on code.google.com by [email protected]
on 31 Jan 2014 at 2:54
Only keyphrase candidates should be removed from the indexes, not the
keyphrases themselves. The idea of a keyphrase candidate is that it is only a
candidate, so, it might be removed from indexes after the filtering phase. If
the keyphrase candidate becomes a keyphrase, then it does not make sense to
remove it through a filter. Then, the mode from StructureFilter which filters
out Keyphrase types should be removed.
Original issue reported on code.google.com by [email protected]
on 29 Jan 2014 at 2:35
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.