The dkpro-keyphrases from crack521

DecompoundingPOSTaggingTest shall be refactored

This unit test uses code from stanfordcorenlp, which is not under asl. Also, it 
does not follow camel case convention.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 3:32

Implement a corpus based filter

There is no filter which removes from the indexes keyphrases candidates that 
are not in a filter corpus.

Original issue reported on code.google.com by [email protected] on 23 Jan 2014 at 12:51

Update dkpro lab version and artifact id

DKPro Lab is now available at Maven central, however it is not being declared 
in the poms of some modules.

Original issue reported on code.google.com by [email protected] on 20 Jun 2014 at 2:42

Textgraphs module should have its license changed to GPL

Since textgraphs module uses MTJ dependency, its license should be changed to 
ASL.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 5:25

Implement a position filtering

There is no filtering for obtaining the keyphrases from a specific range from 
the text.

Original issue reported on code.google.com by [email protected] on 23 Jan 2014 at 12:43

KeyphrasePerformanceCounter should be refactored

The class KeyphrasePerformanceCounter should be refactored because it is too 
big.

Original issue reported on code.google.com by [email protected] on 10 Nov 2013 at 2:41

Prepocessing task must have lemmatizing after decompounding

Since decompounding might affect the lemmas from each compound part, it is 
necessary to have the lemmatization after the decompounding is done. Then, the 
preprocessing task should be changed.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 8:17

Implement a character length filter

There is no filter based on the character length from the keyphrases.

Original issue reported on code.google.com by [email protected] on 23 Jan 2014 at 12:50

Implement a frequency filtering

There is no filter based on the frequency of the keyphrase.

Original issue reported on code.google.com by [email protected] on 23 Jan 2014 at 12:48

Textgraphs module does not use parameter dependency

Textgraphs module's pom file does not declare a dependency on the parameter 
module from DKPro Core.

Original issue reported on code.google.com by [email protected] on 20 Jun 2014 at 2:45

core.candidate package needs a better coverage by the unit tests

The package core.candidate from the core module needs a better coverage. At 
least 70% would be reasonable to have.

Original issue reported on code.google.com by [email protected] on 23 Jan 2014 at 11:53

Some dependencies are outdated

Some dependencies being used are out of date.

Original issue reported on code.google.com by [email protected] on 23 Jun 2014 at 8:04

Remove KeyphraseCandidate type

KeyphraseCandidate type is not necessary since the candidate annotator and the 
filters can work on the Keyphrase type without any problem.

Original issue reported on code.google.com by [email protected] on 31 Jan 2014 at 1:45

Refactor DictionaryFilter

- Move DictionaryFilter from module wikipediafilter to core
- It should extend from AbstractFilter

Original issue reported on code.google.com by [email protected] on 3 Feb 2014 at 10:14

Coreference module license should be changed to gpl

This module uses stanfordnlp module from dkpro core. Its license shall be 
changed to gpl.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 1:06

Lab module license should be changed to GPL

Lab module uses code from Stanford CoreNLP wrapper from dkpro. For this reason, 
its license cannot be ASL.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 3:38

Precision/Recall at 5, 10 and 15 should be in the report

Currently there is no information about the precision at 5, 10 and 15 at the 
report.

Original issue reported on code.google.com by [email protected] on 14 Nov 2013 at 2:36

Implement a pos filter

Implement a filter which removes candidates that either do or do not match a 
pos pattern.

Original issue reported on code.google.com by [email protected] on 29 Jan 2014 at 7:06

Tree-tagger Processes don't terminate

Hi,

I'm using DKPro Keyphrases' CooccurrenceGraphExtractor to extract keyphrases 
from various texts. The keyphrase extraction for the texts is
performed sequentially. However, my Windows Task-Manager reports that the 
tree-tagger processes do not terminate. So although I process the texts 
sequentially, a growing number of tree-tagger processes accumulates in my RAM 
until my RAM is used up completely.

My code that invokes the keyphrase extraction looks like this:


For(String text : allTexts){
    CooccurrenceGraphExtractor extractor = new CooccurrenceGraphExtractor();
    extractor.setMinKeyphraseLength(2);
    extractor.setCandidate(new Candidate(CandidateType.Token, PosType.N));
    List<Keyphrase> keyphrases = extractor.extract(text);
    keyphrases = getTopRankedUniqueKeyphrases(keyphrases, keyphrases.size());

    // save text 
    ...
}

Is there a way to avoid this accumulation of tree-tagger processes?
Thanks in advance.

Sincerely yours,
Laura




What steps will reproduce the problem?
1.
2.
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 8 Jul 2014 at 1:01

KeyphraseWriter should also output the keyphrases to a file

KeyphraseWriter purpose is to be used for evaluation. However, if we run a 
keyphrase extracton pipeline for several configurations using dkpro lab, it is 
not that simple to analyze the outcome print in the console. KeyphraseWriter 
should offer the option of using a file to print out the keyphrases extracted.

Original issue reported on code.google.com by [email protected] on 5 May 2014 at 8:02

KeyphrasePatternCounter should not be an inner-class in KeyphraseDatasetStatistics

KeyphrasePatternCounter should not be an inner-class in 
KeyphraseDatasetStatistics

Original issue reported on code.google.com by [email protected] on 6 Nov 2013 at 1:16

KeyphraseDatasetStatistics needs to have unit test

KeyphraseDatasetStatistics needs to have unit test

Original issue reported on code.google.com by [email protected] on 8 Nov 2013 at 3:40

Mtj dependency should not be on core pom file

The mtj dependency is used only by the textgraphs module. Then, it should be 
only in the textgraphs module pom file.

Original issue reported on code.google.com by [email protected] on 30 Sep 2013 at 2:08

Core module license should be changed

The core module license should be changed to ASL.

Original issue reported on code.google.com by [email protected] on 30 Sep 2013 at 2:44

Weka dependency should not be on core pom file

The weka dependency is used only in the wrappers module. Then, it should be 
used only in the wrappers module.

Original issue reported on code.google.com by [email protected] on 30 Sep 2013 at 1:55

Kea dependency should not be on core pom file

The kea dependency is used only in the wrappers module. Then, it should be used 
only in the wrappers module.

Original issue reported on code.google.com by [email protected] on 30 Sep 2013 at 1:03

Drop kea wrapper

Kea dependency is not on maven central, release should not be blocked by that.

Original issue reported on code.google.com by [email protected] on 23 Jun 2014 at 3:17

Create an abstract type for the CandidateFilters

Would be good if the filters extended an abstract type representing a candidate 
filter.

Original issue reported on code.google.com by [email protected] on 31 Jan 2014 at 2:47

KeyphrasePatternCounter should be removed

This is class is not used by any other class.

Original issue reported on code.google.com by [email protected] on 8 Nov 2013 at 3:43

Update DKPro Core version to 1.6.1

DKPro Core ASL and GPL 1.6.1 are already available on maven central. Their 
dependencies should be updated wherever they are declared.

Original issue reported on code.google.com by [email protected] on 20 Jun 2014 at 2:47

Add FrequencyFilters with in-document and external frequency

The current frequency filter only works on the document text.

Original issue reported on code.google.com by [email protected] on 3 Feb 2014 at 8:55

Bookindexing license should be changed to gpl

Since this module uses dkpro stanfordnlp module, its license should be changed 
to gpl.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 12:51

Tests for variants of TfidfRanker fail

No model for tf-idf is provided.

Original issue reported on code.google.com by [email protected] on 3 Feb 2014 at 9:31

Wikipediafilter module needs to have its license changed

Wikipediafilter uses JWPL, so it cannot be licensed under ASL.

Original issue reported on code.google.com by [email protected] on 1 Oct 2013 at 6:46

Remove deprecated methods from code

Several deprecated methods from uimaFIT are still being used in dkpro-keyphrases

Original issue reported on code.google.com by [email protected] on 22 Jan 2014 at 9:49

Wrappers rely on TreeTagger which is difficult to use for novice users

Should we replace that with OpenNLP?

Original issue reported on code.google.com by [email protected] on 20 Mar 2014 at 10:12

Move Filter Factories to factory package

Filter factories should be moved to a factory package.

Original issue reported on code.google.com by [email protected] on 31 Jan 2014 at 2:54

Remove the Keyphrase mode from StructureFilter

Only keyphrase candidates should be removed from the indexes, not the 
keyphrases themselves. The idea of a keyphrase candidate is that it is only a 
candidate, so, it might be removed from indexes after the filtering phase. If 
the keyphrase candidate becomes a keyphrase, then it does not make sense to 
remove it through a filter. Then, the mode from StructureFilter which filters 
out Keyphrase types should be removed.

Original issue reported on code.google.com by [email protected] on 29 Jan 2014 at 2:35

crack521 / dkpro-keyphrases Goto Github PK

dkpro-keyphrases's People

Stargazers

Watchers

dkpro-keyphrases's Issues

Recommend Projects

Recommend Topics

Recommend Org