michael-rapp / apriori Goto Github PK

A Java implementation of the Apriori algorithm for finding frequent item sets and (optionally) generating association rules

License: Apache License 2.0

Kotlin 100.00%

apriori association-rules frequent-itemsets data-mining machine-learning

apriori's People

Contributors

Stargazers

Watchers

Forkers

jih75 kunsland gaworo adamnain falcaopetri dai794251336 forrestgg phucsaker99

apriori's Issues

Running of Program

Hi Michael,

I have been trying to run your program from a different set of ways. Could you possible provide a comprehensive list of step by step commands(which includes pre-reqs such as kotlin) that I can use to execute the program?

Item-Iterator should check hasNext() and not be reused

This was a tricky bug to find:

Apriori/src/main/java/de/mrapp/apriori/modules/FrequentItemSetMinerModule.java

Line 70 in b03b7cb

while ((transaction = iterator.next()) != null) {

My first ugly work-around looks like this:

        Collection<Transaction<Article>> customers = getCustomers();
        customers.add(null);

        Iterator<Transaction<Article>> iterator = customers.iterator();
        Output<Article> output = apriori.execute(iterator);

edit: and this work-around doesn't work. Iterators are also reused, see below

ruleSet is null

Junit test code is:

File inputFile = new File("c:/data1.txt");
double minSupport = 0.1;
double minConfidence = 0.2;
Apriori<NamedItem> apriori = new Apriori.Builder<NamedItem>(minSupport)
    .generateRules(minConfidence).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
RuleSet<NamedItem> ruleSet = output.getRuleSet();
		
if (ruleSet != null) {
    Iterator<AssociationRule<NamedItem>> iteratorItemSet = ruleSet.iterator();

    while (iteratorItemSet.hasNext()) {
        AssociationRule<NamedItem> itemSet = iteratorItemSet.next();
        System.out.println("result ............." + itemSet.toString());
    }
} else {
    System.out.println("ruleSet is null");
}

data1.txt content is:

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

ruleSet is null

the support of "coffee milk " is 3/4=0.75, the confidence is 3/3=1，accord with minSupport = 0.1 and minConfidence = 0.2,ruleSet should has "coffee -> milk" at least.

Nullpointer Exception

I'm using following dataset as input with an minSupport of 0.5 and a minConfidence of 1.0 (Version 1.3.0):

0 1 2 3
0 1 2 3
0 1 3 4 5
0 1 4
0 1 4

This dataset produces following NullpointerException:

java.lang.NullPointerException at de.mrapp.apriori.modules.FrequentItemSetMinerModule.generateInitialItemSets(FrequentItemSetMinerModule.java:70) at de.mrapp.apriori.modules.FrequentItemSetMinerModule.findFrequentItemSets(FrequentItemSetMinerModule.java:230) at de.mrapp.apriori.tasks.FrequentItemSetMinerTask.findFrequentItemSets(FrequentItemSetMinerTask.java:104) at de.mrapp.apriori.Apriori.execute(Apriori.java:830)

if one of the first two identical entrys is removed, the algorithm works fine:

0 1 2 3
0 1 3 4 5
0 1 4
0 1 4

Cannot access de.mrapp.util.datastructure.SortedArraySet

Hi. I had a some error: Cannot access de.mrapp.util.datastructure.SortedArraySet. Can you help me?

This is my code: double minSupport = 0.5;
Apriori apriori = new Apriori.Builder(minSupport).create();
Iterable<Transaction> iterable = () -> new DataIterator(new File("sad.txt"));
Output output = apriori.execute(iterable);
FrequentItemSets frequentItemSets = output.getRuleSet();

Frequent item sets missing

Junit test code is:

File inputFile = new File("c:/data1.txt");
double minSupport = 0.2;
Apriori<NamedItem> apriori = new Apriori.Builder<NamedItem>(minSupport).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
SortedSet<ItemSet<NamedItem>> frequentItemSets = output.getFrequentItemSets();		
System.out.println("frequentItemSets.size():" + frequentItemSets.size());		
Iterator<ItemSet<NamedItem>> iteratorItemSet = frequentItemSets.iterator();

while (iteratorItemSet.hasNext()) {
    ItemSet<NamedItem> itemSet = iteratorItemSet.next();
    System.out.println("result ............."+ itemSet.toString());
}

data1.txt content is:

# Test data for the Apriori algorithm
# One transaction per line, items are separated with whitespaces

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

result .............[milk]
result .............[coffee, milk, sugar]
result .............[bread, coffee, milk, sugar]

the support of bread is 2/4=0.5;
the support of coffee is 3/4=0.75;
the support of sugar is 3/4=0.75;
the support of milk is 3/4=0.75;

all these are greater than minSupport = 0.2 , but the result only includes "milk"

frequentItemSetCount incorrect

Junit test code is:

File inputFile = new File("c:/data1.txt");
int frequentItemSetCount=1;
Apriori<NamedItem> apriori = new Apriori.Builder<>(frequentItemSetCount)
    .supportDelta(0.1).maxSupport(1.0).minSupport(0.0).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
SortedSet<ItemSet<NamedItem>> frequentItemSets = output.getFrequentItemSets();		
System.out.println("frequentItemSets.size():"+frequentItemSets.size());	
Iterator<ItemSet<NamedItem>> iteratorItemSet = frequentItemSets.iterator();

while (iteratorItemSet.hasNext()) {
    ItemSet<NamedItem> itemSet = (ItemSet<NamedItem>) iteratorItemSet.next();
    System.out.println("result ............."+ itemSet.toString());
}

data1.txt content is:

# Test data for the Apriori algorithm
# One transaction per line, items are separated with whitespaces

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

frequentItemSets.size():4
result .............[coffee, milk]
result .............[sugar]
result .............[milk]
result .............[coffee]

frequentItemSetCount =1 but frequentItemSets.size()=4

Add functionality to easily sort/filter frequent item sets

As pointed out in #2, it might be useful to provide the functionality to sort and/or filter the frequent item sets, which are contained by the Apriori algorithm's Output. This would require to create a custom implementation of the type SortedSet, which provides sort-/filter-methods such as the class RuleSet does.

java.lang.NoClassDefFoundError: de/mrapp/util/Condition

It seems that the Condition class is missing. I cannot find it in the jar file or your repository either.

michael-rapp / apriori Goto Github PK

apriori's People

Contributors

Stargazers

Watchers

Forkers

apriori's Issues

Running of Program

Item-Iterator should check hasNext() and not be reused

ruleSet is null

Nullpointer Exception

Cannot access de.mrapp.util.datastructure.SortedArraySet

Frequent item sets missing

frequentItemSetCount incorrect

Add functionality to easily sort/filter frequent item sets

java.lang.NoClassDefFoundError: de/mrapp/util/Condition

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent