Code Monkey home page Code Monkey logo

apriori's People

Contributors

falcaopetri avatar michael-rapp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

apriori's Issues

Running of Program

Hi Michael,

I have been trying to run your program from a different set of ways. Could you possible provide a comprehensive list of step by step commands(which includes pre-reqs such as kotlin) that I can use to execute the program?

Item-Iterator should check hasNext() and not be reused

This was a tricky bug to find:

while ((transaction = iterator.next()) != null) {

My first ugly work-around looks like this:

        Collection<Transaction<Article>> customers = getCustomers();
        customers.add(null);

        Iterator<Transaction<Article>> iterator = customers.iterator();
        Output<Article> output = apriori.execute(iterator);

edit: and this work-around doesn't work. Iterators are also reused, see below

ruleSet is null

Junit test code is:

File inputFile = new File("c:/data1.txt");
double minSupport = 0.1;
double minConfidence = 0.2;
Apriori<NamedItem> apriori = new Apriori.Builder<NamedItem>(minSupport)
    .generateRules(minConfidence).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
RuleSet<NamedItem> ruleSet = output.getRuleSet();
		
if (ruleSet != null) {
    Iterator<AssociationRule<NamedItem>> iteratorItemSet = ruleSet.iterator();

    while (iteratorItemSet.hasNext()) {
        AssociationRule<NamedItem> itemSet = iteratorItemSet.next();
        System.out.println("result ............." + itemSet.toString());
    }
} else {
    System.out.println("ruleSet is null");
}

data1.txt content is:

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

ruleSet is null

the support of "coffee milk " is 3/4=0.75, the confidence is 3/3=1,accord with minSupport = 0.1 and minConfidence = 0.2,ruleSet should has "coffee -> milk" at least.

Nullpointer Exception

I'm using following dataset as input with an minSupport of 0.5 and a minConfidence of 1.0 (Version 1.3.0):

0 1 2 3
0 1 2 3
0 1 3 4 5
0 1 4
0 1 4

This dataset produces following NullpointerException:

java.lang.NullPointerException at de.mrapp.apriori.modules.FrequentItemSetMinerModule.generateInitialItemSets(FrequentItemSetMinerModule.java:70) at de.mrapp.apriori.modules.FrequentItemSetMinerModule.findFrequentItemSets(FrequentItemSetMinerModule.java:230) at de.mrapp.apriori.tasks.FrequentItemSetMinerTask.findFrequentItemSets(FrequentItemSetMinerTask.java:104) at de.mrapp.apriori.Apriori.execute(Apriori.java:830)

if one of the first two identical entrys is removed, the algorithm works fine:

0 1 2 3
0 1 3 4 5
0 1 4
0 1 4

Cannot access de.mrapp.util.datastructure.SortedArraySet

Hi. I had a some error: Cannot access de.mrapp.util.datastructure.SortedArraySet. Can you help me?

This is my code: double minSupport = 0.5;
Apriori apriori = new Apriori.Builder(minSupport).create();
Iterable<Transaction> iterable = () -> new DataIterator(new File("sad.txt"));
Output output = apriori.execute(iterable);
FrequentItemSets frequentItemSets = output.getRuleSet();

Frequent item sets missing

Junit test code is:

File inputFile = new File("c:/data1.txt");
double minSupport = 0.2;
Apriori<NamedItem> apriori = new Apriori.Builder<NamedItem>(minSupport).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
SortedSet<ItemSet<NamedItem>> frequentItemSets = output.getFrequentItemSets();		
System.out.println("frequentItemSets.size():" + frequentItemSets.size());		
Iterator<ItemSet<NamedItem>> iteratorItemSet = frequentItemSets.iterator();

while (iteratorItemSet.hasNext()) {
    ItemSet<NamedItem> itemSet = iteratorItemSet.next();
    System.out.println("result ............."+ itemSet.toString());
}

data1.txt content is:

# Test data for the Apriori algorithm
# One transaction per line, items are separated with whitespaces

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

result .............[milk]
result .............[coffee, milk, sugar]
result .............[bread, coffee, milk, sugar]

the support of bread is 2/4=0.5;
the support of coffee is 3/4=0.75;
the support of sugar is 3/4=0.75;
the support of milk is 3/4=0.75;

all these are greater than minSupport = 0.2 , but the result only includes "milk"

frequentItemSetCount incorrect

Junit test code is:

File inputFile = new File("c:/data1.txt");
int frequentItemSetCount=1;
Apriori<NamedItem> apriori = new Apriori.Builder<>(frequentItemSetCount)
    .supportDelta(0.1).maxSupport(1.0).minSupport(0.0).create();
Iterator<Transaction<NamedItem>> iterator = new DataIterator(inputFile);
Output<NamedItem> output = apriori.execute(iterator);
SortedSet<ItemSet<NamedItem>> frequentItemSets = output.getFrequentItemSets();		
System.out.println("frequentItemSets.size():"+frequentItemSets.size());	
Iterator<ItemSet<NamedItem>> iteratorItemSet = frequentItemSets.iterator();

while (iteratorItemSet.hasNext()) {
    ItemSet<NamedItem> itemSet = (ItemSet<NamedItem>) iteratorItemSet.next();
    System.out.println("result ............."+ itemSet.toString());
}

data1.txt content is:

# Test data for the Apriori algorithm
# One transaction per line, items are separated with whitespaces

bread   butter  sugar
coffee  milk    sugar
bread   coffee  milk    sugar
coffee  milk

run result is :

frequentItemSets.size():4
result .............[coffee, milk]
result .............[sugar]
result .............[milk]
result .............[coffee]

frequentItemSetCount =1 but frequentItemSets.size()=4

Add functionality to easily sort/filter frequent item sets

As pointed out in #2, it might be useful to provide the functionality to sort and/or filter the frequent item sets, which are contained by the Apriori algorithm's Output. This would require to create a custom implementation of the type SortedSet, which provides sort-/filter-methods such as the class RuleSet does.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.