Code Monkey home page Code Monkey logo

dci's People

Contributors

gahag avatar

Watchers

 avatar

dci's Issues

Performance for converting transactions to BitMatrix

I am not familiar with BitMatrix that is used in this library, if I want to pass a list of transactions from python with PyO3, would there be too much overhead converting transactions in form of Vec<Vec<i64>> to the BitMatrix since I have to set those values one by one.

How do I extend ItemSet8 in order to support much more features?

say I have transaction list that has 100000 rows and 300 unique features

[[25, 14, 22], [8, 23], [8, 1, 10, 27, 0], [12, 26, 6, 7, 18, 2], [2, 21], [1, 13, 27], [25, 24, 11, 8, 22, 27], [0, 21, 23, 28], [21, 7, 11, 26, 16], [24], [19, 3, 2], [6], [5, 18, 0, 1], [11, 9, 17], [12, 14, 5, 17], [11], [], [4, 0], [24, 9, 12], [13, 25, 8, 23], [26, 4], [22, 1], [20], [4, 0], [19, 16, 1, 27, 22, 25], [5, 26], [9, 17, 25], [8, 3, 10], [26, 16, 11], [20, 14, 9, 6], [17, 5, 26, 6], [21], [0, 17], [15, 23, 2], [17, 29, 22, 20, 6, 13], [8, 20, 7, 14, 2, 15, 18, 10, 19, 17], [6, 13], [25, 11, 19], [28, 21, 18, 13, 25, 0, 29, 5, 11], [14, 24], [20, 17, 13, 11], [24], [6, 14], [12, 14], [25, 3, 6, 8], [16, 26, 3, 21], [4, 0, 23, 17, 25], [14, 5, 21, 24, 8], [11, 13, 3], [], [6, 0, 3], [1, 0, 25], [9, 18, 25], [14], [28, 22, 19, 7, 5], [28, 9, 19, 17], [14], [6, 10], [18, 25], [5, 29, 24, 27], [3, 27, 2], [22, 8], [17, 23], [14, 2, 22, 7], [20, 15, 27, 1, 29], [4], [8, 15], [7, 29], [12], [1, 26], [8, 9, 2], [6, 29, 22], [3, 6, 23], [27, 20, 4, 21, 5], [25, 0, 23, 14], [1, 8, 0], [19, 9], [13, 29], [0, 23, 25], [11, 16, 19, 21], [2, 20, 19, 10], [22, 14, 7, 26, 16], [28], [23, 13], [27, 10, 6, 22, 3, 5, 18], [21, 12], [12], [19, 7], [28, 2, 25], [21, 26, 17], [18, 19, 12], [6, 19, 18, 9], [14, 10, 4], [19, 8], [23], [19, 21], [15, 20, 19, 4], [4, 27, 5], [], [20, 8]] ...

I had a hard time mining rules from the transactions I have, it complains pyo3_runtime.PanicException: attempt to shift left with overflow when I used the ItemSet8, but when I changed data: u8 to data: u64, the problem disappears, but I wasn't able to mine any rules.

So my question is:

  1. Since there is bit shifting operation in the implementation of ItemSet8 (self.data & (1 << self.ix)), I wonder if this will limit the capability of the algorithm for mining transactions that has more than a certain amount of features?
  2. Why would I see empty rule mined sometimes? like [], 30
  3. What is the time complexity of the algorithm in terms of n (number of transactions) and m (number of unique features)?
  4. How do I adapt ItemSet8 in order to get legitimate rules mined on larger dataset?

Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.