Code Monkey home page Code Monkey logo

imylu's Introduction

What' new?

Unlike Imylu 0.1, Imylu 0.2 will be based on numpy and scipy to make the code simple and efficient.

What does imylu do?

Most of popular machine learning algorithms have been implemented by pure Python code in the project, which is specially recommended for people who would like to learn the algorithm details by reading Python code rather than lots of mathematical formulas. However, the necessary mathematical formulas and derivations are included in the code comments.

Imylu is compatible with: Python 3.6-3.7.

Why this name, imylu?

Chinese used to pronounce ML as [aimu'ailu], and the name imylu was inspired by this.

Folders guide

avatar

imylu's People

Contributors

tushushu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

imylu's Issues

There seems to be no missing value process

Thank you for your sharing.
However, when in the true development environments, there are always having missing value.
It seems no missing value process for each algorithm.

_get_split_mse 函数计算

_get_split_mse 中计算MSE的话 split_sqr_sum[0] - split_sum[0] * split_avg[0] 是不是应该为split_sqr_sum[0] - split_avg[0] * split_avg[0] 因为你是Y的实际值-Y的均值的平方

矩阵分解, 相乘出现负值问题探讨

博主你好,谢谢你分享代码。代码特别适合学习,书写整洁。
我是通过ALS搜索到您的代码,借此我想请教你一个问题。

ALS分解出来的两个矩阵,相乘之后,出现了很多负值,这样的负值怎么理解呢。举例如下:

userId,movieId,rating,timestamp
1,2,4.5,964982703
1,3,2.0,964981247
2,1,4.0,964982224
2,3,3.5,964983815
3,2,5.0,964982931
3,4,2.0,964982400
4,2,3.5,964980868
4,3,4.0,964982176
4,4,1.0,964984041

对应于评分矩阵
{{0,   4.5,  2,   0}, 
 {4.0, 0,  3.5,  0},
 {0,   5.0,  0,   2.0}, 
 {0,   3.5, 4.0, 1.0}}

通过你的代码计算结果如下

model = ALS()
model.fit(X, k=2, max_iter=10)

## ===>

user_matrix = 
   [[0.7570282336382094, 0.03844973056986965, 0.8341276635923996, 0.7159946957782793],
    [0.2987763898981603, 1.2236650397020115, -0.09214284357731845, 0.661286003351013]]

item_matrix = 
   [[-0.9352925817978953, 5.779445419172115, 1.301780265376408, 1.4251036233547738], 
    [2.7179214250950294, -0.33085416701657533, 3.272648111651669, -0.2344265756021138]]

user_matrix.transpose x item_matrix = 
[[0.10400786028337483, 4.276351943480331, 1.9632744030892986, 1.0088025527850888],
 [3.2898636807717296, -0.1826365582074784, 4.054678181939848, -0.23206475458923115],
 [-1.0305904247583555, 4.851281148112146, 0.7842998282335908, 1.2103190870120526], 
 [1.1276588690551161, 3.919263034868889, 3.0962241552067207, 0.865343621997239]]

我通过Mathematica编程也计算出同样的结果

R = {{0, 4.5, 2, 0}, {4.0, 0, 3.5, 0}, {0, 5.0, 0, 2.0}, {0, 3.5, 4.0,1.0}};
X = RandomReal[1, {4, 2}]
(* {{0.242376,0.511595}, {0.661005,0.105123}, {0.00835057,0.137917}, {0.980405,0.334949}} *)
Do[
   (* compute Y *)
   Xt = Transpose[X];
   Y = Transpose[LinearSolve[Xt.X, Xt.R]];
   (* compute X *)
   Yt = Transpose[Y];
   X = Transpose[LinearSolve[Yt.Y, Yt.Transpose[R]]],
   {i, 10}
]

很容易看到两个矩阵相乘出现了负值, 怎么理解这样的情况?谢谢

image

ALS get_rmse 函数计算问题

n_elements = sum(map(len, ratings.values()))
elements 数目偏大,在你的矩阵计算过程中被赋予的0 的默认值
导致ratings 的规模是m*n ,使得RMSE 的值偏小

imylu/imylu/utils/load_data.py/

def gen_data(low, high, n_rows, n_cols=None):

Returns:       
        list -- 1d or 2d list with int

but in imylu/imylu/utils/preprocessing.py/def min_max_scale(X): X is ndarray .

and you can see imylu/examples/kd_tree_example.py/ 's line 51~line 53

X = gen_data(low, high, n_rows, n_cols)        
y = gen_data(low, high, n_rows)        
Xi = gen_data(low, high, n_cols)

I think just add a line in def gen_data():

def gen_data(low, high, n_rows, n_cols=None):
    """Generate dataset randomly.
    Arguments:
        low {int} -- The minimum value of element generated.
        high {int} -- The maximum value of element generated.
        n_rows {int} -- Number of rows.
        n_cols {int} -- Number of columns.
    Returns:
        list -- 1d or 2d list with int
    """
    if n_cols is None:
        ret = [randint(low, high) for _ in range(n_rows)]
    else:
        ret = [[randint(low, high) for _ in range(n_cols)]
               for _ in range(n_rows)]
    ret = array(ret)  # This is my add.
    return ret

Maybe I ignore something in somewhere, thanks!

打扰了,非常感谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.