Code Monkey home page Code Monkey logo

transdim's Introduction

Google Scholar Badge

👋 I'm Xinyu Chen, now a Postdoctoral Associate at MIT (see MIT sites). Before joining MIT, I received PhD degree from University of Montreal in Canada.

Latest Publications

transdim's People

Contributors

hanty avatar lijunsun avatar mengyinglei avatar vadermit avatar xinychen avatar yxnchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transdim's Issues

transporatation

can we use a function that people meet together using gps to gather?

How to input a missing data to obtain complete data after training

Hello, thank you very much for your open source project, which has been very helpful to me!
I have a question now. After training the BATF algorithm, I want to input a missing piece of data and obtain the complete data. What should I do?
I would greatly appreciate it if you could reply to my question

请问BTMF数据修补程序中的rank参数该如何设置呢

作者您好!我觉得您的工作非常好且有意义,但是目前我在应用时遇到了两个问题。一个是如题所示,BTMF中的rank参数是什么含义,如何影响结果。二是我发现对速度序列使用BTMF后变得非常的平滑,甚至给我一种失真的感觉,请问这个问题有解决的办法吗,是否可以通过调整参数解决呢。

LATC的预测问题

非常感谢您提供了这样一篇漂亮的文章和code,这篇文章对我十分有帮助,但是我在复现您代码的过程中遇到了一些问题:在利用LATC模型对数据进行预测时,可以像传统的时间序列模型对数据向前进行预测吗,也就是预测现实世界中还没有的值吗?如果可以的话代码应该如何实现呢,我在复现您LATC-predict的代码时发现您定义的predictor函数是把要预测的值当成缺失值处理然后按LATC进行补全,然后与真实值比较,但是我想预测没有真实值的数据时,应该如何实现呢

About Your RMSE

In CP_ALS and Tucker_ALS, why do you calculate the RMSE on the training set ( sparse_tensor) not on the test set as you do in BGCP?

Btw, final_mape = np.sum(np.abs(dense_tensor[pos] - tensor_hat[pos]) / dense_tensor[pos]) / dense_tensor[pos].shape[0], the np.abs() should cover (dense_tensor[pos] - tensor_hat[pos]) / dense_tensor[pos] instead of dense_tensor[pos] - tensor_hat[pos]

A question about the null value generation of time series

hello,i am glad to ask a question. I am studying a set of data on the changes in the number of physical stores, mainly to detect the outliers of the number changes. Of course, the blank value itself is also an outlier. I want to ask, what algorithm should I use to fill in the blank values ​​to find other types of outliers? Thank you.

关于将BATF应用到自己数据集时产生的不正定问题

LinAlgError: 8-th leading minor of the array is not positive definite
这是我最后报的错,参数和代码提供的一样,我上网查是因为我的数据中带有NaN。我想用您提供的办法做缺失值处理,请问怎样将BATF应用于带NaN的数据?非常感谢!

The difference in data

About the data set. Each data file has these three named data: tensor,random_tensor,random_matrix.
What do these three stand for and is there any difference?

应用在自己的数据集上出现问题

你好,我有三条存在关联的时间序列数据,缺失值设置为0,我将他们重组成(6720,3)形状,采用LRTC进行数据填补,在运行过程中出现了问题。
ValueError Traceback (most recent call last)
in
11 epsilon = 1e-4
12 maxiter = 200
---> 13 tensor_hat = LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
14 end = time.time()
15 print('Running time: %d seconds'%(end - start))

in LRTC(failed resolving arguments)
55 Z[pos_missing] = np.mean(X + T / rho, axis = 0)[pos_missing]
56 T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))
---> 57 tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)
58 tol = np.sqrt(np.sum((tensor_hat - last_tensor) ** 2)) / snorm
59 last_tensor = tensor_hat.copy()

<array_function internals> in einsum(*args, **kwargs)

~\AppData\Roaming\Python\Python36\site-packages\numpy\core\einsumfunc.py in einsum(out, optimize, *operands, **kwargs)
1348 if specified_out:
1349 kwargs['out'] = out
-> 1350 return c_einsum(*operands, **kwargs)
1351
1352 # Check the kwargs to avoid a more cryptic error later, without having to

ValueError: einstein sum subscripts string contains too many subscripts for operand 1

十分感谢你的开源项目!

A bug in LRTC-TNN.ipynb.

In the svt_tnn code:

def svt_tnn(mat, alpha, rho, theta):
    tau = alpha / rho
    [m, n] = mat.shape
    if 2 * m < n:
        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)
        s = np.sqrt(s)
        idx = np.sum(s > tau)
        mid = np.zeros(idx)
        mid[:theta] = 1
        mid[theta:idx] = (s[theta:idx] - tau) / s[theta:idx]
        return (u[:, :idx] @ np.diag(mid)) @ (u[:, :idx].T @ mat)
    elif m > 2 * n:
        return svt_tnn(mat.T, tau, theta).T # this svt_tnn lack an argument. :( It only has 3 aurgements. 
    u, s, v = np.linalg.svd(mat, full_matrices = 0)
    idx = np.sum(s > tau)
    vec = s[:idx].copy()
    vec[theta:idx] = s[theta:idx] - tau
    return u[:, :idx] @ np.diag(vec) @ v[:idx, :]

The error shows:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[18], line 7
      5 epsilon = 1e-4
      6 maxiter = 200
----> 7 x = LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
      8 # end = time.time()
      9 # print('Running time: %d seconds'%(end - start))

Cell In[8], line 17, in LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)
     15 rho = min(rho * 1.05, 1e5)
     16 for k in range(len(dim)):
---> 17     X[k] = mat2ten(svt_tnn(ten2mat(Z - T[k] [/](https://file+.vscode-resource.vscode-cdn.net/) rho, k), alpha[k], rho, int(np.ceil(theta * dim[k]))), dim, k)
     18 Z[pos_missing] = np.mean(X + T [/](https://file+.vscode-resource.vscode-cdn.net/) rho, axis = 0)[pos_missing]
     19 T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))

Cell In[6], line 13, in svt_tnn(mat, alpha, rho, theta)
     11     return (u[:, :idx] @ np.diag(mid)) @ (u[:, :idx].T @ mat)
     12 elif m > 2 * n:
---> 13     return svt_tnn(mat.T, tau, theta).T
     14 u, s, v = np.linalg.svd(mat, full_matrices = 0)
     15 idx = np.sum(s > tau)

TypeError: svt_tnn() missing 1 required positional argument: 'theta'

关于划分训练集,测试集

尊敬的作者您好,请问在使用机器学习算法进行数据插补时不需要像深度学习方法那样划分出训练集和测试集吗?

Compare GAIN with BGCP about imputation

Hi,I'm learning about GAIN now,and it seems that BGCP has better performance than GAIN In terms of imputation.So I want to know what are GAIN's advantages and disadvantages,comparing with BGCP?And is there any difference in the direction of application between BGCP and GAIN?Thanks!

some question about your paper

From paper "Low-Rank Autoregressive Tensor Completion for Spatiotemporal Traffic Data Imputation"
Could you tell me how you calculated to get this formula?
image

Looking forward your reply

After the forecast of the full data

请问在定义的BGCP算法中返回的两个参数分别代表什么呢?
如果我需要返回补缺后的数据,请问应改成哪个参数呢?
谢谢!

请问是否有交通流量数据呢

您好,
请问你们的数据集是否能提高交通流量的数据集呢,如果没有是否有别的公开数据集能提供流量呢,我发现我现在能找的公开数据集都是速度数据,找不到流量数据,谢谢了!

数据集问题

请问 dataset\Guangzhou-data-set下面的 random_matrix.mat 和 random_tensor.mat 是随机生成的么,还是符合某种分布规律,画图看不出来呀

PeMS graph

Hi, I want to know for the PeMS dataset which graph to use- one called graph_pems_new and another graph_pems.

Thanks

LinAlgError: SVD did not converge using LRTC-TNN

I have non-random missing values of about 50% orginal values with 5 feature. I try to use LRTC-TNN to restore the missing values, however, it shows LinAlgError: SVD did not converge. What can I do ? Or is there any other method can be used to impute my data? Thanks.
The original data is shown below (just ignor the last figure, bottom right one with nothing showing):

image

dataset

I think it is very interesting. But unfortunately, when I try to implement your algorithm with my own data, I have some problems. I am stuck in generating third-order tensor, could you please send me the source code of your data processing.

Appreciate your help.
Thanks and regards.

How can I set the parameter "low_rank" in different application scenarios?

First of all, your open source work is very beautiful! Let me have a good understanding of the main content of your paper.

Here I would like to discuss an initial parameter setting pointed out in the code or the paper. In the code, the value of initial parameter low_rank needs to be specified when executing BGCP notebook. How can the value be defined according to the given different time series? Or is the definition of this value completely random? I'm troubled by that.

If convenient, please reply. I will be very grateful! Thank you again for your open source work.

How to apply LATC algorithm to my own dataset

I am glad to ask you a question. My dataset is not complete so I want to use LATC for completion. This means I don't have a dense_tensor, so what should I do? My dataset is a matrix with dimension 21x4081, but I added a new dimension to it and converted it into a tensor with dimension 21x1x4081.
If I use sparse_tensor.copy() to instead dense_tensor, RMSE and MAPE was nan.
image
Looking forward to your reply

关于应用到二维矩阵数据的问题

我发现这些算法模型都是应用到三维张量的数据上面,那如果我的数据集是二维数据呢,比如说我想用到PEMS-BAY,METR-LA上面,我尝试的一个方法是把二维张量增加一个维度(原来是(N,T),变成(1,N,T),但是又会报内存不够的错误

预测时的参数设置问题

感谢您提供了这样漂亮的文章和代码,我在用其他数据复现您的代码时遇到了一些困难,在用LATC模型对数据进行预测的过程中,我发现在我设置time_horizon pred_time_steps time_intervals back_time_steps参数的过程中遇到了包括:SVD不收敛,维度不能为负等的问题,我想请问设置这些参数时有什么要求以及技巧吗?
我的代码是:
image

image

注:我的数据是104个时间段的4个国家的GDP,我按季度将它折叠成2644的张量结构,用1995-2020年的季度数据来预测2021年的GDP数值

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.