Code Monkey home page Code Monkey logo

llmtime's People

Contributors

andrewgordonwilson avatar carmarpe avatar eltociear avatar kashif avatar ngruver avatar nkulkarni avatar shikaiqiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

llmtime's Issues

Description not found for p_extra

I couldn't find the reason in Appendix to account for p_extra in NLL/D calculation. Could you please comment on this step? If I missed something, can you please point me to the right place?

# adjust logprobs by removing extraneous and renormalizing (see appendix of paper)

I am also curious whether this function will ensure a non-negative constraint on the return values.
Thanks in advance!

Question about the continuous likelihood.

Hello. I am a master degree student at Korea university.

First of all, I really appreciate to give me a good inspiration from your interesting paper "Large Language Models Are Zero-Shot Time Series Forecasters". And also a big congratulations on being published in NeurIPS 2023!

I read a lot of time, but I can't understand the part of "continuous likelihood".

First thing is the part of p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) * p(u_1 | u_0) * p(u_0)
It is related to hierarchical softmax, but I can't understand 100%.
If this part means the definition of general language model, it should be p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) * ... * p(u_1 | u_0) * p(u_0).

Second thing is part of the definition of U_k(x).
I think U_k(x) should be just composed of an indicator function. I can't understand the reason for the B^n term in the part of the definition.

Thnak you.

Question about the generate_predictions() function

When I run the demo.ipynb file without changing anything and try getting the autotuned predictions, gpt3 works fine, but once I use gpt4 and the promptcast model, I get this error:


TypeError Traceback (most recent call last)
Cell In[9], line 6
4 hypers = list(grid_iter(model_hypers[model]))
5 num_samples = 2
----> 6 pred_dict = get_autotuned_predictions_data(train, test, hypers, num_samples, model_predict_fns[model], verbose=False, parallel=False)
7 out[model] = pred_dict
8 plot_preds(train, test, pred_dict, model, show_samples=True)

File /mnt/aamv_data/nimeesha_workspace/nimeesha_workspace/first_paper/AAMV/llmtime/models/validation_likelihood_tuning.py:119, in get_autotuned_predictions_data(train, test, hypers, num_samples, get_predictions_fn, verbose, parallel, n_train, n_val)
117 best_val_nll = float('inf')
118 print(f'Sampling with best hyper... {best_hyper} \n with NLL {best_val_nll:3f}')
--> 119 out = get_predictions_fn(train, test, **best_hyper, num_samples=num_samples, n_train=n_train, parallel=parallel)
120 out['best_hyper']=convert_to_dict(best_hyper)
121 return out

File /mnt/aamv_data/nimeesha_workspace/nimeesha_workspace/first_paper/AAMV/llmtime/models/promptcast.py:278, in get_promptcast_predictions_data(train, test, model, settings, num_samples, temp, dataset_name, **kwargs)
275 input_strs = None
276 if num_samples > 0:
277 # Generate predictions
--> 278 preds, completions_list, input_strs = generate_predictions(model, inputs, steps, settings, scalers,
279 num_samples=num_samples, temp=temp, prompts=prompts, post_prompts=post_prompts,
280 parallel=True, return_input_strs=True, constrain_tokens=False, strict_handling=True, **kwargs)
281 # skip bad samples
282 samples = [pd.DataFrame(np.array([p for p in preds[i] if p is not None]), columns=test[i].index) for i in range(len(preds))]

TypeError: models.promptcast.generate_predictions() got multiple values for keyword argument 'parallel'

Changing parallel=False to True, or removing the parameter in the function call altogether doesn't work. What should I do?

Thank you!

Autoformer experiments

Hi, Thanks for the great repository. I could not find a run script for the datasets in the informer/autoformer papers. Are there plans to add them?

Size of the test set of the Informer datasets

Hi Nate!

Just scanned through you marvelous work. I found that the precomputed output of Autoformer on the Informer datasets are substantially smaller than the original test set. As you mentioned in your paper that the test set has been narrowed, but what is the actual size of the test set?

add suggestions for usage requirements for openAI

After running and debugging the demo notebook a bit I got the following error message

RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.

I only have a free account on OpenAI. Could you provide in the documentation files and/or the demo notebook some indication of how much usage might be needed to run the demo script one, or, say, 10 times? I will check my usage logs (although they don't appear to be updated in real-time) but it would be helpful to have a sense of how much a run of one of these models churns through API limits, and how different model parameters might change that. Thanks!

import darts.models error

hello,
when I run the demo.ipynp, there is a mistake. I feel confused. could you help me:

ImportError Traceback (most recent call last)
File /home/ssd2/mashichao/anaconda3/envs/llmtime_new/lib/python3.9/site-packages/sklearn/__check_build/init.py:45
44 try:
---> 45 from ._check_build import check_build # noqa
46 except ImportError as e:

ImportError: dlopen: cannot load any more object with static TLS

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
/home/ssd2/mashichao/llmtime-main/demo.ipynb Cell 1 line 1
12 from models.utils import grid_iter
13 from models.promptcast import get_promptcast_predictions_data
---> 14 from models.darts import get_arima_predictions_data
15 from models.llmtime import get_llmtime_predictions_data
16 from data.small_context import get_datasets

File /home/ssd2/mashichao/llmtime-main/models/darts.py:3
1 import pandas as pd
2 from darts import TimeSeries
----> 3 import darts.models
4 import numpy as np
5 from darts.utils.likelihood_models import LaplaceLikelihood, GaussianLikelihood
...
to build the package before using it: run python setup.py install or
make in the source directory.

If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform.

Reproducibility of LLM-Time results on Informer datasets

Hi,
First, I want to thank you for your insightful paper and the valuable resources in your repository. I am currently attempting to replicate your results for the Informer datasets (ETTm2, exchange_rate, electricity, etc.). However, I was unable to find a run_informer.py file to facilitate this, as was the case for Monash or DARTS. Could you please guide me on how to reproduce these results using your code, especially with the autoformer_dataset.py? Thank you in advance for your assistance and time.

How were the normalized scores aggregated?

Thank you for releasing the code! This is a very interesting piece of work. Congrats on the NeurIPS acceptance! 🎉

As per my understanding, you're aggregating normalized scores to report the final scaled score. It looks like you're using the arithmetic mean to aggregate the normalized scores. Please correct me if I am wrong.

Using the arithmetic mean may not be the best way of summarizing a normalized metric. This may lead to misleading conclusions. A better way to aggregate normalized scores is using the geometric mean. Please check this paper out for details:

Fleming, Philip J., and John J. Wallace. "How not to lie with statistics: the correct way to summarize benchmark results." Communications of the ACM 29.3 (1986): 218-221.

Based on the numbers in https://github.com/ngruver/llmtime/blob/main/precomputed_outputs/deterministic_csvs/monash.csv, here are the plots that I get using the arithmetic and geometric mean.

image

image

Missing LLaMa from experiments

Hello,

Thanks for sharing the code for the exciting work.
It seems that LLaMa is not in the experiments you shared.
In Monash, llama is initialized with empty hyperparameters and is never called. Similarly, it is not initialized in other experiments.

Since it is an open-source model, it is easier to work with that. Can you share the code for that please?

Thanks!

How to run Llama 70B?

Was there a specific command that was used to run the Llama 70B model? For example to do model-parallelism?
What GPU configuration did the authors use?

Prediction length for Monash benchmark

Hi, may I check how the baseline results for the Monash benchmark (Figure 4, e.g. Wavenet, Transform., DeepAR, etc.) were obtained? From my understanding of the codebase, it is using the huggingface monash_tsf dataset repository to obtain the Monash time series. The prediction length is based on this:

pred_len = len(val_example) - len(train_example)

My concern is that the prediction lengths from the huggingface dataset are different from the default prediction length in the Monash dataset. For example, solar 10 minutes from the hf dataset has a prediction length of 60 while the Monash baseline results have a prediction length of 1008. Please correct me if I am mistaking anything here. Thank you!

how to use local csv data to test?

Date,c1,c2,c3,c4,c5,c6,c7
2001/5/30,22,24,29,31,35,4,11
2001/6/2,15,22,31,34,35,5,12
2001/6/4,3,4,18,23,32,1,6
.......
my local csv data like above, how to use your demo code

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.