ngruver / llmtime Goto Github PK

View Code? Open in Web Editor NEW

627.0 627.0 138.0 76.08 MB

Home Page: https://arxiv.org/abs/2310.07820

License: MIT License

Python 7.22% Jupyter Notebook 92.75% Shell 0.03%

llmtime's People

Contributors

Stargazers

Watchers

Forkers

mitkox valeman yynnxu wklm francyjglisboa jugglingnumbers akai01 gauravvgat shanthshivam eltociear zergey toandreyhse techthiyanes sarthakpujari agbaezehenry surfcao antonioliu97 kashif bassemfg ai-jie01 danx0r duongtrung gjmulder ailabteam vatogato haoxin1998 ayushrakesh hankniu01 yesouicom ffrujeri roozbehsanaei hanlaoshi tusharsinghbaghel franklong1 luciferjason ssrisunt yi-zhi111 badboy1314 baojinming arelkeselbri americast wang-lucy dean-south carmarpe rishabhmallik umesh92x s2014628 goed505s ljunius jsquires0 nkulkarni taylor-olmst 2132660698 reichlab lizonglingo ggzhang0071 rivera-paleo zxq-0058 shubh-81 12345rupali zaizou avibrahms bsweger alexandru-victor-andrei raunakdey mivanovitch annakrystalli elray1 donga0223 mzorn-58 mmkerr lshandross lumiqai darbetter yunjiao-chen deephaejoong mkim425 marievozanne julian-corbet sarahmish smisham96 cmsc720-foundation-in-deep-learning jamesliu forrestgg sanghy grace-go yueyangu skaiphd svorwerk-flextg quhaoh233 zhiyuanyaoj eemichel rohitkrjha bigdatamatta mann1105 huibinshen vishalbelsare m6129 aditya-oikawa13 abhinavnarang777

llmtime's Issues

Description not found for p_extra

I couldn't find the reason in Appendix to account for p_extra in NLL/D calculation. Could you please comment on this step? If I missed something, can you please point me to the right place?

llmtime/models/gpt.py

Line 121 in adefc38

    
           # adjust logprobs by removing extraneous and renormalizing (see appendix of paper)

I am also curious whether this function will ensure a non-negative constraint on the return values.
Thanks in advance!

Question about the continuous likelihood.

Hello. I am a master degree student at Korea university.

First of all, I really appreciate to give me a good inspiration from your interesting paper "Large Language Models Are Zero-Shot Time Series Forecasters". And also a big congratulations on being published in NeurIPS 2023!

I read a lot of time, but I can't understand the part of "continuous likelihood".

First thing is the part of p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) * p(u_1 | u_0) * p(u_0)
It is related to hierarchical softmax, but I can't understand 100%.
If this part means the definition of general language model, it should be p(u_1, ..., u_n) = p(u_n | u_n-1, .. u_0) * ... * p(u_1 | u_0) * p(u_0).

Second thing is part of the definition of U_k(x).
I think U_k(x) should be just composed of an indicator function. I can't understand the reason for the B^n term in the part of the definition.

Thnak you.

NLL function is missing for GPT

The gpt_nll_fn should be added for gpt-3.5 in nll_fns

ImportError: cannot import name 'load_dataset' from 'datasets' (unknown location) from monash.py

Can you also provide the monash datasets you used to reproduce the results?

Question about the generate_predictions() function

When I run the demo.ipynb file without changing anything and try getting the autotuned predictions, gpt3 works fine, but once I use gpt4 and the promptcast model, I get this error:

TypeError Traceback (most recent call last)
Cell In[9], line 6
4 hypers = list(grid_iter(model_hypers[model]))
5 num_samples = 2
----> 6 pred_dict = get_autotuned_predictions_data(train, test, hypers, num_samples, model_predict_fns[model], verbose=False, parallel=False)
7 out[model] = pred_dict
8 plot_preds(train, test, pred_dict, model, show_samples=True)

File /mnt/aamv_data/nimeesha_workspace/nimeesha_workspace/first_paper/AAMV/llmtime/models/validation_likelihood_tuning.py:119, in get_autotuned_predictions_data(train, test, hypers, num_samples, get_predictions_fn, verbose, parallel, n_train, n_val)
117 best_val_nll = float('inf')
118 print(f'Sampling with best hyper... {best_hyper} \n with NLL {best_val_nll:3f}')
--> 119 out = get_predictions_fn(train, test, **best_hyper, num_samples=num_samples, n_train=n_train, parallel=parallel)
120 out['best_hyper']=convert_to_dict(best_hyper)
121 return out

File /mnt/aamv_data/nimeesha_workspace/nimeesha_workspace/first_paper/AAMV/llmtime/models/promptcast.py:278, in get_promptcast_predictions_data(train, test, model, settings, num_samples, temp, dataset_name, **kwargs)
275 input_strs = None
276 if num_samples > 0:
277 # Generate predictions
--> 278 preds, completions_list, input_strs = generate_predictions(model, inputs, steps, settings, scalers,
279 num_samples=num_samples, temp=temp, prompts=prompts, post_prompts=post_prompts,
280 parallel=True, return_input_strs=True, constrain_tokens=False, strict_handling=True, **kwargs)
281 # skip bad samples
282 samples = [pd.DataFrame(np.array([p for p in preds[i] if p is not None]), columns=test[i].index) for i in range(len(preds))]

TypeError: models.promptcast.generate_predictions() got multiple values for keyword argument 'parallel'

Changing parallel=False to True, or removing the parameter in the function call altogether doesn't work. What should I do?

Thank you!

Autoformer experiments

Hi, Thanks for the great repository. I could not find a run script for the datasets in the informer/autoformer papers. Are there plans to add them?

Fine-tuning with tabular data

Could you publish code/instructions on how to fine-tune with personal data?

Size of the test set of the Informer datasets

Hi Nate!

Just scanned through you marvelous work. I found that the precomputed output of Autoformer on the Informer datasets are substantially smaller than the original test set. As you mentioned in your paper that the test set has been narrowed, but what is the actual size of the test set?

add suggestions for usage requirements for openAI

After running and debugging the demo notebook a bit I got the following error message

RateLimitError: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.

I only have a free account on OpenAI. Could you provide in the documentation files and/or the demo notebook some indication of how much usage might be needed to run the demo script one, or, say, 10 times? I will check my usage logs (although they don't appear to be updated in real-time) but it would be helpful to have a sense of how much a run of one of these models churns through API limits, and how different model parameters might change that. Thanks!

import darts.models error

hello,
when I run the demo.ipynp, there is a mistake. I feel confused. could you help me:

ImportError Traceback (most recent call last)
File /home/ssd2/mashichao/anaconda3/envs/llmtime_new/lib/python3.9/site-packages/sklearn/__check_build/init.py:45
44 try:
---> 45 from ._check_build import check_build # noqa
46 except ImportError as e:

ImportError: dlopen: cannot load any more object with static TLS

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
/home/ssd2/mashichao/llmtime-main/demo.ipynb Cell 1 line 1
12 from models.utils import grid_iter
13 from models.promptcast import get_promptcast_predictions_data
---> 14 from models.darts import get_arima_predictions_data
15 from models.llmtime import get_llmtime_predictions_data
16 from data.small_context import get_datasets

File /home/ssd2/mashichao/llmtime-main/models/darts.py:3
1 import pandas as pd
2 from darts import TimeSeries
----> 3 import darts.models
4 import numpy as np
5 from darts.utils.likelihood_models import LaplaceLikelihood, GaussianLikelihood
...
to build the package before using it: run python setup.py install or
make in the source directory.

If you have used an installer, please check that it is suited for your
Python version, your operating system and your platform.

Reproducibility of LLM-Time results on Informer datasets

Hi,
First, I want to thank you for your insightful paper and the valuable resources in your repository. I am currently attempting to replicate your results for the Informer datasets (ETTm2, exchange_rate, electricity, etc.). However, I was unable to find a run_informer.py file to facilitate this, as was the case for Monash or DARTS. Could you please guide me on how to reproduce these results using your code, especially with the autoformer_dataset.py? Thank you in advance for your assistance and time.

Basic usage without an LLM key?

I was wondering if I could quickly use your model without an LLM key (e.g. OpenAI key)?

How were the normalized scores aggregated?

Thank you for releasing the code! This is a very interesting piece of work. Congrats on the NeurIPS acceptance! 🎉

As per my understanding, you're aggregating normalized scores to report the final scaled score. It looks like you're using the arithmetic mean to aggregate the normalized scores. Please correct me if I am wrong.

Using the arithmetic mean may not be the best way of summarizing a normalized metric. This may lead to misleading conclusions. A better way to aggregate normalized scores is using the geometric mean. Please check this paper out for details:

Fleming, Philip J., and John J. Wallace. "How not to lie with statistics: the correct way to summarize benchmark results." Communications of the ACM 29.3 (1986): 218-221.

Based on the numbers in https://github.com/ngruver/llmtime/blob/main/precomputed_outputs/deterministic_csvs/monash.csv, here are the plots that I get using the arithmetic and geometric mean.

Missing LLaMa from experiments

Hello,

Thanks for sharing the code for the exciting work.
It seems that LLaMa is not in the experiments you shared.
In Monash, llama is initialized with empty hyperparameters and is never called. Similarly, it is not initialized in other experiments.

Since it is an open-source model, it is easier to work with that. Can you share the code for that please?

Thanks!

How to run Llama 70B?

Was there a specific command that was used to run the Llama 70B model? For example to do model-parallelism?
What GPU configuration did the authors use?

nvm.

The example does not support openai after the version is upgraded

Would you consider upgrading the source code to solve the call problem of the new version of openai？

how to get future forecasting points?

now, the demo is completion,can you share a demo to forcasting future data and load local csv file?

Prediction length for Monash benchmark

Hi, may I check how the baseline results for the Monash benchmark (Figure 4, e.g. Wavenet, Transform., DeepAR, etc.) were obtained? From my understanding of the codebase, it is using the huggingface monash_tsf dataset repository to obtain the Monash time series. The prediction length is based on this:

llmtime/data/monash.py

Line 43 in 37d0a33

pred_len = len(val_example) - len(train_example)

My concern is that the prediction lengths from the huggingface dataset are different from the default prediction length in the Monash dataset. For example, solar 10 minutes from the hf dataset has a prediction length of 60 while the Monash baseline results have a prediction length of 1008. Please correct me if I am mistaking anything here. Thank you!

how to use local csv data to test?

Date,c1,c2,c3,c4,c5,c6,c7
2001/5/30,22,24,29,31,35,4,11
2001/6/2,15,22,31,34,35,5,12
2001/6/4,3,4,18,23,32,1,6
.......
my local csv data like above, how to use your demo code

ngruver / llmtime Goto Github PK

llmtime's People

Contributors

Stargazers

Watchers

Forkers

llmtime's Issues

hello, when I run the demo.ipynp, there is a mistake. I feel confused. could you help me:

Recommend Projects

Recommend Topics

Recommend Org

hello,
when I run the demo.ipynp, there is a mistake. I feel confused. could you help me: