I was getting the RateLimitReached Error yesterday (after around the 80th generation e

Token generation limit about gpt-llm-trainer HOT 5 OPEN

mshumer commented on July 17, 2024 2

Token generation limit

from gpt-llm-trainer.

Comments (5)

xiscoding commented on July 17, 2024 1

I believe that this error is due to openAI limiting your requests. You can increase the rate limit or adjust the retry backoff. To increase the rate limit, you need to submit a form here: https://docs.google.com/forms/d/e/1FAIpQLSc6gSL3zfHFlL6gNIyUcjkEv29jModHGxg5_XGyr-PrE2LaHw/viewform. You can also learn more about model limits: https://platform.openai.com/docs/guides/rate-limits/error-mitigation.

The problem with increasing the rate limit for me is the increase in cost. The code I posted above is basically a simple retry backoff and I haven't had any issues with it. I was hoping for a solution that limited the token count of the output as it reaches the rate limit, but this messes up the outputs.

from gpt-llm-trainer.

fredzannarbor commented on July 17, 2024

Having same issue. Matt probably has a way higher token limit than most of us!

from gpt-llm-trainer.

nurena24 commented on July 17, 2024

same issue

from gpt-llm-trainer.

tuanha1305 commented on July 17, 2024

I believe that this error is due to openAI limiting your requests. You can increase the rate limit or adjust the retry backoff. To increase the rate limit, you need to submit a form here: https://docs.google.com/forms/d/e/1FAIpQLSc6gSL3zfHFlL6gNIyUcjkEv29jModHGxg5_XGyr-PrE2LaHw/viewform. You can also learn more about model limits: https://platform.openai.com/docs/guides/rate-limits/error-mitigation.

from gpt-llm-trainer.

ishaan-jaff commented on July 17, 2024

You can try using the litellm router if you have multiple deployments of the same model, this will allow you to increase your effective rate limit
docs: https://docs.litellm.ai/docs/routing

from litellm import Router

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

router = Router(model_list=model_list)

# openai.ChatCompletion.create replacement
response = await router.acompletion(model="gpt-3.5-turbo", 
                messages=[{"role": "user", "content": "Hey, how's it going?"}])

print(response)

from gpt-llm-trainer.

Token generation limit about gpt-llm-trainer HOT 5 OPEN

Comments (5)

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent