🪙Token Limits and AI Models

Understanding Tokens

Before we proceed, let's take a moment to understand what tokens are. In the context of natural language processing and AI models, tokens are the individual units of text that the model processes. These units can vary in size, typically representing words or characters. For example, in the sentence "Hello, how are you?" there are six tokens: "Hello," ",", "how," "are," "you," and "?". Tokens are essential for the model to understand and generate text effectively.

Token Limitations and Budget

The AI models we employ come with varying maximum token limits. These limits are dependent on the model and subscription tier. Specifically, Free and 'Get a Taste' tier users have a token limit of 2048, 'True Supporter' tier users have access to over 4096 tokens, and 'All In' tier users can utilize up to 8192 tokens on certain models. This limit includes both the input text we provide and the generated output from the model. However, the number of tokens expected in the response (max_new_tokens) further limits the available tokens for generating text. It's crucial to consider this limitation when creating chatbots to ensure the conversation remains within the token budget.

Chatbot's Personality and Example Dialogue

When creating chatbots, it's important to be mindful of the token budget. Defining a chatbot and providing example dialogues consume tokens. The chatbot's personality description, including traits, background information, and other details, can significantly impact the available token count. As a good practice, aim to keep the chatbot's personality description within the range of 900-1100 tokens to leave room for other aspects of the conversation.

Message History

The message history refers to the ongoing conversation or interaction with the chatbot. It includes the user's input, the chatbot's responses, and any contextual information needed for the conversation to flow naturally. However, due to the token limitations, the length of the message history that can be included is restricted. So be mindful of not creating chatbots that use too many tokens. For example, on average, 2k context can cover up to 16-18 previous messages, 4k can cover up to about 42, and 8k can cover up to 84 messages. (excluding memory)

Remember that this is just an estimation and that the number of messages included in the history can greatly vary depending on multiple factors, such as message lengths, chatbot's persona, and memory.

Optimizing Token Usage:

To make the most of the limited token budget, it's important to be concise and prioritize essential information. Here are a few tips to optimize token usage when creating chatbots:

  1. Keep the chatbot's personality description and examples brief but effective, aiming for a total character definition within the range of 800 to 1100 tokens.

  2. Use concise language and avoid unnecessary verbosity in the Personality and Scenario.

  3. Consider summarizing or paraphrasing information to save tokens while maintaining clarity.


Creating chatbots using AI models offers exciting possibilities, but it's essential to work within the limitations of token budgets. By understanding the constraints and optimizing token usage, you can create engaging and interactive chatbots while maintaining the coherence of the conversation. Now that you have a clear understanding of tokens and their impact, you're ready to embark on the journey of bringing your chatbots to life!

Last updated