Talking Tokens: A ChatGPT Primer for the Inquisitive Mind

Let's begin unraveling the Mysteries of Tokens.
I initially understood this and wrote this, but in that space of time it looks like OpenAI has an official post on this so it might be better to read that instead. But I wanted to post mine anyway after deliberating about it.
First I will try to describe tokens, near the end I will show it more practically so you too may also have a go and understand it.
A "token" in the context of GPT/LLM (Some of you might be more familiar with the term "ChatGPT"), is essentially a unit of information. Think of it as a piece of a puzzle that makes up a language. In GPT, tokens can be words, characters, punctuation, you name it.
These tokens are the building blocks LLMs use to process text input, understand context, and spit out relevant responses or predictions. It's like having a machine "think" and "write" like a human, but without the mood swings and sarcasm. Not bad, huh?
Now, there's this thing called "embeddings," but I won't bore you with the details in this post. Just know that once we have a token, it's turned into an embedding vector to grasp its meaning and relationships with other tokens.
However, the biggest drawback is the token limit.
The token limit is important because it determines how much text ChatGPT can process and generate in a single interaction. When using GPT-3.5 or any other LLMs, you need to be aware of the token limit, which in the case of ChatGPT3 case is 2048 tokens (Total of 4096, split between input and output). However please note I do not know this for sure, I am basing this off what I know from reading the dev docs and experience. In reality OpenAI could probably turn around and suddenly up it to 8000 tokens or whatever, I do not have that information.
To keep things simpler I will be talking about ChatGPT but the rules also apply to use GPT3.5 turbo if you are using the API.
Why should you care about token limits? Well, if your input text exceeds the token limit, you'll have to truncate or shorten it, which might affect the context and result in inaccurate or incomplete responses. Additionally, if the generated output reaches the token limit, it might get cut off, leaving you with a nonsensical or incomplete reply.
So, knowing the token limit helps you manage the input and output effectively, ensuring that you get the most coherent and relevant responses from ChatGPT. It's like making sure you don't overflow your coffee cup – nobody wants a mess, right?
When having a conversation with ChatGPT, each follow-up question adds to the token count of the following input. If you keep asking questions without considering the token limit, your text might exceed the allowed 2048 tokens. What happens then? By observation it seems ChatGPT will automatically truncate the text, which could lead to losing important context from your conversation. If you are using the API you must truncate this yourself.
ChatGPT method of truncating is interesting, have you ever notice that if you ask a question and it answer it sometimes put's a summary of what you asked in it's answer. That is because if you ask a follow up question it is reading your question's tokens, then reading anything above that (Previous answers, previous questions etc). If the summary comes near the end of it's answer, it may make it into the token limit. If it continues trying to summarize this way it tries it's best to keep context. It is extremely annoying reading such nonsense in the reply isn't it?
What happens when the token limit is reached? ChatGPT might not understand the query, or worse, give you a completely irrelevant response. It's like trying to stuff an oversized sandwich into your mouth – it's a disaster waiting to happen. So, when using ChatGPT, you need to keep track of the token count and manage your questions accordingly to ensure a somewhat coherent "conversation".
When dealing with ChatGPT, memory loss is a real pain. If it forgets something, you'll have to remind it, but here's the catch-22: rehashing the context only makes it forget parts of the previous context. Yeah, real helpful, right? So, if you're planning on using ChatGPT for long-term conversations, brace yourself for a wild, frustrating ride. Good luck with that.
Ok let's dive into a practical example so everyone can understand better.
OpenAI has provided us this tool and it was through using this tool I gained an understanding of how this all works.

Let's do a basic example and break it down.
Input: "Hello my name is Izumi"
Output

Let's try and understand what is happening here. Hello, 1 word 1 token simple.
But let's look at my name, "Iz" is blue and "umi" is purple. Umi could be 海 but it's interesting to see how it breaks down "words". But this may seem strange to us, let's see how the machine interprets it. In the grey box where the text is highlighted, click on "TOKEN IDS". We will get this array.
[15496, 616, 1438, 318, 28493, 12994]
This is the tokens, 29493 = Iz, 12994 = Umi
When a LLM replies it produces a sequence of tokens similar to the example above. As mentioned earlier these tokens which could represent words, character, punctuation etc are converted back into human-readable words, making it appear as if the LLM's are speaking our language, just so we can understand it. Aren't we lucky?
The thing is, ChatGPT is personified to make it feel more relatable and engaging to users like you. By understanding tokens and the inner workings of the model, you're able to see past the human-like facade and realize that it's just a machine doing some clever language processing. But hey, having a more "human" interaction makes it more enjoyable, right?
Also one last thing, if you think that by coaxing ChatGPT to respond in a certain way, and it starts responding the way you want, don't be fooled into thinking you're influencing its "thoughts." It's an advanced machine, but it's still just a machine, not a person. When it starts responding the way you want, it's merely matching patterns and similarities based on the input you provide. So, enjoy the conversation, appreciate its capabilities, but let's not get carried away by treating it as if it has feelings. Those "feelings" could just be "tokens" after all.