Dear reader, this is just going to my take on what was published so the source of truth is https://openai.com/blog/function-calling-and-other-api-updates, https://platform.openai.com/docs/guides/gpt/function-calling and the emails they sent out to the developers on the day.
As I read through the latest batch of OpenAI updates, I find myself torn between slight exasperation and intrigue. There is a lot of give and take here, if I was being pessimistic I'd even say it's like a magic trick, they distract you with the shiny new toy in one hand while they pick your pocket with the other. But I'll give them the benefit of the doubt here as this is entirely opt it and it still seems like it was done under good intentions. Time to dive in.
- Cost: GPT3.5 has a general cost reduction of about 25%, GPT4 cost remains the same, however they have provided developers new ways to increase the token burn. The GPT3.5 cost reduction is as murky as a foggy night in London, before this update I think you just paid for an output token, but now you have to pay for both input AND output, double the fun right?. It's now the same as GPT4's pricing structure.
- GPT3.5 can now following the 'System' instruction a lot better and from quick testing, it seemed to be ok? Before it was really hopeless.
- GPT3.5 new 16k content model, with the ability to follow instructions and at this price point, this feels like a significant improvement. It’s like having a pet that finally sits when you tell it to. Good job.
- At the time of writing, models 'gpt-3.5-turbo' and 'gpt-4' are still using the models from March, and on 2023/07/23 the update models 'gpt-3.5-turbo-0613' and 'gpt-4-0613' will automatically become the default model names, so keep that in mind in case you need to adjust your API. GPT3.5 16k context and GPT4 32k context would have to be specified separately.
- The waitlist for the GPT4 model is about to become history, which is good news, it means I can finally access that sweet 32k context GPT4 model without having to wait for a Golden ticket to fall from the heavens.
Alright, let's get this out of the way. I know you've been waiting on pins and needles for my initial impressions of these 'upgraded' LLM models. Spoiler alert: prepare for a slight letdown. Sure, GPT3.5 has finally mastered the fine art of following instruction prompts properly – congratulations, it's learned how to sit and stay like a well-behaved little tech-puppy. But beyond this minor miracle of obedience, I'm sorry to report that it hasn't shown any other upgraded abilities compared to its previous iteration.
And as for our star player, GPT4, brace yourself for this thrilling news – it has... drumroll please... no noticeable increase or decrease in ability. I can sense your surprise. It's like going to a concert and the band plays all the same songs in the exact same way as the last time you saw them. But hey, let's not be too harsh – at least it hasn't regressed, right? Small victories.
I already discussed function calling in a different post but I also want to talk about it here in more general. This shiny new feature opens up a universe of possibilities.
The biggest merit is you can now receive your data in a neat JSON format, making you feel like a conductor leading your tech-orchestra. You're in control, right?
But here is the catch, as written in their dev docs, functions are injected as system messages, that means with each call you have to be careful of not only your input, your system message but now also all your functions. They consume tokens. It's like taking a bite out of your bank account every time you use your shiny new function toy. And guess what? It counts as 'input token' usage, too. Oh, how I love the smell of stealthy costs in the morning. So much for that alluring 25% cost decrease. However credit where credit is due this is entirely opt in, so use caution.
This is ultimate what I hoped for in the ChatGPT's 'plugins' only to realize 'plugins' are not actually what I want, I wanted to call and intercept the response with my own code. This kind of paves the way to a back road to implementing GPT into your own product a bit better as you have more control. Also the enticing idea of back-to-back function calls before reply to the user is so exciting, ideas, ideas ideas!
When I saw the unscientific unit of measurement of 'pages' pop up in their announcement all I could think of was silly jokes about imperial unit users, my condolences. Maybe developers need to start measuring their API or system response time in candle lengths. Anyway it reminded me of an anecdote. In Japan the page size of B5 is more common for every day items. It's mainly in office scenarios where A4 is used and perhaps you might receive the occasional mail in an A4 sized albeit folded multiple times to fit in the envelope. Textbooks, notepads, bills, and manga and magazines etc all vary in sizes (Defiantly not A4). So what should I envision when I see the 'pages' in the digital world and my physical world hardly revolves around photocopying things in an office.
In conclusion OpenAI has provided us more new set of tools which have limitless possibilities and nice quality of life updates here. I do look forward to playing with these, so here's to wading through token costs and experimenting with functions.