The Chamber 🏰 of Tech Secrets is open. I watched the OpenAI DevDay keynote and was impressed with a lot of the changes. That will be the subject of today’s post… though it will be less about the specifics of the announcements and more about some ways this may impact all of our work.
OpenAI DevDay
There will be countless posts and videos about OpenAI DevDay, recapping the announcements and hyping the future of AI. “I watched the keynote so you don’t have to” will be a common approach. This post will not be that. Instead, I want to consider the impacts — near and mid term — of some key changes. Longer term is unknowable. As Sam said in closing, “We hope you’ll be back next year because what we showed today will look very quaint compared to what we have then”.
1. GPT4-turbo w/ 128k tokens
There were several things I tried to do with OpenAI over the past year and the general limit on my creativity was mostly the size of the context window I was working with. I found it consistently challenging to find ways to slim down my context to the allowed number of tokens to do the job I wanted.
For those completely unfamiliar with a context window, think of it as the short-term memory for a Larger Language Model (LLM). Tokens are roughly equivalent to words, though some words require multiple tokens.
128k tokens allows for inputs that are significantly larger the previous model versions and opens the door for processing a lot more input data, or generating much more lengthy outputs without the model “forgetting” what it is talking about.
Matt Rickard make an astute observation about context window growth this week, which I will call “Rickard’s Law” (and hope it sticks).
The maximum context length of state-of-the-art Large Language Models is expected to double approximately every two years, driven by advances in neural network architectures, data processing techniques, and hardware capabilities.
This significant and consistent growth in context opens some incredible possibilities that I am excited to play with. If the law holds it means we are likely to see:
Models get increasingly “rememberful” for years to come
Capabilities to produce long-form outputs (like a novel)
Abilities to summarize and synthesize much more complex and interrelated datasets over time
2. Model control w/ JSON output
Getting deterministic outputs in JSON is cool. There were open source tools and prompt structures that were assisting with this, but they won’t be required when working with OpenAI APIs anymore. This moves us another step towards my thesis that, eventually all applications will have a model that assists with some element of the application, much like how all applications (for the most part) have a database today.
3. Custom Models
Custom Models are exactly that… a partnership between an enterprise and OpenAI to create a custom model specific to the customer’s use case. This is the first time I have seen OpenAI move from global services to customer-specific. To take away all the flash of this announcement, OpenAI is getting into the professional services business. Sam: “We won’t be able to do this with many companies to start. In the interest of expectations, it won’t be cheap.. but if you want to push things as far as they can currently go, get in touch with us but we believe we can do something great”.
Assuming this service grows and scales, this becomes a potentially powerful capability for enterprises that will never be able to acquire the people necessary to do something like this on their own. There are only so many OpenAI-scale model researchers to go around.
4. Assistant API
At first glance, the Assistant API (now live in beta) takes the complexities of working with something like Open Source LangChain to build an agent and moves them to Closed Source OpenAI ecosystem. The trade off is that it looks like OpenAI is going to make this pretty easy.
Assistant API offers persistent threads (for managing the state of an interaction over time), built in retrieval, code interpreter (for writing code to solve problems on the fly), function calling (for outside interactions). It also integrates with text-to-speech (TTS) and Whisper, along with some new and fairly impressive natural language voices that nail the delivery of the sentences they speak.
This moves us another step towards useful agents and removes the need for a lot of Open Source tools (vector databases, chaining tools, etc.).
Summary
There were a lot of exciting updates here, but most exciting may be the hint that these capabilities “will look quaint next year”. I am guessing OpenAI is significantly ahead on their research internally and has more splashes to come. I remain bullish on OpenAI and their ability to execute. They are taking the AI landscape to another level and look poised to continue to do so.
From a business perspective, its been a big year for OpenAI. They found a capital and cloud computing partner in Microsoft and solidified the relationship. They released some great models. They found inroads to Enterprises. They entered the professional services business. They entered the software around models business (assistant API, etc.). And right out of the AWS playbook, they are lowering prices while improving capabilities.
Now let’s see what the other players have up their sleeves. 2024 should be fun… unless AI destroys us all. ;)