2023, the AI year in review

Saturday 23rd December, 2023 - Bruce Sterling

*Matt Wolfe saw a lot going on.


Hey there! As the year comes to a close, I’ve been thinking back on all of the crazy AI developments that happened in the last 12 months. Many of the tools and stories I’ve covered recently could’ve been the stuff of sci-fi blockbusters just a few years ago. There were hundreds, if not thousands, of major AI announcements—if you blinked, you probably missed something.

January 23: Microsoft invests $10 billion in OpenAI

The details: Microsoft announced a $10 billion investment in OpenAI. In exchange for funding and computing infrastructure, the deal allowed Microsoft to deploy OpenAI’s models across its various products and services.

Why it mattered: As AI’s most influential startup to date, OpenAI was strategically important for Microsoft. Its standard-setting AI models gave Microsoft a competitive edge over other tech giants in the ongoing AI race

February 7: Microsoft sneaks GPT-4 into Bing Chat

The details: The day after Google announced Bard, Microsoft dropped Bing Chat. As we would come to find out a month later, Bing was the first AI chatbot powered by OpenAI’s powerful GPT-4 model. At the time, ChatGPT itself wasn’t yet running on GPT-4: a testament to the power of Microsoft’s $10 billion investment in OpenAI.

Why it mattered: The integration bumped Bing’s daily active users to 100 million the following month, cementing its position as the world’s second-largest search engine (just behind you-know-who).

March 15: OpenAI launches GPT-4

The details: Five weeks after Bing Chat (unofficially) gave us a taste of the model, OpenAI finally unveiled GPT-4. Outperforming all existing LLMS, GPT-4 stunned everyone with its multimodal capabilities, able to handle both text and image inputs. The new model also came with huge improvements in intelligence—according to OpenAI, improvements were evident in the system’s performance on benchmarks like the Bar, LSAT, and SAT exams.

Why it mattered: At the time, the release of GPT-4 was the biggest advancement in LLM development. As the most powerful LLM on the market, GPT-4 set the performance standards for all existing and future AI models

March 16: MidJourney rolls out V5


The details: MidJourney made a huge splash in AI image generation with its latest version, V5 (which came just one day after OpenAI dropped GPT-4—what a crazy week that was). V5’s ability to produce incredibly realistic images (remember the viral puffy jacket Pope?) blew everyone’s minds. Compared to its predecessor, V5 understood user prompts better and boasted a wider range of styles, higher image resolution, and more.
Why it mattered: V5’s release was a milestone in generative AI development. Its huge leap towards photorealistic images set a high bar for the AI image generators that would follow.

March 21: Adobe launches Firefly

The details: At this point, March felt more like a year of AI news than a mere month. In its first major step into the AI game, Adobe launched Firefly, its new AI image generator, as a web-only beta. Firefly is trained on Adobe’s stock image library, openly licensed content, and content without copyright restrictions to produce images that are safe for commercial use.

Why it mattered: Firefly was Adobe’s ticket to enter the race for the most popular AI image generator. But it only really started taking off a few weeks later (see next point).

May 23: Adobe brings AI to Photoshop

The details: When Firefly initially launched, it didn’t appear to have any mind-blowing features compared to MidJourney and Stable Diffusion. But that changed when Adobe integrated Firefly into Photoshop and introduced the new, mind-blowing Generative Fill feature, which could add and remove specific objects in an image based on simple text prompts.

Why it mattered: Firefly’s integration into Photoshop helped creators work more efficiently within an app they already use. This kind of integration into existing workflows was key to unlocking the mainstream adoption of generative AI.

June 7: Runway’s Gen-2 revolutionizes text-to-video

The details: AI startup Runway dropped its new text-to-video model, Gen-2. While Runway’s Gen-1 could only change the style of an existing video, Gen-2 was able to create completely new video scenes from a one-sentence prompt. It could also generate short video clips (just a few seconds long, but still!) from an existing image or the combination of an image and a text description.

Why it mattered: While Gen-2 wasn’t the first text-to-video tool, it was the first to catch on. Gen-2’s huge leap in video generation and video quality kicked off a new era for text-to-video tools.

July 11: Anthropic drops Claude 2

The details: Anthropic’s rollout of the Claude 2 model seriously improved the Claude chatbot. In addition to performing better than the prior model on several benchmarks, Claude 2 set the chatbot apart with its 100,000 token context window. This meant that Claude could receive inputs the length of an entire book—making it the largest available AI model (since then, Anthropic upped Claude’s context window to 200,000 tokens, or roughly 150,000 words).

Why it mattered: Claude 2’s impressive context window gave it a huge edge over ChatGPT when it comes to handling lengthy PDFs—so much so that many people jumped ship from ChatGPT to use Claude instead. I use Claude just as much as ChatGPT because it just works better for summarizing long documents.

September 25: ChatGPT can see, hear, and speak

The details: The ChatGPT experience got even better in September when OpenAI rolled out new voice and image capabilities. The voice feature, powered by a new text-to-speech model, allowed users to have spoken conversations with one of ChatGPT’s realistic synthetic voices. The image features allowed users to upload images and converse with the bot about their content.

Why it mattered: The upgrade opened up a whole new world of ChatGPT use cases (like the ability to input a picture of your broken bike and ask ChatGPT how to fix it…how crazy was that?!). It also added momentum to the growing momentum of multimodality in AI models….