There are a number of predictions already out there covering the wide spectrum of what may be coming in 2024 in the field of Generative AI. I’m filling in the gaps between “GPT-5 is coming” and “Our jobs are going

MidJourney v6 via X user @Shudh

MidJourney v6 via X user @Shudh

Top Predictions

1. Trough of Disillusionment

We’re going to keep multiplexing within the hype cycle. 2023 has been an iteration doom loop between the Technology Trigger phase (due to rapid innovation) and Peak of Inflated Expectations (as everything felt like magic). 2024 is going to add Trough of Disillusionment to the mix as we’re already realizing that applying Gen AI to practical and scalable use cases has many limiting factors (“it just doesn’t work”, “so many wrong answers”, “what is RAG?”, “which model is better?”, “woah this isn’t cheap!”, “we’re not skilled to do this”, etc.) which still require academic / research breakthroughs and potentially new architectures. Don’t give up though; keep learning! Our jobs will be safe till at least 2025! 😉

<aside> 🤖 There are lots of resources made available by Microsoft, Google and others to enable upskilling on Gen AI.

</aside>

Artificial Intelligence Skills

Google Cloud Skills Boost

Fundamentals of Generative AI - Training

MidJourney v6 via X user @scottiewick

MidJourney v6 via X user @scottiewick

2. Responsible AI

Responsible AI was a big talking point in 2023 in boardrooms and the halls of Congress, but I don’t think average consumers will care as much as enterprises will. The business models of companies like OpenAI with its 100MM users and $1.6B in revenue without Responsible AI will be challenged - though it will need to be proven how adding more guardrails enables more growth and profits. In 2024, we’ll see more and more consumers share their data with multi-modal models at free will and with little regard (or knowledge) of risks just like they do with social media today. IMHO, the use of LLMs is not controllable, especially given there are thousands of open source LLMs and many uncensored models. The 2024 elections in the U.S. will be a major demonstration (in addition to many minor ones) of the dangers with LLMs powering disinformation campaigns and a test of any guardrails. 🤷🏽‍♂️

Bing Image Creator via X user @EmmanuelSpamer

Bing Image Creator via X user @EmmanuelSpamer

3. AI Agents Are Coming

A major potential of Gen AI is to leverage the frozen knowledge within LLMs and just-in-time knowledge and tools to activate autonomous decisioning, orchestration and execution. Several ambitious projects are starting (I’ve tried a few!) and while none of them work well, they will mature in 2024.

<aside> 🤖 Here’s what A16Z is tracking in the AI Agents landscape:

Untitled

</aside>

MidJourney v6 via X user @pankociolek

MidJourney v6 via X user @pankociolek

4. Local LLMs Take Over

Local LLMs (mostly open source) will compete with closed source / proprietary models in production as 2023 saw multiple approaches at enabling LLMs to run on commodity GPUs and CPUs. I’ve tried many models running locally and can attest that the benefits are very real and the effort is getting easier (even trivial). In 2024, everyone should be able to run some version of an LLM on their home computers. 👨🏽‍💻

Meta Emu generated by me on imagine.meta.com with prompt “Lots of different size small robots flying into the screen of a laptop with a human watching with a mind blown expression”

Meta Emu generated by me on imagine.meta.com with prompt “Lots of different size small robots flying into the screen of a laptop with a human watching with a mind blown expression”

5. LLMs and Edge AI

LLMs on the Edge will emerge in 2024. Large language models will give birth to small language models. Big tech companies like Apple, Google, Meta, Tesla are already working on enabling LLMs to run locally on mobile devices, automobiles and wearables via nano and small LLM versions and custom chipsets. All mainstream browsers will support the WebGPU standard to tap into integrated GPU hardware on laptops/desktops and WebLLM will live on-device in 2024. 📱 🕸️ 🖥️ 💻 👓

MidJourney v6 via X user @LLMSherpa

MidJourney v6 via X user @LLMSherpa

6. Superproductivity

Multi-modal models will usher in a new level of creativity and productivity in 2024. Witnessed by the quality and sophistication of MidJourney v6, Meta’s Emu, the open source LLaVA model, OpenAI’s GPT 4-V, Google’s VideoPoet, Runway and many new ones coming will unlock imagination and new applications to be rapidly built through code generation, even more powerful language models and LLM application frameworks (LangChain, LlamaIndex, etc.). I’ve experimented with all of them and the potential is awe-inspiring! I can’t wait to experience it at full scale! All images on this post were AI-generated (see captions for details)

MidJourney generated still image that was then animated and filled by AI. From X user @chaseleantj (still) and @elleocalderon (animated)

MidJourney generated still image that was then animated and filled by AI. From X user @chaseleantj (still) and @elleocalderon (animated)

On this last day of 2023, let’s take a break. Happy New Year! 🎉

MidJourney v6 via X user @EpiphanyLattice of a Tardigrade taking a break

MidJourney v6 via X user @EpiphanyLattice of a Tardigrade taking a break