Welcome to the 'Layer 2' of AI

Since November 2022, and really since the release of Stable Diffusion several months prior, the world has been obsessed with generative AI, and I mean obsessed. In the mornings, when I read the news, I also read my GitHub Explore tab, which tells me about interesting new projects, and I’ll read up on some of them, stashing some away to be reviewed later. Since the release of ChatGPT, 90% of the projects I have seen in that tab each day have been related directly to AI, if not to ChatGPT itself. This is certainly the phenomenon of hype (although a justified hype in some senses, given how phenomenally powerful the tech is), but it has given rise to an interesting new term: prompt engineering.

If you look up this strange term today, you’ll find all manner of e-books on the subject, with their fair share of obscure-to-nonexistent publishers and spelling mistakes throughout, a telltale sign of people trying to capitalise on the next big thing. I won’t bother explaining prompt engineering in detail: it’s basically figuring out how to talk to AIs effectively, which applies especially for things like Stable Diffusion (ChatGPT, on the other hand, will pretty much figure out what you mean, for the most part).

I want to take a different perspective on all this, though: my background is in framework development, and in engineering solutions that might best be called ’layer 1’, to appropriate Ethereum’s terminology: these are programs other people can use to build their own programs, and they range from web development frameworks that automate dozens of things that are normally quite hard (see Perseus) to decentralised platforms that aim to provide a foundation for distributed storage, computation, and smart contracts (see the Quantorium). When a technology has stable and powerful layer 1 technology, this is usually when it starts to really take off, and I think this is exactly what we’re seeing today in AI.

For the last several decades, AI development has been focused on getting those fundamentals right, with everyone starting largely from scratch in their own systems. Sure, we’ve had basic architectures like convolutional neural networks (CNNS) for image classification and sequence-to-sequence transduction (for machine translation, among other use-cases), etc., but these aren’t frameworks, they’re just ideas. They’re largely consigned to papers, and it was only with the development of framework-style systems like HuggingFace or TensorFlow that machine learning really became more open to a wider range of developers (i.e. those without maths degrees).

Today, however, innovation has skyrocketed in AI because the programs that have been recently released (particualrly by OpenAI) are capable of being adapted to novel use-cases: if you ask ChatGPT to act as a translator, it will; if you ask it to play a chef, it will; if you ask it to explain quantum physics like you’re five years old, it will. This kind of ’layer 2’ engineering, of changing the prompts we give to this system to make it do new things, is much easier than layer 1 engineering: all you need is a bit of common sense, an internet connection, and a bit of patience while you figure out the best way to do something. Contrast that to layer 1 engineering, where you’re determining the mathematics of weight optimisation, and you’ll understand why innovation in AI is suddenly taking off now, as opposed to five years ago.

To be clear, there have always been frameworks in AI that have helped developers do different things, but now we seem to have a base level of open-ish platforms (with more open alternatives being rapidly developed) that can be built on to provide novel applications very quickly: just the other day, I saw no fewer than three projects all using prompt engineering with ChatGPT to create applications that would allow one to have a conversation with a paper, or a book, or some other source of information, which could facilitate a whole new era of content summarisation. Again, this required very little ’hard’ innovation: they just had to write the right prompts and do a bit of JSON parsing.

To wrap up, I think we’re now seeing a very interesting development in AI, but not because this generation of it is so much more powerful than the previous, but rather because it is so much more open, not in the sense of free and open-source, but in the sense of being easily expandable. To return to OpenAI’s eventual vision of Artifical General Intelligence (AGI), the fundamental promise of AGI is to allow one to build applications that leverage AI trivially, in the same way that one can build a website with a no-code tool.

All this provides an interesting lens on the development of technology more broadly: this is the first major technological shift I have lived through as an adult, and what fascinates me, as a framework developer, is that, at the core of what drives the upward kink in a technology’s adoption curve seems to be the ease of expansion: when a technology becomes a framework for future innovation. Perhaps this also provides an interesting level of hindsight into two projects: one you’ve definitely heard of and one you probably never have: the World Wide Web, and Project Xanadu. The former was open and expandable, the latter tried to do all the innovation itself, and refused to be open. Today, www means one thing to almost everyone on the planet, while Xanadu conjures images of Coleridge’s poetry, not the vast system of data transclusion its founders invisaged. So, from my perspective, if one lesson had to be taken away from all this for those who wish to build the technologies of the future, it would be this: make them open.