GPT-3, a new text prediction algorithm is causing a lot of buzz in the tech community. We asked our AI geek, Atte Honkasalo, to explain what GPT-3 really is and why we all should be excited about it.
Atte is our Head of Data & Analytics and the mastermind behind Q, our predictive AI platform that can scan, rank, and quantify 700,000 companies based on more than 300 different growth indicators.
First, what is GPT-3?
GPT-3 (Generative Pretrained Transformer), is the groundbreaking third generation of a text-prediction algorithm from OpenAI, an AI research and development company backed by a number of influential companies, investors, and people in the AI industry. Basically, you provide it with text input, and it generates text output that is nearly indistinguishable from what a human would write in the same context.
While it is not a new phenomenon within AI to be able to predict text in this way, what makes GPT-3 so groundbreaking is that it’s the first pre-trained algorithm that can produce text that can pass for something a human would write in any context. Based on the inputs, GPT-3 can produce anything from legal text to blog posts, short stories, songs, poetry, staff memos, even software code that will actually run.
How does GPT-3 work?
GPT-3 is a very complex, pre-trained algorithm. It has trained on truly massive amounts of text – there are half a trillion words in the algorithm’s training corpus, which is five times the amount of data contained in Wikipedia. It uses a complex neural network model that is many layers deep, containing a massive number of nodes using more than 175 billion (175 * 109!) parameters. By contrast, GPT-2, released last year, had only 1.5 billion parameters.
How big of a leap forward is this technology?
While I wouldn’t say this is a world-changing technology, at least not yet, it is an order-of-magnitude advance in AI. What was a hard problem – generating text that can pass for human output – has become a solved problem.
"What was a hard problem – generating text that can pass for human output – has become a solved problem."
This shift is similar to other AI problems we’ve tackled. For example, 10 years ago, we had no way of reliably recognizing and classifying objects in images. The problem of computer vision has existed for decades (think of postal offices and address recognition), but this part was solved roughly five years ago with ascent of convolutional neural networks and pretrained algorithms like YOLO or RetinaNet.
Today, an AI system can look at an image and tell whether it is a person, an animal, or an object. However, as an example of the long way to real-world use cases, object recognition is a key technology in autonomous driving, but to solve that requires much more than just the foundational technology.
Another more recent AI problem we’ve solved is exploration, which is not only used in ridesharing and dispatching systems to find the most efficient routes, but can be tested in immersive computer games where the player needs to move in some goal-directed manner. One example is the old arcade game, Montezuma's Revenge. There have been pathfinding AI algorithms that could play this game, but most of them failed after a minute or two. A new reinforcement learning algorithm, Go-Explore, can play the game pretty much forever.
Does the algorithm actually understand what is being asked of it? Can it pass a Turing test?
GPT-3 is essentially mindless. While it can come close to responding as a human would, based on its massive amount of training data, it doesn’t have any comprehension at all. Nor does it have creativity – it can write poetry, but it can’t create new or original poetry. While it may respond like a human, it can’t make the same sort of intellectual and intuitive leaps a human can.
"GPT-3 is essentially mindless. While it can come close to responding as a human would, based on its massive amount of training data, it doesn’t have any comprehension at all."
For example, if you ask it a ‘normal’ question like, “Which is bigger, an elephant or a mouse?” it can answer that the elephant is bigger. But if you ask an odd question, like, “Which is bigger, the Empire State building or a pencil?” – something a human could answer instinctively, but for which there is no context in the training data – the algorithm will guess, and will often answer incorrectly, which makes it clear that it does not actually understand the question.
The Turing test investigates whether people can detect if they are talking to machines or humans. So, it is possible to get responses from GPT-3 that are nearly indistinguishable from those of a human, which brings us very close to having an AI that can pass a Turing test. But we’re not quite there yet.
Lately, Twitter has gone crazy about GPT-3. Is the hype about this technology justified?
An AI that writes predictive text is not new. We’ve had AIs that can write in the style of Shakespeare, AIs that can generate code, even AIs that can generate music in the style of JS Bach. But those AIs are each trained for a specific case. They cannot work in any context other than the one in which they’re trained. GPT-3 works in any textual context, which is a massive scientific advance, one of the major advances of the last few years. The hype about this advance is certainly justified.
"GPT-3 works in any textual context, which is a massive scientific advance, one of the major advances of the last few years. The hype about this advance is certainly justified."
But the hype can go too far. Yes, you can use it to create text that accurately predicts what people expect – for example, a business plan that is coherent, logical, and looks like a real business plan. But there will be no analysis or comprehension involved. It’s a lot like the way I respond sometimes when I find myself in a meeting where I don’t know anything about the technology being discussed. When it’s my turn to speak, I might say, “That’s good, but does it scale?” or “Let’s take a few steps back.” It’s an appropriate response but doesn’t illustrate my comprehension of the subject. That’s more or less the way this algorithm will respond under similar circumstances.
What opportunities does GPT-3 create for founders and VCs?
We are really just at the start of the race. The algorithm is in private beta, the first few companies have gained access to the API, and we’re just beginning to see the technology applied to real problems. Going forward, there’s going to be a marathon from scientific research and tech demos to actual products and features. Legal documents and contracts appear to be a good initial use case, but there will be many more that we haven’t even dreamed of yet.
"Going forward, there’s going to be a marathon from scientific research and tech demos to actual products and features."
On the con side, as with videos, there is the very real danger of deep fakes. The risks range from school essays that fool teachers to significantly more sophisticated phishing scams and misinformation bots. Typically, you could understand fake text from the quality or tone, but with GPT-3 the fake texts can came across very authentic. GPT-3 has initially been released as an API and Open AI will most likely monitor harmful use cases. But since the research is public it will become commonplace and proliferate to harmful use cases in the future. That’s why we will need to find innovations to determine whether documents have been algorithmically created.
Looking at the big picture, GPT-3 has the potential to create a great many opportunities in a wide range of industries. For example, instead of focusing development efforts on text generation for a specific context, companies can use GPT-3 as a core technology for generating text and focus their development efforts and resources on the capabilities and features that differentiate their products. In the end, text is just data. What creates value from that data – unique product features and capabilities – is what’s important. GPT-3 can remove barriers, enabling companies to do things that were simply not possible previously.
Continue the discussion with @attehonkasalo on Twitter.