Written by Roy van Rijn (royvanrijn.com) on Nov 20, 2019 15:42:11

OpenAI GPT-2 is amazing

This week I’ve finally gotten around to research (play with) OpenAI’s groundbreaking language model GPT-2 using talktotransformer.com.

What is OpenAI?

OpenAI is an AI research organization founded by Elon Musk in January 2016 to explore, develop, and deploy technology for the benefit of humanity. It’s headquartered in San Francisco, with headquarters in Mountain View, California.

And GPT-2?

OpenAI recently unveiled a language model called GPT-2 that, given some input text, predicts the coming sentences. GPT-2 also supports the idea of the “Grammar of Reasoning” (GRR), in which the model attempts to extract sentences that would make the most intuitive sense, in terms of human understanding. For example, if you input “the dog ate the cat” and the model predicts “dogs eat cats”, the GRR system would make that sentence as the most probable result. The problem is that even though GRR is very good at answering the question, its predictions are still highly contingent. For example, it is still possible to be a cat-eating dog, and then someone else who doesn’t eat cats can still eat your dog, but the GRR system would not make such a mistake for the second sentence.

GPT-2 uses an adaptive model to learn the most relevant concepts to solve complex problems. This is accomplished by using the most relevant words as words, rather than the least relevant words. The model is adaptive to the structure of the content being analyzed.

How does it work?

There are several components involved in GPT. GPT uses a deep learning based model to understand a large vocabulary.

The model consists of a neural network, which is composed of multiple layers. Each layer is a separate layer that processes a different input word. GPT uses two neural network layers for each of the following steps:

First, each layer learns to recognize a particular word. The layers for both input and target are trained together to improve the model’s performance over time.

Second, the neural networks are used to predict the target word for the current context. To do this, the first neural network is used to learn about the target word. The target is then used to select a second neural network layer, which is used to predict the word that is closest to the current context.

Is the model too powerful?

This is a good question to ask, but it should be taken with a grain of salt. In fact, the GPT-2 is actually quite limited in the way it learns the most relevant concepts. This is in part due to its learning model being a probabilistic algorithm that assumes a model of the world as a probabilistic system. This model is an attempt to describe all the possible states of the world, but the way it learns these states is by taking into account the best model for each state and comparing to that.

A GPT-2 surprise

Surprise!

This entire blogpost above has been generated by OpenAI’s GPT-2. I’ve only given the first sentence and the four headers as input. After that I’ve let GPT-2 fill in the blanks (of course I’ve cherry-picked here). The quality is amazing and I’m pretty sure there are people that made it this far… and had no clue that what they’ve been reading was 100% generated by a machine.

Be sure to play around with it at talktotransformer.com