Language Modeling (LM) is one of the most important tasks in modern Natural Language Processing (NLP). LM is a probabilistic model which helps us to predict the next word or character in a document.
Generative Pre-Trained Transformer (GPT) can be considering as the game changer in the field of natural language understanding and a front runner in Language Modeling. It touches a number of diverse tasks such as textual entailment, answering question, document classification and evaluating semantics similarity. It deals with large unlabeled text which is abundant in nature and always presents a challenge. The GPT harness generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. GPT is very effective and outperforms discriminatively trained models that use architectures specifically crafted for each task.
There are three versions available, GPT, GPT-2 and GPT-3 till now.
GPT  is trained with a causal language modeling (CLM) objective so that it can become as powerful to predict next token in the sequence. The proposed GPT framework  aims to build a strong NLU. It uses a single task-agnostic model followed by generative pre-training and discriminative fine-tuning. With the introduction of pre-training on vast amount of text, GPT has acquired a healthy knowledge, which eventually helping us to solve discriminative tasks such as:
GPT utilizes a semi-supervised method for NLU, it uses a combination of unsupervised pre-training and supervised fine-tuning. GPT uses a large amount of unlabeled text and several datasets which are very well annotated pre-hand training examples (target tasks). GPT follows a 2 stage training procedure. It uses language modeling objective on unlabeled data and learn the parameters of neural network model. In second phase, these parameters are used to a target task using the corresponding supervised objective.
2. GPT 2
In the year 2019, OpenAI  published GPT-2, which was trained to recognize words in the vicinity. GPT-2, is a transformer-based language, is very much applicable in coherent writings as shown in Figure 1.
Figure 1: Transformer Architecture 
The system, which is a general-purpose language algorithm, utilized AI to remodel the language processing abilities. Leveraging above mentioned feature of GPT (1st version), it allows GPT-2 to generate syntactically consistent text. GPT-2 has given a new direction as we talk about text data. GPT-2, like its 1st generation GPT, is a pre-trained language model which we can use for various NLP tasks, such as:
The full GPT-2 model contains:
After training, it harnesses Transformers concept (proposed by Google), an encoder-decoders mechanism to detect input-output dependencies. The previously generated symbols are used as inputs for upcoming outputs. After this another additional normalization layer is added which lets it to generate a whole article. This is a very important development as other NLP models can only generate word, or at higher end can find the missing word in the sentence.
3. GPT 3
This is the third version of the NLP and it can do some outstanding things. Developed by OpenAI’s Generative Pretrained Transformer , GPT-3, is LM which can interpret text, answer questions, and accurately compose text. It analyzes a series of words, text, and other information then focuses on those examples to deliver a unique output as an article or a picture.
Following are the details based on which GPT 3 works:
GPT-3’s capabilities are remarkable. Following could be the potential applications
In GPT 3, we can provide the inputs as text and it produces the best outcome in the form of probable text. It takes this information i.e. the text input by user and the output and creates a subsequent piece. With training on this huge data, GPT 3 can perform anything, however it also suffers with few challenges when compared to human intelligence:
GPT can be considered as a next evolutionary step in AI. It opens a possible door to create a human like intelligence through machine learning. It has brought a revolution in AI and a step towards matching human intelligence, though it is still in infancy stage and we can expect a lot improvement. It has given a direction to us for better neural networks.