ChatGPT in a Nutshell

▲Founders of OpenAI: Sam Altman (left) and Elon Musk / Y Combinator

ChatGPT is a chatbot system developed by OpenAI and has been taking the world by storm lately as it can generate remarkably human-like responses to any topic of questions. ChatGPT is based on a large language model called Generative Pretrained Transformer also developed by OpenAI and is fine-tuned for human dialogue using reinforcement learning. The field of Artificial Intelligence is a relatively new-born field and has seen big success recently with the introduction of modern high-performance chips. Groundbreaking revolutions are common in this field as more than a hundred papers are published every day. What is so remarkable about ChatGPT and sets it apart from other perhaps equally impactful innovations is its pervasive applicability.
Language modeling is essentially a statistical inference method to model the probability of the next word given the sequence of previously occurring words, and the model simply chooses the most probable word from the distribution as the next occurring word. The common bottleneck with the traditional language models such as Recurrent Neural Nets was handling long-term dependency – more often than not, the context clue to determine the most probable word occurs at the beginning of the sentence, and the model has difficulty retaining this information through the long sequence of sentences. GPT aptly solves the long-term dependency issue with new paradigms of attention mechanism and transformer architecture. The attention mechanism enables the model to attend to appropriate words in a sentence regardless of their location, and Transformer is a model architecture that implements the attention mechanism to language modeling.
OpenAI has been progressively releasing new models of GPT since 2018 starting from version One and the latest being version 3.5. The number of trainable parameters of the model of the size of the training dataset has grown exponentially as well. GPT1 had 117 million parameters while GPT3 had 175 billion – anecdotally, sources claim GPT3 has been trained using a significant proportion of the Internet itself.

The potential use cases of this technology are unimaginable as it is able to (1) generate programming code from scratch, (2) find errors in a large codebase, (3) summarize long documents, and (4) generate virtually any type of text. This application in fact largely resembles a semantic search engine and has already been incorporated into the Bing search engine. As a consequence, the arms race between Google and Microsoft to gain competitive advantage has silently begun. On the other hand, ChatGPT has also caused several social disruptions as, for example, students are now able to use this application to write essays and cheat on assignments. The company has placed several constraints and safeguards on the model since its initial release as the model gave dangerously accurate responses to, for example, manufacturing a bomb or generating a SQL injection code.
The world was not ready to take the profound impact of ChatGPT – both good and bad. One can easily imagine how it could be used to, for example, disseminate disinformation on a systemic scale. As much as it is a productivity booster for engineers and writers alike, AI researchers must tread carefully with the release of newer models in the future. As Sam Altman, one of the founders of OpenAI, said in one of the interviews about the rumors on the new release of GPT4, “It will come out at some point when we are confident in doing it safely and responsibly. We are going to release technology much more slowly than people would like; we are going to sit on it for much longer than people like.”

▲Team of researchers and enigneers at OpenAI / Speak

Reporter Kim San 다른기사 보기