OpenAI has announced the release of the newest version of its large language model, GPT-4. Which the company claims exhibits “human level performance” on various professional tests. This latest model is larger than its predecessors, having been trained on more data and possessing more weights in its model file, making it more expensive to run. The approach used in developing GPT-4 involves “scaling up” to achieve better results. Which many researchers in the field believe is responsible for recent advancements in AI.
The company used Microsoft Azure to train GPT-4, with Microsoft investing billions in the startup. While OpenAI did not reveal details about the specific model size or hardware used in training it, citing “the competitive landscape,” it is known that the model was trained on thousands of supercomputers, which could cost tens of millions of dollars.
OpenAI: ChatGPT is getting much smarter!
GPT-4 is expected to power many artificial intelligence demos in the coming weeks. Bing’s AI chatbot already using it, according to Microsoft. OpenAI claims that the new model will produce fewer factually incorrect answers and go off topic less often. It will also perform better than humans on many tests, achieving scores at the 90th percentile on a simulated bar exam, the 93rd percentile on an SAT reading exam, and the 89th percentile on the SAT Math exam.
GPT-4 vs GPT-3.5
According to OpenAI, while the difference between GPT-3.5 and GPT-4 may not be immediately noticeable during casual conversation, the superiority of GPT-4 becomes evident when the conversation delves deeper. OpenAI claims that as artificial intelligence tasks become more complex, GPT-4 is expected to demonstrate greater reliability and creativity than its predecessor. OpenAI also provides test results to support this advancement, showing that GPT-4 outperforms its predecessor in almost every field. The test results of both GPT-4 and GPT-3.5 are listed below:
Gizchina News of the week
Simulated exams | GPT-4 | GPT-4 (no vision) | GPT-3.5 |
Uniform Bar Exam (MBE+MEE+MPT) | 298β/β400~90th | 298β/β400~90th | 213β/β400~10th |
LSAT | 163~88th | 161~83rd | 149~40th |
SAT Evidence-Based Reading & Writing | 710β/β800~93rd | 710β/β800~93rd | 670β/β800~87th |
SAT Math | 700β/β800~89th | 690β/β800~89th | 590β/β800~70th |
Graduate Record Examination (GRE) Quantitative | 163β/β170~80th | 157β/β170~62nd | 147β/β170~25th |
Graduate Record Examination (GRE) Verbal | 169β/β170~99th | 165β/β170~96th | 154β/β170~63rd |
Graduate Record Examination (GRE) Writing | 4β/β6~54th | 4β/β6~54th | 4β/β6~54th |
USABO Semifinal Exam 2020 | 87β/β15099thβ100th | 87β/β15099thβ100th | 43β/β15031stβ33rd |
USNCO Local Section Exam 2022 | 36β/β60 | 38β/β60 | 24β/β60 |
Medical Knowledge Self-Assessment Program | 75% | 75% | 53% |
Codeforces Rating | 392below 5th | 392below 5th | 260below 5th |
AP Art History | 586thβ100th | 586thβ100th | 586thβ100th |
AP Biology | 585thβ100th | 585thβ100th | 462ndβ85th |
AP Calculus BC | 443rdβ59th | 443rdβ59th | 10thβ7th |
However, the company warns that GPT-4 is not perfect and is less capable than humans in many scenarios. The model still suffers from “hallucination,” or making up facts, and is not always factually reliable. It is prone to insisting that it is correct, even when it is wrong. OpenAI said that GPT-4 has limitations that it is working to address, such as social biases, hallucinations, and adversarial prompts.
The new model will be available to paid ChatGPT subscribers and as part of an API that programmers can integrate into their apps. OpenAI will charge approximately 3 cents for about 750 words of prompts and 6 cents for about 750 words in response.
Overall, the release of GPT-4 represents a significant step forward in the development of AI and natural language processing. While it is not without its limitations, the model’s ability to perform at or above human levels on standardized tests suggests that it has the potential to be a valuable tool for a wide range of applications, from chatbots to search engines and more. OpenAI continues to refine and improve its technology. And we can expect to see even more impressive strides in the field of AI in the years to come.