OpenAI announced the newer version of its primary large language model, GPT-4. OpenAI claims that the latest version exhibits human-level performance on many standardized tests.
ChatGPT-4 is larger than earlier versions. It is trained on more data and has additional weights in its model file, therefore making it more costly to run as well.
GPT-4 is a good example of an approach centering on scaling up for achieving better results.
OpenAI used Microsoft Azure to train the GPT-4.
Microsoft said, “Bing’s AI chatbot uses GPT-4.”
OpenAI said, “The new model will produce fewer factually incorrect answers, go off the rails and chat about forbidden topics less often, and even perform better than humans on many standardized tests.”
OpenAI claimed, “GPT-4 performed at the 90th percentile on a simulated bar exam, the 93rd percentile on an SAT reading exam, and the 89th percentile on the SAT Math exam.”
OpenAI warned that the latest software is not perfect yet, and it is less capable than humans in many cases.
The company said, “GPT-4 still has many known limitations that we are working to address, such as social biases, hallucinations, and adversarial prompts.”
OpenAI wrote, “In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.”