GPT-4o

Generative Pre-trained Transformer 4 Omni (GPT-4o)
Developer(s)	OpenAI
Initial release	May 13, 2024; 3 months ago
Predecessor	GPT-4 Turbo
Type	Multimodal; Large language model; Generative pre-trained transformer; Foundation model;
License	Proprietary
Website	openai.com/index/hello-gpt-4o

GPT-4o ("Generative Pre-trained Transformer 4 Omni, GPT-4 Omni") is a multilingual, multimodal generative pre-trained transformer designed by OpenAI. It was announced by OpenAI's CTO Mira Murati during a live-streamed demo on 13 May 2024 and released the same day.^[1] GPT-4o is free, but with a usage limit that is 5 times higher for ChatGPT Plus subscribers.^[2] Its API is twice as fast and half the price of its predecessor, GPT-4 Turbo.^[1]

Background

GPT-4o was originally shadow launched on the Large Model Systems Organization (LMSYS) as 3 different models. These 3 models were called gpt2-chatbot, im-a-good-gpt2-chatbot, and im-also-a-good-gpt2-chatbot.^[3] On 7 May 2024, Sam Altman tweeted "im-a-good-gpt2-chatbot", which was commonly interpreted as a confirmation that these were new OpenAI models being A/B tested.^[4]

Capabilities

GPT-4o achieves state-of-the-art results in voice, multilingual, and vision benchmarks, setting new records in audio speech recognition and translation.^[5]^[6] GPT-4o scores 88.7 on the Massive Multitask Language Understanding (MMLU) benchmark compared to 86.5 by GPT-4.^[6] For voice-to-voice—unlike GPT-3.5 and GPT-4 which convert the voice to text, give the text to the model, then convert the text back to voice using another model—GPT-4o natively supports voice-to-voice making the response near instant and seamless.^[6]^[7]

The model supports over 50 languages,^[1] which OpenAI claims cover over 97% of speakers.^[8] Mira Murati demonstrated the model's multilingual capability by speaking Italian to the model and having it translate between English and Italian during the live-streamed OpenAI demo event on 13 May 2024. In addition, the new tokenizer uses fewer tokens for certain languages, especially languages that are not based on the Latin alphabet, making it cheaper for those languages.^[6]

GPT-4o has knowledge upto October 2023^[9]^[10] and has a context length of 128k tokens^[9] with output token limit capped to 2048.^[10]

As of May 2024, it is the leading model in the Large Model Systems Organization (LMSYS) Elo Arena Benchmarks by the University of California, Berkeley.^[11]

References

^ ^a ^b ^c Wiggers, Kyle (2024-05-13). "OpenAI debuts GPT-4o 'omni' model now powering ChatGPT". TechCrunch. Retrieved 2024-05-13.
^ Field, Hayden (2024-05-13). "OpenAI launches new AI model GPT-4o and desktop version of ChatGPT". CNBC. Retrieved 2024-05-14.
^ Edwards, Benj (2024-05-13). "Before launching, GPT-4o broke records on chatbot leaderboard under a secret name". Ars Technica. Retrieved 2024-05-17.
^ Zeff, Maxwell (2024-05-07). "Powerful New Chatbot Mysteriously Returns in the Middle of the Night". Gizmodo. Retrieved 2024-05-17.
^ van Rijmenam, Mark (13 May 2024). "OpenAI Launched GPT-4o: The Future of AI Interactions Is Here". The Digital Speaker. Retrieved 17 May 2024.
^ ^a ^b ^c ^d "Hello GPT-4o". OpenAI.
^ Altman, Sam. Twitter/X https://twitter.com/sama/status/1790817315069771959. Retrieved 16 May 2024. {{cite web}}: Missing or empty |title= (help)
^ Edwards, Benj (2024-05-13). "Major ChatGPT-4o update allows audio-video talks with an "emotional" AI chatbot". Ars Technica. Retrieved 2024-05-17.
^ ^a ^b "Models - OpenAI API". OpenAI. Retrieved 17 May 2024.
^ ^a ^b Conway, Adam (2024-05-13). "What is GPT-4o? Everything you need to know about the new OpenAI model that everyone can use for free". XDA Developers. Retrieved 2024-05-17.
^ Fedus, William. "GPT-4o is our new state-of-the-art frontier model".

[TechCrunch-1] Wiggers, Kyle (2024-05-13). "OpenAI debuts GPT-4o 'omni' model now powering ChatGPT". TechCrunch. Retrieved 2024-05-13.

[2] Field, Hayden (2024-05-13). "OpenAI launches new AI model GPT-4o and desktop version of ChatGPT". CNBC. Retrieved 2024-05-14.

[3] Edwards, Benj (2024-05-13). "Before launching, GPT-4o broke records on chatbot leaderboard under a secret name". Ars Technica. Retrieved 2024-05-17.

[4] Zeff, Maxwell (2024-05-07). "Powerful New Chatbot Mysteriously Returns in the Middle of the Night". Gizmodo. Retrieved 2024-05-17.

[5] van Rijmenam, Mark (13 May 2024). "OpenAI Launched GPT-4o: The Future of AI Interactions Is Here". The Digital Speaker. Retrieved 17 May 2024.

[Hello_GPT-4o-6] "Hello GPT-4o". OpenAI.

[7] Altman, Sam. Twitter/X https://twitter.com/sama/status/1790817315069771959. Retrieved 16 May 2024. {{cite web}}: Missing or empty |title= (help)

[8] Edwards, Benj (2024-05-13). "Major ChatGPT-4o update allows audio-video talks with an "emotional" AI chatbot". Ars Technica. Retrieved 2024-05-17.

[:0-9] "Models - OpenAI API". OpenAI. Retrieved 17 May 2024.

[:1-10] Conway, Adam (2024-05-13). "What is GPT-4o? Everything you need to know about the new OpenAI model that everyone can use for free". XDA Developers. Retrieved 2024-05-17.

[11] Fedus, William. "GPT-4o is our new state-of-the-art frontier model".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Background

Capabilities

See also

References