The AI community is buzzing with excitement as OpenAI CEO Sam Altman confirmed the development of GPT-5 in a January 2024 podcast with Bill Gates. According to insiders, we might see GPT-5 hit the scene as early as mid-2024.
Insider Insights
While GPT-4 is impressive, Altman believes it’s just the tip of the iceberg. At the World Government Summit, he likened current AI technology to the early days of cell phones. We’re about to witness an AI revolution, and GPT-5 is leading the charge.
Development and Release Timeline
Announcement and Development Confirmation
Sam Altman dropped the bombshell news about GPT-5’s development earlier this year. Since then, anticipation has been building as we await more details.
Expected Release Date
Experts are speculating a release window from May to December 2024. Some believe it could coincide with or follow the U.S. elections, pointing towards a late 2024 launch.
The training period for the new model is expected to be 4-6 months, which is twice the duration of the 3-month training time for GPT-4. This extended period will likely involve reinforcement learning, red teaming, and additional testing before the model is released. However, the timeline remains uncertain, and OpenAI may need to adjust the launch date if unexpected challenges arise.
OpenAI’s GPT-4o launch demo featured, from left to right, CTO Mira Murati and research leads Mark Chen and Barret Zoph.
Key Features and Improvements
Enhanced Intelligence and Capabilities
GPT-5 promises to be significantly smarter than its predecessors. Altman emphasized that intelligence will be the standout feature, enhancing every aspect of the AI model.
Increased Reliability
Reliability has been a key focus for OpenAI. With GPT-4 users frequently mentioning issues like model stability and AI hallucinations, GPT-5 aims to provide a much more consistent and dependable experience.
Advanced Reasoning Abilities
One of the most exciting improvements in GPT-5 is its enhanced reasoning capabilities. This means better context understanding, inference making, and problem-solving skills – a major leap forward for AI.
OpenAI CEO Sam Altman at the GPT-4 Turbo launch event.
Expanded Multimodal Capabilities
Multimodality has been a cornerstone of the recent advancements in GPT technology. OpenAI continues to innovate in this area with the introduction of GPT-4o in May 2024. This new iteration significantly enhances text, voice, and vision capabilities, marking a substantial improvement over GPT-4 Turbo. GPT-4o excels in natural conversations, image analysis, visual descriptions, and complex audio processing.
The advancements in multimodality fundamentally change our interactions with GPT models. The ability to accurately interpret tonal variations and follow human-like speech patterns, as seen in GPT-4o, represents a significant leap in AI-driven natural language processing.
A frame from a 60-second clip generated by Sora, showing a woman walking in Tokyo.
Furthermore, OpenAI has hinted at the capabilities of their upcoming text-to-video model, Sora. This model is designed to replicate intricate camera movements and create highly detailed characters and environments in clips up to 60 seconds long. The commitment to multimodality is further emphasized by statements from OpenAI’s CEO, confirming that video processing and enhanced reasoning are top priorities for future GPT models.
Multimodality is becoming a key term in the evolution of AI models, and it’s easy to see why. While GPT-4 has made strides in enhancing its multimodal features, we can expect even greater integration of voice, video, and images in upcoming models.
Customization and Personalization
GPT-4 is frequently utilized as a versatile, all-purpose tool, but future versions are set to become more personalized. On Gates’ podcast, Sam Altman highlighted that customizability and personalization will be crucial for upcoming OpenAI models. “People want very different things out of GPT-4: different styles, different sets of assumptions,” he noted.
OpenAI has already made strides in this direction with Custom GPTs, allowing users to tailor the model for specific tasks, whether it’s teaching a board game or assisting kids with their homework. Although customization might not be the primary focus of the next update, it is expected to become a significant trend in the future.
In the meantime, you can customize a free AI chatbot powered by GPT for your business. It’s our expertise. Get started here..
Technical Specifications
Parameter Size Increase
Each new iteration of GPT has seen a substantial increase in parameter size, and it’s anticipated that GPT-5 will follow this trend. In transformer models like GPT, parameters encompass the weights and biases of neural network layers, including attention mechanisms, feedforward layers, and embedding matrices. These parameters are crucial as they determine the model’s capacity to learn from input data.
Although OpenAI has not disclosed the exact parameter size for their models, estimates suggest it could be around 1.5 trillion for the upcoming GPT-5. This represents a significant jump from GPT-3’s 175 billion parameters and an astronomical increase from GPT-2’s 1.5 billion.
AI expert Alan Thompson, who advises Google and Microsoft, predicts that GPT-5 could have between 2 to 5 trillion parameters. His projection is based on the observed trend of doubling both computing power and training time, which has notably extended the testing timeline compared to GPT-4.
Larger Context Windows
Visual comparison of GPT-4 Turbo’s context window (128,000 tokens) vs. Gemini’s context window (1 million tokens)
Context windows represent the number of tokens (words or subwords) a model can process at once. A larger context window allows the model to absorb more information from the input text, enhancing the accuracy of its responses.
One limitation of GPT-4 has been its relatively smaller context window. For instance, GPT-4 Turbo and GPT-4o support a context window of 128,000 tokens. In contrast, Google’s Gemini model offers a context window of up to 1 million tokens.
Currently, if the primary concern is a model’s ability to process large volumes of text, GPT-4 may not be the optimal choice. However, it is anticipated that OpenAI will address these limitations in future model versions.
Alan Thompson predicts a significant increase in context window size, potentially reaching up to 40 trillion tokens. Such an advancement would surpass the capabilities of the Gemini model, enabling the handling of massive datasets and significantly improving performance for OpenAI enterprise customers and users with substantial data input needs. This development could be a game-changer for AI model performance.
Pricing and Accessibility
Gpt-4o pricing
If OpenAI maintains their current pricing model, using GPT-5 will come at a premium. At present, ChatGPT with GPT-4 is available exclusively to paying users at $20 per month, while ChatGPT with GPT-3.5 remains free.
For API access, GPT-4 is priced at $30.00 per 1 million input tokens and $60 per 1 million output tokens, with these costs doubling for the 32k version. Given the anticipated capabilities of the new model, prices are likely to be higher than those of previous OpenAI GPT models.
However, OpenAI has made strides in affordability with their latest model. GPT-4o is priced at just $5 per 1 million input tokens and $15 per 1 million output tokens. While these pricing differences may not significantly impact enterprise customers, they represent a commendable effort by OpenAI to make their technology more accessible to individuals and small businesses.
The silver lining? The launch of GPT-5 could potentially make GPT-4 the new free model from OpenAI, offering advanced capabilities at no cost.
Training Data and Legal Considerations
GPT-5 is expected to continue leveraging available information from the internet as training data.
One significant challenge for OpenAI on its path to industry dominance has been the series of lawsuits regarding the model’s comprehensive training methods.
GPT models are trained on extensive datasets sourced from the internet, much of which is copyrighted. This unauthorized use of data has sparked numerous complaints and legal actions, including lawsuits from The New York Times and several U.S. news agencies, as well as allegations that the model’s training process violates the EU’s General Data Protection Regulation.
A California judge recently dismissed one of the copyright lawsuits against OpenAI, filed by a group of writers including celebrities Sarah Silverman and Ta-Nehisi Coates. Despite these legal challenges, there is currently no indication that they will significantly hinder OpenAI’s progress.
Future of AI and ChatGPT
Agent-like Capabilities
OpenAI COO Brad Lightcap hinted at plans to transform human-computer interaction, making future AI models more like agents than simple tools.
Will there be such a thing as a prompt engineer in 2026? You don’t prompt engineer your friend.
Predictions for AI Interaction
GPT-5 promises to bring us closer to a future where AI seamlessly integrates into our daily lives. Its enhanced capabilities will redefine how we interact with AI.