OpenAI Claims GPT-4 Turbo can Handle Picture Inputs and Perform Better Problem-Solving

By Consultants Review Team Thursday, 11 April 2024

The Sam Altman-led business has now disclosed a new GPT-4 update. With this latest version, the GPT-4 Turbo can now take picture inputs and solve problems more effectively. GPT-4 is a big multimodal model, according to a blog post by OpenAI, that can "solve difficult problems with greater accuracy than any of their previous models, owing to its broader general knowledge and advanced reasoning capabilities."

The most recent version, dubbed GPT-4 Turbo with Vision, is now publicly accessible as an API for developers. Additionally, OpenAI has stated that GPT-4 Turbo with Vision may shortly be available on ChatGPT. Not many specifics on the same, nevertheless, have been made public.

"GPT-4 Turbo with Vision is now generally available in the API," OpenAI stated in a post on X. JSON format and function calling are now available for vision requests."

GPT-4 Turbo can now interpret and analyze photos, movies, and other multimedia inputs and provide comprehensive answers and insights with to the integration of vision technology. This move into computer vision gives developers new opportunities to work with, allowing them to create cutting-edge apps for a variety of sectors.

The addition of JSON mode and function calling, which enable developers to automate tasks within their apps using JSON code snippets, is one of the update's standout features. By improving productivity and streamlining processes, this should make it simpler for developers to include GPT-4 Turbo with Vision into their applications.

With a context window of 128,000 tokens, the enhanced AI model is trained on data through December 2023.

In a same vein, The New York Times recently reported that OpenAI encountered a lack of training data when creating its Whisper audio transcription model. Despite the questionable legality of this method, the corporation apparently transcribed over a million hours of YouTube videos to train their GPT-4 language model in order to get past this problem. Greg Brockman, the president of OpenAI, supposedly had a direct hand in finding these films.

The research also stated that by 2021, OpenAI will have used up all of its traditional data sources, which sparked debates over transcription of podcasts, audiobooks, and YouTube videos. Before this, the business used a variety of datasets to train its models, such as instructional Quizlet material and computer code from GitHub.

Current Issue