OpenAI has announced that its powerful GPT-4 Turbo with Vision model is now generally available through the company’s API, opening up new opportunities for enterprises and developers to integrate advanced language and vision capabilities into their applications.
The launch of GPT-4 Turbo with Vision on the API follows the initial release of GPT-4’s vision and audio upload features last September and the unveiling of the turbocharged GPT-4 Turbo model at OpenAI’s developer conference in November.
GPT-4 Turbo promises significant speed improvements, larger input context windows of up to 128,000 tokens (equivalent to about 300 pages), and increased affordability for developers.
A key enhancement is the ability for API requests to utilise the model’s vision recognition and analysis capabilities through text format JSON and function calling. This allows developers to generate JSON code snippets that can automate actions within connected apps, such as sending emails, making purchases, or posting online. However, OpenAI strongly recommends building user confirmation flows before taking actions that impact the real world.
Several startups are already leveraging GPT-4 Turbo with Vision, including Cognition, whose AI coding agent Devin relies on the model to automatically generate full code:
Devin, built by @cognition_labs, is an AI software engineering assistant powered by GPT-4 Turbo that uses vision for a variety of coding tasks. pic.twitter.com/E1Svxe5fBu
— OpenAI Developers (@OpenAIDevs) April 9, 2024
Healthify, a health and fitness app, uses the model to provide nutritional analysis and recommendations based on photos of meals:
The @healthifyme team built Snap using GPT-4 Turbo with Vision to give users nutrition insights through photo recognition of foods from around the world. pic.twitter.com/jWFLuBgEoA
— OpenAI Developers (@OpenAIDevs) April 9, 2024
TLDraw, a UK-based startup, employs GPT-4 Turbo with Vision to power its virtual whiteboard and convert user drawings into functional websites:
Make Real, built by @tldraw, lets users draw UI on a whiteboard and uses GPT-4 Turbo with Vision to generate a working website powered by real code. pic.twitter.com/RYlbmfeNRZ
— OpenAI Developers (@OpenAIDevs) April 9, 2024
Despite facing stiff competition from newer models such as Anthropic’s Claude 3 Opus and Google’s Gemini Advanced, the API launch should help solidify OpenAI’s position in the enterprise market as developers await the company’s next large language model.
(Photo by v2osk)
See also: Stability AI unveils 12B parameter Stable LM 2 model and updated 1.6B variant
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.