Table of Contents
ToggleOpenAI Released GPT-4, the Most Advanced Multimodal AI Model Yet
OpenAI has released GPT-4, a new image and text-understanding AI model the company calls the latest milestone in scaling up deep learning.
GPT-4 is available today to OpenAI’s paying users via ChatGPT Plus, and developers can sign up on a waitlist to access the API. The new AI model is a multimodal AI that generates text and accepts image and text inputs, which is an improvement over its predecessor GPT-3.5, which only accepts text.
GPT-4 is priced at $0.03 per 1,000 “prompt” tokens and $0.06 per 1,000 “completion” tokens. In this article, we explore the features and capabilities of GPT-4, it’s pricing, and its use cases.
GPT-4 Capabilities and Features
GPT-4 is a powerful image- and text-understanding AI model that can accept image and text inputs. It is capable of captioning and interpreting relatively complex images, such as identifying a Lightning Cable adapter from a picture of a plugged-in iPhone.
The new model has an interesting ability to understand images, and it can extrapolate and analyze the content of the image to provide valuable insights.
One of GPT-4’s more interesting features is its steerability tooling. With GPT-4, OpenAI is introducing a new API capability, “system” messages, that allows developers to prescribe style and task by describing specific directions.
System messages, which will also come to ChatGPT in the future, are essentially instructions that set the tone and establish boundaries for the AI’s next interactions.
GPT-4 Pricing
GPT-4 is priced at $0.03 per 1,000 “prompt” tokens and $0.06 per 1,000 “completion” tokens. Prompt tokens are the parts of words fed to the model, while completion tokens are the content generated by GPT-4.
A prompt token is about 750 words, and the completion token is about 750 words. GPT-4 is available today to OpenAI’s paying users via ChatGPT Plus, and developers can sign up on a waitlist to access the API.
GPT-4 Use Cases
GPT-4 has various use cases across different industries, including healthcare, education, e-commerce, and finance. Early adopters of GPT-4 include Microsoft, Stripe, and Duolingo.
Microsoft has confirmed that Bing Chat, its chatbot tech co-developed with OpenAI, is running on GPT-4.
Stripe uses GPT-4 to scan business websites and deliver a summary to customer support staff. Duolingo has built GPT-4 into a new language learning subscription tier.
Another early adopter of GPT-4 is Be My Eyes, which uses GPT-4 to power its new Virtual Volunteer feature.
Powered by GPT-4, the Virtual Volunteer feature can answer questions about images sent. For example, suppose a user sends a picture of the inside of their refrigerator.
In that case, the Virtual Volunteer will not only be able to identify what’s in it correctly but also extrapolate and analyze what can be prepared with those ingredients. The tool can also then offer a number of recipes for those ingredients and send a step-by-step guide on how to make them.
Final Thoughts
GPT-4 is a powerful image- and text-understanding AI model with various use cases across different industries. It is capable of captioning and interpreting relatively complex images, and it can extrapolate and analyze the content of the image to provide valuable insights. GPT-4 is available today to OpenAI’s paying users via ChatGPT Plus, and developers can sign up on a waitlist to access the API in the future.
As with any technology, there are potential drawbacks to consider. The use of AI models like GPT-4 can raise concerns around privacy, bias, and ethical use. It is important for developers and users to understand these issues and take steps to mitigate them.