The shift from text to more dynamic AI experiences
Closing the distance between what’s in your mind and your ability to bring it to life with ChatGPT Images and more.
Humans don’t just think in words. In fact, some of our most compelling ideas often begin as images, sounds, movements, and patterns in our minds. For AI to help us reach our full potential, it needs to communicate in ways that match how we naturally absorb and process the world.
Over the past few months, I’ve talked about how ChatGPT is evolving from a reactive, text-based product into something more intuitive and connected to any of the tasks you want to accomplish. The shift from text to multimedia and dynamic UI is an important part of that transformation, and I’m excited about the progress we’re making.
Many people’s first experience with ChatGPT involves turning a text prompt into a picture. It’s a magical way to see what this technology can do, but the chat interface wasn’t originally designed for this. Creating and editing images is a different kind of task and deserves a space built for visuals. Today we launched a new image gen model and a dedicated entrypoint in ChatGPT for images that works more like a creative studio. The new image viewing and editing screens make it easier to create images that match your vision or get inspiration from trending prompts and preset filters. On top of that, our new model is faster and better at following detailed instructions so you get more accurate edits and creative transformations. It keeps key elements like lighting, composition, and likeness consistent between inputs and outputs, so the results stay much closer to what you imagined.
There are many other use cases that can benefit from interfaces that go beyond text. For example, when you’re researching products or restaurants, you don’t just want a report describing options; you want to see photos and side-by-side specs that help you decide. When you’re learning about new topics, you want to be able to go deeper without losing your place in a thread. We’re improving answers to bring in more visuals with clear sources and adding new ways to get additional context. Coming soon, answers will start to highlight important people, places and products, which you can tap to instantly pull up more information without asking a follow up question. You’ll be able to highlight any word or phrase in an answer, and ChatGPT will tell you more about it.
The same idea applies to other everyday tasks. For things like converting measurements or getting sports scores, you want a fast visual answer that you can absorb at a glance. (This will be great for my husband, who is often doing both in the kitchen.) We’re rolling out a number of these types of utilities in ChatGPT and will continue to add more over time.
We are also improving how writing works inside ChatGPT. Even though writing is text-based, there are important design elements that can make the experience better. Our first version of canvas resonated with people as a writing tool, but it pulled you out of the flow of conversation. We’re working on integrating writing blocks inside of chat, so you can edit in line or switch to full-screen mode when you need it. You’ll also start to see more relevant options based on what you’re writing, so when you’re drafting a report, we’ll make it easy to download a PDF or Word doc. If you’re getting help with an email or text, we’ll make it easy to open the final version in your email or messaging app.
Apps in ChatGPT are another way we’re bringing rich, interactive experiences to your conversations so you can pull in the right tools and actually take action. Earlier this year, we introduced apps from partners like Booking.com, Canva, Coursera, Expedia, Figma, Spotify, and Zillow. Soon even more apps will be available in a new directory, including Adobe, Airtable, Apple Music, Clay, Lovable, OpenTable, Replit, and Salesforce, and other developers will be able to submit their apps for review. We know that we can’t build everything ourselves, and ChatGPT is even more useful when it can connect to the services you already use and surface the right tool at the right moment.
Across all of these areas and others to come, it’s exciting to see ChatGPT move from being primarily text-based and conversational, toward a fully generative UI that brings in the right components based on what you want to do. When you’re creating, you should be able to see and shape the thing you’re making. When visuals tell a story better than words alone, ChatGPT should include them. When you need a quick answer or the next step lives in another tool, it should be right there. As we do this, we can keep closing the distance between what’s in your mind and your ability to bring it to life.





Message très juste sur l’évolution de ChatGPT vers des interfaces plus naturelles et multimodales. On sent bien l’ambition de réduire la distance entre l’intention et l’action.
De mon côté, en utilisant l’IA au quotidien sur des problèmes très concrets (techniques, humains, pédagogiques), je me rends compte qu’une autre couche devient clé en parallèle de l’UI : la calibration de l’utilisateur.
Avant d’accélérer, aider à comprendre comment une personne pense, apprend, résiste à la friction, ou se met en difficulté. Sinon, on risque parfois de supprimer trop vite une friction qui était justement structurante pour l’apprentissage et l’autonomie.
Les interfaces riches rapprochent l’IA de notre manière naturelle de créer.
La compréhension fine de l’utilisateur permet, elle, de rester juste dans l’aide apportée.
Les deux combinées ouvrent, je pense, quelque chose de vraiment puissant et responsable.
So true. Every different type of user task would benefit so much more if it’s offered through an intuitive interface. Creating visuals needs a more visual interface. For example my dad just couldn’t get the hang of what to type to get what he wanted.
Similarly a shopping experience would need a different experience, it’s okay to tell a chatbot that you need a specific phone at a specific price, whole other ball game to buy a Christmas present for your toddler niece.