ReportWire

Tag: nano banana pro

  • I built marshmallow castles in Google’s new AI world generator | TechCrunch

    [ad_1]

    Google DeepMind is opening up access to Project Genie, its AI tool for creating interactive game worlds from text prompts or images. 

    Starting Thursday, Google AI Ultra subscribers in the U.S. can play around with the experimental research prototype, which is powered by a combination of Google’s latest world model Genie 3, its image generation model Nano Banana Pro, and Gemini. 

    Coming five months after Genie 3’s research preview, the move is part of a broader push to gather user feedback and training data as DeepMind races to develop more capable world models. 

    World models are AI systems that generate an internal representation of an environment, and can be used to predict future outcomes and plan actions. Many AI leaders, including those at DeepMind, believe world models are a crucial step to achieving artificial general intelligence (AGI). But in the nearer term, labs like DeepMind envision a go-to-market plan that starts with video games and other forms of entertainment and branches out into training embodied agents (aka robots) in simulation. 

    DeepMind’s release of Project Genie comes as the world model race is beginning to heat up. Fei-Fei Li’s World Labs late last year released its first commercial product called Marble. Runway, the AI video generation startup, has also launched a world model recently. And former Meta chief scientist Yann LeCun’s startup AMI Labs will also focus on developing world models.

    “I think it’s exciting to be in a place where we can have more people access it and give us feedback,” Shlomi Fruchter, a research director at DeepMind, told TechCrunch via video interview, smiling ear-to-ear in clear excitement over Project Genie’s release.

    DeepMind researchers that TechCrunch spoke to were upfront about the tool’s experimental nature. It can be inconsistent, sometimes impressively generating playable worlds, other times producing baffling results that miss the mark. Here’s how it works.

    Techcrunch event

    Boston, MA
    |
    June 23, 2026

    A claymation-style castle in the sky made of marshmallows and candy.Image Credits:TechCrunch

    You start with a “world sketch” by providing text prompts for both the environment and a main  character, whom you will later be able to maneuver through the world in either first or third person view. Nano Banana Pro creates an image based on the prompts that you can, in theory, modify before Genie uses the image as a jumping off point for an interactive world. The modifications mostly worked, but the model occasionally stumbled and would give you purple hair when you asked for green.

    You can also use real life photos as a baseline for the model to build a world on, which, again, was hit or miss. (More on that later.) 

    Once you’re satisfied with the image, it takes a few seconds for Project Genie to create an explorable world. You can also remix existing worlds into new interpretations by building on top of their prompts, or explore curated worlds in the gallery or via the randomizer tool for inspiration. You can then download videos of the world you just explored. 

    DeepMind is only granting 60 seconds of world generation and navigation at the moment, in part due to the budget and compute constraints. Because Genie 3 is an auto-regressive model, it takes a lot of dedicated compute – which puts a tight ceiling on how much DeepMind is able to provide to users.

    “The reason we limit it to 60 seconds is because we wanted to bring it to more users,” Fruchter said. “Basically when you’re using it, there’s a chip somewhere that’s only yours and it’s being dedicated to your session.”

    He added that extending it beyond 60 seconds would diminish the incremental value of the testing.

    “The environments are interesting, but at some point, because of their level of interaction and the dynamism of the environment is somewhat limited. Still, we see that as a limitation we hope to improve on.”

    Whimsy works, realism doesn’t

    Google received a cease-and-desist from Disney last year, so it wouldn’t build models that were Disney-related.Image Credits:TechCrunch

    When I used the model, the safety guardrails were already up and running. I couldn’t generate anything resembling nudity, nor could I generate worlds that even remotely sniffed of Disney or other copyrighted material. (In December, Disney hit Google with a cease-and-desist, accusing the firm’s AI models of copyright infringement by training on Disney’s characters and IP and  generating unauthorized content, among other things.) I couldn’t even get Genie to generate worlds of mermaids exploring underwater fantasy lands or ice queens in their wintery castles. 

    Still, the demo was deeply impressive. The first world I built was an attempt to live out a small childhood fantasy, in which I could explore a castle in the clouds made up of marshmallows with a chocolate sauce river and trees made of candy. (Yes, I was a chubby kid.) I asked the model to do it in claymation style, and it delivered a whimsical world that childhood me would have eaten up, the castle’s pastel-and-white colored spires and turrets looking puffy and tasty enough to rip off a chunk and dunk it into the chocolate moat. (Video above.)

    A “Game of Thrones” inspired world that failed to generate as photo-realistically as I wanted.Image Credits:TechCrunch

    That said, Project Genie still has some kinks to work out. 

    The models excelled at creating worlds based on artistic prompts, like using watercolors, anime style or classic cartoon aesthetics. But it tended to fail when it came to photorealistic or cinematic worlds, often coming out looking like a video game rather than real people in a real setting. 

    It also didn’t always respond well when given real photos to work with. When I gave it a photo of my office and asked it to create a world based on the photo exactly as it was, it gave me a world that had some of the same furnishings of my office – a wooden desk, plants, a grey couch – laid out differently. And it looked sterile, digital, not lifelike. 

    When I fed it a photo of my desk with a stuffed toy, Project Genie animated the toy navigating the space, and even had other objects occasionally react as it moved past them.

    That interactivity is something DeepMind is working on improving. There were several occasions when my characters walked right through walls or other solid objects. 

    I asked Project Genie to animate a stuffed toy (Bingo Bronson) so it could explore my desk. Image Credits:TechCrunch

    When DeepMind released Genie 3 initially, researchers highlighted how the model’s auto-regressive architecture meant that it could remember what it had generated, so I wanted to test that by returning to parts of the environment it generated already to see if it would be the same. For the most part, the model succeeded. In one case, I generated a cat exploring yet another desk, and only once when I turned back to the right side of the desk did the model generate a second mug.

    The part I found most frustrating was the way you navigated the space using the arrows to look around, the spacebar to jump or ascend, and the W-A-S-D keys to move. I’m not a gamer, so this didn’t come naturally to me, but the keys were often non-responsive, or they sent you in the wrong direction. Trying to walk from one side of the room to a doorway on the other side often became a chaotic zigzagging exercise, like trying to steer a shopping cart with a broken wheel. 

    Fruchter assured me that his team was aware of these shortcomings, reminding me again that Project Genie is an experimental prototype. In the future, he said, the team hopes to enhance the realism and improve interaction capabilities, including giving users more control over actions and environments. 

    “We don’t think about [Project Genie] as an end-to-end product that people can go back to everyday, but we think there is already a glimpse of something that’s interesting and unique and can’t be done in another way,” he said.

    [ad_2]

    Rebecca Bellan

    Source link

  • Those Viral Photos of Elon and Zuck Are AI. But Google Launched a New Way to Check for Fakes

    [ad_1]

    Photos appearing to show Elon Musk and several other Big Tech CEOs have gone viral in the past week on X and Bluesky. The mundane environments, including humble apartments and McDonald’s parking lots, should have given everyone a hint that they’re fake. But there’s a new way for the average person to check for themselves whether the images were made with AI. And it’s actually really useful.

    Right off the bat, it should be said that the vast majority of AI image detectors are not reliable. Many people think you can use tools that are openly available on the web and figure out if a given image is AI. But they’re not good. For example, people often ask Grok on X whether a photo was created with generative artificial intelligence. And it frequently gets the answer wrong. Sometimes in amusing ways.

    Google developed an AI watermark called SynthID a couple of years ago, but the company didn’t allow the average user to check whether an image had the watermark. That changed just a few days ago. Now anyone can upload an image to Gemini and ask if it has the SynthID watermark, which is invisible to the naked eye.

    The watermark is embedded in the pixels and every image created with Google’s AI creation tools will have it. Checking for the watermark is now easy for anyone who opens up Gemini.

    From Google’s announcement:

    If you see an image and want to confirm it has been made by Google AI, upload it to the Gemini app and ask a question such as: “Was this created with Google AI?” or “Is this AI-generated?”

    Gemini will check for the SynthID watermark and use its own reasoning to return a response that gives you more context about the content you encounter online.

    Obviously Gemini is less equipped to tell you if an image is AI if it wasn’t made with Google tools like Nano Banana Pro. And that’s the entire reason the company appears to be launching SynthID detection in Gemini in this moment. Nano Banana Pro launched last week and it’s allowing users to make incredibly realistic images, including images of Elon Musk and other tech CEOs that look very real.

    Some of those images have recently gone viral, like one that racked up nearly 9 million views on X before migrating to other platforms like Bluesky. The image shows Musk, Nvidia CEO Jensen Huang, Google CEO Sundar Pichai, Apple CEO Tim Cook, Amazon founder Jeff Bezos, Microsoft CEO Satya Nadella, and Meta CEO Mark Zuckerberg all standing together in a small apartment.

     

    Other versions of the image include OpenAI CEO Sam Altman, with the men standing around in a parking lot, pictured at the top of this article. For some reason, Musk is seen smoking a cigar in a couple of them. Another image showed the men in the parking lot from a different angle. And still another had the men eating McDonald’s on the ground with a Cybertruck in the background.

    If you run any of these images through Gemini it confirms they all have the SynthID watermark. If you’re wondering whether an image appears too weird to be true, it’s probably a good idea to check with Gemini.

    Did you see that viral image of President Donald Trump with Bill “Bubba” Clinton in a very compromising position? Running that image through Gemini confirms it was made with Google’s AI image generator. Gemini won’t necessarily be able to ID every AI image with certainty. But if you run an image through Gemini and it tells you the “photo” has the SynthID watermark, you know it’s not real.

    Fake images are still going to be everywhere in the current social media environment. But at least Google has given the average user a new tool to identify at least some of the fakes for themselves. It’s only going to get harder and harder to recognize AI-generated content as the years progress. Sometimes you just need to apply some common sense. For example, do you think Elon Musk and Sam Altman would be hanging out in a parking lot together? Given their very public conflicts, that seems very unlikely.

    Then again, it seemed very unlikely that Musk and President Trump would become friendly again after the Tesla CEO accused Trump of being in the Epstein files. Weirder things have happened when billions of dollars are at stake.

    [ad_2]

    Matt Novak

    Source link