• July 4, 2024
  • Updated 6:57 pm

Sora AI: OpenAI’s new text-to-video model

Back
SEARCH AND PRESS ENTER
Recent Posts

OpenAI’s chatbots can help you pass college if you want them to. Apparently now you can, too. A new OpenAI app called Sora AI hopes to help you master filmmaking without having to go to film school. Still a research product, Sora focuses on multiple manufacturers and various security experts working together to find security vulnerabilities.

Read on to find out more!

What is Sora AI?

Sora AI is a generative AI tool that can help you create videos of up to 60 seconds with highly realistic images. It does all this in a short time and following all your instructions given in text. OpenAI plans to make it available to all potential authors at an unknown date, but has decided to keep it in preview.

Other companies, from giants like Google to startups like Runway, have also introduced text-to-video AI projects. But OpenAI says what sets Sora apart is its incredible photorealism (something not seen in competing products) and its ability to generate longer clips than the shorter clips other models typically produce, up to one minute.

The researchers we interviewed didn’t say how long it would take to play a full video, but described pressing it as “taking a few days off.” If you can believe the carefully curated examples I’ve seen, it’s worth the effort. OpenAI didn’t allow me to write a tutorial, but I shared four examples of Sora AI’s powerful features.

No one came close to the estimated one-minute limit; the longest was 17 seconds. The first arose from detailed instructions, like the meticulous preparation of a screenwriter. “The beautiful snow-covered city of Tokyo is bustling with activity. The camera moves through the busy city streets, following the snowy weather of the city, and follows several people shopping at nearby stalls. “Precious cherry petals flutter in the wind along with snowflakes.”

Also Read: KREA AI: Revolutionizing AI Generation for Image and Videos

How does Sora AI work?

The result is a compelling vision of what Tokyo will be like in that magical moment when snow and cherry blossoms coexist. The virtual camera follows the couple as they slowly walk through the cityscape, as if they were connected to a drone. One of the passers-by is wearing a mask. To the left, cars speed by on the riverside road, and to the right, small shops line up and shoppers come and go.

With the virtual camera still recording, it’s only after watching the clip a few times that you realize how unlikely the main characters (a couple walking down a snow-covered sidewalk) are. The sidewalk I live on seems like a dead end. They had to jump over a small fence until they reached a strange parallel hallway on the right.

Despite these small flaws, The Tokyo Model is a great exercise in world-building. In the future, production designers will debate whether this is a powerful contribution or a job killer. Furthermore, the people in this video, created entirely by digital neural networks, do not present close-ups or show emotions. However, the Sora AI team says there have been cases where fake actors have shown real emotions.

Other clips are impressive as well, notably “an animated shot of a short, hairy monster kneeling next to a crimson lamp,” along with detailed scene cues and image descriptions. When the last film was released there was a lot of talk about how difficult it was for Pixar to create very complex textures.

Sora AI Features

Sora’s most surprising ability is one he is not trained for. Using OpenAI’s Dalle-3 painter and a version of the diffusion model used in the GPT-4 transformer-based engine, Sora not only produces videos that meet targeting requirements, but also demonstrates an increasing understanding of them.. Cinematographic grammar.

It is a gift for stories. Another video was created from requests for “a beautifully illustrated paper world of coral reefs filled with colorful fish and sea life.” Bill Peebles, another researcher on the project, notes that Sora created narrative momentum through camera angles and timing. “Actually, the scenario has changed a lot. They are not related to each other, but the model generates them on the fly,” he says. «I didn’t tell him to do that. She did it automatically.

In another example, Sora AI recreated a zoo. She began by putting the name of the zoo on a large sign, the video gradually scrolled and then the other scenes reflected the variety of animals that live in the zoo. She made her look beautiful and it was definitely a movie she shouldn’t have made.

One feature of Sora that the OpenAI team hasn’t demonstrated and may not release for some time is the ability to create a video from a single image or a series of frames. “This will be another fun way to enhance storytelling,” says Brooks. OpenAI recognizes that this feature has significant potential to generate false information and disinformation.

Also Read: Unlocking YouTube Success with 2short AI Content Creation Tool

Limitations of this model

Sora AI has the same content restrictions as Dall-E 3. That means no violence, no pornography, no impersonations of real people or the styling of named artists. Additionally, like Dall-E 3, OpenAI allows viewers to determine the results generated by the AI.

But OpenAI says security and reliability are ongoing issues at many companies. “Tackling misinformation requires some restraint on our part, but it also requires public understanding, and social media must adapt to that,” said Aditya Ramesh, principal investigator and head of the Dall-E team.

Another potential issue is whether the content of Sora’s video infringes on someone else’s copyrighted work. The training data comes from publicly available content as well as licensed content, according to Peebles.At the center of the many lawsuits against OpenAI is whether publicly available copyrighted content is a valid subject for AI training.

It will be a long time before text-to-video becomes a threat to real cinema. No, you can’t combine 120 one-minute Sora clips to make a coherent movie. This is because the models do not respond to signals in the same way.

Continuity is impossible. But time limits are no obstacle for Sora and similar programs to transform TikTok, Reels and other social platforms.

Dev is a seasoned technology writer with a passion for AI and its transformative potential in various industries. As a key contributor to AI Tools Insider, Dev excels in demystifying complex AI Tools and trends for a broad audience, making cutting-edge technologies accessible and engaging.

Leave Your Comment