OpenAI’s Sora will generate videos with audio

In an interview with the Wall Street Journal, OpenAI’s chief technology officer Mira Murati gave an update on the technology, when it’s going to be public and what’s to come.

What is Sora?

OpenAI is the company behind the generative AI ChatGPT chatbot. Following on from that technology, OpenAI announced Sora. Sora can generate videos up to a minute long based on a text description. In addition, Sora can generate complex scenes with multiple characters, specific types of motion and accurate details of the subject and background. The AI will also create multiple shots within a single generated video. OpenAI shared a selection of videos generated by Sora, and the results were very impressive.

When will Sora be available?

Currently, Sora is only available to a limited number of visual artists, designers and filmmakers. However, Murati said that Sora will be available to everyone this year, and it could only be “a few months.” In addition, Murati said that OpenAI intends that Sora will be able to incorporate audio to make the videos even more realistic. There was no timescale on this goal though, other than “eventually.”

Editable videos

Murati also told the Wall Street Journal that OpenAI wants users to be able to edit the content that Sora generates. She said, “We’re trying to figure out how to use this technology as a tool that people can edit and create with.” This would also mean that you could correct the generated video if the AI wasn’t accurate.

Training Sora

The sources of the data used by companies to train generative AI is always a hot topic. However, Murati didn’t want to go into the specifics but said it was “publicly available or licensed data.” OpenAI also has a partnership with Shutterstock, and Murati confirmed that content from that site was used when creating Sora.

Expensive to run

Sora is obviously a complex technology and Murati said that, as a result, powering it is “much more expensive.” However, OpenAI hopes that Sora will be available for around the same costs as DALL-E. DALL-E is OpenAI’s text-to-image generator.


One of the biggest concerns around generative AI is the issue of deepfakes. In the United States, this is especially contentious this year as there is a presidential election. However, Murati said that Sora won’t be able to generate videos featuring public figures. There are similar restrictions in place with images generated by DALL-E. In addition, Sora videos will be watermarked to show that they have been generated by AI.

What we think

Even at this relatively early stage in its development, Sora appears capable of producing very realistic videos. While many people will be eager to start using the technology, it’s reassuring to see that OpenAI are implementing polices to control the videos that Sora creates. Generative AI has the potential to be a useful tool, but there is also a significant risk that it could be misused.  

Pete Tomkies
Pete Tomkies
Pete Tomkies is a freelance cinematographer and camera operator from Manchester, UK. He also produces and directs short films as Duck66 Films. Pete's latest short Once Bitten... won 15 awards and was selected for 105 film festivals around the world.

Related Content