The Not Ready for Prime Video Players
Over the past year, so much of the discourse around AI has centered around its ability to completely replace creators. “Will an AI ever be able to write a script?”
Or “can it make a documentary based entirely on prompts?”
“Do we even need graphic designers any more, now that AI can almost kind of make original images, so long as the hands are tucked safely out of frame?”
But as the limitations of the technology become increasingly difficult to ignore, some of these questions are starting to feel a bit less relevant. At least for now, it seems unlikely that AI is going to entirely replace human storytellers, filmmakers, artists, and writers. Attempts to suggest otherwise – like Fable Studios’ “Showrunner” app, which purports to give users the ability to create their own episodes of TV with just text prompts – are increasingly met with cynicism bordering on ridicule.
Still, it’s hard, if not impossible, to discount the future of AI entirely. If only because of the intense interest in the tech from penny-pinching, union-busting corporations and the massive investments in its development by Wall Street and venture capitalists. The billionaires and conglomerates that power America’s economy are gung-ho and making huge bets on AI, and they don’t like to lose. So it’s going to be with us for a while, whether or not it seems capable of churning out quality screenplays. (It does not.)
Maybe Try Raging WITH The Machine?
At least in the short-term, it seems far more likely that AI applications will become a part of everyone’s job, rather than stealing it from them. This is probably true in most industries, as companies from just about every sector excitedly jump on board the AI hype train to get investors excited and make their Q2 earnings presentations more lively. And it’s certainly true for digital creators, who already have a host of options for integrating AI apps into their workflow.
Just last week, ElevenLabs launched a new AI-powered Sound Effects tool, which can generate up to 22 seconds on sounds and soundscapes based on text prompts. The company trained its model on audio from Shutterstock, which has also licensed its archive to train AI models for OpenAI, Meta, and Google. (The company’s AI licensing business brought in over $100 million last year.)
As with so many other tempting tools for creators, Sound Effects is a freemium offering; you can start using it for free, but paid users get a number of benefits, including longer prompts, and the commercial license to any sounds they generate.
In a thread on Twitter/X, ElevenLabs shared some sample outputs, and as with so many other AI tools, it’s a bit of a mixed bag. Sound Effects can nail simple prompts, like “Music Techno Loop” or “Rain Thunderstorm Ambience.” But things get a bit more fuzzy and indistinct as the instructions get more layered or complex, like this attempt to simulate a rocket launch. The biggest overall downside remains consistent with other generative AI apps: the inability to go back and make any tweaks or edits.
For example, this prompt asks for a woman singing “Dancing in the sand / We watched the daylight end.” But instead she pretty clearly sings “Dancing in the sand / We watch da daylight in.” Not too bad, right? Close. But there’s no way to go back into Sound Effects and tell it, “Oh, so close, but your second sentence is slightly wrong.” It’s not really a “brain” with a memory. You just have to try the same prompt again and hope to get a better result.
Putting the “Lab” in Collaborative
This sort of perfectly highlights why AI tools function well as a small part of an overall human’s creative process, but poorly when left to their own devices. A creator making a video who just needs a quick bit of audio could theoretically work with Sound Effects for long enough to generate the exact element that they require. The app could not replace that human creator, and their ability to mentally envision the finished project, and make high-level decisions about the work. It can serve one small specific function, taking over a job that might have once gone to a sound effects editor, or at least the curator or distributor of a sound effects library, but not a filmmaker or project manager.
Another recently launched AI tool, Cartwheel, serves a similar function. Rather than fully animating an entire sequence, as apps like OpenAI’s Sora attempt to do, Cartwheel takes care of the early, labor-intensive basics so a human artist can come in and finish the job. The app allows animators to skip the initial step of creating elementary movements and basic motions; they can automatically generate these, then export Cartwheel’s creations to use as a starting point for their own work.
It’s a rather elegant solution to the AI “editing” problem. Even if Cartwheel doesn’t do a 100% perfect job, as long as it gets most of the way there, a human animator can come in and make the necessary adjustments. Attempting to generate finished work that doesn’t allow for constant human intervention seems to be what causes so many of these models to go astray.
Everyone’s All In on AI, It Seems
Relatively, we’re still in the very early days of AI development, and it’s likely that even more innovative and thoughtful ways to use these models and apps behind the scenes will arise over the coming years. On a recent episode of the “All In” podcast, co-host David Sacks (yes, the guy hosting that Silicon Valley Trump fundraiser) introduced his new Slack competitor, Glue.
Glue’s basic functionality mirrors Slack. But with AI bells and whistles. Most of it is probably more exciting for operations managers than digital media creators, but one particular feature that Sacks showcased on the podcast hints at a whole host of potential future applications. Sacks and one of the show’s producers loaded transcripts of every episode of “All In” into the app, giving Glue the ability to generate detailed data and “reports” about the show.
For example, “All In” producers can ask Glue how many times a particular country has been mentioned on the show, to break down who speaks the most to the least by percentage, or to describe the personality profiles of each co-host and guest. From a producer standpoint, the potential here is pretty vast. Traditionally, in order to answer these kinds of specific content-related questions, you’d need interns or production assistants to scan through every episode of the show and take diligent notes, even jotting down notable timecodes for good examples. (In the past, I’ve even worked on podcasts that relied on diligent fans to track these sorts of highlights or key data points.) But Glue or a similar AI app can do all of this work instantly, provided the raw materials are reasonably well organized.
Again, the magic ingredient here is that Glue isn’t designed to be the star of the show itself. It’s just a new kind of business chat software. But when employed in a clever way, it gives a creative team new options on how to approach their jobs. That’s a bit of a different way of thinking about our approach to AI tools, and one that doesn’t actually require anyone to lose their jobs. Which is certainly a plus.