The Sound Board

Posted: **Jan 26, 2023 6:17 pm**

The age of audio miracles is not yet past.

Adobe Podcast is in Beta. It’s intended as an easy way to make podcasts sound better. Yeah yeah you say. So cutting to the chase… jump to circa 3m 15s to hear the raw recording, and then the one-click speech enhance.

Pro Tools Expert did their own tests with similarly astonishing results - scroll down to the iPhone audio in Conservatory examples here https://www.pro-tools-expert.com/produc ... t-any-good

IMO noise reduction and restoration is the one single area of audio where astonishing progress is being made. Everything else feels trivial by comparison. And as PTE point out, iZotope are being left for dust.

Posted: **Jan 27, 2023 3:45 am**

I've just tried it myself - incredible. Upload some shoddy mic audio here - it's free, you just need to be logged in. https://podcast.adobe.com/enhance#

Posted: **Jan 27, 2023 10:43 pm**

That sounds insanely good. I wonder what the subscription price will be and how well it will work on acoustic music and singing through decent close mic’d stuff in a semi-noisy environment. Definitely going to try it.

Posted: **Jan 27, 2023 11:54 pm**

Real magic!

Posted: **Jan 28, 2023 4:05 am**

Some real wold reports on two challenging podcast episodes. The first had a poor consumer mic in a slighty ambient room, and was further compounded by some baked-in aggressive echo cancellation that was dipping some of the audio out; the second was a clear but unpleasantly thin earbud microphone.

In both cases Adobe performed extraordinary and scarcely believable miracles. The phone mic was perhaps the most remarkable as it really did sound near studio quality - from 40% to 85% perhaps. The voice how has a natural body to it that no other effect could do.

It is of course a trick, and very occasionally the trick goes wrong - once when the contributor reduced their voice level and leant into the microphone, Adobe turned it so some strange gravelly robot-ized unpleastantness. Occasional syllables drop out. One really interesting flaw came as a result of some dodgy editing on my part - it sounded just about ok in the original form, but horribly mangled post-Adobe. I took this as a justified criticism of the editing itself, and redid that section with the audio already processed. What it said to me is that my frankenstein edit did not pass for correct conventional speech, so the AI-trained algorithm didn't know what to do with it.

I'd guesstimate that these flaws affect less than 1% of the material I fed in. That's extraordinary - any Joe or Joanna can simply upload bad audio and it come back sounding better than even the most experienced and talented audio engineer with the very best tools could have done until literally last week. That's sobering. But as a crumb of comfort perhaps, you can't just do the process and forget, you'll need to check for these occasional gremlins and do your best to paper over the cracks. A few times I had to boost a syllable by over 20db - it still sounded a bit odd but it was very brief and result once more intelligible.

I haven't tried vocals yet - I think I'd have to artificially make something bad enough to be worthy of correction in this way! Other tools like RX and especially Waves Clarity do such a good job of smaller problems like background fan noise or the occasional noise off. I guess everyone makes a decent effort when recording vocals - to my knowledge no-one records them down the phone for example. With speech you're routinely dealing with all manner of horrors.

Posted: **Jan 28, 2023 6:27 am**

Thanks for sharing all of that Guy.

I am a bit startled by the strides AI has made with sound in the last few years. This could potentially kill RX in a very short time as it learns from each file uploaded to its servers.

Google has made public Music LM - https://google-research.github.io/seane ... /examples/

I am still trying to confirm this (need to research more) but it looks like Music LM really is generating the tracks from study of audio files - it's not a sample library and it isn't just printing notation that has to be programmed or performed. Like the AI pictures being generated from Gaussian blur, it's frequencies which are currently at 24Khz - no awards to guess how fast it reaches 44.1 Khz resolution. All the examples are in Mono but I am guessing eventually it will be two channel and from there scaled to multi-channel.

This is a very serious and impressive feat.

What this is telling me is that eventually (seems like sooner than later) - RX, Reverbs, Zynaptiq, SPAT, Sample libraries, sample modeling type of tools could go through a multi generational change very, very quickly.

Some people could just buy AI music generated at full resolution and subsequently the market for these tools for musicians and sound designers will drop massively.

I am also curious about the energy use here. How much energy does all of this take if you open it all to the public. Google certainly may not be interested in this as a business but some of the money being used to buy gigantic artist catalogs could be diverted into AI music - as some companies have already done.

It's still not comparable and the costs may not add up but it's most probably going to happen.

Posted: **Jan 28, 2023 7:04 am**

Yes, it's definitely happening everywhere Tanuj. A colleague of mine was putting a pitch deck together recently, and he's mastered using AI pictures. He said, essentially, its now better than using an artist because if the artist doesn't get exactly what they want they are stuck, here they just refine the criteria when its getting close. So that's pretty scary - while I've no doubt that fine art will dismiss it all, jobbing artists are being replaced.

The music above in MusicLM will be just fine for so many purposes. 4th down, the reggae one with the vocals. Put that playing in the background to a bar scene in the Caribbean, no-one would batt an eyelid. And as you say, it will only get better, and better quality.

The same is true in writing. Right now everything as at a still almost-comfortable level where essentially it can create brand new variations of existing material. The wisdom is that it can't do original thought. But what constitutes original thought - we're all products of our experiences - is a pretty grey area.

Returning to Speech Enhance, what strikes me as awe-inspiring about it compared to all the above examples is that it is performing a task far better than previously possible. With the art and music, we think "wow that's like a human could have done it". Speech Enhance is doing something better than humans with their clumsy tools could ever do.

Posted: **Jan 28, 2023 9:45 am**

Watching the lunchtime BBC news (story about a collapsed UK airline) I heard a typical terrible quality down the line interview. Here's a before and after.

Original broadcast audio
https://www.dropbox.com/s/qd68c6lqog7ya ... t.wav?dl=0

Adobe Speech Enhanced audio
https://www.dropbox.com/s/dxdz30wqalwqa ... 9.wav?dl=0

The hole at 5 seconds was in the original so not much you can do about that. This still has some room ambience in it - not unpleasant - but I wondered what would happen if I put it through RX's excellent dialogue de-reverb and added a little compression

Adobe Speech Enhanced and RX Dialogue de-reverb
https://www.dropbox.com/s/mqbi3oufi1rq7 ... 9.wav?dl=0

Amazing. And to compare it with JUST the RX dialogue de-reverb:
https://www.dropbox.com/s/wdqobsh2upyuq ... b.wav?dl=0

RX's Dialogue Isolate isn't great
https://www.dropbox.com/s/d1c4x6554on3b ... 9.wav?dl=0

While Waves Clarity didn't touch it.

The contributor, even in the best case, doesn't sound like she's in a booth with a great mic, there's still a slightly processed feel (though remember this would have originally been using Zoom for the broadcast interview). Although not 100% perfect, this demonstrates that this is a whole new ball game, though some comfort that RX can still be useful as an additional tool.

Posted: **Jan 29, 2023 7:39 am**

Posted: **Jan 29, 2023 10:04 am**

Enhanced (the second one, without de-reverb) is the best for me. De-reverb adds some pumping artefact.

Posted: **Jan 29, 2023 1:42 pm**

Hannes_F wrote: ↑Jan 29, 2023 10:04 am Enhanced (the second one, without de-reverb) is the best for me. De-reverb adds some pumping artefact.

There we are - AI alone better than this human’s involvement!

FWIW I disagree if the aim is pure intelligablilty, which for broadcast or podcast should be the primary aim. The version with RX and compression has the best clarity imo.

Incidentally, I was quite surprised that Adobe left as much room in as it did. Most other similar examples I’ve tried sounded more like the version with RX on top.

Posted: **Jan 30, 2023 9:40 am**

Guy Rowland wrote: ↑Jan 29, 2023 1:42 pm There we are - AI alone better than this human’s involvement!

Yes, that can happen, but needs not be a rule.

As an analogy, human recordings do not need to be better than samples per se, just because a human instrumentalist is involved. Actually, they often are not, for example I had to up my recording game considerably in terms of instruments, microphones, equipment, acoustics and editing in order to stay competitive. Which eventually happened and by a good margin, but it was a struggle.

Maybe now perhaps composers and editors will begin to relate how it feels if you are on the brink of "becoming replaced" by technology - it forces you into an extra struggle for quality, and for that which the algorithms still can not do. Nevertheless there is an element of existential fear involved, and that is not going away easily.

Posted: **Jan 30, 2023 12:52 pm**

Hannes - well... yes and no.

So much of the AI discussion is at the rarefied end. Are artists in danger of not being displayed at the Tate Gallery? No. Are concert composers and musicians under threat from AI stopping them performing at the Albert Hall? No. Are jobbing composers producing muzak for online corporate videos in trouble? Hell yeah.

Those are extremes. With each passing month, that line is going to creep up. Who knows where it would settle. But I already know people who are using AI art in preference to using an artist for particular applications.

Adobe Podcast is interesting. It has no bearing on musicians really. But it will eliminate some boring routine audio tasks that were once someone's job, because it does quicker, cheaper and - crucially - better.

Remember the application here. This is not a tool designed for critical listening. That is what we do all the time as musicians, constantly striving for the most pleasing whole. Spatialisation, humanity, dynamic range and a whole host of other concerns. AI isn't there yet, we hope, we have our skills to contribute still.

But that's not what Adobe Podcast is for. That's for functional listening. Podcasts are not for critical listening, they are for listening in the kitchen when stir frying and in the car doing 70 on the motorway. This is why speech radio audio, unlike TV / film audio, has evolved on particular lines - very close mics, compressed to buggery. In the main voices need to be full bandwidth, crystal clear with all levels equal and pretty much zero dynamic range. Frequently any music underneath gets pushed out of the way dramatically with side-chaining. Any deviation in this with radio or podcasts I tend to find irritating as I'm turning the volume up and down.

And Adobe is doing that specific job better than sound engineers have ever been able to before. Right now I'm finding it needs some massaging, some errors creep in, but it will only improve, and many will decide that level of nicety isn't required.

Posted: **Jan 30, 2023 1:14 pm**

I think there is another factor - the number of people that can operate at the top level is decreasing. I have no idea why, some blame it on computers, some just blame people, but as Hannes put so well, you have to up your game or fall by the wayside.

Adobe Podcast is popular because it is good enough. A good audio engineer can still do a better job - at least today. A good audio engineer takes into account pacing, and rhythm, and for some reason I've yet to hear an example from Adobe Podcast (in fairness I haven't listened to a lot yet) that seems to take that into account.

In terms of audio I do believe that visual audio editors have contributed. It is still far easier to find the EXACT edit point by listening, but unlike tape decks, computers don't let you rock the reels, or stop on a dime, and sometimes it is that lowly dime that makes the difference.

For those in the US, listen to PBS news out of the Washington bureau... there are producers who still care about the rhythm of a story, and there are those that don't, and the difference is audible, unless you are happy with good enough. As a listener I'll admit I could not care less about the rhythm of a news story, but as a recovering broadcaster I still tip my hat to those that pay attention to those smallest of details. It isn't that I can't, or won't, listen to a new story that was put together to be good enough, the goal is to be informed (as much as one can be informed by the 4th estate these days, but that's a different thread) - no, I listen to the news because I want to know what's going on. But when a segment comes on that was produced with the utmost care I do notice. And sometimes it makes me sad that it is a dying art.

That is just one example of where the human touch has value - today. It probably won't be difficult for the developers to update Podcast to pay attention to timing. But they have to want to do it, and that is unlikely because (a) they may not even know it is a thing, and (b) it probably won't sell more copies.

One other point in his post - samples often sound better than real live musicians - they use better gear, and they get to redo each note till they get it right (not sure "get to" is the right wording<G>).

A long time ago I picked up Scarbee's J-Bass for GigaStudio (I said a long time ago). I showed it to a bass player friend, a pretty talented one at that, and he was dismayed. He had a good bass, which he kept in top shape, and a good amplifier, and he felt his takes would never sound as good as the GigaStudio take. So we did an experiment, and in isolation we could both identify which was which, and they were different, and the GigaStudio track did sound better. But in context it was just as obvious, except the human track sounded better because it fit the song better, and the minor difference in gear and technique just didn't matter.

Thanks Hannes!

Posted: **Jan 30, 2023 2:06 pm**

wst3 wrote: ↑Jan 30, 2023 1:14 pmAdobe Podcast is popular because it is good enough. A good audio engineer can still do a better job - at least today.

I’ve posted examples here including a clean clip for comparison - the floor is yours!

Posted: **Jan 30, 2023 2:14 pm**

Not really necessary, your last sentence is spot on, especially that last part.

"Right now I'm finding it needs some massaging, some errors creep in, but it will only improve, and many will decide that level of nicety isn't required."

I wasn't arguing that Podcast and similar tools have no place, but I do lament the acceptance of good enough. And I guess that's why I reacted the way I do, nicety is required, or should be.

Years ago (and we are talking about tape decks and razor blades) I remember the editor asking me to take a second, third, and sometimes fourth shot at a segment. Over time I not only figured out what he was looking for, but I began to appreciate it. And I got to the point that my first edit was ready for air.

And the one INVALUABLE feature that any computer based tool brings to the table is "Undo". Man I hated undoing splices!! It can be done, but it takes longer than the original cut<G>.

Posted: **Jan 30, 2023 2:19 pm**

I think the point though is that, even with the occasional errors, overall it does a better job than any human.

Put it this way, it makes 99.5% of it sound 100% better, whereas a talented sound engineer without Adobe might be able to make 99.9% of it sound 50% better.

Posted: **Jan 30, 2023 4:51 pm**

I misunderstood this tool and tried it on a musical segment, me playing piano at an airport. The results were hilarious as the program tried to turn my (polyphonic) playing into speech!
I think segments of it will make for great humor sfx

Posted: **Jan 31, 2023 7:10 am**

Lawrence wrote: ↑Jan 30, 2023 4:51 pm I misunderstood this tool and tried it on a musical segment, me playing piano at an airport. The results were hilarious as the program tried to turn my (polyphonic) playing into speech!
I think segments of it will make for great humor sfx

OK, now I have to try it.

And Guy, perhaps it is just a philosophical difference of opinion, after all there are very few objective measurements with respect to quality - everyone hears something different. I think it is just that I place a little more weight on factors that may not be important to others.

Posted: **Mar 20, 2023 4:47 pm**

Production Expert did a blind test on 4 audio clean-up examples to see how the AI tools (Adobe and Descript) compared with humans using RX and Acon. Result: AI 4, Humans 0.

https://www.pro-tools-expert.com/produc ... he-results

Posted: **Mar 20, 2023 4:57 pm**

Descript looks insanely useful.

Posted: **Mar 20, 2023 6:22 pm**

Lawrence wrote: ↑Mar 20, 2023 4:57 pm Descript looks insanely useful.

Damn tootin’.

It was new to me. This is their Studio Sound feature, similar to Adobe’s.

Can only be a matter of time when this tech can also replicate natural but clean location audio, not only sat on top of a condenser mic.

The Sound Board

Another new noise reduction miracle

Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle

Re: Another new noise reduction miracle