Question - How to detect music/audio volume

Hi there,

So lately I have been trying to write a macro for ScreenFlow using KeyboardMaestro.

The problem - I have a 20 minute long voice recording of me narrating a script. During the recording I pause for long periods of time as I re-write the script in real time. Therefore the entire the voice recording has long periods of silence followed by short periods of voicing.

When I import it into ScreenFlow I can split the clip with the keypress "T".

My idea was I would activate the script and let ScreenFlow play the audio track. The KeyboardMaestro macro will detect when the volume of the audio gets above a certain level. At this point it will press "T". Then it will wait until the volume gets below a certain level (for the first time after a period of high volume).

Ultimately I want to clip before and after the parts marked in red here:

However here is where the problem comes. How do I detect the volume level of said audio track? Sadly ScreenFlow has no applescript support, so I guess everything would have to be done system side. One idea I did think of is scanning through pixel by pixel with keyboard maestro and cutting before and after the peaks, and I did indeed start making this macro. However I couldn't help but feel there was an easier way to do it...

If I can detect audio volume levels I could literally scrub through the audio and pressing T when needed (or scrubbing back a bit and pressing T).

So does anyone have any ideas?

Thanks,
~Sancarn

1 Like

Hey Sancarn,

Keyboard Maestro doesn’t have the ability to detect audio-out amplitude. It can only get and set the system volume level.

I fiddled for a while with the ‘Find Image on Screen’ action to see if I could get it to detect lengthy pauses, but I didn’t have very good luck.


Best Regards,
Chris

Hey Chris,

Hmm… Maybe that’s something Keyboard Maestro can improve upon… The ability to detect pixels on a screen, for instance, is extremely useful! I can imagine the ability to detect parts of a audio waveform and maybe even sections of audio (like the ability to detect whole pictures) would be incredibly useful.

That being said I’ve tried making an own waveform calculator and I am completely at a loss of how it works… xD So I guess if Peter and Co. don’t know how to get at this stuff I can see why it hasn’t been implemented yet. Either that or it’s just incredibly difficult to get at? I know there is software that can re-route system sound (e.g. sound flower or wavtap) so maybe one could ‘listen’ to those… If that makes sense… xD

Yeah… My problem, with the pixel detection, is that the waveform wasn’t high enough resolution and sometimes there are micro pauses in my speaking. I guess one would have to take and average the waveform first or something O_o Anyways I can’t get it working either so I figure I am stuck until more is implemented to KM…?