A guide on analyzing audio

To work with audio I’ve recorded, I use the free program, Audacity. The first step is to download it from https://www.audacityteam.org/download/

You will need to download and install the Nyquist plug-ins, pitch detect and measure RMS, before you begin. You can find them by Googling “Audacity Nyquist Plug-ins” and clicking on the link to the Audacity site. To install them, you go to Tools>Nyquist Plug-in Installer and selecting the files. Then go to Tools>Add/Remove Plug-ins and select all and click “Enable”.


Before you start, I suggest setting up keyboard shortcuts, by going to Edit>Preferences>Keyboard 


I usually set the following shortcuts:

Amplify – A

Noise Reduction – N

Spectral Edit Multi-tool – S

High-pass filter – H

Low-pass filter – L

Notch Filter – Shift+N

Compress – C

Pitch Detect – P

Label Audio – Shift+L

Measure RMS – M

Tempo – T


I also frequently use the built-in shortcuts:


To type in a label – Ctrl+B

To create a break in the audio, so it can be worked with in sections – Ctrl+I


Select audio


Switch to Multi-view. It is in the drop-down menu, to the left of your recording. This allows you to view the spectrogram and waveform at the same time. When you need to take a closer look at the spectrogram, you can toggle the Waveform off, in the same menu.


Under the analyze tab, measure RMS


Since the silence finder plugin has disappeared from my drives and the internet, I now use “label audio” to find silence. I tell it to label area between sounds. I specify that the silence needs to be .25 seconds and personally, I set it to label silence as “ssh”. I by searching for sounds that match the RMS level and work my way down, if I get an error message.


Optional: a high cut at 100-150 hz and a low cut at between 4000 and 6000 hz, bearing in mind that the lower you go with the low cut, the more hollow it will sound and the higher you go with the high cut, the more like you are to cut out part of the voice. 


Amplify the audio to 10 dB higher than the number which is automatically entered (which would bring the peak to 0 dB). To do that, you need to check the “allow clipping” box. If the filled in number, for example, reads 11 dB, change it to 21.


Select one of the areas of silence and open noise reduction, from the effects menu. Press “G”.


Open noise reduction again and preview what it sounds like with the default settings. You want as few artifacts as possible, so play with the level and smoothing bands. 


The more smoothing bands, the less likely there are to be artifacts (weird electronic twinkle sounds) but the more likely it is to cut out some of the voices. The lower the level, the less likely it is to cut out voices but the more likely there is to be artifacts. For silent sound, the lower you can get both sliders, the better, at least initially. Once you’ve isolated voices, you can use a higher level but try to keep the smoothing low. Always preview. Sometimes, if you’ve found an area of absolute silence, level 24, with 0 smoothing bands works amazingly. Usually, as far as level goes, I find the magic number is 3, with 3 to 9 smoothing bands.


You can check what is being cut out by changing it from “reduce” to “residue” and previewing it. Change it back to reduce before you click “OK” or you’ll only have the noises you meant to get rid of.


After noise reduction, amplify the audio to 0 dB. Listen to what you’ve got so far.


If it isn’t clear yet, I do one or both of 2 things. I either compress the audio, often with a ratio of 10:1, checking the box, “compress based on peaks” or start the process over, amplifying to 10 dB, selecting a new area of silence, opening noise reduction, pressing “G”, opening it again and reducing noise. 


If you have horizontal lines spanning your recording, at a certain frequency, you can get rid of them using the notch filter. The higher the level, the narrower the notch will be. In my case, .6 to 1 usually does the trick. My horizontal lines often appear at 800 hz, 1100 hz, 2100 hz, 3100 hz and 5100 hz. If you have such a line somewhere between 600 and 1200 hz, you want a narrow notch, because that is in the range of speech. If you can’t eyeball the frequency or you want the exact number, use pitch detect, from the analyze menu.


If you have visible artifacts, you can get rid of them by highlighting them and using the Spectral Edit Multi-tool.


After I have cleaned up the audio, I break it up into manageable sections of audio. I find areas with a similar amplitude and press Ctrl+I. Then I am able to amplify those sections more. Remember, when working in sections, you will need to select an area of silence within that section for further noise reduction.


Once you’ve sectioned it off, continue to use the tools I’ve discussed to clarify it.


When I find speech that is hard to understand, I slow down the tempo by about 20%, so I can accurately label it. You can leave it slowed down or speed it back up afterwards. If you choose to speed it back up and then share the audio, people may think you’re nuts, because they don’t hear what you heard when you carefully analyzed it. On the other hand, if you leave it slowed down, some people think you have manipulated the audio too much for it to be of evidentiary value.


Once you have worked out what a section of audio is saying, select it and press Ctrl+I, to separate it, then press Ctrl+B and type in a label for it. Especially when you have a long section of speech, doing it this way makes it much easier to eventually get the big picture. It’s like doing a puzzle.


I will continue to edit this article. It is a rough draft but so many people ask me how I work with recordings that I wanted to go ahead and get it posted.

September 21, 2021 Update

I neglected to mention, in my original post, something I have been doing recently, which I find helpful. Lately, when I open an audio file, before I do anything to the audio I Ctrl+A, to select it all and Ctrl+C to copy it. Then I open the pull-down Tracks menu and add a new track to my recording. If it is stereo, I select stereo and if my recording is mono, I select mono.

Next, I paste (Ctrl+V) a copy of my audio into the track, at the very beginning, so the time matches up. I toggle the new track to Mute, in the small menu to the left of the audio.

I work on the original audio and if it is of particular interest, I do it from scratch, on the pasted audio. This improves the quality of the finished audio immensely. There are so many variables in my process that, while the content of the recording doesn’t change, no two tracks will ever turn out to be identical. What one track lacks, another may have. In fact, when working on the second track, be mindful of any things you wish sounded better. When you get to those parts, make sure to select different areas of silence for noise reduction. When both tracks play together, it minimizes flaws, like pops and artifacts.

In addition, adding additional tracks sometimes helps you to understand what is being said and may even uncover voices which were cut out of the first track. You may also realize that you misheard the content. Leave the old audio in there too but make sure to edit your labels accordingly.

Sometimes I add as many as 6 tracks (after that, especially on a hacked computer, it gets too laggy). The more you add, the better it generally sounds. If you screw up on one of the tracks, delete it, so it doesn’t negatively impact your work.

Add multiple tracks to improve sound quality

I recently began experimenting with adding additional tracks to my recording and it has made a big difference. When I open an audio file, before I do anything to the audio I Ctrl+A, to select it all and Ctrl+C to copy it. Then I open the pull-down Tracks menu and add a new track to my recording. If it is stereo, I select stereo and if my recording is mono, I select mono.

Next, I paste (Ctrl+V) a copy of my audio into the track, at the very beginning, so the time matches up. I toggle the new track to Mute, in the small menu to the left of the audio.

I work on the original audio and if it is of particular interest, I do it from scratch, on the pasted audio. This improves the quality of the finished audio immensely. There are so many variables in my process that, while the content of the recording doesn’t change, no two tracks will ever turn out to be identical. What one track lacks, another may have. In fact, when working on the second track, be mindful of any things you wish sounded better. When you get to those parts, make sure to select different areas of silence for noise reduction. When both tracks play together, it minimizes flaws, like pops and artifacts.

In addition, adding additional tracks sometimes helps you to understand what is being said and may even uncover voices which were cut out of the first track. You may also realize that you misheard the content. Leave the old audio in there too but make sure to edit your labels accordingly.

Sometimes I add as many as 6 tracks (after that, especially on a hacked computer, it gets too laggy). The more you add, the better it generally sounds. If you screw up on one of the tracks, delete it, so it doesn’t negatively impact your work.

Leave a Reply

Your email address will not be published. Required fields are marked *