Tools used for analysis:
- Python 2.7
- dvb t dongle
- rtl_fm program
The first thing to do is analyze few samples that the rtl_fm program outputs. The rtl_fm is the program that can be used to tune into FM channels. Below is a typical command that can be invoked to tune into any FM channel.
rtl_fm -f 91.1e6 -s 200000 -r 48000 – | aplay -r 48k -f S16_LE
-f Specify the frequency, which is 91.1Mhz in the above line
-s The sampling rate used (Sampling rate for the audio data decoded by RTL sdr)
-r The output sapling rate (The sampling rate that the “rtl_fm” is going to use to output audio)
The audio data is written out onto the STDIO which is redirected to the program “aplay” for playing out the audio data onto the speaker. For analyzing the data, we need to have a kind of “snap shot” of sample data. To get the “snap shot”, we are going to store the data in a file. To do this, the audio data must be file piped (using “>” ) so that the file can be opened and analyzed.
rtl_fm -f 91.1e6 -s 200000 -r 48000 -> audio.raw
Now that we have raw audio data in the file “audio.raw” it is time to read the contents of this file and make sense out of it. This file is going to be a binary file and also from the option given to “aplay” (taken from the “rtl_fm” help page) in the command line, we can infer that the format used for the data in the audio output is signed 16 bit little endian (S16_LE). Following python code block would read 256 short ints (or signed 16 bit numbers) from the file.
Now that we have the audio data, it is time to plot the spectrum of the audio data. I’ve used the modified version of this code for plotting the data (note that you need to have matplot and numpy installed for this to work).
|FFT of the audio signal|
This is the plot of amplitude versus frequency. What this basically tells us is that, at what points in the frequency the signal is strong and at what frequencies it is weak.
For comparison, now we will have a look at the audio data as output by rtl_fm for a channel that does not exist. If you listened to audio the only thing that you would hear is radio static.
|Spectrum of the noise file|
As you can see, the spectrum is all over the place. This is because, noise being random, does not have frequency content belonging to one particular frequency group as opposed to a audio signal where the signal has to be in audible range.
This we are going to use to our advantage. First of, we need to filter all the low frequency component so that we do not let these components interfere with our further analysis. For this, we use well, a filter. We will be using Butterworth filter which is a type of IIR filter, meaning that this is a feedback based filter. You can use signal.butter() to design a butter worth filter from the scipy package. The output of this function will be a bunch of co-efficients that you can use in a difference function or can be passed to “signal.lfilter()” (linear filter) for filtering.
|Spectrum of the filtered audio|
|Spectrum of the filtered noise|
If you compare the spectrum of filtered audio and noise files, the noise file has more variations than the audio file spectrum. Now it is easy to see that by averaging out these values, we should be able to say noise from audio (well, most of the time).
But there was one more problem to consider. The Angstrom distro used in BBB has numpy package in the repo but does not have scipy. Luckily, numpy provides a fft() routine for us to use. But there are still 2 more concerns to address. First, we cannot now design the Butterworth filter using butter() and also, we cannot use lfilter() for filtering. First concern is easily addressed since calculating the co-efficients is an one time affair, once we have the coefficients, all we have to do is implement the difference equation in direct form-II.
In the code below you will find that both the filtering methods are used.
All that is left to do is run some more samples of noise, audio and noisy audio files determine the threshold for the average by multiple trails.