Technical Tidbit of the Day: DSP

Showing posts with label DSP. Show all posts

Sunday, September 9, 2012

libav in C++

libav, which is intended for codecs, also serves as a nice signal processing library for scientific applications as it has assembly FFT routines optimized for various processors including x86 SSE and ARM NEON. Written in C for C, it can be a little tricky to use in C++.

The first trick is to wrap the #includes within extern "C":

extern "C" {
#include <libavutil/avutil.h>
    #include <libavcodec/avfft.h>
}

The second trick is that to use the nice C++ array types std::vector or boost::multi_array, it is necessary to use a custom allocator class that calls the libav av_malloc() and av_free(). The reason is that these ensure the 32-byte alignment that libav assumes (without checking every time) is present when it utilizes SIMD instructions. Without a custom allocator, std::vector and boost::multi_array just use new[], which does not allocate aligned, and libav in the process of ANDing addresses ends up running past the end of the buffer and generating a segmentation fault.

The code below uses a custom allocator adapted from The C++ Standard Library -- a Tutorial and Reference. The advantage of using std::vector or boost::multi_array, of course, is automated memory management using the Resource Acquisition Is Initialization pattern/idiom, similar to C++ auto_ptr (which can't be used for arrays because it is hard-coded to delete instead of delete[]). Although std::vector is used below, the same allocator works equally well with boost::multi_array.

#define __STDC_CONSTANT_MACROS

#include <cstring>
#include <vector>

extern "C" {
#include <libavutil/avutil.h>
#include <libavcodec/avfft.h>
}

template <typename T> class allocator_av {
public:
    typedef T               value_type;
    typedef T*              pointer;
    typedef const T*        const_pointer;
    typedef T&              reference;
    typedef const T&        const_reference;
    typedef std::size_t     size_type;

    template <typename U>
    struct rebind {
        typedef allocator_av<U> other;
    };

    allocator_av() {}
    template <typename U> allocator_av(const allocator_av<U>&) {}
    size_type max_size () const { return 1 << 16; }

    pointer allocate (size_type num, const void* = 0) {
        return static_cast<pointer>(av_malloc(num*sizeof(T)));
    }

    void construct (pointer p, const T& value) {}
    void destroy (pointer p) {}
    void deallocate (pointer p, size_type num) { av_free(p); }
};

int main(int argc, char** argv) {
    std::vector<FFTComplex, allocator_av<FFTComplex> > z(256);
    FFTContext* c = av_fft_init(8, 0);
    av_fft_permute(c, z.data());
    av_fft_calc(c, z.data());
    av_free(c);
}

Tuesday, August 28, 2012

Samsung 11.8 tablet key for signal processing

The Samsung 11.8, information about which came out of the Apple lawsuit, is a good match for in-the-field scientific/data acquisition/NDT applications. Why? Because it will be the first tablet with an ARM Cortex-A15 processor, specifically the Samsung Exynos 5.

It's just an ARM processor, right? ARM is extremely low power consumption, but mediocre computational performance, right? Wrong -- ARM has grown up. ARM has maintained its extreme low power consumption while catching up to Intel performance. And the Cortex-A15 is a huge step forward in that regard. It features NEON, which is the ARM equivalent to Intel MMX/SSE/AVX vector processing, which speeds up by multiples the signal and image processing used in scientific computing. And the NEON in Cortex-A15 can do double-precision. Oddly, the current generation Samsung Galaxy tablet, 10.1, dropped NEON even though its even older predecessor had it. But the Samsung 11.8 Cortex-A15 NEON has 128-bit ALUs instead of the 64-bit ALUs that the earlier ARM NEON processors had.

The Samsung 10.1 has thus far escaped injunction from Apple, so there is hope the 11.8 will as well. The iPad won't get Cortex-A15 until the iPad 5. There is hope that the Samsung 11.8 will be unveiled tomorrow at the Unpacked event in Berlin.

Monday, January 4, 2010

Computing energy from FFT

The straightforward way to compute energy of a signal is just summing the squares in the time domain. But it can also be computed from the FFT, or more properly speaking the spectrum, which is the square root of the sum of the real part squared and the imaginary part squared. The tricky part is realizing two things:

1. Typically only the positive half of a spectrum is displayed, so if you want to compare apples to apples (i.e. amounts of energy in the same units), you have to add in the negative spectrum as well. Since the spectrum of a real-valued function is symmetric, this just means doubling the energy from the positive spectrum.

2. The DC component of the spectrum (0 Hz) appears just once in the spectrum, so it doesn't have to be doubled.

Thus we have:

Energy = Sum of squares in time domain = DC component squared + 2 * (sum of squares of positive spectrum)