Dec 28 2007

Reading and processing binary data in blocks

 

We wanted to code a simple N-Point Averager using c programming language. But reading the data one by one is very time consuming because we need to seek back and forth in the data. Alternatively, reading whole data is a problem when the data size is very large and the memory is limited for such data. For example sometimes, we have gigabytes of data.

The solution is to read and process the data in blocks with reasonable size. Since we are filtering the signal, in addition to the elements of the current block, we also need N-1 of the last elements of the previous block. It is illustrated below. For the first block, there is no previous block and let us choose 0 for the signal data before t=0. Each block will require N-1 + BLOCK_SIZE input elements and will produce BLOCK_SIZE outputs.

blocks.png

A code may be as follows. Let us start with defining the coefficients we need. N is the value for N-point averager. Ex: 3 for y(n) = 0.5x(n)+0.3x(n-1) + 0.2x(n-2)

#define BLOCK_SIZE 100
#define N 3
#define INPUT_FILE "input.dat"
#define OUTPUT_FILE "output.dat"

The first N-1 elements of the input buffer are including the previous block's last N-1 inputs and the remaining are for the current blocks elements in our new type definition.

struct blockType {
    double input[N-1 + BLOCK_SIZE];
    double filterCoefficients[N];
    double output[BLOCK_SIZE];
};

Now, the main loop will be as follows:

int main() {

    FILE * fp_in;  /* input file pointer */
    FILE * fp_out; /* output file pointer */
    int i;

    struct blockType block;
    /* set the filter coefficients for our needs */
    block.filterCoefficients[0] = 0.5;
    block.filterCoefficients[1] = 0.3;
    block.filterCoefficients[2] = 0.2;
    for(i=0; i < N-1; i++) { /* set the previous elements to 0 because there's no signal before time = 0 */
        block.input[i] = 0;
    }

    /*createNoisyRectangularPulse(10000, 200, 10); /* create a rectangular pulse */

    fp_in = fopen(INPUT_FILE, "rb");
    if(!fp_in) {
        printf("Could not open file.n");
    }
    else {
        fp_out = fopen(OUTPUT_FILE, "wb");
        while(!feof(fp_in)) { /* until the end of input file */
            fread(&(block.input[N-1]), sizeof(double), BLOCK_SIZE, fp_in); /* read the input file block by block  */

            nPointAveragingFilter( &block ); /* apply filter to that block */
            /* copy last N-1 elements to the front of the input buffer so we can use them in the next loop */
            memcpy(&(block.input[0]), &(block.input[BLOCK_SIZE]), sizeof(double)*(N-1));
            /* USAGE: memcpy(destination, source, number of bytes) */

            fwrite(&(block.output[0]), sizeof(double), BLOCK_SIZE, fp_out); /* write the output to the output file */

        }
        fclose(fp_out);
    }
    fclose(fp_in);

    return 0;
}

The filtering function nPointAveragingFilter will take the address of the block and update the block's output array in the routine. Remind that, the reason to do was to do processing faster, so we need to use pass by reference instead of pass by value. The block structure will involve the input array where the ith output data will be a weighted sum of (i+N-1)th down to ith input data.

void nPointAveragingFilter(struct blockType *block) {
    int i, j;
    for(i=0; i < BLOCK_SIZE; i++) {
        /* for each output element, calculate the n-point average of corresponding input elements
        be careful that the first N-1 elements are from the previous block.
        */

        block->output[i] = 0;
        for(j=0; j < N; j++) {
            block->output[i] += block->filterCoefficients[j] * block->input[i+N-1 - j];
        }
    }
}

Now let us use some MATLAB script to read and plot the input and output data we dealt with:

numOfNumbersToPlot = 1000;
fid = fopen('input.dat');
fid2 = fopen('output.dat');
input = fread(fid, numOfNumbersToPlot, 'double');
output = fread(fid2, numOfNumbersToPlot, 'double');
fclose(fid);
fclose(fid2);
subplot(2,1,1), plot(input), title('Input data')
subplot(2,1,2), plot(output), title('Output data')

An example is the noisy rectangular plot you can create with the source code attached to this post. It is seen that the signal is smoothed.
noisyrectangle.jpg

For fun, let us add noise to a wav file and listen to the result after filtering. Here is the code to create dat files from wav files for c code to read them.

[x,Fs,bits] = wavread('tuuba');
x = x(:,1); % if sound has 2 channels, i.e. stereo
x = x + max(x)/20 .* rand(length(x),1);

fid = fopen('input.dat','wb'); % save it as input.dat
fwrite(fid,x,'double');
fclose(fid);

Now let us say that we processed the data and want to plot/listen to the result. Sampling frequency is set by the user. You can check it from the initial wav file while converting it to dat.

inputFile = 'output.dat';
Fs = 44100; % Hz
fid = fopen(inputFile);
input = fread(fid, 500000, 'double');
fclose(fid);
plot(input), title('Signal')
wavplay(input, Fs)

All source codes are attached. Enjoy!