2018; Combat Gesture Recognition (Robotics Club)

Guining Pertin
Dec 31, 2018
6 min read

Introduction

This project was first done for “Combat Assist for Soldier Support” event under Robotics Club at Kriti 2017, Annual Technical Competition, IITG

TL; DR: https://github.com/otoshuki/Old_Arduino_Projects/tree/master/Combat_Mk2

This was one of my first projects back in my freshman year and our team came 2nd in the competition that year. Looking back at my work, even few months after the event, I found that I could’ve made it way more efficient than the “Jugaad” method we applied. So I decided then that I’d improve the system someday. That is what finally brought me here, after I gathered enough motivation to work on it.

The Idea

So, in this new improved version I plan to include the machine learning and mathematical techniques I’ve learned over the last 1 year. But before that, let me first explain the problems with the previous version.

My life is full of problems that I never found the solutions for!

The first version used 5 flex sensors, one for each finger and an MPU-6050 IMU to determine the hand orientation and movement. They were soldered “haphazardly”, had several tapes attached to it and had a very very cluttered circuit.

Flex sensor readings, which is based on a voltage divider circuit are really noisy and often vary real-real fast. High frequency stuff.

My “classification algorithm”, if I can actually call it one, was nothing but a set of if-else, which I’d hope to call a naive decision tree if allowed.

Now you see the big problem, high frequency noise with nothing but decision tree for classification would inevitably lead to noisy and wrong classification. This was partly solved in my first model using a switch to determine at which point to sample the gesture.

Some solutions I thought of

The circuit can be improved anyway, for I’ve finally started actual electronics labs and courses from the department, learning practical techniques.

The high frequency noise can be effectively reduced by what is called moving average, a form of Low-pass filter.

For the classification algorithm, I am planning a supervised approach, with labels for each symbol and data collected at runtime, probably using some calibration function at the start.

This should solve most of the problems.

So I finally completed the project by doing the following –

Fit 5D flex sensor data over a Gaussian distribution.
Perform moving average(simple+exponential) to get current sample.
Use Naive Bayes classifier with MAP decision rule to classify the sample.

And this actually works pretty well!

Current Progress

Note: The programs below won’t run standalone and are provided only for tutorial or explanation purposes.

I am using 5 flex sensors, one for each finger.

So I found that simple moving average is really effective in removing high frequency noise from time series data. I am basically taking the average over 100 new samples I get from each sensor and using that as the current estimate of the actual sensor data.
But the sensor data still varies quite fast and while this isn’t really a problem for my project, neither does reducing this does. What I used is called exponential moving average, which basically provides weights for the samples. Maybe, fast changing data due to noisy readings or problems with the sensor might lead to wrong classification, so I’d be better off performing this.

//Current rate at 0.2, that is a real slow update
//Current num_samples kept at 100
//Signal conditioning function
int get_conditioned(int pin_num, int prev_val)
{
  int samples = 0;
  //Perform simple moving average
  for (int i=0; i<num_samples; i++)
  {
    samples += analogRead(pin_num);
  }
  //Return the exponential moving average
  return rate*samples/num_samples + (1-rate)*prev_val;   
}

I’d say this was the most effective solution I applied in the project.

I’ve decided that I’d display the gesture icons on an 8×8 LED Matrix, making it way cooler. It would also make up for my mistake at the actual event where I displayed stuff using a Processing program. Using this is easy AF:

#include <LedControl.h>
//Create matrix
LedControl my_matrix = LedControl(DIN, CLK, CS, 1);
//Look closely, you'll find a tick mark made up of 1s
int done[8]     = {0b00000001,
                   0b00000010,
                   0b00000100,
                   0b10001000,
                   0b01010000,
                   0b00100000,
                   0b00000000,
                   0b00000000
                  };
//Helper function to draw the symbol
void draw_char(int to_draw[8])
{
  for(int i=0; i<8; i++)
  {
    my_matrix.setRow(0, i, to_draw[i]);
  }
}

Flex sensors – ticked; 8×8 Matrix – ticked; Symbols – ticked
Now I need to develop the classification algorithm!
I finally ended up doing Gaussian fits over the 5D sensor data that I run using a calibration function, initiated using an interrupt service routine

Basically, given the current symbol to calibrate for, I collect 100 raw samples for each sensor, raw as in no signal conditioning as I mentioned before. Then I get the mean for each sensor, which gives me the 5D “empirical” mean vector for the Gaussian.

Now, I can easily assume no correlation between different sensors and so the covariance matrix is reduced to a diagonal matrix whose values can be found by calculating “empirical variance” for each sensor. Using the mean and the 100 or so samples, I get the “empirical” standard deviations.

//Gaussian fitting for each symbol/class
void fit_gaussian(int y_j)
{
  //First, show calibration logo
  for(int i=0; i<9; i++)
    {
      draw_char(load);
      load[3] += 0b10000000>>i;
      load[4] += 0b10000000>>i;
      delay(200);
    }
  load[3] = 0b00000000;
  load[4] = 0b00000000;
  draw_char(symbols[y_j]);
  delay(1000);
 
  //Calculating std*std for variance is easier than root(std) for every prediction
  unsigned int char_data[num_sensors][100] = {0};
  unsigned int char_mean[num_sensors] = {0};
  unsigned int char_std[num_sensors] = {0};
  //Get 100 samples
  for(int i=0; i<100; i++)
  {
    //For each sensor
    for(int sen_id=0; sen_id<num_sensors; sen_id++)
    {
      char_data[sen_id][i] = map(analogRead(sensor_pins[sen_id]),0,300,0,300);
      //Fitting normalized mean onto the sample
      char_mean[sen_id] += char_data[sen_id][i];
      delay(5);
    }
  }
  //We have the samples x_i and the means u_x_i for each sensor i
  //Calculate variances for each feature
  for(int sen_id=0; sen_id<num_sensors; sen_id++)
  {
     //Empirical variance over all samples
     for(int i=0; i<100; i++)
     {
        //Get sum(x_i - u_x_i)^2
        char_std[sen_id] += pow(char_data[sen_id][i] - char_mean[sen_id]/100, 2);
     }
     //Std_dev = sqrt(variance), variance by n-1
     char_std[sen_id] = pow(char_std[sen_id]/99, 0.5);
  }
  //Store the data for given character
  for(int sen_id=0; sen_id<num_sensors; sen_id++)
  {
    calib_means[y_j][sen_id] = char_mean[sen_id]/100;
    calib_stds[y_j][sen_id] = char_std[sen_id];
  }
}

In the previous Mk. II.1 version, given any new data, I calculate its L2 norm wrt the different 5D mean vectors for each symbol and take the one with the least distance as the class. Although I thought it would be real noisy and error prone, it turned out to work quite accurately for the 8 different classes I have.

In the newer Mk. II.2 version, since I’ve fit the classes over 5D Gaussians pdfs, I could use Naive Bayes classifier with MAP decision rule.

//PDF given mu and std
float gaussian_func(int x, int mu, int std)
{
  //No need to recalculate this over and over
  static const float inv_sqrt_2pi = 0.3989422804014327;
  float a = (x - mu)/std;
  return inv_sqrt_2pi/std * exp(-0.5f*a*a);
}
 
//Naive Bayes classification
int naive_predict()
{
  //Consider equiprobable classes, we don't need prior
  //We only need argmax over the conditionals
  int argmax = 0;
  float max_p = 0;
  //For each class
  for(int y_j=0; y_j<8; y_j++)
  {
    float prob_for_y_j = 1;
    //Get product over conditionals
    for (int x_i=0; x_i<num_sensors; x_i++)
    {
      //Get sample
      int x = map(sensor_data[x_i],0,300,0,300);
      prob_for_y_j *= gaussian_func(x, calib_means[y_j][x_i], calib_stds[y_j][x_i]);
    }
    //Get max and argmax
    if (prob_for_y_j > max_p)
    {
      //If curr is max, change max and argmax
      max_p = prob_for_y_j;
      argmax = y_j;
    }
  }
  return argmax;
}

What it does is as follows:

Get all the conditional pdfs
𝑝(𝑦_𝑗|𝑥_𝑖)=𝑁(𝑢_𝑗,𝜎_𝑗) ∀ i = sensors, j = classes
Consider equiprobable classes
𝑝(𝑦_j)=1/8 ∀ 𝑗 = classes
Given the Bayes rule
𝑝(𝑦_𝑗|𝑥_𝑖)=𝑝(𝑥_𝑖|𝑦_𝑗)∗𝑝(𝑦_𝑗)/𝑝(𝑥_𝑖)
We can use these posterior probabilities with Maximum a posteriori estimation.

Basically, MAP states that the optimal class y for the samples, given the current x(here, 5D vector), is the one that maximizes the posterior probabilities

𝑦=𝑎𝑟𝑔𝑚𝑎𝑥 𝑦_𝑗 𝑝(𝑦_𝑗)∗∏ 𝑝(𝑥_𝑖|𝑦_𝑗), where product is over x_i

𝑝(𝑥_𝑖|𝑦_𝑗) is calculated by the Gaussian distribution for given sample x_i, for i = sensors. This MAP based classification has been finally used in the project for real-time classification of the data.