Home

Mind: How to Build a Neural Network (Part Two)

Thursday, 13 August 2015

In this second part on learning how to build a neural network, we will dive into the implementation of a flexible library in JavaScript. In case you missed it, here is Part One, which goes over what neural networks are and how they operate.

Building the Mind

Building a complete neural network library requires more than just understanding forward and back propagation. We also need to think about how a user of the network will want to configure it (e.g. set total number of learning iterations) and other API-level design considerations.

To simplify our explanation of neural networks via code, the code snippets below build a neural network, Mind, with a single hidden layer. The actual Mind library, however, provides the flexibility to build a network with multiple hidden layers.

Initialization

First, we need to set up our constructor function. Let’s give the option to use the sigmoid activation or the hyperbolic tangent activation function. Additionally, we’ll allow our users to set the learning rate, number of iterations, and number of units in the hidden layer, while providing sane defaults for each. Here’s our constructor:

function Mind(opts) {
  if (!(this instanceof Mind)) return new Mind(opts);
  opts = opts || {};

  opts.activator === 'sigmoid'
    ? (this.activate = sigmoid, this.activatePrime = sigmoidPrime)
    : (this.activate = htan, this.activatePrime = htanPrime);

  // hyperparameters
  this.learningRate = opts.learningRate || 0.7;
  this.iterations = opts.iterations || 10000;
  this.hiddenUnits = opts.hiddenUnits || 3;
}

Note that here we use the sigmoid, sigmoid-prime, htan, and htan-prime npm modules.

Forward Propagation

The forward propagation process is a series of sum products and transformations. Let’s calculate the first hidden sum with all four input data:

This can be represented as such:

To get the result from the sum, we apply the activation function, sigmoid, to each element:

Then, we do this again with the hidden result as the new input to get to the final output result. The entire forward propagation code looks like:

Mind.prototype.forward = function(examples) {
  var activate = this.activate;
  var weights = this.weights;
  var ret = {};

  ret.hiddenSum = multiply(weights.inputHidden, examples.input);
  ret.hiddenResult = ret.hiddenSum.transform(activate);
  ret.outputSum = multiply(weights.hiddenOutput, ret.hiddenResult);
  ret.outputResult = ret.outputSum.transform(activate);

  return ret;
};

Note that this.activate and this.weights are set at the initialization of a new Mind via passing an opts object. multiply and transform come from an npm module for performing basic matrix operations.

Back Propagation

Back propagation is a bit more complicated. Let’s look at the last layer first. We calculate the output error (same equation as before):

And the equivalent in code:

var errorOutputLayer = subtract(examples.output, results.outputResult);

Then, we determine the change in the output layer sum, or delta output sum:

And the code:

var deltaOutputLayer = dot(results.outputSum.transform(activatePrime), errorOutputLayer);

Then, we figure out the hidden output changes. We use this formula:

Here is the code:

var hiddenOutputChanges = scalar(multiply(deltaOutputLayer, results.hiddenResult.transpose()), learningRate);

Note that we scale the change by a magnitude, learningRate, which is from 0 to 1. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If there is a large variability in the input (there is little relationship among the training data) and the rate was set high, then the network may not learn well or at all. Setting the rate too high also introduces the risk of ‘overfitting’, or training the network to generate a relationship from noise instead of the actual underlying function.

Since we’re dealing with matrices, we handle the division by multiplying the delta output sum with the hidden results matrices’ transpose.

Then, we do this process again for the input to hidden layer.

The code for the back propagation function is below. Note that we’re passing what is returned by the forward function as the second argument:

Mind.prototype.back = function(examples, results) {
  var activatePrime = this.activatePrime;
  var learningRate = this.learningRate;
  var weights = this.weights;

  // compute weight adjustments
  var errorOutputLayer = subtract(examples.output, results.outputResult);
  var deltaOutputLayer = dot(results.outputSum.transform(activatePrime), errorOutputLayer);
  var hiddenOutputChanges = scalar(multiply(deltaOutputLayer, results.hiddenResult.transpose()), learningRate);
  var deltaHiddenLayer = dot(multiply(weights.hiddenOutput.transpose(), deltaOutputLayer), results.hiddenSum.transform(activatePrime));
  var inputHiddenChanges = scalar(multiply(deltaHiddenLayer, examples.input.transpose()), learningRate);

  // adjust weights
  weights.inputHidden = add(weights.inputHidden, inputHiddenChanges);
  weights.hiddenOutput = add(weights.hiddenOutput, hiddenOutputChanges);

  return errorOutputLayer;
};

Note that subtract, dot , scalar, multiply, and add come from the same npm module we used before for performing matrix operations.

Putting both together

Now that we have both the forward and back propagation, we can define the function learn that will put them together. The learn function will accept training data (examples) as an array of matrices. Then, we assign random samples to the initial weights (via sample). Lastly, we use a for loop and repeat this.iterations to do both forward and backward propagation.

Mind.prototype.learn = function(examples) {
  examples = normalize(examples);

  this.weights = {
    inputHidden: Matrix({
      columns: this.hiddenUnits,
      rows: examples.input[0].length,
      values: sample
    }),
    hiddenOutput: Matrix({
      columns: examples.output[0].length,
      rows: this.hiddenUnits,
      values: sample
    })
  };

  for (var i = 0; i < this.iterations; i++) {
    var results = this.forward(examples);
    var errors = this.back(examples, results);
  }

  return this;
};

More information about the Mind API here.

Now you have a basic understanding of how neural networks operate, how to train them, and also how to build your own!

If you have any questions or comments, don’t hesitate to find me on twitter. Shout out to Andy for his help on reviewing this.