Steven Miller2017-06-28T04:30:37.786Zhttp://stevenmiller888.github.comSteven MillerIntruder: How to crack Wi-Fi networks in Node.jshttp://stevenmiller888.github.com/intruder-cracking-wifi-networks-in-node2015-09-25T00:00:00.000ZSteven Miller<p>I’m going to explain how to use <a href="https://github.com/stevenmiller888/intruder">Intruder</a> to crack a Wi-Fi network in Node.js. Then, I’m going to explain how it works at a high-level.</p>
<p>I’m going to explain how to use <a href="https://github.com/stevenmiller888/intruder">Intruder</a> to crack a Wi-Fi network in Node.js. Then, I’m going to explain how it works at a high-level.</p>
<p>Start by finding the name of the network you want to crack. In this case, we’ll use an arbitrary network named “Home”. Then, you’ll want to <code>require</code> Intruder, initialize it, and call the <code>crack</code> function:</p>
<pre><code>var Intruder = require('intruder');
var intruder = Intruder();
intruder.crack('Home', function(err, key) {
if (err) throw new Error(err);
console.log(key);
});
</code></pre><p>That’s it. Sort of. It turns out it might take some time for Intruder to crack the network. So maybe you want to monitor it’s progress? Here’s how to do that:</p>
<pre><code>var Intruder = require('intruder');
Intruder()
.on('attempt', function(ivs) {
console.log(ivs);
})
.crack('Home', function(err, key) {
if (err) throw new Error(err);
console.log(key);
});
</code></pre><p>Now, I’ll explain how it works:</p>
<ol>
<li><p>When you call <code>intruder.crack</code>, first we look up all the wireless networks in range. Then, we filter them out to find the network that you passed in.</p>
</li>
<li><p>After finding the specific network, we start sniffing network packets on the network channel.</p>
</li>
<li><p>Sniffing packets will generate a <code>capture</code> file that contains information about the captured packets. We find that file and then pass the file into <a href="https://github.com/aircrack-ng/aircrack-ng">aircrack</a>, which will attempt to decrypt it. You usually need at least 80,000 <a href="https://en.wikipedia.org/wiki/Initialization_vector">IVs</a>, according to aircrack’s documentation.</p>
</li>
</ol>
<p>If you have any questions or comments, don’t hesitate to find me on <a href="https://www.twitter.com/stevenmiller888">twitter</a>.</p>
Mind: How to Build a Neural Network (Part Two)http://stevenmiller888.github.com/mind-how-to-build-a-neural-network-part-22015-08-14T00:00:00.000ZSteven Miller<p><em>In this second part on learning how to build a neural network, we will dive into the implementation of a flexible library in JavaScript. In case you missed it, here is <a href="/mind-how-to-build-a-neural-network">Part One</a>, which goes over what neural networks are and how they operate.</em></p>
<p><em>In this second part on learning how to build a neural network, we will dive into the implementation of a flexible library in JavaScript. In case you missed it, here is <a href="/mind-how-to-build-a-neural-network">Part One</a>, which goes over what neural networks are and how they operate.</em></p>
<h2 id="building-the-mind">Building the Mind</h2>
<p>Building a complete neural network library requires more than just understanding forward and back propagation. We also need to think about how a user of the network will want to configure it (e.g. set total number of learning iterations) and other API-level design considerations.</p>
<p>To simplify our explanation of neural networks via code, the code snippets below build a neural network, <code>Mind</code>, with a single hidden layer. The actual <a href="https://github.com/stevenmiller888/mind">Mind</a> library, however, provides the flexibility to build a network with multiple hidden layers.</p>
<h3 id="initialization">Initialization</h3>
<p>First, we need to set up our constructor function. Let’s give the option to use the sigmoid activation or the hyperbolic tangent activation function. Additionally, we’ll allow our users to set the learning rate, number of iterations, and number of units in the hidden layer, while providing sane defaults for each. Here’s our constructor:</p>
<pre><code class="lang-javascript">function Mind(opts) {
if (!(this instanceof Mind)) return new Mind(opts);
opts = opts || {};
opts.activator === 'sigmoid'
? (this.activate = sigmoid, this.activatePrime = sigmoidPrime)
: (this.activate = htan, this.activatePrime = htanPrime);
// hyperparameters
this.learningRate = opts.learningRate || 0.7;
this.iterations = opts.iterations || 10000;
this.hiddenUnits = opts.hiddenUnits || 3;
}
</code></pre>
<blockquote>
<p>Note that here we use the <a href="https://www.npmjs.com/package/sigmoid"><code>sigmoid</code></a>, <a href="https://www.npmjs.com/package/sigmoid-prime"><code>sigmoid-prime</code></a>, <a href="https://www.npmjs.com/package/htan"><code>htan</code></a>, and <a href="https://www.npmjs.com/package/htan-prime"><code>htan-prime</code></a> npm modules.</p>
</blockquote>
<h3 id="forward-propagation">Forward Propagation</h3>
<p>The forward propagation process is a series of sum products and transformations. Let’s calculate the first hidden sum with all four input data:</p>
<p><img src="http://i.imgur.com/ZhO0Nj2.png" alt=""></p>
<p>This can be represented as such:</p>
<p><img src="http://i.imgur.com/XcSZgTk.png" alt=""></p>
<p>To get the result from the sum, we apply the activation function, sigmoid, to each element:</p>
<p><img src="http://i.imgur.com/rhnNQZW.png" alt=""></p>
<p>Then, we do this again with the hidden result as the new input to get to the final output result. The entire forward propagation code looks like:</p>
<pre><code class="lang-javascript">Mind.prototype.forward = function(examples) {
var activate = this.activate;
var weights = this.weights;
var ret = {};
ret.hiddenSum = multiply(weights.inputHidden, examples.input);
ret.hiddenResult = ret.hiddenSum.transform(activate);
ret.outputSum = multiply(weights.hiddenOutput, ret.hiddenResult);
ret.outputResult = ret.outputSum.transform(activate);
return ret;
};
</code></pre>
<blockquote>
<p>Note that <code>this.activate</code> and <code>this.weights</code> are set at the initialization of a new <code>Mind</code> via <a href="https://github.com/stevenmiller888/mind/blob/master/lib/index.js#L40">passing an <code>opts</code> object</a>. <code>multiply</code> and <code>transform</code> come from an npm <a href="https://www.npmjs.com/package/node-matrix">module</a> for performing basic matrix operations.</p>
</blockquote>
<h3 id="back-propagation">Back Propagation</h3>
<p>Back propagation is a bit more complicated. Let’s look at the last layer first. We calculate the <code>output error</code> (same equation as before):</p>
<p><img src="http://i.imgur.com/IAddjWL.png" alt=""></p>
<p>And the equivalent in code:</p>
<pre><code class="lang-javascript">var errorOutputLayer = subtract(examples.output, results.outputResult);
</code></pre>
<p>Then, we determine the change in the output layer sum, or <code>delta output sum</code>:</p>
<p><img src="http://i.imgur.com/4qnVb6S.png" alt=""></p>
<p>And the code:</p>
<pre><code class="lang-javascript">var deltaOutputLayer = dot(results.outputSum.transform(activatePrime), errorOutputLayer);
</code></pre>
<p>Then, we figure out the hidden output changes. We use this formula:</p>
<p><img src="http://i.imgur.com/TR7FS2S.png" alt=""></p>
<p>Here is the code:</p>
<pre><code class="lang-javascript">var hiddenOutputChanges = scalar(multiply(deltaOutputLayer, results.hiddenResult.transpose()), learningRate);
</code></pre>
<p>Note that we scale the change by a magnitude, <code>learningRate</code>, which is from 0 to 1. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If there is a large variability in the input (there is little relationship among the training data) and the rate was set high, then the network may not learn well or at all. Setting the rate too high also introduces the risk of <a href="https://en.wikipedia.org/wiki/Overfitting">‘overfitting’</a>, or training the network to generate a relationship from noise instead of the actual underlying function.</p>
<p>Since we’re dealing with matrices, we handle the division by multiplying the <code>delta output sum</code> with the hidden results matrices’ transpose.</p>
<p>Then, we do this process <a href="https://github.com/stevenmiller888/mind/blob/master/lib/index.js#L200">again</a> for the input to hidden layer.</p>
<p>The code for the back propagation function is below. Note that we’re passing what is returned by the <code>forward</code> function as the second argument:</p>
<pre><code class="lang-javascript">Mind.prototype.back = function(examples, results) {
var activatePrime = this.activatePrime;
var learningRate = this.learningRate;
var weights = this.weights;
// compute weight adjustments
var errorOutputLayer = subtract(examples.output, results.outputResult);
var deltaOutputLayer = dot(results.outputSum.transform(activatePrime), errorOutputLayer);
var hiddenOutputChanges = scalar(multiply(deltaOutputLayer, results.hiddenResult.transpose()), learningRate);
var deltaHiddenLayer = dot(multiply(weights.hiddenOutput.transpose(), deltaOutputLayer), results.hiddenSum.transform(activatePrime));
var inputHiddenChanges = scalar(multiply(deltaHiddenLayer, examples.input.transpose()), learningRate);
// adjust weights
weights.inputHidden = add(weights.inputHidden, inputHiddenChanges);
weights.hiddenOutput = add(weights.hiddenOutput, hiddenOutputChanges);
return errorOutputLayer;
};
</code></pre>
<blockquote>
<p>Note that <code>subtract</code>, <code>dot</code> , <code>scalar</code>, <code>multiply</code>, and <code>add</code> come from the same npm <a href="https://www.npmjs.com/package/node-matrix">module</a> we used before for performing matrix operations.</p>
</blockquote>
<h3 id="putting-both-together">Putting both together</h3>
<p>Now that we have both the forward and back propagation, we can define the function <code>learn</code> that will put them together. The <code>learn</code> function will accept training data (<code>examples</code>) as an array of matrices. Then, we assign random samples to the initial weights (via <a href="https://github.com/stevenmiller888/sample"><code>sample</code></a>). Lastly, we use a <code>for</code> loop and repeat <code>this.iterations</code> to do both forward and backward propagation.</p>
<pre><code class="lang-javascript">Mind.prototype.learn = function(examples) {
examples = normalize(examples);
this.weights = {
inputHidden: Matrix({
columns: this.hiddenUnits,
rows: examples.input[0].length,
values: sample
}),
hiddenOutput: Matrix({
columns: examples.output[0].length,
rows: this.hiddenUnits,
values: sample
})
};
for (var i = 0; i < this.iterations; i++) {
var results = this.forward(examples);
var errors = this.back(examples, results);
}
return this;
};
</code></pre>
<p><em>More information about the Mind API <a href="https://github.com/stevenmiller888/mind">here</a>.</em></p>
<p>Now you have a basic understanding of how neural networks operate, how to train them, and also how to build your own!</p>
<p>If you have any questions or comments, don’t hesitate to find me on <a href="https://www.twitter.com/stevenmiller888">twitter</a>. Shout out to <a href="https://www.twitter.com/andyjiang">Andy</a> for his help on reviewing this.</p>
<h2 id="additional-resources">Additional Resources</h2>
<p><a href="https://www.youtube.com/watch?v=bxe2T-V8XRs">Neural Networks Demystified</a>, by <a href="https://www.twitter.com/stephencwelch">Stephen Welch</a></p>
<p><a href="http://neuralnetworksanddeeplearning.com/chap3.html">Neural Networks and Deep Learning</a>, by <a href="http://michaelnielsen.org/">Michael Nielsen</a></p>
<p><a href="http://natureofcode.com/book/chapter-10-neural-networks/">The Nature of Code, Neural Networks</a>, by <a href="https://twitter.com/shiffman">Daniel Shiffman</a></p>
<p><a href="https://en.wikipedia.org/wiki/Artificial_neural_network">Artificial Neural Networks</a>, Wikipedia</p>
<p><a href="http://www.cheshireeng.com/Neuralyst/nnbg.htm">Basic Concepts for Neural Networks</a>, by Ross Berteig</p>
<p><a href="http://www.saedsayad.com/artificial_neural_network.htm">Artificial Neural Networks</a>, by <a href="http://www.saedsayad.com/author.htm">Saed Sayad</a></p>
<p><a href="http://www.researchgate.net/post/How_to_decide_the_number_of_hidden_layers_and_nodes_in_a_hidden_layer">How to Decide the Number of Hidden Layers and Nodes in a Hidden Layer</a></p>
<p><a href="http://in.mathworks.com/matlabcentral/answers/72654-how-to-decide-size-of-neural-network-like-number-of-neurons-in-a-hidden-layer-number-of-hidden-lay">How to Decide size of Neural Network like number of neurons in a hidden layer & Number of hidden layers?</a></p>
Mind: How to Build a Neural Network (Part One)http://stevenmiller888.github.com/mind-how-to-build-a-neural-network2015-08-11T00:00:00.000ZSteven Miller<p><a href="https://en.wikipedia.org/wiki/Artificial_neural_network">Artificial neural networks</a> are statistical learning models, inspired by biological neural networks (central nervous systems, such as the brain), that are used in <a href="https://en.wikipedia.org/wiki/List_of_machine_learning_concepts">machine learning</a>. These networks are represented as systems of interconnected “neurons”, which send messages to each other. The connections within the network can be systematically adjusted based on inputs and outputs, making them ideal for supervised learning.</p>
<p><a href="https://en.wikipedia.org/wiki/Artificial_neural_network">Artificial neural networks</a> are statistical learning models, inspired by biological neural networks (central nervous systems, such as the brain), that are used in <a href="https://en.wikipedia.org/wiki/List_of_machine_learning_concepts">machine learning</a>. These networks are represented as systems of interconnected “neurons”, which send messages to each other. The connections within the network can be systematically adjusted based on inputs and outputs, making them ideal for supervised learning.</p>
<p>Neural networks can be intimidating, especially for people with little experience in machine learning and cognitive science! However, through code, this tutorial will explain how neural networks operate. By the end, you will know how to build your own flexible, learning network, similar to <a href="https://www.github.com/stevenmiller888/mind">Mind</a>.</p>
<p>The only prerequisites are having a basic understanding of JavaScript, high-school Calculus, and simple matrix operations. Other than that, you don’t need to know anything. Have fun!</p>
<h2 id="understanding-the-mind">Understanding the Mind</h2>
<p>A neural network is a collection of “neurons” with “synapses” connecting them. The collection is organized into three main parts: the input layer, the hidden layer, and the output layer. Note that you can have <em>n</em> hidden layers, with the term “deep” learning implying multiple hidden layers.</p>
<p><img src="https://cldup.com/ytEwlOfrRZ-2000x2000.png" alt=""></p>
<p><em>Screenshot taken from <a href="https://www.youtube.com/watch?v=bxe2T-V8XRs">this great introductory video</a>, which trains a neural network to predict a test score based on hours spent studying and sleeping the night before.</em></p>
<p>Hidden layers are necessary when the neural network has to make sense of something really complicated, contextual, or non obvious, like image recognition. The term “deep” learning came from having many hidden layers. These layers are known as “hidden”, since they are not visible as a network output. Read more about hidden layers <a href="http://stats.stackexchange.com/questions/63152/what-does-the-hidden-layer-in-a-neural-network-compute">here</a> and <a href="http://www.cs.cmu.edu/~dst/pubs/byte-hiddenlayer-1989.pdf">here</a>.</p>
<p>The circles represent neurons and lines represent synapses. Synapses take the input and multiply it by a “weight” (the “strength” of the input in determining the output). Neurons add the outputs from all synapses and apply an activation function.</p>
<p>Training a neural network basically means calibrating all of the “weights” by repeating two key steps, forward propagation and back propagation.</p>
<p>Since neural networks are great for regression, the best input data are numbers (as opposed to discrete values, like colors or movie genres, whose data is better for statistical classification models). The output data will be a number within a range like 0 and 1 (this ultimately depends on the activation function—more on this below).</p>
<p>In <strong>forward propagation</strong>, we apply a set of weights to the input data and calculate an output. For the first forward propagation, the set of weights is selected randomly.</p>
<p>In <strong>back propagation</strong>, we measure the margin of error of the output and adjust the weights accordingly to decrease the error.</p>
<p>Neural networks repeat both forward and back propagation until the weights are calibrated to accurately predict an output.</p>
<p>Next, we’ll walk through a simple example of training a neural network to function as an <a href="https://en.wikipedia.org/wiki/Exclusive_or">“Exclusive or” (“XOR”) operation</a> to illustrate each step in the training process.</p>
<h3 id="forward-propagation">Forward Propagation</h3>
<p><em>Note that all calculations will show figures truncated to the thousandths place.</em></p>
<p>The XOR function can be represented by the mapping of the below inputs and outputs, which we’ll use as training data. It should provide a correct output given any input acceptable by the XOR function.</p>
<pre><code>input | output
--------------
0, 0 | 0
0, 1 | 1
1, 0 | 1
1, 1 | 0
</code></pre><p>Let’s use the last row from the above table, <code>(1, 1) => 0</code>, to demonstrate forward propagation:</p>
<p><img src="http://imgur.com/aTFz1Az.png" alt=""></p>
<p><em>Note that we use a single hidden layer with only three neurons for this example.</em></p>
<p>We now assign weights to all of the synapses. Note that these weights are selected randomly (based on Gaussian distribution) since it is the first time we’re forward propagating. The initial weights will be between 0 and 1, but note that the final weights don’t need to be.</p>
<p><img src="http://imgur.com/Su6Y4UC.png" alt=""></p>
<p>We sum the product of the inputs with their corresponding set of weights to arrive at the first values for the hidden layer. You can think of the weights as measures of influence the input nodes have on the output.</p>
<pre><code>1 * 0.8 + 1 * 0.2 = 1
1 * 0.4 + 1 * 0.9 = 1.3
1 * 0.3 + 1 * 0.5 = 0.8
</code></pre><p>We put these sums smaller in the circle, because they’re not the final value:</p>
<p><img src="http://imgur.com/gTvxRwo.png" alt=""></p>
<p>To get the final value, we apply the <a href="https://en.wikipedia.org/wiki/Activation_function">activation function</a> to the hidden layer sums. The purpose of the activation function is to transform the input signal into an output signal and are necessary for neural networks to model complex non-linear patterns that simpler models might miss.</p>
<p>There are many types of activation functions—linear, sigmoid, hyperbolic tangent, even step-wise. To be honest, I don’t know why one function is better than another.</p>
<p><img src="https://cldup.com/hxmGABAI7Y.png" alt=""></p>
<p><em>Table taken from <a href="http://www.asprs.org/a/publications/pers/2003journal/november/2003_nov_1225-1234.pdf">this paper</a>.</em></p>
<p>For our example, let’s use the <a href="https://en.wikipedia.org/wiki/Sigmoid_function">sigmoid function</a> for activation. The sigmoid function looks like this, graphically:</p>
<p><img src="http://i.imgur.com/RVbqJsg.jpg" alt=""></p>
<p>And applying S(x) to the three hidden layer <em>sums</em>, we get:</p>
<pre><code>S(1.0) = 0.73105857863
S(1.3) = 0.78583498304
S(0.8) = 0.68997448112
</code></pre><p>We add that to our neural network as hidden layer <em>results</em>:</p>
<p><img src="http://imgur.com/yE88Ryt.png" alt=""></p>
<p>Then, we sum the product of the hidden layer results with the second set of weights (also determined at random the first time around) to determine the output sum.</p>
<pre><code>0.73 * 0.3 + 0.79 * 0.5 + 0.69 * 0.9 = 1.235
</code></pre><p>..finally we apply the activation function to get the final output result.</p>
<pre><code>S(1.235) = 0.7746924929149283
</code></pre><p>This is our full diagram:</p>
<p><img src="http://imgur.com/IDFRq5a.png" alt=""></p>
<p>Because we used a random set of initial weights, the value of the output neuron is off the mark; in this case by +0.77 (since the target is 0). If we stopped here, this set of weights would be a great neural network for inaccurately representing the XOR operation.</p>
<p>Let’s fix that by using back propagation to adjust the weights to improve the network!</p>
<h3 id="back-propagation">Back Propagation</h3>
<p>To improve our model, we first have to quantify just how wrong our predictions are. Then, we adjust the weights accordingly so that the margin of errors are decreased.</p>
<p>Similar to forward propagation, back propagation calculations occur at each “layer”. We begin by changing the weights between the hidden layer and the output layer.</p>
<p><img src="http://imgur.com/kEyDCJ8.png" alt=""></p>
<p>Calculating the incremental change to these weights happens in two steps: 1) we find the margin of error of the output result (what we get after applying the activation function) to back out the necessary change in the output sum (we call this <code>delta output sum</code>) and 2) we extract the change in weights by multiplying <code>delta output sum</code> by the hidden layer results.</p>
<p>The <code>output sum margin of error</code> is the target output result minus the calculated output result:</p>
<p><img src="http://i.imgur.com/IAddjWL.png" alt=""></p>
<p>And doing the math:</p>
<pre><code>Target = 0
Calculated = 0.77
Target - calculated = -0.77
</code></pre><p>To calculate the necessary change in the output sum, or <code>delta output sum</code>, we take the derivative of the activation function and apply it to the output sum. In our example, the activation function is the sigmoid function.</p>
<p>To refresh your memory, the activation function, sigmoid, takes the sum and returns the result:</p>
<p><img src="http://i.imgur.com/rKHEE51.png" alt=""></p>
<p>So the derivative of sigmoid, also known as sigmoid prime, will give us the rate of change (or “slope”) of the activation function at the output sum:</p>
<p><img src="http://i.imgur.com/8xQ6TiU.png" alt=""></p>
<p>Since the <code>output sum margin of error</code> is the difference in the result, we can simply multiply that with the rate of change to give us the <code>delta output sum</code>:</p>
<p><img src="http://i.imgur.com/4qnVb6S.png" alt=""></p>
<p>Conceptually, this means that the change in the output sum is the same as the sigmoid prime of the output result. Doing the actual math, we get:</p>
<pre><code>Delta output sum = S'(sum) * (output sum margin of error)
Delta output sum = S'(1.235) * (-0.77)
Delta output sum = -0.13439890643886018
</code></pre><p>Here is a graph of the Sigmoid function to give you an idea of how we are using the derivative to move the input towards the right direction. Note that this graph is not to scale.</p>
<p><img src="http://i.imgur.com/ByyQIJ8.png" alt=""></p>
<p>Now that we have the proposed change in the output layer sum (-0.13), let’s use this in the derivative of the output sum function to determine the new change in weights.</p>
<p>As a reminder, the mathematical definition of the <code>output sum</code> is the product of the hidden layer result and the weights between the hidden and output layer:</p>
<p><img src="http://i.imgur.com/ITudruR.png" alt=""></p>
<p>The derivative of the <code>output sum</code> is:</p>
<p><img src="http://i.imgur.com/57mJyOe.png" alt=""></p>
<p>..which can also be represented as:</p>
<p><img src="http://i.imgur.com/TR7FS2S.png" alt=""></p>
<p>This relationship suggests that a greater change in output sum yields a greater change in the weights; input neurons with the biggest contribution (higher weight to output neuron) should experience more change in the connecting synapse.</p>
<p>Let’s do the math:</p>
<pre><code>hidden result 1 = 0.73105857863
hidden result 2 = 0.78583498304
hidden result 3 = 0.68997448112
Delta weights = delta output sum / hidden layer results
Delta weights = -0.1344 / [0.73105, 0.78583, 0.69997]
Delta weights = [-0.1838, -0.1710, -0.1920]
old w7 = 0.3
old w8 = 0.5
old w9 = 0.9
new w7 = 0.1162
new w8 = 0.329
new w9 = 0.708
</code></pre><p>To determine the change in the weights between the <em>input and hidden</em> layers, we perform the similar, but notably different, set of calculations. Note that in the following calculations, we use the initial weights instead of the recently adjusted weights from the first part of the backward propagation.</p>
<p>Remember that the relationship between the hidden result, the weights between the hidden and output layer, and the output sum is:</p>
<p><img src="http://i.imgur.com/ITudruR.png" alt=""></p>
<p>Instead of deriving for <code>output sum</code>, let’s derive for <code>hidden result</code> as a function of <code>output sum</code> to ultimately find out <code>delta hidden sum</code>:</p>
<p><img src="http://i.imgur.com/25TS8NU.png" alt="">
<img src="http://i.imgur.com/iQIR1MD.png" alt=""></p>
<p>Also, remember that the change in the <code>hidden result</code> can also be defined as:</p>
<p><img src="http://i.imgur.com/ZquX1pv.png" alt=""></p>
<p>Let’s multiply both sides by sigmoid prime of the hidden sum:</p>
<p><img src="http://i.imgur.com/X0wvirh.png" alt="">
<img src="http://i.imgur.com/msHbhQl.png" alt=""></p>
<p>All of the pieces in the above equation can be calculated, so we can determine the <code>delta hidden sum</code>:</p>
<pre><code>Delta hidden sum = delta output sum / hidden-to-outer weights * S'(hidden sum)
Delta hidden sum = -0.1344 / [0.3, 0.5, 0.9] * S'([1, 1.3, 0.8])
Delta hidden sum = [-0.448, -0.2688, -0.1493] * [0.1966, 0.1683, 0.2139]
Delta hidden sum = [-0.088, -0.0452, -0.0319]
</code></pre><p>Once we get the <code>delta hidden sum</code>, we calculate the change in weights between the input and hidden layer by dividing it with the input data, <code>(1, 1)</code>. The input data here is equivalent to the <code>hidden results</code> in the earlier back propagation process to determine the change in the hidden-to-output weights. Here is the derivation of that relationship, similar to the one before:</p>
<p><img src="http://i.imgur.com/7NmXWSh.png" alt="">
<img src="http://i.imgur.com/1SDxECJ.png" alt="">
<img src="http://i.imgur.com/KYuSAgw.png" alt=""></p>
<p>Let’s do the math:</p>
<pre><code>input 1 = 1
input 2 = 1
Delta weights = delta hidden sum / input data
Delta weights = [-0.088, -0.0452, -0.0319] / [1, 1]
Delta weights = [-0.088, -0.0452, -0.0319, -0.088, -0.0452, -0.0319]
old w1 = 0.8
old w2 = 0.4
old w3 = 0.3
old w4 = 0.2
old w5 = 0.9
old w6 = 0.5
new w1 = 0.712
new w2 = 0.3548
new w3 = 0.2681
new w4 = 0.112
new w5 = 0.8548
new w6 = 0.4681
</code></pre><p>Here are the new weights, right next to the initial random starting weights as comparison:</p>
<pre><code>old new
-----------------
w1: 0.8 w1: 0.712
w2: 0.4 w2: 0.3548
w3: 0.3 w3: 0.2681
w4: 0.2 w4: 0.112
w5: 0.9 w5: 0.8548
w6: 0.5 w6: 0.4681
w7: 0.3 w7: 0.1162
w8: 0.5 w8: 0.329
w9: 0.9 w9: 0.708
</code></pre><p>Once we arrive at the adjusted weights, we start again with forward propagation. When training a neural network, it is common to repeat both these processes thousands of times (by default, Mind iterates 10,000 times).</p>
<p>And doing a quick forward propagation, we can see that the final output here is a little closer to the expected output:</p>
<p><img src="http://i.imgur.com/UNlffE1.png" alt=""></p>
<p>Through just one iteration of forward and back propagation, we’ve already improved the network!!</p>
<p><em>Check out <a href="https://www.youtube.com/watch?v=GlcnxUlrtek">this short video</a> for a great explanation of identifying global minima in a cost function as a way to determine necessary weight changes.</em></p>
<p>If you enjoyed learning about how neural networks work, check out <a href="/mind-how-to-build-a-neural-network-part-2">Part Two</a> of this post to learn how to build your own neural network.</p>
<p><strong>Note: I’ve been working on a new project called <a href="https://maji.cloud/products/config">Maji Config</a>. If you’re tired of duplicating config all over your codebase, or having to redeploy all your apps whenever you need to change config, this might work well for you. I’d love to hear what you think of it. Feel free to send me an <a href="mailto:stevenmiller888@me.com?Subject=Hello">email</a>.</strong></p>
Remembering `.shift()` and `.unshift()`http://stevenmiller888.github.com/remembering-shift-vs-unshift2015-05-22T00:00:00.000ZSteven Miller<p>If you’re like me, you forget the difference between <code>.shift()</code> and <code>.unshift()</code> all the time. Here’s a little trick to remembering them. Picture a keyboard. Now think of that keyboard as an array, with the left side of the keyboard corresponding to the front of the array, and the right side of the keyboard corresponding to the back of the array. Imagine yourself pressing down the left <code>shift</code> key. Think of this as “removing” it from the keyboard (array). Similarly, the <code>shift</code> function removes an element from the front of the array. Now picture yourself removing your finger from the left <code>shift</code> key, and it comes back up. You just “added” (or unshifted) an element to the array.</p>
<p>If you’re like me, you forget the difference between <code>.shift()</code> and <code>.unshift()</code> all the time. Here’s a little trick to remembering them. Picture a keyboard. Now think of that keyboard as an array, with the left side of the keyboard corresponding to the front of the array, and the right side of the keyboard corresponding to the back of the array. Imagine yourself pressing down the left <code>shift</code> key. Think of this as “removing” it from the keyboard (array). Similarly, the <code>shift</code> function removes an element from the front of the array. Now picture yourself removing your finger from the left <code>shift</code> key, and it comes back up. You just “added” (or unshifted) an element to the array.</p>