% % Skeleton solution for a backpropagation network. % % You need to change the bits that say INSERT CODE HERE. % Feel free to use as is or start from scratch. % % Usage: % backprop; % % Arguments: % none. % % Returns: % none. % % Example: % % >> backprop % % ---------------------------------------------------------- % % Modeling Behavior % Psychology 891C % Fall 2004 % % Andrew L. Cohen % 11/8/04 % % --------------------------------------------------------- % function backprop % Step 1: Constants ---------------------------------------- % Load the data load test_patterns.mat -ascii; load train_patterns.mat -ascii; load test_targets.mat -ascii; load train_targets.mat -ascii; % The data are now in the matrices: % train_patterns train_targets test_patterns test_targets % To see what they look like, uncomment the next 4 lines. % train_patterns % train_targets % test_patterns % test_targets A = size(train_patterns, 1); % How many input units to use B = 8; % Number of hidden units C = size(train_targets, 1); % Number of respones num_train = size(train_patterns, 2); % How many training patterns to use num_test = size(test_patterns, 2); % How many testing patterns to use eta = 0.35; % Learning rate for backprop num_epochs = 500; % Number of training epochs % Step 2: Initialize weights --------------------------------- % Initialize weights between -.1 and .1 uniformly. % rand gives a random number between 0 and 1. % DON'T FORGET THE THRESHOLD UNITS out of (but not into) the input and % hidden layers! % Input to hidden (#input+1 x #hidden) w1 = % INSERT CODE HERE % Hidden to output (#hidden+1 x #output) w2 = % INSERT CODE HERE % Step 3: Initialize threshold units -------------------------- % Let the threshold units be the last input and hidden unit x(A + 1) = % INSERT CODE HERE h(B + 1) = % INSERT CODE HERE % Train the network -------------------------------------- % For each epoch for epoch = 1:num_epochs % Total number of training and testing patterns num_patterns = num_train + num_test; % Reset sse to zero sse_train(epoch) = 0; sse_test(epoch) = 0; % For each train and test pattern for p = 1:num_patterns % Training or testing? Choose patterns and targets appropriately and % set flag. if p <= num_train training = 1; patterns = train_patterns; targets = train_targets; else training = 0; patterns = test_patterns; targets = test_targets; end % Select a training pattern and get the right target. % mod gives modulus after division, so splits training and testing % appropriately. pattern = patterns(:, mod(p - 1, num_train) + 1); target = targets(:, mod(p - 1, num_train) + 1); % The desired output vector (make a row vector so matches output). % y is 1xC. y = target'; % Step 4: Activate inputs to be the pattern ---------------- % x is 1xA+1, but, note that you shouldn't include the threshold unit % here. x(1:A) = % INSERT CODE HERE % Step 5: Propagate from input to hidden units ------------- % net1 = sum of the input activations times the first weight layer to the % hidden units. % This should be easy. Check what [1 2 3] * [1 2; 3 4; 5 6] does. It % is [1*1 + 2*3 + 3*5 1*2 + 2*4 + 3*6]. % When you multiply a axb and a bxc matrix you get a axc matrix. % Note the the b's need to match. % As an example, say a 3 unit input vector, plus threshold unit, is % [.2 .3 .4 1], say the weights from these units to two hidden units % are [1 1 1 1] to the first hidden unit and [0 0 0 0] to the second % hidden unit. Then the weight vector is [1 0; 1 0; 1 0; 1 0]. So if % you want to sum the input to the first hidden unit you want % .2*1 + .3*1 + .4*1 + 1*1 and to the second hidden unit, % .1*0 + .3*0 + .4*0 + 1*0. Thus when you multiply % [.2 .3 .4 1] * [1 0; 1 0; 1 0; 1 0] you get % [.2*1 + .3*1 + .4*1 + 1*1 .1*0 + .3*0 + .4*0 + 1*0] which is a % 1x2 matrix of net inputs to the hidden units. % You can use 'size(x)' to look at how many rows & cols a matrix has. % You should include the threshold unit here. % x is 1xA+1, w1 is A+1xB. net1 should be 1xB. net1 = % INSERT CODE HERE % Filter through activation function % e^x is exp(x), you can take exp([1 2]), and 1./[1 2 3] = [1 .5 .3333] % Note that I'm NOT including the threshold unit here. h(1:B) = % INSERT CODE HERE % Step 6: Propagate activations from hidden to output ---------- % net2 = sum of the hidden activations times the second weight layer to the % output units. See net1 above for hints. % w2 is B+1xC. h is 1xB+1. net2 should be 1xC. net2 = % INSERT CODE HERE % Filter through activation function, see h above for hints. % o should be 1xC and net2 is 1xC. o = % INSERT CODE HERE % Backprop if training, print output if not if training % Calculate error sse_train(epoch) = sse_train(epoch) + sum((o-y).^2); % Step 7: Compute errors in output layer ----------------- % This should be easy too. % Note that [1 2 3] .* [2 3 4] = [1*2 2*3 3*4] and that % (1 - [ 1 2 3]).*[2 3 4] = [0 -1 -2].*[2 3 4] = [0 -3 -8]. % o and y are both 1xC. delta2 should be 1xC. delta2 = % INSERT CODE HERE % Step 8: Compute errors of the hidden layer -------------- % Do NOT include the threshold units here. To not include the % threshold units in h use h(1:B). To not include them in w2 use % w2(1:B, :). % Remember there is a difference between '.*' and '*'. You might % want to use both in this formula. % To multiply you may need to take the transpose of one of the % matrices. % y' is the transpose of y, so [1 2 3]' = [1; 2; 3]. % For example, if you have a matrix [1; 2] and [1 2; 3 4] you can't % multiply [1; 2] * [1 2; 3 4], but you can multiply % [1; 2]' * [1 2; 3 4]. % You can also do [1; 2]' * [1 2; 3 4] .* [5 6]. Try it. delta1 = % INSERT CODE HERE % Step 9: Adjust weights between hidden and output layers ----- % You will need to transpose something. % h is 1xB+1, delta2 is 1xA. w2 should be B+1xA. Don't forget eta. % Include the threshold unit. w2 = w2 + % INSERT CODE HERE % Step 10: Adjust weights between input and hidden layers ----- % x is 1x9 delta1 is 1xB, w1 should be A+1xB. % Include the threshold unit. w1 = w1 + % INSERT CODE HERE else % Calculate error sse_test(epoch) = sse_test(epoch) + sum((o-y).^2); end end end % Plot results (Training is dotted, testing is solid) close all plot(sse_train, '--'); hold on plot(sse_test); legend('train', 'test'); xlabel('Epoch'); ylabel('SSE'); hold off