机器学习实战之神经网络（二）_长笛人倚楼Gloria

http://blog.sina.com.cn/u/2214742602

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

机器学习实战之神经网络（二）

(2013-10-22 15:36:39)

标签：

神经网络

智能

bp

costfunction

learning

分类：机器学习

实例

手写数字辨识

一、costfunction

按照前面的理论说明，编写函数计算代价函数J及其梯度

function [J grad] = nnCostFunction(nn_params, ...

input_layer_size, ...

hidden_layer_size, ...

num_labels, ...

X, y, lambda)

%NNCOSTFUNCTION Implements the neural network cost function for a two layer

%neural network which performs classification

% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ...

% X, y, lambda) computes the cost and gradient of the neural network. The

% parameters for the neural network are "unrolled" into the vector

% nn_params and need to be converted back into the weight matrices.

% The returned parameter grad should be a "unrolled" vector of the

% partial derivatives of the neural network.

% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices

% for our 2 layer neural network

Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...

hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...

num_labels, (hidden_layer_size + 1));

% Setup some useful variables

m = size(X, 1);

% You need to return the following variables correctly

J = 0;

Theta1_grad = zeros(size(Theta1));

Theta2_grad = zeros(size(Theta2));

% ====================== YOUR CODE HERE ======================

% Instructions: You should complete the code by working through the

% following parts.

% Part 1: Feedforward the neural network and return the cost in the

% variable J. After implementing Part 1, you can verify that your

% cost function computation is correct by verifying the cost

% computed in ex4.m

% Part 2: Implement the backpropagation algorithm to compute the gradients

% Theta1_grad and Theta2_grad. You should return the partial derivatives of

% the cost function with respect to Theta1 and Theta2 in Theta1_grad and

% Theta2_grad, respectively. After implementing Part 2, you can check

% that your implementation is correct by running checkNNGradients

% Note: The vector y passed into the function is a vector of labels

% containing values from 1..K. You need to map this vector into a

% binary vector of 1's and 0's to be used with the neural network

% cost function.

% Hint: We recommend implementing backpropagation using a for-loop

% over the training examples if you are implementing it for the

% first time.

% Part 3: Implement regularization with the cost function and gradients.

% Hint: You can implement this around the code for

% backpropagation. That is, you can compute the gradients for

% the regularization separately and then add them to Theta1_grad

% and Theta2_grad from Part 2.

%% calculate J

% recode y

Y = diag(ones(1, num_labels));

X = [ones(m, 1) X]; % 5000 * 401

aa2 = sigmoid(Theta1 * X'); % 25 * 5000

aa22 = [ones(1, m); aa2];

aa3 = sigmoid(Theta2 * aa22); % 10 * 5000

for i = 1: m

tempJ = -Y(:, y(i)) .* log(aa3(:, i)) - (1 - Y(:, y(i))) .* log(1 - aa3(:, i));

sum_tempJ = sum(tempJ);

J = J + sum_tempJ;

end

J = 1/m * J;

% regular

temp_theta1 = Theta1(:, (2: end));

temp_theta2 = Theta2(:, (2: end));

J = J + lambda * 1/(2*m) * (sum(sum(temp_theta1.^2)) + sum(sum(temp_theta2.^2)));

%% Calculate grad

Delta1 = 0;

Delta2 = 0;

Delta3 = 0;

for i = 1: m

a1 = X(i, :)'; % 401*1

z2 = Theta1 * a1; % 25*1

a2_temp = sigmoid(z2);

a2 = [1; a2_temp]; % 26*1

z3 = Theta2 * a2; % 10*1

a3_temp = sigmoid(z3);

% a3 = [1; a3_temp];

delta3 = a3_temp - Y(:, y(i)); *1

delta2 = Theta2(:, [2:end])' * delta3 .* sigmoidGradient(z2); % 25*1

Delta2 = Delta2 + delta3 * a2'; % 10*26

Delta1 = Delta1 + delta2 * a1'; % 25*401

end

Theta1_grad = 1/m * Delta1;

Theta2_grad = 1/m * Delta2;

%% Regularized

[m1, n1] = size(Theta1);

[m2, n2] = size(Theta2);

Theta1_grad = Theta1_grad + lambda / m * [zeros(m1, 1), Theta1(:, [2: end])];

Theta2_grad = Theta2_grad + lambda / m * [zeros(m2, 1), Theta2(:, [2: end])];

% -------------------------------------------------------------

% =========================================================================

% Unroll gradients

grad = [Theta1_grad(:) ; Theta2_grad(:)];

end

二、sigmoid函数的梯度

g = sigmoid(z) .* (1 - sigmoid(z));

三、神经网络的训练

通过对代价函数求极值，得到最优的参数theta

%% =================== Part 8: Training NN ===================

% You have now implemented all the code necessary to train a neural

% network. To train your neural network, we will now use "fmincg", which

% is a function which works similarly to "fminunc". Recall that these

% advanced optimizers are able to train our cost functions efficiently as

% long as we provide them with the gradient computations.

fprintf('\nTraining Neural Network... \n')

% After you have completed the assignment, change the MaxIter to a larger

% value to see how more training helps.

options = optimset('MaxIter', 50);

% You should also try different values of lambda

lambda = 1;

% Create "short hand" for the cost function to be minimized

costFunction = @(p) nnCostFunction(p, ...

input_layer_size, ...

hidden_layer_size, ...

num_labels, X, y, lambda);

% Now, costFunction is a function that takes in only one argument (the

% neural network parameters)

[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);

% Obtain Theta1 and Theta2 back from nn_params

Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...

hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...

num_labels, (hidden_layer_size + 1));

fprintf('Program paused. Press enter to continue.\n');

pause;

四、进行预测

function p = predict(Theta1, Theta2, X)

%PREDICT Predict the label of an input given a trained neural network

% p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the

% trained weights of a neural network (Theta1, Theta2)

% Useful values

m = size(X, 1);

num_labels = size(Theta2, 1);

% You need to return the following variables correctly

p = zeros(size(X, 1), 1);

h1 = sigmoid([ones(m, 1) X] * Theta1');

h2 = sigmoid([ones(m, 1) h1] * Theta2');

[dummy, p] = max(h2, [], 2);

% =========================================================================

end

五、关于预测的再次说明

神经网络训练的结果是得到Theta1和Theta2，我们可以通过只载入这两个参数来进行预测。

预测函数与上面的类似，不过为了更普遍一点，这里重写一下。

X = [ones(m, 1) X];

y_predict = 1: num_labels;

y_predict = y_predict';

a2 = sigmoid(Theta1 * X'); % 25 * 5000

a22 = [ones(1, m); a2];

a3 = sigmoid(Theta2 * a22); % 10 * 5000

p_all = a3'; % 5000 * 10

[p_max, index] = max(p_all, [], 2);

p = y_predict(index);

http://s5/middle/84024a4azx6DCFFzTTe94&690

http://s12/middle/84024a4azx6DCFFBASv0b&690

来自为知笔记(Wiz)

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：机器学习实战之神经网络（一）

后一篇：SVD 信号降噪

新浪BLOG意见反馈留言板　欢迎批评指正