I've been messing around with Theano a bit for machine learning, and have been loving it.
I've got a pretty basic neural network class working great -- I've done training on various examples, and done great.
For instance, I just did a stacked autoencoder on the abalone dataset, and without messing with the hyperparameters for final training, I got around 59% accuracy on a 6 class classification scheme (the site shows between 55-65% performance in a 3 class scheme, so I'm happy with this. I'm going to play around with the hyperparameters to try and get better.).
My biggest question is how do people implement minibatch gradient descent using Theano -- I've just been using plain gradient descent and training on every example during each epoch, and I'd love to speed it up. I'm not sure how to update the gradients after a minibatch is done -- it seems like I'd have to run all the examples through the net at once, and I'm not sure on how to do that without breaking things. I'm also planning on adding momentum, but that is seemingly a lot easier -- I just need a shared Theano scalar, and need to update it with everything else.
Here is the relevant section of my code. The layers are their own objects stored in self.layers, and their make_output method builds up a
Code: Select all
def _set_cost(self, type: "cost type"):
"""Set the _costfunc attributes
Usage:
_set_cost(type)
Arguments:
type -- the cost function type
This method may throw a NotImplementedError if the kind of cost function
has not yet been implemented.
This method is used by __init__ to set the cost type appropriately.
Not intended to be accessed publicly.
"""
if type == "categorical crossentropy":
self._costfunc = tensor.nnet.categorical_crossentropy
elif type == "binary crossentropy":
self._costfunc = tensor.nnet.binary_crossentropy
elif type == "quadratic":
self._costfunc = quadratic_cost
else:
raise NotImplementedError("The cost type " + kind +
" is unimplemented.")
def _build_forwardprop(self):
"""Compile a theano function for forwardpropagation
Usage:
_build_forwardprop()
This method is used by __init__ to create the forwardprop method.
Not intended to be accessed publicly.
"""
# Make theano symbols for input and output
self._inpt = tensor.fmatrix("inpt")
self._otpt = tensor.fmatrix("otpt")
self.layers[0].make_output(self._inpt)
for layer in self.layers:
if layer.id != 0:
layer.make_output(self.layers[layer.id - 1].output)
self._output = self.layers[-1].output
# Compile forwardprop method
self.forwardprop = function(inputs = [self._inpt],
outputs = self._output,
allow_input_downcast = True)
def _build_backprop(self, rate: "float", reg_coeff: "float"):
"""Compile a theano function for backpropagation
Usage:
_build_backprop(rate, reg_coeff)
Arguments:
rate -- The learning rate for the network.
reg_coeff -- The L2 regularization coefficient.
This method is used by __init__ to create the backprop method.
Not intended to be accessed publicly.
"""
# L2 regularization expression
regularize = 0
for layer in self.layers:
regularize += abs(layer.weights).sum() ** 2
self.cost = (tensor.mean(self._costfunc(self._output, self._otpt)) +
(reg_coeff * regularize))
self.params = []
for layer in self.layers:
self.params.append(layer.params[0])
self.params.append(layer.params[1])
self._gradients = tensor.grad(cost = self.cost, wrt = self.params)
self._updates = []
for grad, param in zip(self._gradients, self.params):
self._updates.append([param, param - (rate * grad)])
# Compile backprop method
self.backprop = function(inputs = [self._inpt, self._otpt],
outputs = self.cost,
updates = self._updates,
allow_input_downcast = True)
def train(self, data: "list of lists",
epochs: "integer", ):
"""Train the neural network using SGD
Usage:
train(data, epochs)
Arguments:
data -- A list of training examples of the form
[[data], [intended output]].
epochs -- The number of epochs to train for.
This method updates the weights and biases of the network using the
backprop method.
"""
for i in range(0, epochs):
random.shuffle(data)
for item in data:
self.backprop([item[0]],[item[1]])