Pseudo-code

1. Assign matrix of inputs P and vector of targets T as usual.
2. for i = 1:length(P) # repeat through each row of input data
     POneLeftOut = P;
     TOneLeftOut = T;
     remove ith row of POneLeftOut, and ith row of TOneLeftOut (POneLeftOut(:,i) = []) # this is in order to 'leave-out' one instance of the data
     setup new neural network with POneLeftOut, TOneLeftOut as arguments
     set portion of samples used for testing to 0
     set portion of samples used for training to .75
     set portion of samples used for crossval to .25
     train network on POneLeftOut, TOneLeftOut
     simulate network on ith row of P (the one row excluded earlier)
     assign this output as ith entry of some vector, like testForecast(i)
   end
3. compute mean squared error (or whatever error measurement you like)
by comparing values of testForecast to those of T.  Programming this manually
would be a good test to see if you truly understand what mse is.  If not,
you can just use the built in mse function.

Explanation

Multiple networks are trained and created, each with slightly different training and testing data. In each neural network instance, only one piece of data is set to be testing data (instead of the normal 20% of the data samples). The idea is that each network is being trained on the entire dataset except for the one entry which has been removed, then we run that single entry through the network as a test. In this way, you don’t need to dedicate some portion of your dataset to testing, and you also don’t make the error of using the same data in training and testing, which would overestimate prediction ability.

You can then use the resulting vector of forecasted test-data outputs to compare to the real outputs, in order to obtain forecast error rates that are comparable to other similarly run neural networks.

Why

The normal way of running a neural network (60% train, 20% validate, 20% test) once does not bode well for comparison to other networks forecasting error rates (MSE). Even if you run the same exact setup on the same data multiple times, you may get slightly different MSE. This is because there is randomness involved in most neural networks initially and throughout training. They randomly start at a guess of an adequate beginning point and try to get forecast error to a minimum. Sometimes this, and other factors, can lead to slightly different final weights and therefore slightly different error rates.

Concept

Here is the concept behind leave-one-out. Think of cross-validation samples as training samples, since they are used in the training process. Then with the default MATLAB settings, you use 80% (60 training+20 crossval) for training, and the remaining 20% for testing. Notice that we don’t test on any samples that were used for training, because that gives us too optimistic an estimate of how the network will perform when given new data. Since the weights of the network depend on the training set, we only test an independent test set which was not used to determine those weights. So there is a tradeoff between size of training set and size of test set. The larger your training set, the smaller your test set must be, and vice versa, unless you have unlimited data.

Leave-one-out allows us to get around this tradeoff, at the expense of more computation (n-times more computation if you have n samples in your data set). In the leave-one-out algorithm, we allocate all but one of the samples to the training set (this is probably 99% or more of the entire set). After training, we test on the single sample which was left out of the training set. So we have a achieved a very large training set (almost 100%), but only a very small testing set. But then this process is repeated n times, each one using a different network, and each time choosing a different sample to leave out of the training set and use for testing.

The result is that we have artificially created a maximal test set, comprising 100% of the data (we have n test values at the end), but never once have we made the mistake of testing a network on a sample which was used for training that network. And there was almost no loss of training set size, since each training set included 99% of the data. So each network that we trained was very similar to the network which would result from using 100% of the samples for training.

Cavaets

  1. Once you want to use the network on current input data, to forecast future outputs, which network does one actually use?
    1. If you somehow average or combine weights from all the various networks, that in itself may be destructive in a fragile balanced network.
    2. If you pick one network of the bunch, it may not be representative of the bunch.
    3. Maybe the answer is saving all of the various networks weights, using each of them successively to predict output, and then combining those results?
    4. ANSWER: Train with 100% of the data
      1. FIXME I am curious if training with 100% more than once will come up with any substantially different minima to be found?
  2. Leave-one-out may not work on time-series data using MATLABs ‘dynamic neural network’ (time-series adjusting) method
    1. This I am not as sure of, and more thought needs to be put into it, as dynamic network functionality is not yet understood

Matts Notes

One word of caution about something that misled me (Matt) for a long time, and the reason for which I started doing that “leave-one-out” testing:

As you know, when a neural network is set up, MATLAB randomly allocates the data into 3 groups (train, validate,test). The MSE you are probably seeing when you run the network is that which is automatically output by MATLAB, which is the training MSE. This MSE reflects better performance than you will actually get since it’s based on data that was used to train the network. The MSE to look at is the test data MSE. You can get it by typing [net, tr.tperf] = sim(net, P,T) instead of just net = sim(net,P,T). Then look at the last entry in the vector tr.tperf.

Also, since the allocation is random, it’s not even fair to compare networks based just on the testing performance. It might be the case that the random training set for one network was “easy” to predict compared to that of the other network. If you don’t want to get into the leave-one-out method (which I can send you when you are ready to do it), the best alternative is to write a script which will train and simulate, say, 1000 networks, each with different testing sets and therefore different test MSE’s. Then average those MSE’s and you have a much more accurate assessment of your networks performance, suitable to compare to other networks tested with the same routine.

 
personal/school/financialitsystems_is698/neural_net_evol_alg_project/documentation/leave_one_out.txt · Last modified: 11.20.2008 12:14 by 69.134.58.89
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki