Tensorflow in Practise
Tensorflow and keras together is an efficient way of performing training on deep neural networks.
**********************************************************
Some ready functions;
Sequential: That defines a SEQUENCE of layers in the neural network
Flatten: Remember earlier where our images were a square, when you printed them out? Flatten just takes that square and turns it into a 1 dimensional set.
Dense: Adds a layer of neurons
Each layer of neurons need an activation function to tell them what to do. There’s lots of options, but just use these for now.
Relu effectively means “If X>0 return X, else return 0” — so what it does it it only passes values 0 or greater to the next layer in the network.
Softmax takes a set of values, and effectively picks the biggest one, so, for example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1, 9.5, 0.1, 0.05, 0.05, 0.05], it saves you from fishing through it looking for the biggest value, and turns it into [0,0,0,0,1,0,0,0,0] — The goal is to save a lot of coding!
**********************************************************
Here below we would like to highlight some nice to know information before performing a training on tensorflow:
Those points are gathered from deeplearning.ai sources.
1- You’ll notice that all of the values in the number are between 0 and 255. If we are training a neural network, for various reasons it’s easier if we treat all values as between 0 and 1, a process called ‘normalizing’…and fortunately in Python it’s easy to normalize a list like this without looping.
2- Now you might be wondering why there are 2 sets…training and testing ; The idea is to have 1 set of data for training, and then another set of data…that the model hasn’t yet seen…to see how good it would be at classifying values.
3- By adding more Neurons we have to do more calculations, slowing down the process, but in this case they have a good impact — we do get more accurate. That doesn’t mean it’s always a case of ‘more is better’, you can hit the law of diminishing returns very quickly!
4- the first layer in your network should be the same shape as your data. Let’s say right now our data is 28x28 images, and 28 layers of 28 neurons would be infeasible, so it makes more sense to ‘flatten’ that 28,28 into a 784x1. Instead of wriitng all the code to handle that ourselves, we add the Flatten() layer at the begining
5- The number of neurons in the last layer should match the number of classes you are classifying for. In this case it’s the digits 0–9, so there are 10 of them, hence you should have 10 neurons in your final layer.
6- Effects of additional layers in the network, there isn’t a significant impact — because this is relatively simple data. For far more complex data (including color images to be classified as flowers that you’ll see in the next lesson), extra layers are often necessary.
7- The impact of training for more or less epochs, you might see the loss value stops decreasing, and sometimes increases. This is a side effect of something called ‘overfitting’ which you can learn about [somewhere] and it’s something you need to keep an eye out for when training neural networks. There’s no point in wasting your time training if you aren’t improving your loss, right!
8- XX% accuracy might be enough for you, and if you reach that after 3 epochs, why sit around waiting for it to finish a lot more epochs, so you have callbacks!