An overfit model is like a child who mugged up the results without generalization learned from theory. These models get a low loss during training, but does a poor job on predicting with new data. If a model fits the current sample well, how can we trust that it will make good predictions on new data? Overfitting is caused by making a model more complex than necessary. The fundamental rule of machine learning is fitting the data as simply as possible.
If picture of a cat is shown to children, they can easily connect the word cat with the shape of a cat. If the picture is shown upside down, some children may have difficulty in recognizing it. The teacher needs to tell the child that this also represent a cat. Now the child realizes that shape of objects are independent of the orientation. When working with neural networks and deep learning, a technique called data augmentation is used to provide orientation independence. Data augmentation means generating new incarnations of the same data from the given data. Often this is programmatically done by modifying the images in the dataset with random flips and shifts. This makes the training dataset larger and helps the model to generalize the shape of the object represented in the image and teaches the model that the shape is independent of the position and orientation of the objects represented in images. Precisely this is what teachers in kindergartens does to the child. So data augmentation helps the child and the model to generalize easily and learn fast.
Most of the Machine learning libraries provide an image augmentation API that can be used to create modified versions of images in the training dataset just-in-time. Overfitting is the enemy of generalization since it makes the learner to mug up without understanding. Data augmentation helps in avoiding overfitting by exposing all feactures of the object to the learner whether it is a child or a deep learning model. Overfitting occurs when a model tries to fit the training data so closely that it does not generalize well to new data. To summarize, if you are not able to generalize well, your level of intelligence is low. This is true for both humans and machine learning models.
As you grow old, you master the technique of generalization and your intelligence grows. Feel free to generalize and learn fast. See you next time ……….