Adopting Some Good Practices to Avoid Overfitting in the Use of Machine Learning

WSEAS Transactions on Mathematics

Print ISSN: 1109-2769
E-ISSN: 2224-2880

Volume 17, 2018

Notice: As of 2014 and for the forthcoming years, the publication frequency/periodicity of WSEAS Journals is adapted to the 'continuously updated' model. What this means is that instead of being separated into issues, new papers will be added on a continuous basis, allowing a more regular flow and shorter publication times. The papers will appear in reverse order, therefore the most recent one will be on top.

Volume 17, 2018

Adopting Some Good Practices to Avoid Overfitting in the Use of Machine Learning

AUTHORS: Imanol Bilbao, Javier Bilbao, Cristina Feniser

Download as PDF

In Machine Learning, different techniques, methods and algorithms are applied in order to a better approach for the problem that is solving. Adaptive learning, self-organization of information, generalization, fault tolerance and real-time operation are some of the most used in this field. These systems are dynamic and they can learn from the data adapting to the nature of the information. But an excessive adaptation or improvement of the response to the training data can lead to a poor generalization in many cases. Excessive training with the same set of data will cause the classification curves to over-detail the formal variations of that set. To avoid this overfitting, certain preventions can be taken. One possible option is to use the regularization technique keeping all the variables. This technique works well when we have many input parameters and each contributes 'a little' in the prediction. We can conclude that the number of input features compared with the number of training samples, is really important to avoid overfitting.

KEYWORDS: Machine Learning, overfitting, underfitting, regularization.

REFERENCES:

[1] Jagannath Aghav, Poorwa Hirve, Mayura Nene, Deep learning for Real Time Collision Detection and Avoidance, Proceedings of International Conference on Communication Computing and Networking (ICCCN-2017), Chandigarh, March 2017.

[2] A. Y. Ng, Preventing “Overfitting” of CrossValidation Data. Proceeding ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning, pp. 245-253, 1997. Available in July 2018 at: http://www.andrewng.org/portfolio/preventingoverfitting-of-cross-validation-data/

[3] James L. McClelland and David E. Rumelhart, Parallel Distributed Processing, 2-vol. set, Explorations in the Microstructure of Cognition, MIT Press Cambridge, MA, USA, 1986. Available in July 2018 at: https://mitpress.mit.edu/books/paralleldistributed-processing-2-vol-set.

[4] Richard Simon, Michael D. Radmacher, Kevin Dobbin, Lisa M. McShane, Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification, Journal of the National Cancer Institute, vol. 95, Issue 1, pp. 14-18, 2003.

[5] Christophe Ambroise, Geoffrey J. McLachlan, Selection bias in gene extraction on the basis of microarray gene-expression data, Proceedings of the National Academy of Sciences of the United States of America, vol. 99, No. 10 (May 14, 2002), pp. 6562-6566, 2002.

[6] Jyothi Subramanian, Richard Simon, Overfitting in prediction models – Is it a problem only in high dimensions?, Contemporary Clinical Trials, vol. 36, pp. 636- 641, 2013.

[7] Imanol Bilbao and Javier Bilbao, Solving problems for new results predictions in artificial neural networks, International Journal of Neural Networks and Advanced Applications, vol. 4, pp. 10-13, 2017.

[8] M. Hardt, B. Recht, and Y. Singer, Train faster, generalize better: Stability of stochastic gradient descent, Proceedings of the 33rd International Conference on Machine Learning, pp. 1225–1234, 20–22 Jun 2016.

[9] J. Lin, R. Camoriano, and L. Rosasco, Generalization properties and implicit regularization for multiple passes SGM, Proceedings of the 33rd International Conference on Machine Learning, pp. 2340– 2348, 20–22 Jun 2016.

[10] V. I. Avrutskiy, Avoiding overfitting of multilayer perceptrons by training derivatives, https://arxiv.org/pdf/1802.10301, 2018.

[11] A. S. Weigend, B. A. Huberman, and D. E. Rumelhart, Predicting sunspots and exchange rates with connectionist networks. In M. Casdagli and S. Eubank, editors, Nonlinear Modeling and Forecasting, SFI Studies in the Sciences of Complexity, Proceedings vol. XII, pp. 395–432. Addison-Wesley, 1992.

[12] L. Prechelt, Early Stopping - But When?, In: Orr G.B., Müller KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol. 1524. Springer, Berlin, Heidelberg, 1998.

[13] J. Loughrey, P. Cunningham, Overfitting in Wrapper-Based Feature Subset Selection: The Harder You Try the Worse it Gets, In: Bramer M., Coenen F., Allen T. (eds) Research and Development in Intelligent Systems XXI. SGAI 2004. Springer, London, 2005.

[14] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov, Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Journal of Machine Learning Research, vol. 15, pp. 1929-1958, 2014.

[15] Maren Mahsereci, Lukas Balles, Christoph Lassner, Philipp Hennig, Early Stopping without a Validation Set, arXiv preprint arXiv:1703.09580, 2017.

[16] Kaveh Mahdaviani, Helga Mazyar, Saeed Majidi and Mohammad H. Saraee, A Method to Resolve the Overfitting Problem in Recurrent Neural Networks for Prediction of Complex Systems' Behavior, 2008 International Joint Conference on Neural Networks (IJCNN 2008), pp. 3723-3728, 2008.

[17] Jeffrey L. Elman, Finding structure in time, Cognitive Science, vol. 14, pp. 179-211, 1990.

[18] S. Chiewchanwattana, C. Lursinsap, C. H. Chu, Time-series data prediction based on reconstruction of missing samples and selective ensembling of FIR neural networks, Proceedings of the 9th International Conference on Neural Information Processing, 2002.

[19] Harris Drucker, Corinna Cortes, L. D. Jackel, Yann LeCun and Vladimir Vapnik, Boosting and Other Ensemble Methods, Neural Computation, vol. 6, pp. 1289-1301. 1994.

[20] De-Wang Chen and Jun-Ping Zhang, Time series prediction based on ensemble ANFIS, 2005 International Conference on Machine Learning and Cybernetics, Guangzhou, China, pp. 3552-3556, vol. 6. 2005, doi: 10.1109/ICMLC.2005.1527557.

WSEAS Transactions on Mathematics, ISSN / E-ISSN: 1109-2769 / 2224-2880, Volume 17, 2018, Art. #34, pp. 274-279

Copyright Β© 2018 Author(s) retain the copyright of this article. This article is published under the terms of the Creative Commons Attribution License 4.0

Quick Links

Login

Other Articles by Authors

Authors and WSEAS

WSEAS Transactions on Mathematics

Bulletin Board