Conclusion

Project Overview

This project evaluates the performance and convergence of various stochastic gradient descent (SGD) variants and traditional gradient descent methods with different step size settings. The methods are tested on two synthetic datasets: one with high correlation and one with low correlation among features, focusing on solving the ridge penalized empirical risk minimization for regression.

Experimental Results

Comparing All Variants

Low Correlation Data (ρ=0.1): The SGD switch variant performs well initially, but SGD with momentum (β=0.4) ultimately provides the best performance, with the lowest error and most stable convergence.

Comparison of all variants of the SGD: plots of the errors of the objective vs iterations for low correlated data

High Correlation Data (ρ=0.7): All methods are impacted by high correlation, but SGD with momentum (β=0.6) outperforms the others by achieving the lowest loss and distance to the minimum, effectively managing the noise introduced by correlated features.

Comparison of all variants of the SGD: plots of the errors of the objective vs iterations for higher correlated data

Comparing Full Gradient Descent to SGD with Momentum and Switch Variants

Low Correlation Data (ρ=0.1): Gradient Descent performs best overall, smoothly converging to very low loss values and effectively reaching the minima. SGD momentum with β=0.4\beta = 0.4β=0.4 performs well, surpassing SGD switch in later stages with lower loss and better convergence, but only Gradient Descent effectively reaches the minima.

Comparison of the full gradient descent with the SGD with momentum and switch variants: plots of the errors of the objective vs iterations for 1) low correlated data.

High Correlation Data (ρ=0.7): SGD momentum with β=0.6 outperforms the other variants by achieving the lowest loss and distance to the minimum. Gradient Descent, while smooth, is slower due to high correlation noise, and SGD switch struggles with oscillations. Overall, all methods perform relatively similarly in the presence of high correlation.

Comparison of the full gradient descent with the SGD with momentum and switch variants: plots of the errors of the objective vs iterations for higher correlated data.

Conclusion

This project investigated various gradient methods on two datasets with different correlation levels, focusing on linear regression. The experiments highlighted that high correlation introduces more noise and instability in gradient estimates, challenging convergence. In such cases, methods like shrinking stepsizes or momentum demonstrated better performance by managing noise effectively. Low correlation, on the other hand, resulted in reduced noise, facilitating smoother and more efficient convergence across all methods. The insights gained from this study emphasize the importance of selecting appropriate gradient strategies based on data correlation characteristics to optimize regression outcomes.

Requirements

A working computer with:

a python compiler
numpy
matplotlib
scipy

References

[1] Bottou, Léon (2010) . Large-scale machine learning with stochastic gradient descent. Proceedings of COMPSTAT'2010: 19th International Conference on Computational StatisticsParis France, August 22-27, 2010 Keynote, Invited and Contributed Papers. p. 177--186. Springer

[2] Defazio, Aaron and Bach, Francis and Lacoste-Julien, Simon (2014) . SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in neural information processing systems, vol.27.

[3] Johnson, Rie and Zhang, Tong(2013). Accelerating stochastic gradient descent using predictive variance reduction. Advances in neural information processing systems, vol. 26.

Contributors

Ahmed
Najlaa
Jeremie

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
imgs		imgs
.DS_Store		.DS_Store
Comparative Study of Stochastic Gradient Descent Variants with Correlation Effects.ipynb		Comparative Study of Stochastic Gradient Descent Variants with Correlation Effects.ipynb
README.md		README.md
Report.pdf		Report.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project Overview

Experimental Results

Comparing All Variants

Comparing Full Gradient Descent to SGD with Momentum and Switch Variants

Conclusion

Requirements

References

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

jnlandu/optim

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Experimental Results

Comparing All Variants

Comparing Full Gradient Descent to SGD with Momentum and Switch Variants

Conclusion

Requirements

References

Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages