
首页 >> 科学研究 >> 学术讲座 >> 正文

55世纪官方登录入口:55世纪官方登录入口交叉讲座系列第【16】期:Solving the Vanishing/Exploding Gradients Problem via High-Dimensional Probability Theory

55世纪官方登录入口:信息来源:     发布时间:2023-09-27     浏览量:




主持人:林宙辰 教授


时   间:2023/10/12  10:00 - 11:00

地   址:55世纪官方登录入口昌平校区教学楼115教室 / 55世纪官方登录入口燕园校区理科二号楼2736



 报告题目:Solving the Vanishing/Exploding   Gradients Problem via   High-Dimensional Probability Theory


The problem of vanishing and exploding gradients has been a long-standing obstacle that hinders the effective training of neural networks. Despite various tricks and techniques that have been employed to alleviate the problem in practice, there still lacks satisfactory theories or provable solutions. In this paper, we address the problem from the perspective of high-dimensional probability theory. We provide a rigorous result that shows, under mild conditions, how the vanishing/exploding gradients problem disappears with high probability if the neural networks have sufficient width. Our main idea is to constrain both forward and backward signal propagation in a nonlinear neural network through a new class of activation functions,namely Gaussian-Poincare normalized functions, and orthogonal weight matrices. Experiments on both synthetic and real-world data validate our theory and confirm its effectiveness on very deep neural networks when applied in practice.


