>>最受欢迎的情感词典,欢迎点击下载!<<
研究方向
学术报告
资源下载
当前位置: 首页>>新闻动态>>正文
    博士生逯志兴的研究成果被IPM录用
    2025-03-23 21:11 卢俊宇 

    近日,实验室博士生逯志兴关于提高优化器泛化能力的研究成果被Information Processing & Management (IPM) 期刊录用。IPM为中科院一区,CCF推荐B类期刊。


    题目:Enhancing Orthogonality of Parameter Vectors to Improve Generalization in Deep Neural Networks with Momentum Optimizer(利用动量优化器增强参数向量的正交性以提高深度神经网络的泛化)


    Abstract:Momentum is a widely adopted technique in the deep neural network (DNN) optimization, recognized for enhancing performance. However, our analysis indicates that momentum is not always beneficial for the network. We theoretically demonstrate that increasing the orthogonality of parameter vectors significantly improves the generalization ability of DNNs, while momentum tends to reduce this orthogonality. Our results further show that integrating normalization and residual connections into DNN architectures helps preserve orthogonality, thereby enhancing the generalization of networks optimized with momentum. Extensive experiments across multilayer perceptrons (MLPs), convolutional neural networks (CNNs) and Transformers validate our theoretical findings. Finally, we find that the parameter vectors of commonly pre-trained language models (PLMs) all maintain a better orthogonality.


    中文摘要:动量是深度神经网络优化器中广泛应用的技术,通常能够提高网络的收敛速度和稳定性,然而,我们的研究表明,动量并非总是对网络有利的。我们从理论上证明增加参数向量的正交性能够提高深度神经网络的泛化能力,然而动量会降低参数的正交性。进一步,我们发现将归一化和残差连接集成到深度神经网络架构中,有助于提高参数的正交性,从而增强了动量优化器的泛化能力。该理论在多层感知机、卷积神经网络以及由Transformer组成的预训练模型中得到了广泛的验证。我们的研究有助于促进深度神经网络优化理论的发展。 



    关闭窗口