HyRead Journal 台灣全文資料庫

文章詳目資料

Journal of Computers EIMEDLINEScopus

自然科學/資訊/科技

篇名	Effects of BP Algorithm-based Activation Functions on Neural Network Convergence
卷期	29:1
作者	Junguo Hu 、 Lili Xu 、 Xin Wang 、 Xiaojun Xu 、 Guangyun Su
頁次	076-085
關鍵字	activation functions 、 back-propagation 、 convergence 、 gradient factor 、 initial weights 、 EI 、 MEDLINE 、 Scopus
出刊日期	201802
DOI	10.3966/199115992018012901007

Activation functions map data in artificial neural network computation. In an application, the activation function and selection of its gradient and translation factors are directly related to the convergence of the network. Usually, the activation function parameters are determined by trial and error. In this work, a Cauchy distribution (Cauchy), Laplace distribution (Laplace), and Gaussian error function (Erf) were used as new activation functions for the back-propagation (BP) algorithm. In addition, this study compares the effects of the Sigmoid type function (Logsig), hyperbolic tangent function (Tansig), and normal distribution function (Normal). The XOR problem was used in simulation experiments to evaluate the effects of these six kinds of activation functions on network convergence and determine their optimal gradient and translation factors. The results show that the gradient factor and initial weights significantly impact the convergence of activation functions. The optimal gradient factors for Laplace, Erf-Logsig, Tansig-Logsig, Logsig, and Normal were 0.5, 0.5, 4, 2, and 1, respectively, and the best intervals were [0.5, 1], [0.5, 2], [2, 6], [1, 4], and [1, 2], respectively. Using optimal gradient factors, the order of convergence speed was Laplace, Erf-Logsig, Tansig-Logsig, Logsig, and Normal. The functions Logsig (gradient factor = 2), Tansig-Logsig (gradient factor = 4), Normal (translation factor = 0, gradient factor = 1), Erf-Logsig (gradient factor = 0.5) and Laplace (translation factor = 0, gradient factor = 0.5) were less sensitive to initial weights, and as a result, their convergence performances were less influenced. As the gradient of the curve of the activation functions increased, the convergence speed of the networks showed an accelerating trend. The conclusions obtained from the simulation analysis can be used as a reference for the selection of activation functions for BP algorithm-based feedforward neural networks.

本卷期文章目次

關鍵知識WIKI

文章詳目資料

Journal of Computers EIMEDLINEScopus

中文摘要

英文摘要

本卷期文章目次

關鍵知識WIKI

相關文獻