It is shown that there is general expression for back-propagation rules, in contrast to previous studies that have implicitly assumed that there is just one back-propagation rule. There are an infinite number of possible learning rules that obey the general expression. A number of other rules are presented, and, along with the original rule, applied to a small but difficult pattern mapping task with a three-layer network. Performance criteria investigated includes the dependence of the rate of learning on the learning rate parameter, eta , the information required by the learning rule, and the discreteness of the hidden-unit representations learned. The original algorithm is not the best by any of these criteria. It is suggested that the problems that have been observed with the original back-propagation learning rule, such as the deterioration of performance with very large networks, may not exist for some of the other rules.
|Original language||English (US)|
|Title of host publication||Unknown Host Publication Title|
|Number of pages||6|
|State||Published - Dec 1 1987|