LEARNING STOCHASTIC FEEDFORWARD NETWORKS
Radford M. Neal
Department of Computer Science
University of Toronto
November 1990
Connectionist learning procedures are presented for "sigmoid" and
"noisy-OR" varieties of stochastic feedforward network. These
networks are in the same class as the "belief networks" used in
expert systems. They represent a probability distribution over
a set of visible variables using hidden variables to express
correlations. Conditional probability distributions can be
exhibited by stochastic simulation for use in tasks such as
classification. Learning from empirical data is done via a
gradient ascent method analogous to that used in Boltzmann
machines, but due to the feedforward nature of the connections,
the negative phase of Boltzmann machine learning is unnecessary.
Experimental results show that, as a result, learning in a sigmoid
feedforward network can be faster than in a Boltzmann machine.
These networks have other advantages over Boltzmann machines in
pattern classification and decision making applications, and
provide a link between work on connectionist learning and work on
the representation of expert knowledge.