LEARNING STOCHASTIC FEEDFORWARD NETWORKS
 
                         Radford M. Neal 
                  Department of Computer Science
                      University of Toronto

                         November 1990

Connectionist learning procedures are presented for "sigmoid" and
"noisy-OR" varieties of stochastic feedforward network.  These
networks are in the same class as the "belief networks" used in
expert systems.  They represent a probability distribution over 
a set of visible variables using hidden variables to express 
correlations.  Conditional probability distributions can be 
exhibited by stochastic simulation for use in tasks such as 
classification.  Learning from empirical data is done via a 
gradient ascent method analogous to that used in Boltzmann 
machines, but due to the feedforward nature of the connections, 
the negative phase of Boltzmann machine learning is unnecessary.  
Experimental results show that, as a result, learning in a sigmoid 
feedforward network can be faster than in a Boltzmann machine.  
These networks have other advantages over Boltzmann machines in 
pattern classification and decision making applications, and 
provide a link between work on connectionist learning and work on 
the representation of expert knowledge.