A Simple Artificial Neuron

<Conclusion>

A single TLU is limited

A single TLU operates as a binary classifier restricted to linearly separable classification problems. It is restricted to linearly separable problems because the two intersecting hypersurfaces are flat hyperplanes, and so intersect in a straight line.

The hypersurface derived from the weight values is flat because the input activation is calculated as a linear polynomial: it has no higher order terms. (activation a, input i xi, weight i wi)

a = w1x1 + w2x2 + ... + wnxn

To solve classification problems that are not nonlinearly separable we need to form a higher order polynomial. This can be done by either introducing higher order terms before summation or by pre-processing the input with a preceding layer of TLUs: Both methods deform the hypersurface formed by the activation polynomial.

Networks of TLU's

McCulloch and Pitts proved that a sufficiently large network of TLUs can solve any given Boolean classification problem (including nonlinear ones) by showing how the fundamental operators of propositional logic can be constructed from networks of TLUs. In fact, only two layers of TLUs are sufficient to emulate any given linear or nonlinear function.

As a simple example, XOR can be emulated by a three-unit, two layer net with two units in layer one and one unit in layer two. Each of the two layer-one TLUs returns a two-dimensional step function of the network inputs to one of the inputs of the layer two TLU. Thus, for the layer-two TLU, instead of the linear activation polynomial:

a = w1x1 + w2x2

we can now write the non linear polynomial:

a = w1f(x1,x2) + w2g(x1,x2)

where f(s) and g(s) are step functions. In the final neuron, the intersection for the decision line is now the between the threshold hyperplane and a step-hypersurface, and so the intersection can also be non-linear.

Concluding Remarks

This report has described each component of the TLU in turn, pointing out the analogous biological component and describing its operation functionally and geometrically. We have also described the whole TLU functionally and procedurally as well as discussing the key limitation of TLUs and how it may be overcome.

The report has emphasised that TLUs are biologically inspired. However, the complexity of biological neurons is far in excess of that of a TLU. Modern simulations use extensions of one dimensional cable theory to describe the flow of charge across the surface of individual neurons in time.

Despite this disparity, networks of TLUs are interesting objects of study in their own right, and there are some relatively simple modifications that can be easily made. The Heaviside function can be replaced with a sigmoidal function, causing the neuron to output real, rather than Boolean values.