Dilawar's Blog

watch -n 1 /dev/null

06 Oct 2020

Dense Associative Memory

John Hopfield in 1982 proposed a neat idea. It was a network that could store memories and recall them when presented with a partial cue. There are many tutorials out there on Hopfield network (my personal favourite is https://neuronaldynamics.epfl.ch/online/Ch17.S1.html, the original paper is a great read as well.).

These networks are highly studied. Despite their limitations and non-biological nature, they are still very popular among neuroscientits for their simplicity and tractability. Their storage capacity is barely enough: a network with N neurons stores roughly 0.14N memories (still linear though).

This note is about recent progress on these networks proposed by Krotov and Hopfield in 2016 which has shown great promise in practice e.g. Hopfield Networks is All You Need (vis-à-vis Attention is all you need).

There are two major feats achieved in this work.

  • Dense networks can store many more memories than the number of neurons using rectified polynomials (ee note: a polynomial signal generator and a diode in series) in the update rule. Probably because higher order polynomials can easily tease apart two patterns which looks tightly correlated to the traditional update rule. These are very non-biological!
  • They pointed out a connection or a duality between recurrent networks of Hopfield type and feedforward networks. Specifically, they work out how the activation function of feedforward network is related to update rule in the Hopfield network.

A sidenote

Long time ago, a Russian psychologist wrote a book The Mind of a Mnemonist about a person who couldn’t forget anything. He could recall all the details from any meeting years ago almost flawlessly. The cost of this seemingly endless memory capacity was the inability to generalize or to see patterns. For him there is no difference between these two lists of numbers: [1,2,3,4,5,6,7…,9] and [9,1,2,4,7,8..]. If he was a deer, he would never be able to learn that “all tigers are dangerous”; only that particular tiger with that particular stripe patterns who attacked him on that particular day is dangerous!

The point being that there is a trade off between the ability to generalize (features learning) and ability to remember examples (prototype learning). Animals with higher cognitive facilities such as mice learn prototypes first, then they quickly learn the features. Flies, on the other hand, learn prototypes but can’t generalize well or at all. Today’s neural network, AFAIK, seems to take the reverse direction: learn features first, and when overtrained learns prototypes.

https://github.com/dilawar/algorithms/tree/master/MemoryNetwork/DenseAssociativeNetworks has a Python3 implementation for this paper for the XOR function.

Categories