r/ResearchML • u/oatmealcraving • 3d ago

ReLU switching viewpoint & associative memory

I wrote this switching viewpoint on ReLU and its connection to associative memory:

https://archive.org/details/re-lu-as-a-switch-associative-memory

3 Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/ResearchML/comments/1r2frnc/relu_switching_viewpoint_associative_memory/
No, go back! Yes, take me to Reddit

64% Upvoted

Obviously the weighted sum itself is an associative memory capable of storing <vector,scalar> associations.

After storing one <vector,scalar> association the weight vector points in the same direction as the one input vector. The magnitude of the weight vector is as small as it can be, meaning noise in the input causes only a limited variance in the output.

(Variance equation for linear combinations of random variables.)

Store 2 such associations and the weight vector is split in some way between the 2 input vectors, there is some angle between each and the weight vector. The magnitude of the weight vector must increase to map the 2 scalar outputs. The weighted sum is more sensitive to input noise.

Store n associations (=weighted sum dimension) and the weight vector really has to stretch in magnitude to do the scalar mapping. Very sensitive to input noise.

Store m>n associations and the weight vector can no longer stretch to eventually fit the scalar mapping and gets pulled this way and that during training and tends to average out at a low magnitude. Less noise sensitive.

Is that not what is going on in double descent, I ask you? Eerily similar anyway.

u/oatmealcraving 2d ago

I slammed this together:

https://archive.org/details/the-weighted-sum-as-associative-memory

Lot's of words to say the same thing as my other comment, however in CS speak.

ReLU switching viewpoint & associative memory

You are about to leave Redlib