…not much has changed in 33 years on the macro level. We’re still setting up differentiable neural net architectures made of layers of neurons and optimizing them end-to-end with backpropagation and stochastic gradient descent. Everything reads remarkably familiar, except it is smaller.
An erudite explanation of the state of neural networks today, when put in the context of its past and its future. I am glad I follow Andrej Karpathy and his writing. My big takeaway from this piece — the world would need more compute, and that’s why Apple’s M1 chips are setting the stage for the next evolution of computing needs.