Listen to this story
|
In GitHub’s blog post last week, research scientist Andrej Karpathy launched a course named ‘Neural Networks: Zero to Hero’ that focuses on teaching the basics of neural networks. In a series of YouTube videos, one can code and train neural networks together, with the built Jupyter notebooks that are then captured inside the lectures directory.
The lectures are divided into two categories—the first is ‘The spelled-out intro to neural networks and backpropagation: building micrograd’ that discusses on the backpropagation and training of neural networks, assuming the learners have a basic knowledge of Python and a recollection of calculus; the second is ‘The spelled-out intro to language modelling: building makemore’ where a bigram character-level language model is implemented to further complexify into a modern Transformer language model, like GPT.
The main focus in the second category would be on introducing torch.Tensor and its subtleties, along with its use in efficiently evaluating neural networks. It would also focus on the overall framework of language modelling that includes sampling, model training, and the evaluation of a loss.
Moreover, the lectures would also consist of a set of exercises included in the video description for better understanding of the concepts.
With a PhD from Stanford University in computer science, Andrej Karpathy is a computer scientist that specialises in deep learning and computer vision. He joined Tesla in 2017 and served as the Director of AI, along with stints at Open AI.
A primary instructor for the first deep learning course at Stanford—‘CS 231n: Convolutional Neural Networks for Visual Recognition’, Karpathy serves as an independent researcher who openly trains on large deep neural networks.