Department of Brain and Cognitive Sciences (BCS)
Building and Training Deep Learning Models in Pytorch
Description
In this tutorial, we will walk through an example Jupyter Notebook in which we load a dataset, preprocess it, build a "residual-attention" network, train our model, and validate our performance on withheld data. In the process of going through the notebook, we will discuss briefly how to run this on OpenMind and how to parallelize training across multiple GPUs, as well as the reasoning behind the network architecture choice and the basic theory of the attention/transformer layer.