diff --git a/intermediate_source/ddp_tutorial.rst b/intermediate_source/ddp_tutorial.rst index c63321ad14c..97fd99f700e 100644 --- a/intermediate_source/ddp_tutorial.rst +++ b/intermediate_source/ddp_tutorial.rst @@ -20,7 +20,7 @@ multiple machines, making it perfect for large-scale deep learning applications. To use DDP, you'll need to spawn multiple processes and create a single instance of DDP per process. But how does it work? DDP uses collective communications from the -`torch.distributed `__ +`torch.distributed `__ package to synchronize gradients and buffers across all processes. This means that each process will have its own copy of the model, but they'll all work together to train the model as if it were on a single machine. @@ -71,7 +71,7 @@ Basic Use Case To create a DDP module, you must first set up process groups properly. More details can be found in -`Writing Distributed Applications with PyTorch `__. +`PyTorch Distributed Overview <../beginner/dist_overview.html>`__. .. code:: python