Skip to content

Commit 24c9adf

Browse files
committed
Improve tracer_model_split documentation in pipelining tutorial
- Added clear section headers for both splitting options - Option 1: Manual Model Splitting - Option 2: Tracer-based Model Splitting - Fixed typo: 'before the before' -> 'before the' - Added explanation of split_spec dictionary parameters - Clarified that split_spec specifies module path and split point type - Made the tracer_model_split function definition more prominent The tracer_model_split code block was present but users were missing it because it wasn't clearly labeled as a separate option. Fixes issue #3530
1 parent f99e9e8 commit 24c9adf

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

intermediate_source/pipelining_tutorial.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@ Step 1: Partition the Transformer Model
108108

109109
There are two different ways of partitioning the model:
110110

111+
**Option 1: Manual Model Splitting**
112+
111113
First is the manual mode in which we can manually create two instances of the model by deleting portions of
112114
attributes of the model. In this example for two stages (2 ranks), the model is cut in half.
113115

@@ -139,10 +141,13 @@ As we can see the first stage does not have the layer norm or the output layer,
139141
The second stage does not have the input embedding layers, but includes the output layers and the final four transformer blocks. The function
140142
then returns the ``PipelineStage`` for the current rank.
141143

144+
**Option 2: Tracer-based Model Splitting**
145+
142146
The second method is the tracer-based mode which automatically splits the model based on a ``split_spec`` argument. Using the pipeline specification, we can instruct
143147
``torch.distributed.pipelining`` where to split the model. In the following code block,
144-
we are splitting before the before 4th transformer decoder layer, mirroring the manual split described above. Similarly,
145-
we can retrieve a ``PipelineStage`` by calling ``build_stage`` after this splitting is done.
148+
we are splitting before the 4th transformer decoder layer, mirroring the manual split described above. The ``split_spec`` dictionary
149+
specifies where to split the model by providing the module path (``"layers.4"``) and the split point type (``SplitPoint.BEGINNING``).
150+
Similarly, we can retrieve a ``PipelineStage`` by calling ``build_stage`` after this splitting is done.
146151

147152
.. code:: python
148153

0 commit comments

Comments
 (0)