You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/examples.md
+36-36Lines changed: 36 additions & 36 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,7 @@ output = model(data)
35
35
36
36
#### Complete - Float32
37
37
38
-
[//]: #(train_single_fp32_complete)
38
+
[//]: #(marker_train_single_fp32_complete)
39
39
```
40
40
import torch
41
41
import torchvision
@@ -79,11 +79,11 @@ torch.save({
79
79
'optimizer_state_dict': optimizer.state_dict(),
80
80
}, 'checkpoint.pth')
81
81
```
82
-
[//]: #(train_single_fp32_complete)
82
+
[//]: #(marker_train_single_fp32_complete)
83
83
84
84
#### Complete - BFloat16
85
85
86
-
[//]: #(train_single_bf16_complete)
86
+
[//]: #(marker_train_single_bf16_complete)
87
87
```
88
88
import torch
89
89
import torchvision
@@ -128,15 +128,15 @@ torch.save({
128
128
'optimizer_state_dict': optimizer.state_dict(),
129
129
}, 'checkpoint.pth')
130
130
```
131
-
[//]: #(train_single_bf16_complete)
131
+
[//]: #(marker_train_single_bf16_complete)
132
132
133
133
### Distributed Training
134
134
135
135
Distributed training with PyTorch DDP is accelerated by oneAPI Collective Communications Library Bindings for Pytorch\* (oneCCL Bindings for Pytorch\*). The extension supports FP32 and BF16 data types. More detailed information and examples are available at its [Github repo](https://github.com/intel/torch-ccl).
136
136
137
137
**Note:** When performing distributed training with BF16 data type, use oneCCL Bindings for Pytorch\*. Due to a PyTorch limitation, distributed training with BF16 data type with Intel® Extension for PyTorch\* is not supported.
138
138
139
-
[//]: #(train_ddp_complete)
139
+
[//]: #(marker_train_ddp_complete)
140
140
```
141
141
import os
142
142
import torch
@@ -194,7 +194,7 @@ torch.save({
194
194
'optimizer_state_dict': optimizer.state_dict(),
195
195
}, 'checkpoint.pth')
196
196
```
197
-
[//]: #(train_ddp_complete)
197
+
[//]: #(marker_train_ddp_complete)
198
198
199
199
## Inference
200
200
@@ -206,7 +206,7 @@ The `optimize` function of Intel® Extension for PyTorch\* applies optimizations
206
206
207
207
##### Resnet50
208
208
209
-
[//]: #(inf_rn50_imp_fp32)
209
+
[//]: #(marker_inf_rn50_imp_fp32)
210
210
```
211
211
import torch
212
212
import torchvision.models as models
@@ -223,11 +223,11 @@ model = ipex.optimize(model)
223
223
with torch.no_grad():
224
224
model(data)
225
225
```
226
-
[//]: #(inf_rn50_imp_fp32)
226
+
[//]: #(marker_inf_rn50_imp_fp32)
227
227
228
228
##### BERT
229
229
230
-
[//]: #(inf_bert_imp_fp32)
230
+
[//]: #(marker_inf_bert_imp_fp32)
231
231
```
232
232
import torch
233
233
from transformers import BertModel
@@ -248,15 +248,15 @@ model = ipex.optimize(model)
248
248
with torch.no_grad():
249
249
model(data)
250
250
```
251
-
[//]: #(inf_bert_imp_fp32)
251
+
[//]: #(marker_inf_bert_imp_fp32)
252
252
253
253
#### TorchScript Mode
254
254
255
255
We recommend you take advantage of Intel® Extension for PyTorch\* with [TorchScript](https://pytorch.org/docs/stable/jit.html) for further optimizations.
256
256
257
257
##### Resnet50
258
258
259
-
[//]: #(inf_rn50_ts_fp32)
259
+
[//]: #(marker_inf_rn50_ts_fp32)
260
260
```
261
261
import torch
262
262
import torchvision.models as models
@@ -277,11 +277,11 @@ with torch.no_grad():
277
277
278
278
model(data)
279
279
```
280
-
[//]: #(inf_rn50_ts_fp32)
280
+
[//]: #(marker_inf_rn50_ts_fp32)
281
281
282
282
##### BERT
283
283
284
-
[//]: #(inf_bert_ts_fp32)
284
+
[//]: #(marker_inf_bert_ts_fp32)
285
285
```
286
286
import torch
287
287
from transformers import BertModel
@@ -306,13 +306,13 @@ with torch.no_grad():
306
306
307
307
model(data)
308
308
```
309
-
[//]: #(inf_bert_ts_fp32)
309
+
[//]: #(marker_inf_bert_ts_fp32)
310
310
311
311
#### TorchDynamo Mode (Experimental, _NEW feature from 2.0.0_)
312
312
313
313
##### Resnet50
314
314
315
-
[//]: #(inf_rn50_dynamo_fp32)
315
+
[//]: #(marker_inf_rn50_dynamo_fp32)
316
316
```
317
317
import torch
318
318
import torchvision.models as models
@@ -330,11 +330,11 @@ model = torch.compile(model, backend="ipex")
330
330
with torch.no_grad():
331
331
model(data)
332
332
```
333
-
[//]: #(inf_rn50_dynamo_fp32)
333
+
[//]: #(marker_inf_rn50_dynamo_fp32)
334
334
335
335
##### BERT
336
336
337
-
[//]: #(inf_bert_dynamo_fp32)
337
+
[//]: #(marker_inf_bert_dynamo_fp32)
338
338
```
339
339
import torch
340
340
from transformers import BertModel
@@ -356,7 +356,7 @@ model = torch.compile(model, backend="ipex")
356
356
with torch.no_grad():
357
357
model(data)
358
358
```
359
-
[//]: #(inf_bert_dynamo_fp32)
359
+
[//]: #(marker_inf_bert_dynamo_fp32)
360
360
361
361
### BFloat16
362
362
@@ -367,7 +367,7 @@ We recommend using Auto Mixed Precision (AMP) with BFloat16 data type.
367
367
368
368
##### Resnet50
369
369
370
-
[//]: #(inf_rn50_imp_bf16)
370
+
[//]: #(marker_inf_rn50_imp_bf16)
371
371
```
372
372
import torch
373
373
import torchvision.models as models
@@ -385,11 +385,11 @@ with torch.no_grad():
385
385
with torch.cpu.amp.autocast():
386
386
model(data)
387
387
```
388
-
[//]: #(inf_rn50_imp_bf16)
388
+
[//]: #(marker_inf_rn50_imp_bf16)
389
389
390
390
##### BERT
391
391
392
-
[//]: #(inf_bert_imp_bf16)
392
+
[//]: #(marker_inf_bert_imp_bf16)
393
393
```
394
394
import torch
395
395
from transformers import BertModel
@@ -411,15 +411,15 @@ with torch.no_grad():
411
411
with torch.cpu.amp.autocast():
412
412
model(data)
413
413
```
414
-
[//]: #(inf_bert_imp_bf16)
414
+
[//]: #(marker_inf_bert_imp_bf16)
415
415
416
416
#### TorchScript Mode
417
417
418
418
We recommend you take advantage of Intel® Extension for PyTorch\* with [TorchScript](https://pytorch.org/docs/stable/jit.html) for further optimizations.
419
419
420
420
##### Resnet50
421
421
422
-
[//]: #(inf_rn50_ts_bf16)
422
+
[//]: #(marker_inf_rn50_ts_bf16)
423
423
```
424
424
import torch
425
425
import torchvision.models as models
@@ -440,11 +440,11 @@ with torch.no_grad():
440
440
441
441
model(data)
442
442
```
443
-
[//]: #(inf_rn50_ts_bf16)
443
+
[//]: #(marker_inf_rn50_ts_bf16)
444
444
445
445
##### BERT
446
446
447
-
[//]: #(inf_bert_ts_f16)
447
+
[//]: #(marker_inf_bert_ts_f16)
448
448
```
449
449
import torch
450
450
from transformers import BertModel
@@ -470,7 +470,7 @@ with torch.no_grad():
470
470
471
471
model(data)
472
472
```
473
-
[//]: #(inf_bert_ts_f16)
473
+
[//]: #(marker_inf_bert_ts_f16)
474
474
475
475
### INT8
476
476
@@ -491,7 +491,7 @@ Please follow the steps below to perform static calibration:
491
491
7. Save the INT8 model into a `pt` file.
492
492
493
493
494
-
[//]: #(int8_static)
494
+
[//]: #(marker_int8_static)
495
495
```
496
496
import os
497
497
import torch
@@ -521,7 +521,7 @@ with torch.no_grad():
521
521
522
522
traced_model.save("quantized_model.pt")
523
523
```
524
-
[//]: #(int8_static)
524
+
[//]: #(marker_int8_static)
525
525
526
526
##### Dynamic Quantization
527
527
@@ -535,7 +535,7 @@ Please follow the steps below to perform static calibration:
oneDNN provides [oneDNN Graph Compiler](https://github.com/oneapi-src/oneDNN/tree/dev-graph-preview4/doc#onednn-graph-compiler) as a prototype feature that could boost performance for selective topologies. No code change is required. Install <aclass="reference external"href="installation.md#installation_onednn_graph_compiler">a binary</a> with this feature enabled. We verified this feature with `Bert-large`, `bert-base-cased`, `roberta-base`, `xlm-roberta-base`, `google-electra-base-generator` and `google-electra-base-discriminator`.
596
596
@@ -604,7 +604,7 @@ The example code below works for all data types.
0 commit comments