Skip to content

Commit 43e3821

Browse files
authored
sync doc to release branch, remove DLPack, horovod from release branch. (#1643)
Signed-off-by: Ye Ting <ting.ye@intel.com>
1 parent 0d58893 commit 43e3821

File tree

11 files changed

+1085
-1835
lines changed

11 files changed

+1085
-1835
lines changed

docs/images/DLPack/figure1_DLPack_import.svg

Lines changed: 0 additions & 746 deletions
This file was deleted.

docs/images/DLPack/figure2_DLPack_export.svg

Lines changed: 0 additions & 876 deletions
This file was deleted.

docs/index.rst

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22
Welcome to Intel® Extension for PyTorch* Documentation
33
######################################################
44

5-
Intel® Extension for PyTorch* extends `PyTorch\* <https://github.com/pytorch/pytorch>`_ with up-to-date features and optimizations for an extra performance boost on Intel hardware. It is a heterogeneous, high performance deep learning implementation for both CPU and XPU. XPU is user visible device which is counterpart of well-known CPU and CUDA in PyTorch* community. XPU represents Intel® specific kernel and graph optimization for varies “concrete” devices. XPU runtime will choose the actual device when executing AI workloads on XPU device. The default selected device is Intel® GPU. This release introduces optimized solution on XPU particually and provides PyTorch end-users to get up-to-date features and optimizations on Intel® Graphics cards.
5+
Intel® Extension for PyTorch* extends `PyTorch\* <https://github.com/pytorch/pytorch>`_ with up-to-date features and optimizations for an extra performance boost on Intel hardware. It is a heterogeneous, high performance deep learning implementation for both CPU and XPU. XPU is a user visible device that is a counterpart of the well-known CPU and CUDA in the PyTorch* community. XPU represents Intel specific kernel and graph optimizations for various “concrete” devices. XPU runtime will choose the actual device when executing AI workloads on the XPU device. The default selected device is Intel GPU. This release introduces optimized solution on XPU particularly and lets PyTorch end-users get up-to-date features and optimizations on Intel Graphics cards.
66

7-
Intel® Extension for PyTorch* provides aggressive optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques such as operation fusion, and Intel® Extension for PyTorch* amplified them with more comprehensive graph optimizations. This extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by ``import intel_extension_for_pytorch``. To execute AI workloads on XPU, the input tensors and models shall be converted to XPU beforehand by ``input = input.to("xpu")`` and ``model = model.to("xpu")``.
7+
Intel® Extension for PyTorch* provides aggressive optimizations for both eager mode and graph mode. Graph mode in PyTorch* normally yields better performance from optimization techniques such as operation fusion, and Intel® Extension for PyTorch* amplifies them with more comprehensive graph optimizations. This extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by ``import intel_extension_for_pytorch``. To execute AI workloads on XPU, the input tensors and models must be converted to XPU beforehand by ``input = input.to("xpu")`` and ``model = model.to("xpu")``.
88

99
Intel® Extension for PyTorch* is structured as shown in the following figure:
1010

@@ -13,11 +13,11 @@ Intel® Extension for PyTorch* is structured as shown in the following figure:
1313
:align: center
1414
:alt: Architecture of Intel® Extension for PyTorch*
1515

16-
PyTorch components are depicted with white boxes while Intel Extensions are with blue boxes. Extra performance of the extension comes from optimizations for both eager mode and graph mode. In eager mode, the PyTorch frontend is extended with custom Python modules (such as fusion modules), optimal optimizers and INT8 quantization API. Further performance boosting is available by converting the eager-mode model into graph mode via the extended graph fusion passes. For XPU backend, optimized operators and kernels are implemented and registered through PyTorch dispatching mechanism. These operators and kernels are accelerated from native vectorization feature and matrix calculation feature of Intel® GPU hardware. In graph mode, further operator fusions are supported to reduce operator/kernel invocation overheads, and thus increase performance.
16+
PyTorch components are depicted with white boxes and Intel extensions are with blue boxes. Extra performance of the extension comes from optimizations for both eager mode and graph mode. In eager mode, the PyTorch frontend is extended with custom Python modules (such as fusion modules), optimal optimizers, and INT8 quantization API. Further performance boosting is available by converting the eager-mode model into graph mode via extended graph fusion passes. For the XPU backend, optimized operators and kernels are implemented and registered through PyTorch dispatching mechanism. These operators and kernels are accelerated from native vectorization feature and matrix calculation feature of Intel GPU hardware. In graph mode, further operator fusions are supported to reduce operator/kernel invocation overheads, and thus increase performance.
1717

18-
Intel® Extension for PyTorch* utilizes `DPC++ <https://github.com/intel/llvm#oneapi-dpc-compiler>`_ compiler which supports the latest `SYCL* <https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html>`_ standard and also a number of extensions to the SYCL* standard, which can be found in the `sycl/doc/extensions <https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions>`_ directory. Intel® Extension for PyTorch* also integrates `oneDNN <https://github.com/oneapi-src/oneDNN>`_ and `oneMKL <https://github.com/oneapi-src/oneMKL>`_ libraries and provides kernels based on that. oneDNN library is used for computation intensive operations. oneMKL library is used for fundamental mathematical operations.
18+
Intel® Extension for PyTorch* utilizes the `DPC++ <https://github.com/intel/llvm#oneapi-dpc-compiler>`_ compiler that supports the latest `SYCL* <https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html>`_ standard and also a number of extensions to the SYCL* standard, which can be found in the `sycl/doc/extensions <https://github.com/intel/llvm/tree/sycl/sycl/doc/extensions>`_ directory. Intel® Extension for PyTorch* also integrates `oneDNN <https://github.com/oneapi-src/oneDNN>`_ and `oneMKL <https://github.com/oneapi-src/oneMKL>`_ libraries and provides kernels based on that. The oneDNN library is used for computation intensive operations. The oneMKL library is used for fundamental mathematical operations.
1919

20-
Intel® Extension for PyTorch* has been released as an open–source project at `Github <https://github.com/intel/intel-extension-for-pytorch>`_.
20+
Intel® Extension for PyTorch* has been released as an open–source project on `GitHub <https://github.com/intel/intel-extension-for-pytorch>`_.
2121

2222
.. toctree::
2323
:hidden:
@@ -27,7 +27,6 @@ Intel® Extension for PyTorch* has been released as an open–source project at
2727
tutorials/releases
2828
tutorials/installation
2929
tutorials/examples
30-
tutorials/performance
3130
tutorials/api_doc
3231
tutorials/contribution
3332
tutorials/license

docs/tutorials/AOT.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,17 @@
22

33
## Introduction
44

5-
[AOT Compilation](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compilation/ahead-of-time-compilation.html) is a helpful feature for development lifecycle or distribution time, when you know beforehand what your target device is going to be at application execution time. When AOT compilation is enabled, no additional compilation time is needed when running application. It also benifits the product quality since no just-in-time (JIT) bugs encountered as JIT is skipped and final code executing on the target device can be tested as-is before deliver to end-users. The disadvantage of this feature is that the final distributed binary size will be increased a lot (e.g. from 500MB to 2.5GB for Intel® Extension for PyTorch\*).
5+
[AOT Compilation](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-cpp-compiler-dev-guide-and-reference/top/compilation/ahead-of-time-compilation.html) is a helpful feature for development lifecycle or distribution time, when you know beforehand what your target device is going to be at application execution time. When AOT compilation is enabled, no additional compilation time is needed when running application. It also benifits the product quality since no just-in-time (JIT) bugs encountered as JIT is skipped and final code executing on the target device can be tested as-is before delivery to end-users. The disadvantage of this feature is that the final distributed binary size will be increased a lot (e.g. from 500MB to 2.5GB for Intel® Extension for PyTorch\*).
66

77
## Use case
88

9-
Intel® Extension for PyTorch\* provides build option `USE_AOT_DEVLIST` for end-users who install Intel® Extension for PyTorch\* via source compilation to configure device list for AOT compilation. The target device in device list is specified by DEVICE name of the target. Multi-target AOT compilation is supported by using comma (,) as delimiter in device list. See below table for the AOT setting targeting [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html).
9+
Intel® Extension for PyTorch\* provides build option `USE_AOT_DEVLIST` for end-users who install Intel® Extension for PyTorch\* via source compilation to configure device list for AOT compilation. The target device in device list is specified by DEVICE name of the target. Multi-target AOT compilation is supported by using a comma (,) as a delimiter in device list. See below table for the AOT setting targeting [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html).
1010

1111
| Supported HW | AOT Setting |
1212
| ------------ |---------------------|
1313
| Intel® Data Center GPU Flex Series | USE_AOT_DEVLIST='dg2-g10-c0' |
1414

15-
Intel® Extension for PyTorch\* enables AOT compilation for Intel® GPU target devices in prebuilt wheel files. Intel® Data Center GPU Flex Series is the enabled target device in current release. If Intel® Extension for PyTorch\* is executed on a device which is not pre-configured in `USE_AOT_DEVLIST`, this application can still run as JIT compilation will be triggered automatically for allowing execution on the current device. It causes additional compilation time during execution however.
15+
Intel® Extension for PyTorch\* enables AOT compilation for Intel GPU target devices in prebuilt wheel files. Intel® Data Center GPU Flex Series is the enabled target device in current release. If Intel® Extension for PyTorch\* is executed on a device which is not pre-configured in `USE_AOT_DEVLIST`, this application can still run because JIT compilation will be triggered automatically to allow execution on the current device. It causes additional compilation time during execution however.
1616

1717
## Requirement
1818

docs/tutorials/contribution.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ To develop on your machine, here are some tips:
6464

6565
You do not need to repeatedly install after modifying Python files (`.py`). However, you would need to reinstall if you modify a Python interface (`.pyi`, `.pyi.in`) or non-Python files (`.cpp`, `.h`, etc.).
6666

67-
If you want to reinstall, make sure that you uninstall Intel® Extension for PyTorch\* first by running `pip uninstall intel_extension_for_pytorch` until you see `WARNING: Skipping intel_extension_for_pytorch as it is not installed`; next run `python setup.py clean`. After that, you can install in `develop` mode again.
67+
If you want to reinstall, make sure that you uninstall Intel® Extension for PyTorch\* first by running `pip uninstall intel_extension_for_pytorch` until you see `WARNING: Skipping intel_extension_for_pytorch as it is not installed`. Then run `python setup.py clean`. After that, you can install in `develop` mode again.
6868

6969
### Tips and Debugging
7070

@@ -99,7 +99,7 @@ For more information about unit tests, please read [README.md](../../tests/gpu/R
9999
100100
## Writing documentation
101101
102-
So you want to write some documentation for your code contribution and don't know where to start?
102+
Do you want to write some documentation for your code contribution and don't know where to start?
103103

104104
Intel® Extension for PyTorch\* uses [Google style](http://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) for formatting docstrings. Length of line inside docstrings block must be limited to 80 characters to fit into Jupyter documentation popups.
105105

0 commit comments

Comments
 (0)