intel
diff --git a/‎docs/tutorials/features.rst‎
Lines changed: 1 addition & 1 deletion b/‎docs/tutorials/features.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/tutorials/features/graph_capture.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/tutorials/features/graph_capture.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/tutorials/features/int8_overview.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/tutorials/features/int8_overview.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/tutorials/features/nhwc.md‎
Lines changed: 3 additions & 3 deletions b/‎docs/tutorials/features/nhwc.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎docs/tutorials/installation.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/tutorials/installation.md‎
Lines changed: 1 addition & 1 deletion
@@ -104,7 +104,7 @@ Intel® Extension for PyTorch* also optimizes operators and implements several c
 .. autoclass:: MergedEmbeddingBag
 .. autoclass:: MergedEmbeddingBagWithSGD
 
-**Auto kernel selection** is a feature that enables users to tune for better performance with GEMM operations. It is provided as parameter –auto_kernel_selection, with boolean value, of the ipex.optimize() function. By default, the GEMM kernel is computed with oneMKL primitives. However, under certain circumstances oneDNN primitives run faster. Users are able to set –auto_kernel_selection to True to run GEMM kernels with oneDNN primitives.” -> "We aims to provide good default performance by leveraging the best of math libraries and enabled weights_prepack, and it has been verified with broad set of models. If you would like to try other alternatives, you can use auto_kernel_selection toggle in ipex.optimize to switch, and you can diesable weights_preack in ipex.optimize if you are concerning the memory footprint more than performance gain. However in majority cases, keeping default is what we recommend.
+**Auto kernel selection** is a feature that enables users to tune for better performance with GEMM operations. It is provided as parameter –auto_kernel_selection, with boolean value, of the ipex.optimize() function. By default, the GEMM kernel is computed with oneMKL primitives. However, under certain circumstances oneDNN primitives run faster. Users are able to set –auto_kernel_selection to True to run GEMM kernels with oneDNN primitives.” -> "We aim to provide good default performance by leveraging the best of math libraries and enabled weights_prepack, and it has been verified with broad set of models. If you would like to try other alternatives, you can use auto_kernel_selection toggle in ipex.optimize to switch, and you can disable weights_preack in ipex.optimize if you are concerning the memory footprint more than performance gain. However in majority cases, keeping default is what we recommend.
 
 Optimizer Optimization
 ----------------------
 
@@ -3,7 +3,7 @@ Graph Capture (Experimental)
 
 ### Feature Description
 
-This feature automatically applies a combination of TorchScript trace technique and TorchDynamo to try to generate a graph model, for providing a good user experience while keep execution fast. Specifically, the process tries to generate a graph with TorchScript trace functionality first. In case of generation failure or incorrect results detected, it changes to TorchDynamo with TorchScript backend. Failure of the graph generation with TorchDynamo triggers a warning message. Meanwhile the generated graph model falls back to the original one. I.e. the inference workload runs in eager mode. Users can take advantage of this feature through a new knob `--graph_mode` of the `ipex.optimize()` function to automatically run into graph mode.
+This feature automatically applies a combination of TorchScript trace technique and TorchDynamo to try to generate a graph model, for providing a good user experience while keeping execution fast. Specifically, the process tries to generate a graph with TorchScript trace functionality first. In case of generation failure or incorrect results detected, it changes to TorchDynamo with TorchScript backend. Failure of the graph generation with TorchDynamo triggers a warning message. Meanwhile the generated graph model falls back to the original one. I.e. the inference workload runs in eager mode. Users can take advantage of this feature through a new knob `--graph_mode` of the `ipex.optimize()` function to automatically run into graph mode.
 
 ### Usage Example
 
 
@@ -109,7 +109,7 @@ Note: For weight observer, it only supports dtype **torch.qint8**, and the qsche
 **Suggestion**:
 
 1. For weight observer, setting **qscheme** to **torch.per_channel_symmetric** can get a better accuracy.
-2. If your CPU device doesn't support VNNI, seeting the observer's **reduce_range** to **True** can get a better accuracy, such as skylake.
+2. If your CPU device doesn't support VNNI, setting the observer's **reduce_range** to **True** can get a better accuracy, such as skylake.
 
 ### Prepare Model
 
 
@@ -12,7 +12,7 @@ Look at the following image of illustrating NCHW and NHWC when N=1. Actually whe
 
 PyTorch refers to NCHW as `torch.contiguous_format` (the default memory format) and to NHWC as `torch.channels_last`, which is a new feature as of the 1.5 release.
 
-TensorFlow uses NHWC as the default memory format because NHWC has a performance advantage over NCHW. On CPU platforms, we propose to optimize Channels Last memory path for ihe following reasons:
+TensorFlow uses NHWC as the default memory format because NHWC has a performance advantage over NCHW. On CPU platforms, we propose to optimize Channels Last memory path for the following reasons:
 * **Performance** - NHWC performance is not as good as blocked memory format (nChw16c), but it is close, and much better performance than NCHW.
 * **User Experience** - Operator coverage of NHWC would be higher than blocked memory format (`to_mkldnn()` method), so user experience is better. To be specific, it is difficult to enable operators that manipulates `dim` on blocked format such as `sum(dim=?)`. You would need to convert tensor from blocked memory format back to NHWC using `to_dense()`, before feeding it into `sum()`. This is naturally supported on Channels Last memory format already.
 * **Upstream** - Will be easier since CPU doesn't hold secret ingredient and both inference and training will be covered.
@@ -74,7 +74,7 @@ Better to explain the concepts here with a diagram, the **dotted lines** indicat
 
 Before moving on, I feel it is necessary to explain how PyTorch organizes tensors in memory - the **layout**. Here we only focus on **dense** tensors, skip 'coo' layout of **sparse** tensor.
 
-The question itself can be reinterpreted as for a tensor of size <N, C, H, W>, how does PyTorch accesses the element with index <n, c, h, w> from memory, the answer is **stride**:
+The question itself can be reinterpreted as for a tensor of size <N, C, H, W>, how does PyTorch access the element with index <n, c, h, w> from memory, the answer is **stride**:
 ```
 tensor: <N, C, H, W>
 index: <n, c, h, w>
@@ -146,7 +146,7 @@ The general guideline has been listed under reference [Writing-memory-format-awa
 
 ### c. Register oneDNN Kernel on Channels Last
 
-Registering a oneDNN kernel under Channels Last memory format on CPU is no different from [cuDNN](https://github.com/pytorch/pytorch/pull/23861): Only very few upper level changes are needed, such as accommodate 'contiguous()' to 'contiguous(suggested_memory_format)'. The automatic reorder of oneDNN weight shall been hidden in ideep.
+Registering a oneDNN kernel under Channels Last memory format on CPU is no different from [cuDNN](https://github.com/pytorch/pytorch/pull/23861): Only very few upper level changes are needed, such as accommodate 'contiguous()' to 'contiguous(suggested_memory_format)'. The automatic reorder of oneDNN weight shall have been hidden in ideep.
 
 ## oneDNN NHWC APIs
 
 
@@ -88,7 +88,7 @@ python -m pip install <package_name>==<version_name> -f https://developer.intel.
 To ensure a smooth compilation, a script is provided in the Github repo. If you would like to compile the binaries from source, it is highly recommended to utilize this script.
 
 ```bash
-$ wget https://github.com/intel/intel-extension-for-pytorch/blob/v2.0.0+cpu/scripts/compile_bundle.sh
+$ wget https://raw.githubusercontent.com/intel/intel-extension-for-pytorch/v2.0.0+cpu/scripts/compile_bundle.sh
 $ bash compile_bundle.sh
 ```