Releases: microsoft/tensorflow-directml-plugin
tensorflow-directml-plugin 0.4.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.
Changes in 0.4.0
- Add DirectML kernels for
CudnnRNNCanonicalToParamsandCudnnRNNParamsToCanonical - Add support for grouped convolution in
Conv2DBackpropFilterandConv3DBackpropFilter - Add
float16support for_FusedConv2D
tensorflow-directml-plugin 0.3.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.
Changes in 0.3.0
- Set
tensorflow-cpu==2.10.0as a hard dependency due to incompatibility with Keras 2.11's default optimizers. - Fix overflow in
BatchNormops when float16 or mixed precision is used. - Remove unnecessary
Castoperation inReduceMinandReduceMaxops.
tensorflow-directml-plugin 0.2.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.
Changes in 0.2.0
- Improve TensorBoard profiling and capturing chrome traces
- Add support for
exponential_avg_factor != 1.0inFusedBatchNorm - Add an
int32kernel registration forFill
tensorflow-directml-plugin 0.1.1
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.
Changes in 0.1.1
- Fix a crash in
InTopKV2whenkis bigger than the size of the axis dimension.
tensorflow-directml-plugin 0.1.0
The Python packages are available as a PyPI release. To download the latest python package automatically, simply pip install tensorflow-directml-plugin.
Changes in 0.1.0
- Upgrade the DirectML version to 1.9.1, which includes minor bug fixes and performance improvements.
- Add DirectML kernels for the
RngSkipandRngReadAndSkipoperators. - Add DirectML kernels for the
StatelessRandomGetKeyCounterAlg,StatelessRandomGetKeyCounterandStatelessRandomGetAlgoperators. - Add a DirectML kernel for
SparseApplyAdagrad. - Add a DirectML kernel for
StatelessRandomUniformV2. - Add a DirectML kernel for
InTopKV2. - Add DirectML kernels for
MatrixDiagV3andMatrixDiagPartV3. - Add emulated support for
int64. - Add a dependency on
tensorflow-cpu>=2.10.0. Users should install thetensorflow-cpupackage instead oftensorflowortensorflow-gpuwhen usingtensorflow-directml-plugin. - Add
int32support forStridedSlice. - Add CPU emulated versions of
UnsortedSegmentSum,UnsortedSegmentMax,UnsortedSegmentMinandUnsortedSegmentProdto get rid of device placement errors in transformer models. - Add a C API for Linux. The C API can be downloaded from the releases page in the
tensorflow-directml-pluginGitHub repository. - Add support for multiple devices.
- Add integer support for
Relu. - Add
int32support forPack. - Fix the incomplete adapter description on Linux.
- Fix a crash in
ArgMinandArgMaxwhen the output type wasint16oruint16. - Fix an undefined behavior when retrieving a list of strings from an attribute.
- Fix a memory leak in the BFC allocator.
- Fix a memory leak in the graph optimizer.
- Fix a memory leak in
SegmentReduction. - Fix a memory leak in
StridedSlice. - Fix a memory leak in the emulated random kernels.
- Fix the validation of
Rangeto allow values nearINT_MAX. - Get rid of warnings related to unsupported
DataFormatDimMapandDataFormatVecPermuteoperators. - Prevent unbounded growth of command allocator memory.
- Optimize output allocation for inputs that can be executed in-place and directly forwarded to the output.
- Increase the available memory by allowing devices to allocate shared (nonlocal) memory.
- Improve the performance of the unsorted segment operators by batching GPU->CPU copies together.
- Increase the performance of emulated operators by reducing the number of eager context and eager ops creation.