-
Notifications
You must be signed in to change notification settings - Fork 14
Labels
new featureNew feature or requestNew feature or request
Description
I noticed that the conversion pass from the arith dialect to the neura dialect currently handles arith.cmpi but does not include a conversion for arith.cmpf.
Since the neura dialect already provides an fcmp operation, this seems like an unintended omission. If a fix needed, I would be happy to work on adding support for this conversion.
An example:
// RUN: mlir-opt nonlinear.mlir --convert-math-to-llvm --lower-affine --convert-scf-to-cf --convert-cf-to-llvm -o kernel-llvm.mlir
// RUN: mlir-neura-opt --assign-accelerator --lower-arith-to-neura --lower-llvm-to-neura kernel-llvm.mlir
module attributes {} {
func.func @softmax(%arg0: memref<?xf32>, %arg1: memref<?xf32>, %arg2: i32) attributes {} {
%c0 = arith.constant 0 : index
%c1 = arith.constant 1 : index
%c0_f32 = arith.constant 0.0 : f32
// Find max value
%0 = affine.load %arg0[%c0] : memref<?xf32>
%1 = arith.index_cast %arg2 : i32 to index
%max_val = affine.for %i = 1 to %1 iter_args(%max = %0) -> (f32) {
%val = affine.load %arg0[%i] : memref<?xf32>
%cmp = arith.cmpf ogt, %val, %max : f32
%new_max = arith.select %cmp, %val, %max : f32
affine.yield %new_max : f32
}
return
}
}
After run the scripts, it will generates IR as below, where there is still arith.cmpf instead of operations in neura dialects.
...
bb2: // pred: ^bb1
%9 = memref.load %arg0[%7] : memref<?xf32>
%10 = arith.cmpf ogt, %9, %6 : f32
%11 = "neura.sel"(%9, %6, %10) : (f32, f32, i1) -> f32
%12 = "neura.add"(%7, %1) : (index, index) -> index
%13 = builtin.unrealized_conversion_cast %12 : index to i64
neura.br %13, %11 : i64, f32 to ^bb1
^bb3: // pred: ^bb1
...Metadata
Metadata
Assignees
Labels
new featureNew feature or requestNew feature or request