Skip to content

[P1] Missing conversion from arith.cmpf to neura.fcmp #123

@HobbitQia

Description

@HobbitQia

I noticed that the conversion pass from the arith dialect to the neura dialect currently handles arith.cmpi but does not include a conversion for arith.cmpf.

Since the neura dialect already provides an fcmp operation, this seems like an unintended omission. If a fix needed, I would be happy to work on adding support for this conversion.

An example:

// RUN: mlir-opt nonlinear.mlir --convert-math-to-llvm --lower-affine --convert-scf-to-cf --convert-cf-to-llvm -o kernel-llvm.mlir
// RUN: mlir-neura-opt --assign-accelerator --lower-arith-to-neura --lower-llvm-to-neura kernel-llvm.mlir
module attributes {} {
  func.func @softmax(%arg0: memref<?xf32>, %arg1: memref<?xf32>, %arg2: i32) attributes {} {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c0_f32 = arith.constant 0.0 : f32
    
    // Find max value
    %0 = affine.load %arg0[%c0] : memref<?xf32>
    %1 = arith.index_cast %arg2 : i32 to index
    %max_val = affine.for %i = 1 to %1 iter_args(%max = %0) -> (f32) {
      %val = affine.load %arg0[%i] : memref<?xf32>
      %cmp = arith.cmpf ogt, %val, %max : f32
      %new_max = arith.select %cmp, %val, %max : f32
      affine.yield %new_max : f32
    }
    return
  }
}

After run the scripts, it will generates IR as below, where there is still arith.cmpf instead of operations in neura dialects.

...
bb2:  // pred: ^bb1
    %9 = memref.load %arg0[%7] : memref<?xf32>
    %10 = arith.cmpf ogt, %9, %6 : f32
    %11 = "neura.sel"(%9, %6, %10) : (f32, f32, i1) -> f32
    %12 = "neura.add"(%7, %1) : (index, index) -> index
    %13 = builtin.unrealized_conversion_cast %12 : index to i64
    neura.br %13, %11 : i64, f32 to ^bb1
  ^bb3:  // pred: ^bb1
...

Metadata

Metadata

Assignees

Labels

new featureNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions