Open
@jim22k

Description

The mask in reduce_to_vector is a Vector, while the input is a Matrix, making it impossible to apply the mask prior to the operation. If select_by_mask could do vec->matrix broadcasting, the mask could be applied earlier and possibly save compute time.

Alternatively, the mask could be brought into the linalg.generic loop and used to skip entire rows.

The 2nd approach would be faster and simpler if it works.