Failing to Finalize Execution Plan Using cuDNN Backend to Create a Fused Attention fprop Graph
I am working on implementing the Fused Attention fprop graph pattern. As of now I am only combining two matrix multiplications, meaning g3 and g4 are empty. I believe I have also matched all the requirements for this graph but none of the engine configurations provided by the engine heuristic work when passed to an execution plan. When finalizing the exec plan using any of the engine configurations the status CUDNN_STATUS_NOT_SUPPORTED
is returned.