I tested the example code in this issue https://github.com/pytorch/pytorch/issues/106255
I fixed forward like below
torch::Tensor forward(torch::Tensor x)
{
x = fc->forward(x);
return x;
}
Here is the result
Memory used: 382
Memory used: 766
Memory used: 1362
Memory used: 1362
Memory used: 978
Unlike the example code, Even after clearing the cache, some memory remains.
I just tried
cudaFree(input.data_ptr());
cudaFree(output.data_ptr());
result is
Memory used: 214
Still remains something
Where is this from and how can I release this memory?