Relative Content

Tag Archive for pytorchmulti-gpu

How to know how much GPU ram is occupied by a Pytorch model and how much by the data batch?

I have been experimenting with PyTorch by training different models. During runtime, I am unsure about the maximum size of my model or batch size before encountering the OOM (out of memory) error on each GPU. While I can manually iterate through different batch sizes to determine the limit, it would be more efficient if I could measure the memory usage of my models and data directly. While nvidia-smi is good but it gives an aggregated total of the occupied RAM.

How to know how much GPU ram is occupied by a pytprch model and how much by the data batch?

I have been experimenting with PyTorch by training different models. During runtime, I am unsure about the maximum size of my model or batch size before encountering the OOM (out of memory) error on each GPU. While I can manually iterate through different batch sizes to determine the limit, it would be more efficient if I could measure the memory usage of my models and data directly. While nvidia-smi is good but it gives an aggregated total of the occupied RAM.