CUDA memory errors

#15
by darkmentat - opened

I was able to train a model last night just fine, but trying again today (regardless of how few or how many images I use), I constantly get an error around memory: RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 14.76 GiB total capacity; 13.48 GiB already allocated; 11.75 MiB free; 13.60 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How can I modify the max_split_size_mb, or is something else happening?

rebuilding the space/container may have fixed it...

Resolved with restart and rebuild.

darkmentat changed discussion status to closed

Sign up or log in to comment