I have a large array and I want to mask out certain values (set them to nodata). But I’m experiencing an out-of-memory error despite having sufficient RAM.

I have shown below an example that reproduces my situation. My array is 14.5 GB and the mask is ~7GB, but I have 64GB of RAM dedicated to this, so I don’t understand why this fails.

```
import numpy as np
arr = np.zeros((1, 71829, 101321), dtype='uint16')
arr.nbytes
#14555572218
mask = np.random.randint(2, size=(71829, 101321), dtype='bool')
mask.nbytes
#7277786109
nodata = 0
#this results in OOM error
arr[:, mask] = nodata
```

Interestingly, if I do the following, then things work.

```
arr = np.zeros((71829, 101321), dtype='uint16')
arr.nbytes
#14555572218
mask = np.random.randint(2, size=(71829, 101321), dtype='bool')
mask.nbytes
#7277786109
nodata = 0
#this works
arr[mask] = nodata
```

But it isn’t something I can use. This code will be a part of a library module that would need to accept a variable value for the zeroth dimension.

My guess is that `arr[mask] = nodata`

is modifying the array in-place but `arr[:, mask] = nodata`

is creating a new array, but I don’t know why that would be the case. Even if it did, there should still be enough space for that, since the total size of `arr`

and `mask`

would be 22GB and I have 64GB of RAM.

I tried searching about this, I found this but I’m new to numpy and I didn’t understand the explanation of the longer answer. I did try the `np.where`

approach from the other answer to that question, but I still get OOM error.

Any input would be appreciated.