I think your error comes from the fact that you have your mask size = (1024, 1024). The network will predict a few decades of masks for each images and select the best of them after. In this case if you have your image that’s already of size (1024, 1024) and then several masks of the same size there is no way that your GPU have enough memory to keep them all.
In the standard configuration the masks are of size :
mask_height: 33
mask_width: 33
So I suggest you to change that.
CLICK HERE to find out more related problems solutions.