At perticular epoch the NN was using nd arrays of (5828,10,200) which goes more than 1 crore (too huge, and that was my limit for GPU).
I wrote code where if there are any places this limit is exhausting it would divide the batch into 2 parts. So at the end I got all the batches of nd arrays less than 1 crore and then it ran successfully on my GPU.
CLICK HERE to find out more related problems solutions.