Basic logic gates can be a pain on ass on NN.
I did not spot amnything wrong in your code, but by the nature of your problem and what you describe, I will bet the problem is in the training data. When you have a disbalance between the positives and negatives in a data, an NN can be biased towards the positive cases and generate false-posivtives. If your training data have equal numbers of each input, your negative cases are 25%.
There are two ways to correct this, one is changing your training set, putting more cases of [0,0] OR you can give more emphasis to the false-positive error, like multipling this error by an constant before use it to adjust the nodes weights.
CLICK HERE to find out more related problems solutions.