The most straight forward reason for this is every number (except 2) will XOR out to 0.
Now visualize all the numbers in binary representation one below the other.
The ones XORing out to 0 MUST HAVE identical bit masks (of course so, because all, except 2 are duplicates and the duplicates will have identical bit masks, because the values are identical). So, the only values contributing to the final non-zero XOR value are the 2 non-repeating numbers.
So, basically XORing only those 2 non-repeating numbers and XORing all the numbers is effectively same.
And the NON-ZERO final XOR gives in a very important clue. The clue that it is NON-ZERO. And that means, because of XOR’s mutually exclusive properties, the two non-repeating numbers must have AT LEAST one bit different. Since that bit has contributed to final NON-ZERO XOR value, all the rest of the numbers HAVE TO HAVE CANCELLED EACH OTHER. There’s absolutely no way that bit is staying NON-ZERO, otherwise. So, you can trust only one (any one) set bit in the final XOR value. That is bit is sure to be set odd number of times in the entire set (because then only it can be exclusive and hence set in the final XOR value), AND, IMPORTANTLY, it HAS TO be set ON in one number, and OFF in the other. So, that bit can help split the entire set into 2 halves, effectively reducing the problem to finding only one non-repeating number, but this time, in 2 sets from the original set. Think about it. Just pay close attention to the XOR’s properties. It would start making sense.