Apparently, to train the "English" recognition model, windows UI lang has to be English too and I don't really want to change UI lang... but it might be worth trying to see if training can solve the problem later.
I turned off everything that was making noise but the issue still persists and confidence level is always around 0.93.
My microphone is blue Yeti nano so its not very bad.
As a programmer, I understand that handling "," for a special case like this is very annoying and not worth the effort :(