I've tried it on my home PC again and it seems that you may be using a few synchronization calls too many per frame.
You may shave of some cycles if you wrap the sync-call routine into a function that checks if it has been called before in the current frame and only updates the sync variables during the first call.