I like the composition here, it is simple enough, yet all the other guys around add value to the framing in a sense it creates a sense of interest as they were all looking on with interest.
I do agree also that it is not relevant to shift yourself in this case. I would also snap first and crop if necessary. Rather than to move to get a better framing, but missing the moment because of the movement, I would snap and process later.
Furthermore, I do like how TS presented this with a side view instead of a diagonal side view. In this case, the main catch is the light-hearted stunt and the keen interest of the observant passerby. Although the old man's candid expression may add interest to the overall picture, I feel that it also boils down to, at the split second, what does TS want to capture, i.e. whether it is the stunt or the expression. In my mind, I would capture the stunt first as it is so-called more "rare" than a facial expression.