Bandwidth is needed for more things than just the data you send yourself. The XNA Framework handles voice automatically, but every time you speak into the headset, we have to send that data out over the wire.
The voice stream is heavily compressed, using ~500 bytes per second, and only when you are actually talking.
By default, all players can talk to all others. Consider a 16 player game, where one player is talking to the other 15:
Yikes! Remember we only have 8k in total. We've nearly used the whole thing up, even before sending any actual game data.
How can you survive this deadly attack of the killer voice bandwidth gremlins?