Posted on June 6, 2010 by slicer
When we first started with Mumble, we did it because none of the commercial alternatives had quality we were happy with, and there were no open source projects that would have been a good starting base. So, we made our own. This means Mumble was made to satisfy a need; the need we have for high quality voicecom when we play. As such, our two focus areas are voice quality and voice latency. Oh, and “cool technical stuff”. But I digress. Quality is subjective, and is hard to quantify, but latency is much easier to test.
At the time, we were playing a lot of Battlefield 2. And one of our primary problem with the voicecomms we had was that if you saw a grenade a yelled “GRENADE!”, what your clanmate actually heard was a loud boom, followed by your desperate attempt to warn him (or her). It really should be the other way around, and we’ve made sure to emphasize this every chance we got. At first, people were skeptical. Did latency really matter? But as more people moved to use Mumble, more people discovered that quality and latency does matter, as it allows you to speak naturally instead of feeling like a shared walkie talkie.
A few years pass. Other competing products arrive. Old competing products get released in new versions… And they all mention the magical words “low latency”. There are now YouTube videos comparing latency, and numerous technical quasi-explanations around the web on what latency you can achieve with solution X over solution Y. People discuss individual milliseconds. I’m not sure if I should be proud or ashamed, but we’ve made “low-latency” a new buzzword for gaming VoIP. Proud, because it means we’ve managed to make the entire field of VoIP apps better for the end user. Ashamed, because to most people it is still just a word without any actual definition to it.
Our best repeatable results, achieved using either ALSA hw: or WASAPI exclusive mode, is around 40-50 ms. That’s mouth to ear, including network travel time. But what does that actually mean? According to Google, 40milliseconds equals 13.6 meters. So, place two people 15 meters apart, each with a headset, and have person A say “Latency”. Person B will hear it in the headset before he can hear it through the air. That is low latency. Unfortunately, such configurations are useless for actual gaming; exclusive hardware devices means the games can’t use it.
It’s still a cool result.