I want to talk about new machine learning (ML) and artificial intelligence (AI) features in Microsoft Teams. It facilitates meetings, and calls sound much better, even in the trickiest situations.

Machine learning (ML) model for Microsoft Teams

Our machine learning (ML) model for Microsoft Teams has recently been updated to stop unwanted echoes. This is good news for anyone whose train of thought has been thrown off by the sound of their own words coming back at them.

This model takes it a step further by allowing “full duplex” sound to improve the way people talk over Teams. Users can now talk and listen at the same time, which makes the conversation seem more natural and less choppy.

Teams also use AI to reduce reverberation, which makes it easier for users to hear each other in rooms with bad acoustics. Now, users can sound like they are speaking into a headset microphone, even when they are in a large room where speech and other noises can bounce from wall to wall. Audio in these hard-to-record places now sounds the same as conversations in the office.

Teams Audio Improvements

Teams fix audio issues like echo, being cut off, and reverberation to improve calls.

The echo is a familiar sound effect that can be annoying in online meetings. This causes the person on the other end of the call to hear their own voice, making an echo:

An echo cancellation module’s job is to figure out when sound from the loudspeaker gets into the microphone and take it out of the sound that goes out. The speaker is often closer to the microphone than the person using the device. In these situations, the echo signal comes louder than the end user’s voice. This makes it hard to get rid of the echo without also getting rid of the end user’s speech signal, especially when both people try to talk simultaneously. When only one person can talk at a time, it’s hard for users to get remote attendees’ attention.

The teams’ AI-based approach to the echo cancellation and interrupt ability fix the problems with traditional digital signal processing. We used data from thousands of devices to make a large dataset with about 30,000 hours of clean speech to train a real-time model that can manage extreme audio conditions.

Microsoft has stringent rules about privacy, so no customer information is collected for this data set. Instead, we used public data or the public’s help to find specific scenarios. We also ensured a good mix of male and female voices. That differs with 74 different languages.

Instead of running separate noise suppression and echo cancellation models, which would have made things more complicated, we decided to use joint training to combine the two. Now, our all-in-one model runs 10 percent faster than “noise suppression only” without sacrificing quality.

On top of that,

An echo cancellation module’s job is to figure out when sound from the loudspeaker gets into the microphone and take it out of the sound that goes out. Many times, the speaker is closer to the microphone than the person using the device. In these situations, the echo signal comes in louder than the end user’s voice. This makes it hard to get rid of the echo without also getting rid of the end user’s speech signal, especially when both people try to talk at the same time. When only one person can talk at a time, it’s hard for users to get remote attendees’ attention.

The teams’ AI-based approach to the echo cancellation and interrupt ability fix the processing problems with the traditional digital signal. This is to train a real-time model that can manage extreme audio conditions. They used data from thousands of devices, making a large dataset with about 30,000 hours of clean speech.

More…

Microsoft has extremely strict rules about privacy, so no customer information is collected for this data set. Instead, Teams used public data or the help of the public to find specific scenarios. They also made sure that there was a good mix of male and female voices. There were used 74 different languages.

Instead of running separate noise suppression and echo cancellation models have complicated things. Microsoft decided to use joint training to combine the two. Now, the all-in-one model runs 10 percent faster than “noise suppression only” without sacrificing quality.

Teams also changed their training data to account for reverberation. It lets the ML model turn any captured audio signal into sound. This sounds like it was said into a microphone that was very close. In the video above, the reverberation effect is shown. This is something that traditional echo cancellers can’t do. Teams let your voice sound like it’s coming from the office, even if you’re having a meeting on the stairs.

Starting to look into how AI and machine learning can improve sound quality. How to make calls and meetings in Teams work as well as they can.

Echo cancellation and dereverberation are being added to Windows and Mac devices, and mobile platforms will get them soon.

Clearer Conversation, Batter Collaboration

If you have doubts, questions or queries feel free to contact us at +61 3 9005 6868 or Email Us at hello@techomsystems.com.au Feel free to reach us anytime, we will be happy to serve you

Technology Adoption Expert | TECHOM Systems Pty Ltd

Helping technology adoption faster than ever before. Authoring modern workplace technologies like Microsoft Teams, Microsoft Intune, Azure Cloud Services, and Emerging Security Solutions.