Zoom CTO deep dive: How our federated approach to AI maximizes performance, quality, and affordability 

Embarking on the transformative AI journey from conceptualization to realization resembles a winding road, marked by continuous disruption, adaptation, and innovation. Having been on this journey for the past 30 years, I’ve had a front-row seat to — and played an active role in — the evolution of AI, from speech recognition and natural language understanding to computer vision. The pace of innovation in the last six months since I’ve joined Zoom has been particularly astounding.

At Zoom, we’re using AI to improve human collaboration and productivity. Zoom AI Companion is a cornerstone of our innovation, designed to help increase productivity, facilitate seamless collaboration, and derive deeper insights to enhance how you work across the Zoom platform. Zoom’s federated approach to AI enables us to provide AI Companion at no additional cost with the paid services assigned to your Zoom user account.* Here’s a closer look at our AI and how it provides high-quality performance at a lower cost.

Zoom’s federated approach to AI

Zoom has offered AI services such as speech recognition, computer vision, machine translation, and large language models (LLMs) to enhance communication for years. The LLMs we use include Zoom’s LLM, as well as third-party models OpenAI GPT 3.5 and GPT 4, and Anthropic Claude 2. Our federated approach can incorporate newer LLMs, such as OpenAI’s GPT 4 Turbo, from our partners, open- and closed-source LLMs alike, to continue to improve the end-to-end experiences for Zoom customers.

We adhere to a cost-effective strategy that first employs a lower-cost LLM most suitable for the task. Our Z-Scorer then evaluates the initial task completion quality, and if needed, we use a more advanced LLM to augment the task completion based on what was achieved from the initial LLM, similar to the way that a cohesive team working together is able to create a higher-quality product more efficiently than any one individual.

Comparing performance with other LLMs

With our federated approach to AI, according to our own internal testing, our team has improved the relative quality of AI Companion over single-model approaches, such as OpenAI GPT-3.5 Turbo (the relative difference is 99% vs 93% quality rating, per our proprietary quality evaluation methodology) or several other state-of-the-art LLMs. 

We’re measuring performance as a combination of lower cost, faster response time, and higher-quality outputs. In comparison to OpenAI’s GPT-4-32k model as the proxy of Microsoft Copilot, Zoom AI Companion’s meeting questions capability offers reduced cost and faster response time while maintaining comparable AI quality as shown in Figure 1.

Figure 1. Zoom federated AI in relative percentage of OpenAI GPT-4-32k on key metrics of cost and quality for Zoom AI Companion’s meeting query task. Microsoft Copilot used OpenAI GPT-4 in orchestration with Microsoft Graph and other components. We don’t use customer data to train our AI models, but used Zoom internal meeting data for benchmarking in this chart with OpenAI GPT-4 as the Microsoft Copilot proxy. 

The power of our models is further demonstrated in AI Companion’s multilingual performance, which now supports 32 languages (in preview) beyond English. Recognizing that most LLMs are primarily pre-trained with English-dominated data, we added translation models to expand our multilingual capabilities. By translating non-English transcripts into English using Zoom’s translation models for Zoom AI Companion’s multilingual meeting summarization, we consider not only translated data but also the original data simultaneously. As shown in Figure 2, our model led to not only significantly improved AI quality over GPT-3.5 but also approaching GPT-4-32k AI quality (97% relative) with less than 6% of the cost.

Figure 2. Zoom federated AI in relative percentage of OpenAI GPT-4-32k for Zoom AI Companion’s multilingual summarization task in 32 languages other than English, including Chinese, French, German, Italian, Japanese, Portuguese, and Spanish. Microsoft Copilot used OpenAI GPT-4 together with Microsoft Graph and other components. Zoom internal meeting data was collected for benchmarking in this chart with OpenAI GPT-4 as the Microsoft Copilot proxy. 

These examples underscore the efficacy of Zoom’s federated approach to AI, seamlessly combining the strengths of different machine learning systems to deliver high-performance results. 

A winning approach for AI 

We believe the benefits of AI should be widely available to as many people as possible. Our federated approach to AI plays a big role in bringing this vision to life. It’s why, while other companies may charge a premium per user, we’re able to include AI Companion at no additional cost for customers on eligible paid Zoom plans.* 

We encourage you to try AI Companion out for yourself — visit our getting-started guide to learn more about enabling and using these features. If you don’t have an eligible paid Zoom plan, upgrade today to access the benefits of AI Companion. 

*Note: AI Companion may not be available for all regions or industry verticals.

阅读全文