DCT
2:24-cv-01067
Dialect LLC v. Microsoft Corp
I. Executive Summary and Procedural Information
- Parties & Counsel:- Plaintiff: Dialect, LLC (Texas)
- Defendant: Microsoft Corp. (Delaware)
- Plaintiff’s Counsel: BLUE PEAK LAW GROUP Group; Miller Fair Henry, PLLC
 
- Case Identification: 2:24-cv-01067, E.D. Tex., 12/20/2024
- Venue Allegations: Plaintiff alleges venue is proper because Microsoft has regular and established physical places of business in the district, including corporate offices, data centers, network points of presence, and dedicated "store-within-a-store" locations inside Best Buy retail stores.
- Core Dispute: Plaintiff alleges that Defendant’s virtual assistant and artificial intelligence platforms, including Cortana, Copilot, and Azure AI services, infringe nine patents related to natural language understanding and voice recognition technology.
- Technical Context: The technology at issue involves methods for interpreting ambiguous human speech by determining context, selecting specialized software agents, and managing user-specific data, forming the foundation of modern conversational AI.
- Key Procedural History: The complaint alleges a long history of interactions between Microsoft and the patents’ original inventor, VoiceBox, including acquisition discussions where the patent portfolio was allegedly disclosed. The complaint also notes that in October 2024, the Patent Trial and Appeal Board denied institution of inter partes review petitions filed by Google against two of the asserted patents (the ’209 and ’006 patents).
Case Timeline
| Date | Event | 
|---|---|
| 2002-06-03 | Earliest Priority Date for ’209, ’006, ’570 Patents | 
| 2003-06-03 | Earliest Priority Date for ’825, ’468, ’959, ’367, ’659 Patents | 
| 2005-08-31 | Earliest Priority Date for ’409 Patent | 
| 2006-01-01 | Microsoft CEO Steve Ballmer allegedly meets with VoiceBox CEO to discuss technology | 
| 2007-07-01 | Microsoft and VoiceBox teams meet to discuss potential acquisition; VoiceBox allegedly discloses IP portfolio including application for the '209 Patent | 
| 2008-07-08 | U.S. Patent No. 7,398,209 issues | 
| 2009-12-15 | U.S. Patent No. 7,634,409 issues | 
| 2010-10-05 | U.S. Patent No. 7,809,570 issues | 
| 2011-03-29 | U.S. Patent No. 7,917,367 issues | 
| 2011-09-06 | U.S. Patent No. 8,015,006 issues | 
| 2012-04-20 | VoiceBox allegedly sends Microsoft a "Patent Status Chart" listing the ’209, ’409, ’006, ’570, and ’367 patents and applications leading to the other asserted patents | 
| 2012-06-05 | U.S. Patent No. 8,195,468 issues | 
| 2013-12-31 | U.S. Patent No. 8,620,659 issues | 
| 2014-01-01 | Microsoft's Cortana virtual assistant becomes available | 
| 2016-11-15 | Microsoft launches Azure Bot services | 
| 2017-04-18 | U.S. Patent No. 9,626,959 issues | 
| 2017-08-15 | U.S. Patent No. 9,734,825 issues | 
| 2021-01-17 | Microsoft launches Azure OpenAI Service | 
| 2023-01-01 | Microsoft retires Cortana and replaces it with Copilot | 
| 2024-10-01 | PTAB denies institution of inter partes review for the '209 Patent and '006 Patent | 
| 2024-12-20 | Complaint filed | 
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 9,734,825 - Methods and Apparatus for Determining a Domain Based on the Content and Context of a Natural Language Utterance, issued August 15, 2017
The Invention Explained
- Problem Addressed: The patent addresses the challenge of interpreting user speech that is "incomplete, ambiguous, or subjective" (’825 Patent, col. 1:32-40).
- The Patented Solution: The invention proposes a method where a system determines the correct "domain" for a user's speech by scoring at least two possible contexts. This scoring is based on keywords and "prior probabilities or fuzzy possibilities" received from a plurality of "autonomous executable domain agents." The system then selects the appropriate domain agent—a module with a specific area of expertise—to handle the user's query and generate a response (’825 Patent, Claim 5).
- Technical Importance: This approach provides a systematic way for a voice-enabled system to disambiguate user intent by routing a query to the correct specialized function (e.g., weather, music, navigation) instead of failing or misinterpreting the request (Compl. ¶48).
Key Claims at a Glance
- The complaint asserts independent claim 5 (Compl. ¶125).
- Claim 5 requires a method with the following essential elements:- Recognizing words in a user's speech utterance.
- Receiving, at a parser, keywords and associated probabilities from a system agent or active domain agent.
- Determining a score for at least two possible contexts for the utterance based on the received keywords and probabilities.
- Determining a domain for the utterance based on the recognized words and the context scores.
- Selecting an autonomous executable domain agent based on the determined domain.
- Providing a query or command to the selected domain agent.
- Creating one or more queries by the selected domain agent.
- Sending the queries asynchronously to information sources.
 
- The complaint does not explicitly reserve the right to assert dependent claims for this patent.
U.S. Patent No. 7,398,209 - Systems And Methods For Responding To Natural Language Speech Utterance, issued July 8, 2008
The Invention Explained
- Problem Addressed: The patent specification describes the "fundamental incompatibility" between how humans ask questions, which relies heavily on context and domain knowledge, and how machines process queries, which are typically "highly structured and not inherently natural to the human user" (’209 Patent, col. 1:27-35).
- The Patented Solution: The invention claims a method for making machine communication more natural. A system receives a user's speech, recognizes words and phrases, and then parses them to determine a "meaning" and "context." Based on this meaning, it selects an "autonomous executable" domain agent. Crucially, the system then formulates the user's request "in accordance with a grammar used by the selected domain agent" before invoking that agent to process the request and present a result (’209 Patent, Claim 1). This process is illustrated in a block diagram provided in the complaint (Compl. ¶55; ’209 Patent, Fig. 6).
- Technical Importance: This patent describes a modular architecture where specialized agents, each with their own specific grammar, can be invoked to handle different types of natural language requests, overcoming the limitations of a single, rigid command-and-control system (Compl. ¶53).
Key Claims at a Glance
- The complaint asserts independent claim 1 (Compl. ¶150).
- Claim 1 requires a method with the following essential elements:- Receiving a user's speech utterance containing a request.
- Maintaining a dynamic set of prior probabilities or fuzzy possibilities.
- Recognizing words and phrases using dictionary and phrase tables.
- Parsing the words and phrases to determine a meaning, which includes determining a context based on keywords.
- Selecting an autonomous executable domain agent based on the determined meaning.
- Formulating the request in accordance with a grammar used by the selected domain agent.
- Invoking the selected domain agent to process the formulated request.
- Presenting the results of the processed request.
 
- The complaint does not explicitly reserve the right to assert dependent claims for this patent.
U.S. Patent No. 8,195,468 - Mobile Systems And Methods Of Supporting Natural Language Human-Machine Interactions, issued June 5, 2012
- Technology Synopsis: The patent addresses the technical problem of processing multi-modal inputs (e.g., a combination of speech and non-speech commands). The claimed solution involves creating separate transcriptions for speech and non-speech inputs, merging them, and then using personalized, general, and environmental cognitive models to determine the most likely context and identify the appropriate domain agent to generate a response (Compl. ¶¶ 60, 61).
- Asserted Claims: At least Claim 19 (Compl. ¶175).
- Accused Features: The complaint alleges that the Accused Products process multi-modal inputs using personal and contextual information, context stacks, and domain agents (Compl. ¶¶ 188-190).
U.S. Patent No. 9,626,959 - Systems And Methods Of Supporting Adaptive Misrecognition in Conversational Speech, issued April 18, 2017
- Technology Synopsis: This patent describes a method for adaptively handling speech misrecognition. After performing a first action based on an initial interpretation of a command, the system accesses a "personalized cognitive model" to "proactively select a second interpretation" if the user indicates the first one was incorrect, and then performs a second action based on this new interpretation (Compl. ¶¶ 66, 67).
- Asserted Claims: At least Claim 1 (Compl. ¶201).
- Accused Features: The complaint alleges that the Accused Products use a personalized cognitive model to interpret user input and improve responses based on user interaction patterns (Compl. ¶214).
U.S. Patent No. 7,634,409 - Dynamic Speech Sharpening, issued December 15, 2009
- Technology Synopsis: The patent claims a method for handling out-of-vocabulary words and tolerating noise. The method involves recognizing a stream of phonemes from a user's utterance and mapping them to an "acoustic grammar that phonemically represents one or more syllables" to generate an interpretation (Compl. ¶¶ 71, 72).
- Asserted Claims: At least Claim 1 (Compl. ¶225).
- Accused Features: The complaint alleges the Accused Products provide out-of-vocabulary interpretation for user input by using "phone-to-word transduction and phone-level alignments" (Compl. ¶238).
U.S. Patent No. 8,015,006 - Systems And Methods For Processing Natural Language Speech Utterances With Context-Specific Domain Agents, issued September 6, 2011
- Technology Synopsis: The patent describes a method for processing speech that includes determining a user's identity based on voice characteristics. This identity is then associated with the recognized words and the user's request, but only if a predetermined confidence level is met. This user identification is combined with dynamic updating of dictionaries and context-based parsing to process a request via a domain agent (Compl. ¶¶ 78, 81).
- Asserted Claims: At least Claim 1 (Compl. ¶249).
- Accused Features: The complaint alleges the Accused Products interpret user input using dynamically updated information and use domain agents to process user input (Compl. ¶¶ 262, 263).
U.S. Patent No. 7,809,570 - Systems And Methods For Responding To Natural Language Speech Utterance, issued October 5, 2010
- Technology Synopsis: This patent addresses the processing of a single utterance containing multiple requests. The claimed method processes these requests in a "multi-threaded environment," sending separate events to different domain agents, which in turn create asynchronous queries. The system then receives response events from the agents and creates a final response based on the combined information (Compl. ¶¶ 86, 87).
- Asserted Claims: At least Claim 1 (Compl. ¶274).
- Accused Features: The complaint alleges the Accused Products use contextual information and multiple domain agents to process user input, citing how "Copilot's orchestrator matches plugins to user queries" (Compl. ¶¶ 287, 288).
U.S. Patent No. 7,917,367 - Systems And Methods For Responding To Natural Language Speech Utterance, issued March 29, 2011
- Technology Synopsis: The patent claims a method for synchronizing context across multiple devices. It describes registering a plurality of mobile devices with a "context manager," which then receives a "context change event" from one device and informs the other registered devices of the change to synchronize context across them all (Compl. ¶¶ 91, 92).
- Asserted Claims: At least Claim 11 (Compl. ¶299).
- Accused Features: The complaint alleges the Accused Products, such as Copilot, synchronize contexts across multiple mobile devices (Compl. ¶312).
U.S. Patent No. 8,620,659 - Systems And Methods Of Supporting Adaptive Misrecognition in Conversational Speech, issued December 31, 2013
- Technology Synopsis: This patent describes a method for predicting user actions. The system determines whether a "personalized cognitive model" has enough information to predict a user's subsequent actions. If not, it uses a "generalized cognitive model" based on interaction patterns from a plurality of users to make the prediction (Compl. ¶¶ 95, 325).
- Asserted Claims: At least Claim 42 (Compl. ¶323).
- Accused Features: The complaint alleges the Accused Products use both personalized and general cognitive models based on user interactions to interpret and respond to user input (Compl. ¶336).
III. The Accused Instrumentality
Product Identification
- The accused instrumentalities are Microsoft's Cortana virtual assistant, Copilot virtual assistant, Azure AI services, and Azure OpenAI Services (Compl. ¶7).
Functionality and Market Context
- The Accused Products are artificial intelligence platforms and virtual assistants that allow users to interact with devices and services using natural language speech and text inputs (Compl. ¶¶ 116, 119, 121). The complaint alleges these products are integrated into a vast ecosystem, with Cortana having been available on over 1.4 billion Windows devices and Copilot experiencing significant growth since its 2023 debut (Compl. ¶¶ 117, 120). The Azure AI and OpenAI services are described as enabling developers to build and deploy conversational AI, with the Azure OpenAI Service expected to generate substantial revenue (Compl. ¶121). A screenshot from Microsoft's website shows marketing for "Copilot agents" that "process user speech using domain agents" (Compl. ¶139).
IV. Analysis of Infringement Allegations
’825 Patent Infringement Allegations
| Claim Element (from Independent Claim 5) | Alleged Infringing Functionality | Complaint Citation | Patent Citation | 
|---|---|---|---|
| recognizing, by a speech recognition engine, one or more words in the user generated natural language speech utterance; | The Accused Products receive and process user speech to understand commands and queries (Compl. ¶¶ 126, 138). | ¶126, ¶138 | col. 20:3-5 | 
| receiving, at a parser, keyword and associated prior probabilities or fuzzy possibilities from a system agent or an active domain agent of a plurality of autonomous executable domain agents; | The Accused Products allegedly use "keywords and contextual information" to understand user speech, which is used to determine how to respond (Compl. ¶138). | ¶138 | col. 20:6-11 | 
| determining, for the natural language speech utterance, a score for each of at least two possible contexts, wherein the scores are determined based on the received keyword...; | The Accused Products are alleged to behave differently based on the "conversation state" and "triggering context" to disambiguate user intent when there are multiple matching topics (Compl. ¶138). | ¶138 | col. 20:12-18 | 
| determining by the parser, a domain for the user generated natural language utterance based on the recognized one or more words...and the determined scores for each of the at least two possible contexts; | Based on context, the Accused Products allegedly determine the appropriate domain to process a user's request (Compl. ¶¶ 138, 139). | ¶138, ¶139 | col. 20:19-24 | 
| selecting at least one of the plurality of autonomous executable domain agents based, at least in part, on the determined domain...; | The Accused Products are alleged to process user speech using "domain agents" which are described as autonomous and independently operating to perform plans and orchestrate other agents (Compl. ¶139). A screenshot shows various "Copilot agents" (Compl. ¶139). | ¶139 | col. 20:25-36 | 
| sending, by the selected at least one of the plurality of domain agents, the one or more queries in an asynchronous manner to one or more local or external information sources. | The selected Copilot agents allegedly retrieve information from "grounding data" to summarize or answer questions (Compl. ¶139). | ¶139 | col. 20:43-47 | 
- Identified Points of Contention:- Scope Questions: A central question may be whether Microsoft's "Copilot agents" (Compl. ¶139) meet the claim definition of "autonomous executable domain agents." The defense may argue that modern AI plugins operate differently than the specific agent architecture described in the patent.
- Technical Questions: The complaint alleges the system uses "context" to disambiguate requests (Compl. ¶138), but a key technical question will be whether this process is equivalent to the claimed steps of "receiving...keyword and associated prior probabilities" from an agent and "determining...a score" for different contexts.
 
’209 Patent Infringement Allegations
| Claim Element (from Independent Claim 1) | Alleged Infringing Functionality | Complaint Citation | Patent Citation | 
|---|---|---|---|
| receiving the user generated natural language speech utterance, the received user utterance containing at least one request; | The Accused Products are conversational AI platforms that receive speech and text input from users (Compl. ¶¶ 151, 155). | ¶151, ¶155 | col. 22:50-52 | 
| parsing the recognized words and phrases to determine a meaning of the utterance, wherein determining the meaning includes determining a context for the at least one request...based on one or more keywords...; | The Accused Products allegedly use "keywords and contextual information" to interpret user input and determine context, which can lead to different behaviors depending on the "conversation state" (Compl. ¶163). A screenshot shows the importance of "triggering context" (Compl. ¶163). | ¶163 | col. 23:1-7 | 
| selecting at least one domain agent based on the determined meaning, the selected domain agent being an autonomous executable that receives, processes, and responds to requests associated with the determined context; | The Accused Products are alleged to process user speech using "domain agents" that "retrieve information from grounding data" and "take actions when asked" (Compl. ¶164). | ¶164 | col. 23:8-13 | 
| formulating the at least one request contained in the utterance in accordance with a grammar used by the selected domain agent to process requests associated with the determined context; | The complaint does not provide sufficient detail for analysis of this element, but alleges generally that the Accused Products implement all claim elements (Compl. ¶153). | ¶153 | col. 23:14-18 | 
| invoking the selected domain agent to process the formulated request; and presenting results of the processed request to the user... | The invoked agents allegedly "summarize or answer questions" and "orchestrate other agents" to provide a response to the user (Compl. ¶164). | ¶164 | col. 23:19-25 | 
- Identified Points of Contention:- Scope Questions: A primary point of contention for the ’209 Patent will likely be the term "grammar used by the selected domain agent." The defense may argue that modern AI systems do not use discrete, agent-specific "grammars" in the manner contemplated by the patent, but rather use more fluid, model-based approaches to process requests.
- Technical Questions: What evidence does the complaint provide that the Accused Products "formulate" a request according to a specific grammar, as opposed to simply passing parsed intent and entities to a selected software module or API?
 
V. Key Claim Terms for Construction
For the ’825 Patent:
- The Term: "autonomous executable domain agent"
- Context and Importance: This term is the core of the invention's modular approach. Infringement will depend on whether Microsoft’s "Copilot agents" (Compl. ¶139) are legally and technically equivalent to the claimed agents. Practitioners may focus on this term because its definition will determine whether a modern, plugin-based AI architecture falls within the scope of a patent describing a more structured agent-based system.
- Intrinsic Evidence for Interpretation:- Evidence for a Broader Interpretation: The '825 patent claims priority to the '209 patent, which describes agents broadly as "autonomous executables that receive, process and respond to user questions, queries and commands" and as "re-distributable packages or modules of functionality, typically for a specific domain or application" (’209 Patent, col. 2:50-54). This language could support construing the term to cover a wide range of software modules.
- Evidence for a Narrower Interpretation: The specification of the '209 patent distinguishes between "system agents" and "domain agents," suggesting a specific role for each (’209 Patent, col. 12:21-28). A defendant may argue that the term requires the specific agent architecture shown in figures like Figure 2 of the '209 patent, which illustrates distinct agent modules, an agent library, and an agent manager.
 
For the ’209 Patent:
- The Term: "formulating the at least one request contained in the utterance in accordance with a grammar used by the selected domain agent"
- Context and Importance: This limitation is critical because it requires not just selecting an agent, but re-formatting the user's request according to rules ("grammar") specific to that agent. This step differentiates the invention from a simple routing system. The dispute will likely center on whether Microsoft’s AI systems perform this specific formulation step.
- Intrinsic Evidence for Interpretation:- Evidence for a Broader Interpretation: The term "grammar" is not explicitly defined, which may allow for a broad interpretation as any set of rules or syntax required by the agent to process the request. The specification notes that the parser can form a question "in a standard format or hierarchical data structure used for processing by the agents" (’209 Patent, col. 18:47-50), which could be argued to be a form of grammar.
- Evidence for a Narrower Interpretation: The prosecution history, as cited in the complaint, distinguishes the invention from prior art that "omits a grammar used by a domain agent" (Compl. ¶56). A defendant may argue this creates a narrower definition, requiring a formal, predefined grammar distinct from just passing parameters to an API, which is common in modern software.
 
VI. Other Allegations
- Indirect Infringement: The complaint alleges that Microsoft induces infringement by providing developer tools, SDKs, and instructional websites (e.g., the "Cortana Skills Kit," "Copilot Studio," and "Get better results with Copilot prompting" guides) that actively encourage and instruct developers and end-users to use the Accused Products in an infringing manner (Compl. ¶¶ 131-136, 156-161). It is also alleged that the Accused Products are not staple articles of commerce because they are especially designed to perform the claimed methods (Compl. ¶¶ 137, 162).
- Willful Infringement: The complaint alleges a long history of pre-suit knowledge. It claims that Microsoft was made aware of the asserted patent families during acquisition discussions with VoiceBox as early as 2012, when VoiceBox allegedly provided Microsoft a chart listing the issued patents and pending applications (Compl. ¶101). The complaint further alleges that Microsoft was aware of the infringement of the ’825 Patent as early as its issue date in 2017 and the ’209 Patent as early as 2012 (Compl. ¶¶ 142, 167). This alleged pre-suit knowledge, combined with continued alleged infringement, forms the basis of the willfulness claim.
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of technical and definitional scope: Can the patented concepts of "autonomous executable domain agents," context scoring based on "prior probabilities," and agent-specific "grammars," which were described in patents with priority dates from the early 2000s, be construed to cover the architecture of Microsoft's modern, AI-driven platforms like Copilot and Azure AI? The case may turn on whether Microsoft's use of orchestrators, plugins, and large language models is fundamentally different from, or merely an implementation of, the patented methods.
- A key factual question will be one of pre-suit knowledge and intent: Given the complaint's detailed allegations of a decade-long history of meetings, acquisition discussions, and explicit disclosure of the patent portfolio by VoiceBox to Microsoft, what evidence will emerge regarding Microsoft’s state of mind during the development of Cortana, Copilot, and its Azure AI services? This will be central to the claim of willful infringement.
- An important procedural issue will be the impact of prior patent challenges: With the complaint noting that the PTAB has already denied institution of inter partes review for the '209 and '006 patents, the question arises as to what weight this will carry, if any, in shaping the court's and the parties' views on the validity of those and potentially other patents in the asserted portfolio.