DCT

2:24-cv-00207

Dialect LLC v. Bank Of America NA

Key Events

Amended Complaint

amended complaint

I. Executive Summary and Procedural Information

Parties & Counsel:
- Plaintiff: Dialect, LLC (Texas)
- Defendant: Bank of America, NA. (Federally Chartered National Banking Association)
- Plaintiff’s Counsel: BLUE PEAK LAW GROUP Group; Ward, Smith & Law, Firm
Case Identification: 2:24-cv-00207, E.D. Tex., 06/17/2024
Venue Allegations: Plaintiff alleges venue is proper in the Eastern District of Texas because Defendant maintains regular and established places of business in the District, including a Technology Center in Plano, Texas, where it has allegedly recruited personnel for the accused product. The complaint also notes that Defendant has not contested venue in prior patent infringement actions in the District.
Core Dispute: Plaintiff alleges that Defendant’s "Erica" virtual financial assistant, a feature within its mobile banking application, infringes five patents related to natural language understanding and voice recognition technology.
Technical Context: The technology at issue involves conversational artificial intelligence that allows users to interact with computer systems using natural speech and text, a foundational technology for modern digital assistants.
Key Procedural History: The patents-in-suit originated with VoiceBox Technologies. The complaint notes that other patents from the same portfolio, sharing common inventors, were previously asserted against Amazon, resulting in a jury verdict of willful infringement and a $46.7 million damages award. The complaint also alleges that VoiceBox and Bank of America engaged in discussions regarding this technology as early as 2016.

Case Timeline

Date	Event
2005-08-05	Priority Date for ’160 and ’039 Patents
2005-08-29	Priority Date for ’468, ’607, and ’957 Patents
2009-12-29	U.S. Patent No. 7,640,160 Issued
2012-06-05	U.S. Patent No. 8,195,468 Issued
2013-05-21	U.S. Patent No. 8,447,607 Issued
2016-02-16	U.S. Patent No. 9,263,039 Issued
2016-01-01	Alleged discussions between VoiceBox and Bank of America began
2016-11-15	U.S. Patent No. 9,495,957 Issued
2018-06-01	Accused Product "Erica" initially deployed
2024-06-17	Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 7,640,160 - "Systems And Methods For Responding To Natural Language Speech Utterance"

The Invention Explained

Problem Addressed: The patent’s background section describes the difficulty machines face in communicating with humans in a natural manner, noting that human communication relies heavily on context and domain knowledge, whereas machine-based queries are often highly structured and rigid (Compl. ¶35; ’160 Patent, col. 2:21-34).
The Patented Solution: The invention proposes a system to better interpret natural language by comparing text combinations from a user's utterance against both pre-defined "grammar expression entries" and a "context stack" of expected contexts from the ongoing dialogue. The system scores these potential matches to determine the user's most likely intent and then communicates a request to a specialized "domain agent" to execute the command ('160 Patent, Abstract; Compl. ¶36).
Technical Importance: This technology represented a move away from rigid command-and-control voice systems toward more flexible, context-aware conversational interfaces that can better handle ambiguous user input (Compl. ¶¶23, 35).

Key Claims at a Glance

The complaint asserts at least independent claim 12 (’160 Patent, Cl. 12; Compl. ¶70).
Claim 12 recites a method with the essential elements of:
- Receiving a transcription of a natural language utterance at a computer.
- Identifying one or more matching "contexts" by comparing text combinations from the transcription against both "grammar expression entries" and "one or more expected contexts stored in a context stack."
- Scoring each of the identified matching contexts.
- Selecting the matching context with the highest score to determine a "most likely context."
- Communicating a request to a "domain agent" configured to process requests in that most likely context.
The complaint does not explicitly reserve the right to assert dependent claims.

U.S. Patent No. 8,195,468 - "Mobile Systems And Methods Of Supporting Natural Language Human-Machine Interactions"

The Invention Explained

Problem Addressed: The patent addresses the technical challenge of processing a combination of different types of user inputs—specifically, both speech ("natural language utterance") and non-speech inputs (e.g., text entry, screen taps) in a mobile environment (’468 Patent, col. 1:15-24; Compl. ¶40).
The Patented Solution: The invention describes a method for processing this "multi-modal" input by transcribing and merging speech and non-speech inputs. The system then uses a "semantic knowledge-based model" to interpret the merged transcription. This model is comprised of three parts: a "personalized cognitive model" (from the specific user's past interactions), a "general cognitive model" (from interactions with a plurality of users), and an "environmental model" (from the user's environment, such as location). This comprehensive model helps determine the most likely context and generate a response via a domain agent (’468 Patent, Cl. 19; Compl. ¶41; ’468 Patent, Fig. 8).
Technical Importance: The claimed method provides a framework for creating more sophisticated and personalized user interactions by integrating multiple input types with a multi-layered understanding of the user and their environment (Compl. ¶43).

Key Claims at a Glance

The complaint asserts at least independent claim 19 (’468 Patent, Cl. 19; Compl. ¶102).
Claim 19 recites a method with the essential elements of:
- Receiving a "multi-modal natural language input" including a natural language utterance and a non-speech input.
- Identifying the user who provided the input.
- Creating a speech-based transcription using a speech recognition engine and a "semantic knowledge-based model," where the model includes a "personalized cognitive model," a "general cognitive model," and an "environmental model."
- Merging the speech-based and non-speech-based transcriptions.
- Identifying one or more entries in a "context stack" that match information in the merged transcription.
- Determining a most likely context, identifying an associated "domain agent," communicating a request to it, and generating a response.
The complaint does not explicitly reserve the right to assert dependent claims.

U.S. Patent No. 8,447,607 - "Mobile Systems And Methods Of Supporting Natural Language Human-Machine Interactions"

Technology Synopsis: The ’607 Patent describes a method for processing multi-modal natural language inputs from a user. The system generates transcriptions from both speech and non-speech inputs, merges them, and identifies a user's intent by matching the merged transcription against entries in a "context stack" indicative of prior context. A "cognitive model" based on the user's prior interactions is used to generate the speech transcription (Compl. ¶¶47, 139).
Asserted Claims: Independent claim 12 is asserted (Compl. ¶139).
Accused Features: The complaint alleges that Erica's ability to receive voice, text, and icon-selection inputs, generate transcriptions, merge them, and use a user's interaction history ("cognitive model") to identify a matching context ("context stack") infringes the ’607 Patent (Compl. ¶¶140-153).

U.S. Patent No. 9,263,039 - "Systems And Methods For Responding To Natural Language Speech Utterance"

Technology Synopsis: The ’039 Patent discloses a method of processing both speech and non-speech communications by transcribing and merging them to form a query. The system searches this query for text combinations, compares them to a "context description grammar," generates a "relevance score," and selects one or more "domain agents" based on this score to obtain content and generate an ordered response (Compl. ¶¶51, 172).
Asserted Claims: Independent claim 13 is asserted (Compl. ¶172).
Accused Features: The complaint alleges that Erica infringes by receiving multi-modal inputs, merging them into a query (e.g., "Transfer money"), comparing text combinations to its grammar, generating relevance scores to rank options like "Transfer between my accounts" or "Pay a bill," and selecting a domain agent based on the highest score (Compl. ¶¶173-185).

U.S. Patent No. 9,495,957 - "Mobile Systems And Methods Of Supporting Natural Language Human-Machine Interactions"

Technology Synopsis: The ’957 Patent describes a method that begins by generating a "context stack" based on a plurality of prior utterances. When a new natural language utterance is received, the system determines the words in it, compares those words to entries in the context stack, generates "rank scores" for the matching context entries, and uses those scores to determine the user's command or request (Compl. ¶¶57, 204).
Asserted Claims: Independent claim 7 is asserted (Compl. ¶204).
Accused Features: The complaint alleges that Erica's functionality of maintaining a conversational context (the "context stack"), performing speech recognition on user utterances, comparing keywords like "Transfer" to context entries, generating rank scores to order options presented to the user, and determining the final command infringes the ’957 Patent (Compl. ¶¶206-213).

III. The Accused Instrumentality

Product Identification

The accused instrumentality is the "Eica" virtual financial assistant, a feature integrated within Bank of America’s Mobile Banking application for iOS and Android devices (Compl. ¶8).

Functionality and Market Context

The Erica feature is a conversational assistant that allows customers to perform banking tasks by "spoken conversation and/or visual text" (Compl. ¶74). It allegedly combines artificial intelligence, predictive analytics, and natural language processing to understand and respond to user requests (Compl. ¶75). The complaint alleges Erica is designed to "learn from clients' behaviors over time" and from conversations with all Bank of America customers, and that it can use environmental information like a user's location to respond to queries (Compl. ¶¶75-76). The screenshot provided in the complaint shows a user asking Erica for the "nearest branch," prompting Erica to use location data to find a result (Compl. p. 29). The complaint touts Erica's commercial success, noting it had surpassed 1.5 billion client interactions by mid-2023 and is a primary driver of client engagement (Compl. ¶¶62-63).

IV. Analysis of Infringement Allegations

’160 Patent Infringement Allegations

Claim Element (from Independent Claim 12)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
...receiving a transcription of a natural language utterance at a computer comprising the knowledge-enhanced speech recognition engine;	The Erica system transcribes a user's spoken utterance, and this transcription is received by its computer systems for processing. The screenshot of a user's request, "I would like to transfer money," shows a transcribed utterance.	¶80	col. 10:48-52
identifying one or more contexts that completely or partially match one or more text combinations contained in the transcription, wherein identifying the matching contexts includes comparing the text combinations against the grammar expression entries in the context description grammar and against one or more expected contexts stored in a context stack;	Erica allegedly maintains a "context stack" of expected user intents (e.g., "Transfer," "Pay a bill") and compares the transcribed text ("Transfer money") against these contexts and associated grammar expressions to find a match.	¶81	col. 13:5-14
scoring each of the identified matching contexts;	In response to a user's input, Erica allegedly generates relevance scores for the matching contexts. The complaint provides a screenshot showing options ranked in a specific order, which allegedly reflects their underlying scores.	¶82	col. 5:9-14
selecting the matching context having a highest score to determine a most likely context for the utterance; and	Erica allegedly presents the option with the highest score first (e.g., "Transfer between my accounts"), indicating it has been selected as the most likely context.	¶83	col. 13:6-14
communicating a request to a domain agent configured to process requests in the most likely context for the utterance...	When a user selects the highest-scored option, Erica communicates the request to a specialized "domain agent" responsible for that function, such as an agent that transfers money.	¶84	col. 13:10-14

Identified Points of Contention:
- Scope Questions: A central question may be whether the term "context stack" as described in the patent, which implies a specific data structure for managing dialogue history, can be construed to cover the method by which the accused Erica system stores and accesses potential user intents. Similarly, the definition of "domain agent" will be critical—whether it reads on the specific software modules or microservices that Erica allegedly uses to process different types of financial requests.
- Technical Questions: The complaint alleges that the ranked list of options presented by Erica indicates a "scoring" process. A key technical question will be what evidence demonstrates that this ranking is the result of the specific comparison and scoring method required by claim 12, as opposed to a different form of algorithmic decision-making.

’468 Patent Infringement Allegations

Claim Element (from Independent Claim 19)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
receiving a multi-modal natural language input...including a natural language utterance and a non-speech input...	Erica receives input through multiple modes, such as a spoken utterance ("I would like to transfer money") followed by a non-speech, icon-selection input (tapping an account selection box).	¶¶104-105	col. 22:45-51
identifying the user that provided the multi-modal input;	To use Erica, a user must be logged into their Bank of America account, which identifies the user via account information, username, and password.	¶109	col. 23:15-20
creating a speech-based transcription...using a...semantic knowledge-based model, wherein the...model includes a personalized cognitive model...a general cognitive model...and an environmental model...	Erica allegedly uses a three-part model: it learns from an individual client's behavior ("personalized"), from all BofA customers ("general"), and uses location data for queries like "nearest branch" ("environmental").	¶¶111-113	col. 23:58-24:14
merging the speech-based transcription and the non-speech-based transcription to create a merged transcription;	The complaint alleges that in a dialogue where a user types "I would like to" and speaks "Transfer money," the system merges these inputs to form a complete command.	¶114	col. 12:35-38
identifying one or more entries in a context stack matching information contained in the merged transcription;	Based on the merged transcription, Erica allegedly identifies a matching entry in its context stack, such as "Transfer between my accounts."	¶117	col. 24:15-18
...identifying a domain agent associated with the most likely context...communicating a request...and generating a response...	Erica allegedly identifies the domain agent associated with the determined context (e.g., money transfer) and communicates the user's request to it, which then provides content for a response, such as account information.	¶¶118-120	col. 24:25-34

Identified Points of Contention:
- Scope Questions: The dispute may center on the definition of the three-part "semantic knowledge-based model." It raises the question of whether the general machine learning and personalization features described in Bank of America's marketing materials perform the specific functions of the claimed "personalized," "general," and "environmental" cognitive models.
- Technical Questions: An evidentiary question will be whether the accused system actually "merges" transcriptions from different modalities into a single data object as required by the claim, or if it processes sequential inputs in a different manner that does not constitute merging. The complaint provides a screenshot illustrating a multi-modal interaction where a user speaks and then taps the screen (Compl. p. 43).

V. Key Claim Terms for Construction

The Term: "context stack" (’160 Patent, Cl. 12; ’468 Patent, Cl. 19)

Context and Importance: This term appears in the independent claims of multiple asserted patents and is fundamental to how the inventions are alleged to understand conversational flow. Its construction will be critical to determining whether Erica’s method of managing potential user intents and dialogue history falls within the scope of the claims. Practitioners may focus on this term because its definition could determine whether a specific data structure is required or if a more functional interpretation is appropriate.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The specification suggests a functional role, stating agents "may update a context stack to enable follow-up requests" (’160 Patent, col. 4:12-14) and that it contains "an ordered list of command contexts" (’468 Patent, col. 20:59-60), which may support an interpretation covering any system that maintains an ordered history of potential commands.
- Evidence for a Narrower Interpretation: The specification also notes that knowledge agents provide information for generating a response by using data supplied from a "grammar stack" (’160 Patent, col. 11:30-32), potentially linking the "context stack" to a more specific, grammar-driven data structure rather than a general-purpose state management system.

The Term: "semantic knowledge-based model" including a "personalized cognitive model," a "general cognitive model," and an "environmental model" (’468 Patent, Cl. 19)

Context and Importance: This composite term is the core technical element of claim 19 of the ’468 Patent. The infringement analysis for this patent will turn on whether Erica's AI system can be shown to contain these three distinct, claimed model components.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The detailed description defines the models in functional terms: the personalized model is based on "prior interactions with that user," the general model on interactions with "multiple users," and the environmental model on the "user's environment" (’468 Patent, Abstract). This language may support reading the claim on any system that learns from individual, group, and environmental data, regardless of its specific architecture.
- Evidence for a Narrower Interpretation: Figure 8 of the ’468 Patent depicts the "Personalized Cognitive model" (810), "General Cognitive Model" (806), and "Environmental Model" (808) as distinct architectural blocks that feed into a "Conversational Speech Analyzer" (804). A defendant may argue this figure requires three structurally separate and distinct models, potentially limiting the claim scope to a specific implementation rather than a functional outcome.

VI. Other Allegations

Indirect Infringement: The complaint alleges inducement by asserting that Defendant’s website, press releases, and "Financial Center Job Aid" documents instruct and encourage customers to use Erica's multi-modal (speech and text) and context-aware features in a manner that directly infringes the patent claims (Compl. ¶¶88-91, 126-127). Contributory infringement is alleged on the basis that the Erica system is "exclusively available in Bank of America’s Mobile Banking app" and is therefore not a staple article of commerce suitable for substantial non-infringing use (Compl. ¶¶92, 128).
Willful Infringement: The willfulness allegation is based on alleged pre-suit knowledge. The complaint asserts that Bank of America was aware of the patented technology as early as 2016 due to direct business discussions with the original inventor, VoiceBox (Compl. ¶59). It further alleges knowledge based on a 2013 Bank of America patent application that cites a patent publication related to three of the asserted patents (Compl. ¶59). The complaint also establishes post-suit knowledge as of the filing date of the original complaint (Compl. ¶95).

VII. Analyst’s Conclusion: Key Questions for the Case

A core issue will be one of definitional scope: can the specific terms recited in the patents from the 2005-2012 era, such as "context stack" and "domain agent," be construed to cover the architecture of a modern, AI-driven virtual assistant like Erica? The case may turn on whether these terms are interpreted as requiring specific data structures and software modules or as describing broader functions that Erica's system performs.
A key evidentiary question will be one of functional mapping: does the accused Erica system, in its underlying operation, actually implement the multi-part "semantic knowledge-based model" (personalized, general, and environmental) as claimed in the ’468 Patent? While the complaint points to marketing language supporting each element, the case will likely require technical evidence to demonstrate that Erica’s learning and context-awareness capabilities map directly onto the distinct components and processes recited in the claim.