DCT

1:24-cv-00839

VB Assets LLC v. Amazon.com Services LLC

Key Events

Complaint

I. Executive Summary and Procedural Information

Parties & Counsel:
- Plaintiff: VB Assets, LLC (Delaware)
- Defendant: Amazon.com Services, LLC (Delaware)
- Plaintiff’s Counsel: Smith, Katzenstein & Jenkins LLP
Case Identification: 1:24-cv-00839, D. Del., 07/18/2024
Venue Allegations: Venue is alleged to be proper in the District of Delaware because Defendant is a Delaware corporation and therefore resides in the district for patent venue purposes.
Core Dispute: Plaintiff alleges that Defendant’s Alexa-enabled products and services infringe five U.S. patents related to cooperative conversational voice interfaces, voice-driven commerce, and voice-based targeted advertising.
Technical Context: The patents relate to the field of natural language understanding and conversational artificial intelligence, a technology central to the virtual assistant and smart speaker market.
Key Procedural History: The complaint is filed in the shadow of prior litigation between the parties, which culminated in a November 2023 jury verdict where Amazon was found to have willfully infringed four related patents, resulting in a damages award of $46.7 million and an ongoing royalty. The complaint also alleges an extensive history of pre-suit interactions, including meetings in 2011 and 2017 where Plaintiff's predecessor, VoiceBox, allegedly disclosed its patented technology to Amazon, as well as a formal notice letter sent in May 2024.

Case Timeline

Date	Event
2006-10-16	Priority Date for ’249 and ’699 Patents
2007-02-06	Priority Date for ’758 Patent
2009-11-10	Priority Date for ’025 Patent
2011-10-07	VoiceBox presents its technology to Amazon via teleconference
2011-10-26	VoiceBox and Amazon executives and engineers meet at VoiceBox's office
2014-09-16	Priority Date for ’385 Patent
2014-11-01	Amazon announces the launch of Alexa and the Echo smart speaker (approx. date)
2016-11-22	U.S. Patent No. 9,502,025 Issues
2017-02-02	VoiceBox discloses the pending application for the ’249 Patent to Amazon
2018-01-04	’025 Patent or its application is cited during prosecution of an Amazon-assigned patent
2019-05-21	U.S. Patent No. 10,297,249 Issues
2019-07-29	First VoiceBox lawsuit filed against Amazon
2020-08-25	U.S. Patent No. 10,755,699 Issues
2021-08-03	U.S. Patent No. 11,080,758 Issues
2021-08-10	U.S. Patent No. 11,087,385 Issues
2023-11-08	Delaware jury finds Amazon willfully infringed four related VoiceBox patents
2024-05-01	Plaintiff sends notice letter to Amazon regarding infringement of the patents-in-suit
2024-07-18	Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 10,297,249 - "System and Method for a Cooperative Conversational Voice User Interface"

Patent Identification: U.S. Patent No. 10297249, "System and Method for a Cooperative Conversational Voice User Interface," issued May 21, 2019 (Compl. ¶41).

The Invention Explained

Problem Addressed: The patent addresses the limitations of prior art "Command and Control" voice systems, which forced users to memorize exact words and navigate rigid verbal menus, failing to provide a seamless or intuitive conversational experience (Compl. ¶¶47-48; ’249 Patent, col. 2:5-9).
The Patented Solution: The invention describes a "cooperative conversational voice user interface" that improves natural language processing by integrating information from multiple sources. It generates "short-term knowledge" based on both voice inputs and prior "multi-modal" non-voice interactions (e.g., screen touches) within the same conversation to better understand context (Compl. ¶42; ’249 Patent, Abstract). The system also claims a method for enhancing speech recognition accuracy by receiving the same utterance from two different input devices (e.g., microphones), comparing the inputs, and filtering sound based on that comparison to reduce noise (Compl. ¶42; ’249 Patent, col. 4:36-54).
Technical Importance: The technology aimed to improve the functioning of voice interfaces by allowing users to "converse naturally" rather than simplifying or "dumbing down" their requests to fit a rigid system structure (Compl. ¶51).

Key Claims at a Glance

The complaint asserts claims 1-10 and 12-15 (Compl. ¶97). Independent claims 1 (method) and 16 (system) are asserted.
The essential elements of independent system claim 16 include processors programmed to:
- receive, during a first conversation, a first voice input comprising a first natural language utterance via a first input device;
- receive a second voice input comprising the first natural language utterance via a second input device;
- compare the first voice input with the second voice input;
- filter sound from the first voice input and the second voice input based on the comparison;
- obtain, during the first conversation, a user interface state related to one or more non-voice inputs associated with the first voice input;
- generate short-term knowledge based on at least the first voice input and the first non-voice input;
- determine, based on the short-term knowledge, a first context for the first natural language utterance; and
- determine an interpretation and generate a response based on that context.
The complaint reserves the right to assert dependent claims (Compl. ¶97).

U.S. Patent No. 10,755,699 - "System and Method for a Cooperative Conversational Voice User Interface"

Patent Identification: U.S. Patent No. 10755699, "System and Method for a Cooperative Conversational Voice User Interface," issued August 25, 2020 (Compl. ¶44).

The Invention Explained

Problem Addressed: Like its parent ’249 Patent, the '699 Patent addresses the rigidity of prior art "Command and Control" voice systems (Compl. ¶47).
The Patented Solution: This invention enhances conversational interaction by adapting system responses based on how a user speaks. The system accumulates both "short-term knowledge" from the current conversation and "long-term knowledge" from prior conversations (Compl. ¶45). It then uses this combined knowledge to "identify a manner" in which an utterance was spoken—defined as including tone, pace, inflection, and word use—and generates a response based on both the interpretation of the words and this identified manner (Compl. ¶¶45, 57; ’699 Patent, Abstract). The invention also describes expiring short-term session data after a "psychologically appropriate amount of time to humanize system behavior" (Compl. ¶56).
Technical Importance: The invention aims to significantly improve conversational accuracy and the natural flow of interactions by making the system responsive not just to what is said, but how it is said (Compl. ¶56).

Key Claims at a Glance

The complaint asserts claims 1, 3-5, 6-8, and 11 (Compl. ¶105). Independent claim 1 is asserted.
The essential elements of independent system claim 1 include processors programmed to:
- receive a user input comprising a natural language utterance;
- recognize words or phrases from the utterance;
- identify a context for the utterance based on the recognized words;
- determine an interpretation of the utterance based on the context;
- accumulate short-term knowledge from utterances received during a single conversation;
- accumulate long-term knowledge from utterances received prior to that conversation period;
- identify a manner in which the utterance was spoken based on the short-term and long-term knowledge; and
- generate a response based on the interpretation and the identified manner.
The complaint reserves the right to assert dependent claims (Compl. ¶105).

U.S. Patent No. 11,087,385 - "Voice Commerce"

Patent Identification: U.S. Patent No. 11087385, "Voice Commerce," issued August 10, 2021 (Compl. ¶59).
Technology Synopsis: The patent addresses the problem of cumbersome online shopping on mobile devices, which traditionally required users to browse websites and fill out forms (Compl. ¶¶62-63). The patented solution is a voice commerce system that can identify a product for purchase from a single user utterance, receive a second, separate confirmation input, and then complete the transaction without requiring any further user input for product selection or shipping/payment details (Compl. ¶¶60, 70-71).
Asserted Claims: Independent claims 1, 11, and 31, along with various dependent claims (Compl. ¶113).
Accused Features: The voice-based purchasing functionality integrated into Amazon's Alexa platform (Compl. ¶¶1, 113).

U.S. Patent No. 11,080,758 - "System and Method for Delivering Targeted Advertisements and/or Providing Natural Language Processing Based on Advertisements"

Patent Identification: U.S. Patent No. 11080758, "System and Method for Delivering Targeted Advertisements and/or Providing Natural Language Processing Based on Advertisements," issued August 3, 2021 (Compl. ¶74).
Technology Synopsis: The patent addresses the failure of prior voice interfaces to facilitate productive dialogue or effectively utilize voice-based interactions for marketing and advertising (Compl. ¶¶78-79). The invention is a system that processes a natural language utterance to determine a context, selects a "purchase opportunity" (advertisement) based on that context, tracks the user’s interaction pattern with the delivered opportunity (including transaction completion), updates a user-specific profile based on that pattern, and then uses the profile to interpret subsequent utterances for selecting future purchase opportunities (Compl. ¶75).
Asserted Claims: Independent claims 1, 9, and 18, along with various dependent claims (Compl. ¶121).
Accused Features: The voice-driven advertisement selection and delivery systems within the Alexa platform (Compl. ¶¶1, 121).

U.S. Patent No. 9,502,025 - "System and Method for Providing a Natural Language Content Dedication Service"

Patent Identification: U.S. Patent No. 9502025, "System and Method for Providing a Natural Language Content Dedication Service," issued November 22, 2016 (Compl. ¶87).
Technology Synopsis: The patent addresses shortcomings in "Command and Control" systems that were ill-suited for multi-modal interactions like dedicating digital content (e.g., sending a song with a personal message) to another person (Compl. ¶¶90-91). The solution is a service that processes a first utterance to identify content for dedication, receives a second utterance to be associated with it, provides the words of the second utterance as textual annotations in the content's metadata, and sends the recipient information to access both the content and the associated second utterance (Compl. ¶88).
Asserted Claims: Independent claims 8 and 14, along with various dependent claims (Compl. ¶129).
Accused Features: Alexa's functionality for sending digital content, such as music or messages, to other users (Compl. ¶129).

III. The Accused Instrumentality

Product Identification: The complaint accuses "Alexa Products," a category defined to include the Alexa virtual assistant, all associated hardware and software, Echo devices, Alexa mobile apps, Fire TV, Amazon smart glasses, and third-party devices with Alexa integrated (Compl. ¶11, n.2).
Functionality and Market Context: The complaint alleges that the accused products provide "improved contextual conversations and voice driven advertisement and commerce," including through the use of large language models (LLMs) (Compl. ¶1). The complaint cites a September 2023 Amazon announcement promoting Alexa's ability to use "personalization and context" by carrying "over relevant context throughout conversations, in the same way that humans do all the time" (Compl. ¶32). This announcement is presented as a screenshot showing Amazon's description of Alexa's contextual capabilities (Compl. p. 14). A second screenshot shows a live demonstration where Alexa allegedly uses knowledge of a user's prior interactions to identify his favorite football team and provide a tailored response (Compl. p. 14, ¶33).

IV. Analysis of Infringement Allegations

The complaint references exemplary claim charts in exhibits that were not provided. The following summary is based on the narrative infringement theories articulated within the complaint.

’249 Patent Infringement Allegations: The complaint alleges that Alexa Products perform the claimed method of facilitating natural language responses using multi-modal inputs (Compl. ¶¶42, 97). The theory suggests Alexa devices receive voice input from multiple microphones, compare these inputs to filter noise, and integrate non-voice user interface states with voice input to generate "short-term knowledge." This knowledge is then allegedly used to determine context and generate a response, mapping to the core steps of the asserted claims (Compl. ¶42).
Identified Points of Contention:
- Technical Question: What evidence does the complaint provide that Alexa performs the specific step of filtering sound "based on the comparison" of two distinct inputs containing the same utterance, as required by the claims and emphasized during prosecution (Compl. ¶55), versus employing more general noise-cancellation or beamforming techniques?
- Scope Question: How does the complaint map the claimed step of obtaining a "user interface state related to one or more non-voice inputs" to the operation of predominantly voice-first devices like an Amazon Echo?
’699 Patent Infringement Allegations: The infringement theory for the ’699 Patent centers on Alexa's alleged ability to adapt responses based on a user's manner of speaking (Compl. ¶¶45, 105). The complaint posits that Alexa accumulates short-term and long-term knowledge about a user's utterances and uses this knowledge to identify the "manner" (e.g., tone, pace, inflection) in which a request is made, generating a response based on this identified manner in addition to the content of the words themselves (Compl. ¶45).
Identified Points of Contention:
- Technical Question: What factual basis does the complaint offer for the allegation that Alexa analyzes prosodic or acoustic features of speech (tone, pace) as the claimed "manner" of speaking, and that this analysis is based on accumulated "short-term and long-term knowledge"?
- Scope Question: Can the term "identify a manner," which the patent defines with specific acoustic examples (Compl. ¶57), be construed to cover inferences of user intent based solely on word choice or conversational history, without analysis of the speech signal itself?

V. Key Claim Terms for Construction

Term from ’249 Patent: "filter sound from the first voice input and the second voice input based on the comparison" (from claim 16)
- Context and Importance: This term appears central to the patent's novelty over prior art, as noted in the complaint's summary of the prosecution history (Compl. ¶55). The definition will be critical to determining whether Alexa's noise-handling technology performs the specific comparative process claimed, or a different, non-infringing one.
- Intrinsic Evidence for Interpretation:
  - Evidence for a Narrower Interpretation: The complaint highlights that the patent was granted because the prior art failed to describe filtering "where the filtering is based on the comparison" of two inputs (Compl. ¶55). This prosecution history suggests a specific, technically distinct filtering method is required, not just any noise reduction.
  - Evidence for a Broader Interpretation: The detailed description may contain broader language describing the general goal of filtering out "environmental and non-human noise" by "comparing a speech signal from the various microphones," which could be argued to encompass a wider range of multi-microphone noise reduction techniques (’249 Patent, col. 4:46-51).
Term from ’699 Patent: "identify a manner in which the natural language utterance was spoken" (from claim 1)
- Context and Importance: This is the core inventive concept of the asserted claim. Its construction will determine whether infringement requires direct analysis of acoustic and prosodic features of speech or if it can be met by higher-level analysis of user intent or sentiment.
- Intrinsic Evidence for Interpretation:
  - Evidence for a Narrower Interpretation: The specification explicitly defines "manner" as including "at least one of tone, pace, timing, inflection, word use, and/or jargon" (Compl. ¶57; ’699 Patent, col. 6:21-23). This list of predominantly acoustic characteristics provides strong support for a construction requiring analysis beyond mere semantics.
  - Evidence for a Broader Interpretation: A party might argue that elements like "word use" and "jargon" from the patent's own definition are not purely acoustic, potentially opening the door for the term to cover systems that adapt based on lexical choice patterns derived from long-term knowledge, without direct prosodic analysis.

VI. Other Allegations

Indirect Infringement: The complaint alleges inducement by asserting that Amazon writes infringing software for Alexa Products, instructs users on its infringing use through manuals and technical support, and profits from the infringing activity (Compl. ¶¶98, 106, 114, 122, 130). Contributory infringement is also alleged on the basis that Alexa Products are especially made or adapted for infringement and are not staple articles of commerce with substantial non-infringing uses (Compl. ¶¶99, 107, 115, 123, 131).
Willful Infringement: Willfulness is a central theme, with the complaint alleging pre-suit knowledge of the patents and technology based on: (1) a series of meetings between VoiceBox and Amazon beginning in 2011 (Compl. ¶¶4-9); (2) specific disclosure of the application leading to the ’249 Patent at a February 2, 2017 meeting (Compl. ¶¶24, 98); (3) the ’025 Patent being cited during the prosecution of an Amazon-assigned patent in 2018 (Compl. ¶130); (4) a May 2024 notice letter (Compl. ¶35); and (5) the November 2023 jury verdict finding Amazon’s infringement of related patents to be willful (Compl. ¶¶1, 31).

VII. Analyst’s Conclusion: Key Questions for the Case

A key evidentiary question will be one of specific technical operation: Does the complaint provide a sufficient factual basis to suggest that Amazon's Alexa platform, which now relies heavily on LLMs, performs the specific, multi-step processes recited in the patent claims, such as filtering sound "based on the comparison" of two identical utterances (’249 Patent) or analyzing acoustic "manner of speaking" (’699 Patent)? The case may turn on whether Plaintiff can prove these granular functions exist within Alexa's complex architecture.
A central legal issue will be Amazon's state of mind: Given the extensive alleged history of disclosures, the citation of the ’025 patent in Amazon's own prosecution, and particularly the prior jury verdict of willful infringement on related patents, can Amazon mount a credible defense against willfulness? This history suggests a significant challenge for the defense and may heavily influence potential damages and the possibility of the case being deemed exceptional.
A core issue will be one of technological scope: Can the claims, which describe systems with more modular components like distinct "context domain agents," be construed to read on the functionality of a more integrated and holistic large language model? The court's interpretation of the claim language in light of this technological shift will likely be a focal point of the dispute.