DCT

1:17-cv-00287

IPA Tech Inc v. NVIDIA Corp

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 1:17-cv-00287, D. Del., 03/20/2017
  • Venue Allegations: Plaintiff alleges venue is proper in the District of Delaware because Defendant is a Delaware corporation and is subject to personal jurisdiction in the district.
  • Core Dispute: Plaintiff alleges that Defendant’s products incorporating Google's voice assistant technology, such as the Shield Tablet and Shield TV, infringe three patents related to systems for speech-based navigation of remotely-stored electronic information.
  • Technical Context: The technology at issue involves using spoken natural language to search for and retrieve information, a foundational capability for modern digital assistants and voice-controlled devices.
  • Key Procedural History: The patents-in-suit originated with SRI International, which was involved in the DARPA-funded CALO project, a precursor to the Siri personal assistant. The patent portfolio was acquired by Plaintiff IPA Technologies in 2016. Subsequent to the filing of this complaint, all asserted claims of all three patents-in-suit (U.S. Patent Nos. 6,742,021; 6,523,061; and 6,757,718) were cancelled in Inter Partes Review (IPR) proceedings before the U.S. Patent and Trademark Office, with certificates of cancellation issuing in March 2020.

Case Timeline

Date Event
1999-01-05 Earliest Priority Date for ’021, ’061, and ’718 Patents
2003-02-18 ’061 Patent Issued
2004-05-25 ’021 Patent Issued
2004-06-29 ’718 Patent Issued
2007-01-01 Siri, Inc. formed by SRI International
2010-02-01 Siri technology released as an iPhone 3GS app
2012-07-01 Google Now digital assistant included with Android OS
2016-05-06 IPA Technologies acquires SRI patent portfolio
2017-03-20 Complaint Filed
2020-03-09 All claims of ’061 and ’718 Patents cancelled via IPR
2020-03-12 All claims of ’021 Patent cancelled via IPR

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 6,742,021 - "Navigating Network-Based Electronic Information Using Spoken Input With Multimodal Error Feedback," Issued May 25, 2004

The Invention Explained

  • Problem Addressed: The patent describes the difficulty for users navigating large electronic data sources (e.g., the internet, multimedia libraries) using conventional step-by-step, text-and-click interfaces. These methods do not accommodate intuitive, natural language requests (’021 Patent, col. 1:15-2:19).
  • The Patented Solution: The invention provides a method where a user can make a spoken request. The system interprets the request and, if it is ambiguous or incomplete, solicits additional clarifying input from the user through a different modality (e.g., presenting a visual menu for selection). This "multi-modal dialogue" refines the query until it is specific enough to retrieve the desired information from a remote network server for presentation to the user (’021 Patent, col. 2:26-66, Fig. 4).
  • Technical Importance: This approach sought to make human-computer interaction more conversational and efficient by combining the ease of speech input with the precision of graphical feedback for error correction.

Key Claims at a Glance

  • The complaint asserts independent claim 1 (Compl. ¶26).
  • Essential elements of claim 1 include:
    • receiving a spoken request;
    • rendering an interpretation of the request;
    • constructing a navigation query from the interpretation;
    • soliciting additional input from the user in a non-spoken modality different from the original request, without the user having to explicitly request it;
    • refining the navigation query with the additional input;
    • using the refined query to select data from a remote electronic data source; and
    • transmitting the selected data to the user's client device.
  • The complaint reserves the right to assert additional claims (Compl. ¶28, n.1).

U.S. Patent No. 6,523,061 - "System, Method, and Article of Manufacture For Agent-Based Navigation in a Speech-Based Data Navigation System," Issued February 18, 2003

The Invention Explained

  • Problem Addressed: The patent identifies the same core problem as the ’021 Patent: enabling intuitive, voice-driven navigation of complex data sources without requiring users to conform to rigid, pre-existing navigation structures (’061 Patent, col. 1:21-2:20).
  • The Patented Solution: This invention proposes an "agent-based" architecture to solve the problem. A central "facilitator" manages data flow by receiving an interpreted user request and routing it to one or more specialized "agents" (e.g., a web database agent, an email agent). The facilitator maintains a registry of each agent's capabilities, allowing it to delegate tasks to the appropriate module to retrieve information (’061 Patent, Abstract; col. 13:16-14:65, Fig. 6).
  • Technical Importance: This agent-based architecture provides a modular and extensible framework for a voice-assistant system, allowing new capabilities to be added by registering new agents with the facilitator.

Key Claims at a Glance

  • The complaint asserts independent claim 1 (Compl. ¶38).
  • Essential elements of claim 1 include:
    • receiving and interpreting a spoken request;
    • constructing a navigation query;
    • routing the query to at least one agent that uses the query to select data;
    • invoking a user interface agent to output the selected data;
    • wherein a "facilitator manages data flow among multiple agents and maintains a registration of each of said agents' capabilities."
  • The complaint reserves the right to assert additional claims (Compl. ¶39, n.2).

Multi-Patent Capsule

  • Patent Identification: U.S. Patent No. 6,757,718, "Mobile Navigation of Network-Based Electronic Information Using Spoken Input," Issued June 29, 2004.
  • Technology Synopsis: This patent adapts the core speech-based navigation technology to a mobile environment. It claims a method where a user makes a spoken request using a "mobile information appliance" (which includes a portable remote control or a set-top box) that is in communication with remote network servers via an established data link (’718 Patent, Abstract, Claim 1).
  • Asserted Claims: The complaint asserts independent claim 1 (Compl. ¶50).
  • Accused Features: The complaint accuses NVIDIA Android TV products, which utilize a portable remote control for voice-based navigation, of infringing this patent (Compl. ¶51).

III. The Accused Instrumentality

Product Identification

  • The complaint identifies two categories of accused products: "NVIDIA Google Now-enabled products," such as the Shield Tablet K1, and "NVIDIA Android TV products," such as the Shield TV (Compl. ¶19, ¶21).

Functionality and Market Context

  • The core accused functionality is the voice search capability provided by the Google Now digital assistant and the Android TV operating system (Compl. ¶14, ¶20). The complaint alleges that this functionality allows users to interact with their devices using natural spoken language to retrieve information like directions, weather, and media content (Compl. ¶16, ¶18).
  • The system is described as processing a spoken request, retrieving information from remote data sources, and presenting the results to the user visually in the form of "cards" or other displays (Compl. ¶15, ¶17).
  • No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

U.S. Patent No. 6,742,021 Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
A method for speech-based navigation of an electronic data source... The NVIDIA Google Now-enabled products use speech-based navigation of an electronic data source. ¶28 col. 1:15-18
(a) receiving a spoken request for desired information from the user; The products receive a spoken request for information (such as directions, calendar, weather, flight, sports, and restaurant information). ¶28 col. 3:56-61
(b) rendering an interpretation of the spoken request; The products render an interpretation of the spoken request. ¶28 col. 7:1-8:41
(c) constructing at least part of a navigation query based upon the interpretation; The products construct at least part of a navigation query based on the spoken request. ¶28 col. 8:51-10:48
(d) soliciting additional input from the user, including user interaction in a non-spoken modality... without requiring the user to request said non-spoken modality; The products solicit additional input from the user, including user interaction in a non-spoken modality different than the original request without requiring the user to request the non-spoken modality. ¶28 col. 10:50-11:14
(e) refining the navigation query, based upon the additional input; The complaint makes a conclusory allegation that this step is met. ¶28 col. 11:15-34
(f) using the refined navigation query to select a portion of the electronic data source; and The complaint makes a conclusory allegation that this step is met. ¶28 col. 11:15-34
(g) transmitting the selected portion of the electronic data source from the network server to a client device of the user. The products transmit the selected portion from a network server to the NVIDIA Google Now-enabled products. ¶28 col. 11:25-34

Identified Points of Contention

  • Technical Questions: The complaint asserts that the accused products perform the step of "soliciting additional input... in a non-spoken modality" (Compl. ¶28), but provides no specific factual example of this interaction. A central question is what evidence exists that the accused products proactively present non-spoken (e.g., visual menu) options to resolve ambiguity, as opposed to simply returning a list of search results or failing the query.

U.S. Patent No. 6,523,061 Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
A method for utilizing agents for speech-based navigation of an electronic data source... Defendant's products use speech-based navigation. ¶39, ¶40 col. 1:21-27
(a) receiving a spoken request... (b) rendering an interpretation... (c) constructing a navigation query... The products receive a spoken request, render an interpretation, and construct a navigation query. ¶39, ¶40 col. 7:1-8:67
(d) routing the navigation query to at least one agent, wherein the at least one agent utilizes the navigation query to select a portion of the electronic data source; The products route the navigation query to at least one agent that utilizes the query to select a portion of the electronic data source. ¶39, ¶40 col. 13:46-51
(e) invoking a user interface agent for outputting the selected portion... The products invoke a user interface agent for outputting the selected portion of the electronic data source. ¶39, ¶40 col. 14:45-48
wherein a facilitator manages data flow among multiple agents and maintains a registration of each of said agents' capabilities. The complaint alleges that in the accused products, a facilitator manages data flow among multiple agents and maintains a registration of each of the agents' capabilities. ¶39, ¶40 col. 13:25-45

Identified Points of Contention

  • Scope Questions: The complaint alleges that the accused system uses "agents" and a "facilitator" as claimed (Compl. ¶39), but provides no detail on how the Google Now/Android TV software architecture maps to this specific structure. A key question is whether the accused system contains a single "facilitator" component that performs both the functions of managing data flow and maintaining a registration of agent capabilities, as required by the claim.

V. Key Claim Terms for Construction

For the ’061 Patent

  • The Term: "facilitator"
  • Context and Importance: This term is the central limitation of independent claim 1 and defines the specific agent-based architecture. The infringement analysis hinges on whether the accused Google system can be shown to possess a component that meets the structural and functional definition of the claimed "facilitator." Practitioners may focus on this term because the patent appears to teach a specific, centralized architecture, and the accused system may have a different, more distributed or integrated structure.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The claim itself defines the facilitator functionally as something that "manages data flow among multiple agents and maintains a registration of each of said agents' capabilities" (’061 Patent, col. 16:4-7). A party might argue this functional language is not limited to a specific implementation and covers any system that coordinates different services.
    • Evidence for a Narrower Interpretation: The detailed description and Figure 6 depict a distinct, central "Facilitator 600" that connects to various specialized agents (e.g., Speech Recognition Agent, Web Database Agent, Calendar Agent) (’061 Patent, Fig. 6; col. 13:25-45). A party may argue this embodiment limits the term to a single, centralized component that performs all the recited functions, as opposed to a system where those functions might be distributed or handled differently.

VI. Other Allegations

  • Indirect Infringement: For all three patents, the complaint alleges active inducement under 35 U.S.C. § 271(b). The factual basis alleged is that Defendant encourages and instructs end users to perform the claimed methods through marketing materials, product manuals, and instructional videos, citing blog posts and YouTube videos as examples (Compl. ¶31, ¶43, ¶54).
  • Willful Infringement: The complaint alleges that Defendant gained knowledge of the patents "no later than the filing of this complaint or shortly thereafter" (Compl. ¶30, ¶42, ¶53). This allegation supports a claim for willful infringement based only on post-filing conduct. The plaintiff explicitly reserves the right to request a finding of willfulness if pre-suit knowledge is established during discovery (Compl. ¶32, ¶44, ¶56).

VII. Analyst’s Conclusion: Key Questions for the Case

The complaint, as filed, raised several technical and legal questions. However, subsequent procedural developments are dispositive. The central questions for this case are now primarily procedural:

  • A core issue is one of viability: Given that all asserted claims across all three patents-in-suit were cancelled in Inter Partes Review proceedings after the complaint was filed, the fundamental question is what legal basis, if any, remains for the lawsuit to proceed.
  • A key historical question of architectural mapping would have been whether Google's software architecture, as implemented in NVIDIA's products, contains a "facilitator" that both manages data flow and maintains an agent registry, as specifically required by the '061 patent.
  • A key historical question of functional operation would have been what evidence demonstrates that the accused products proactively use "non-spoken" modalities to resolve speech ambiguities, as required by the '021 patent, versus merely presenting a list of possible results.