DCT

3:24-cv-00540

SoundClear Tech LLC v. Google LLC

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 1:24-cv-01281, E.D. Va., 07/25/2024
  • Venue Allegations: Venue is alleged to be proper based on Google's regular and established places of business within the Eastern District of Virginia, including offices in Reston, and its business activities and investments, such as data centers, within the district.
  • Core Dispute: Plaintiff alleges that Defendant’s smart devices (Google Home/Nest), mobile devices (Pixel), and the Google Assistant software infringe three patents related to adaptive voice-based user interfaces and multi-touch screen gestures.
  • Technical Context: The technologies at issue involve core features of modern smart assistants and touchscreen devices: dynamically altering voice responses for privacy and clarity, and interpreting multi-touch gestures like pinch-to-zoom.
  • Key Procedural History: The complaint states the patents-in-suit were developed by JVC Kenwood ("JVCK"), described as a "major audio processing product power house," and were subsequently acquired by Plaintiff SoundClear Technologies LLC.

Case Timeline

Date Event
2011-09-28 Earliest Priority Date for U.S. Patent No. 9,223,487
2015-12-29 U.S. Patent No. 9,223,487 Issued
2018-03-06 Earliest Priority Date for U.S. Patent No. 11,069,337
2018-03-12 Earliest Priority Date for U.S. Patent No. 11,244,675
2021-07-20 U.S. Patent No. 11,069,337 Issued
2022-02-08 U.S. Patent No. 11,244,675 Issued
2024-07-25 Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 11,069,337 - "Voice-Content Control Device, Voice-Content Control Method, and Non-Transitory Storage Medium," Issued July 20, 2021

The Invention Explained

  • Problem Addressed: The patent describes a problem with voice-output devices where the audio output may be an annoyance to people other than the user. However, simply lowering the volume may make the content difficult for the intended user to understand (Compl. ¶21; ’337 Patent, col. 1:29-42).
  • The Patented Solution: The invention proposes a system that classifies a user's voice as a "first voice" or a "second voice," for example, based on proximity or whether it is a whisper. Based on this classification, the device generates a different output. When a "second voice" is detected, the device generates a "second output sentence" where some information is omitted compared to the "first output sentence" generated for a "first voice" (Compl. ¶19; ’337 Patent, Abstract; col. 1:66-col. 2:11).
  • Technical Importance: This approach sought to balance user experience with situational awareness, allowing a device's response to be private or less intrusive in a shared environment without sacrificing intelligibility for the primary user (’337 Patent, col. 1:38-42).

Key Claims at a Glance

  • The complaint asserts independent claim 4 (Compl. ¶44).
  • Essential elements of method claim 4 include:
    • calculating a distance between a user and a voice-content control device;
    • acquiring a voice spoken by a user;
    • analyzing the acquired voice to classify it as a "first voice" or a "second voice" based on the distance;
    • analyzing the acquired voice to execute processing intended by the user;
    • generating an output sentence based on the executed processing;
    • adjusting a sound volume of voice data from the output sentence, wherein the volume for a "first voice" sentence differs from the volume for a "second voice" sentence.

U.S. Patent No. 11,244,675 - "Word Replacement in Output Generation for Detected Intent by Voice Classification," Issued February 8, 2022

The Invention Explained

  • Problem Addressed: The patent addresses the issue of a voice assistant's output being understood by people other than the user, which may be undesirable when the information is sensitive (’675 Patent, col. 1:28-37).
  • The Patented Solution: The invention describes classifying an acquired voice (e.g., based on distance via a proximity sensor) and detecting the user's intent. If the voice is classified as a "first voice," the system generates an output sentence where at least one word from the underlying notification information is "replaced with another word." This acts as a form of obfuscation for privacy. If the voice is determined to be a "second voice," a full, non-replaced output is generated (’675 Patent, Abstract; col. 22:20-28).
  • Technical Importance: This technology provides a method for enhancing user privacy in voice interactions by dynamically altering the content of spoken responses based on contextual cues like user proximity or identity, rather than just adjusting volume (’675 Patent, col. 1:33-37).

Key Claims at a Glance

  • The complaint asserts independent claim 6 (Compl. ¶57).
  • Essential elements of method claim 6 include:
    • acquiring a voice spoken by a user;
    • calculating a distance between the user and the device via a proximity sensor to classify the voice into a "first voice" or a "second voice";
    • analyzing the voice to detect "intention information";
    • acquiring "notification information" based on the intent;
    • generating a "first output sentence" when the voice is the "first voice," in which at least one word from the notification information is "replaced with another word";
    • generating a "second output sentence" when the voice is the "second voice," which includes all the information.

U.S. Patent No. 9,223,487 - "Electronic Apparatus, Method of Controlling the Same, and Related Computer Program," Issued December 29, 2015

Technology Synopsis

The patent addresses the need for "easy operation" of electronic devices with touch panels (Compl. ¶35; ’487 Patent, col. 5:20-21). It discloses a method for selecting objects on a screen by detecting two touch positions, determining that the distance between them decreases over time (a "pinch-in" gesture), and in response, defining a rectangular area between the initial touch points to select the object(s) within it (Compl. ¶34; ’487 Patent, Abstract).

Asserted Claims

The complaint asserts independent method claim 11 (Compl. ¶68).

Accused Features

The complaint alleges that the "pinch with two fingers to zoom in and out" functionality on Google's touchscreen products, such as the Nest Hub and Pixel devices, infringes the ’487 patent (Compl. ¶¶ 71, 74-75).

III. The Accused Instrumentality

Product Identification

The complaint names a broad set of accused products and services, categorized as: (1) "Google Home Products" (e.g., Google Home, Nest Mini, Nest Hub); (2) "Google Nexus/Pixel Products" (all generations); and (3) "Google Assistant" software (Compl. ¶8).

Functionality and Market Context

  • The complaint alleges these products incorporate microphones and speakers for voice interaction, connect to Google servers for processing, and utilize various sensors and algorithms for contextual awareness (Compl. ¶¶ 45-47).
  • Specific functionalities cited as infringing the ’337 and ’675 patents include:
    • Proximity Detection: Using "ultrasound sensing," cameras ("Look and Talk"), and "neural network adaptive beamforming (NAB)" to determine a user's distance from the device (Compl. ¶46). The complaint references a Google research blog describing how "gray dots appear" on a Nest Hub Max display when a user looks at it from within 5 feet, indicating readiness (Compl. ¶46).
    • User Identification: Using "Voice Match" to identify individual users and provide personalized results (Compl. ¶¶ 48, 52).
    • Adaptive Volume: Using "Ambient IQ" to automatically adjust the volume of the Assistant's voice based on ambient noise, which the complaint links to adjusting volume based on interference signals that may correlate with distance (Compl. ¶¶ 51, 53).
  • For the ’487 patent, the accused functionality is the use of multi-touch gestures on touchscreen devices like the Nest Hub and Pixel Tablet, specifically the "pinch with two fingers to zoom in and out" gesture (Compl. ¶¶ 71, 74). The complaint references a YouTube video demonstrating this gesture on a Google device (Compl. ¶71).

IV. Analysis of Infringement Allegations

’337 Patent Infringement Allegations

Claim Element (from Independent Claim 4) Alleged Infringing Functionality Complaint Citation Patent Citation
calculating a distance between a user and a voice-content control device; Google products allegedly use ultrasound sensing, camera sensing (e.g., "Look and Talk"), and neural network adaptive beamforming (NAB) to determine the distance of a user. ¶46 col. 17:40-44
analyzing the acquired voice to classify the acquired voice as either one of a first voice and a second voice based on the distance between the user and the voice-content control device; Google products allegedly classify a voice as "near" (first voice) or "far" (second voice) based on the distance determined by the various sensing features. The complaint also links this to "Voice Match" for identifying specific users. ¶48 col. 17:45-50
generating, based on content of the executed processing, output sentence that is text data for a voice to be output to the user; Google Assistant allegedly processes a user's voice input and generates a response, which is presented as a text string that can be converted to an audio signal via Text-To-Speech (TTS). ¶50 col. 19:26-34
at the generating, a first output sentence is generated...when the acquired voice has been classified as the first voice, and a second output sentence is generated...in which a part of information included in the first output sentence is omitted when the acquired voice has been classified as the second voice; Google Assistant allegedly generates personalized (and thus different) visual and/or audible responses using "Voice Match" depending on whether the user is recognized. This is alleged to map to the first/second voice and first/second output sentence structure. ¶52 col. 20:1-11
at adjusting the sound volume of voice data, further adjusting the sound volume of voice data such that the sound volume of voice data obtained by converting the first output sentence thereinto differs from the sound volume of voice data obtained by converting the second output sentence thereinto. Google's "Ambient IQ" feature allegedly adjusts volume based on ambient noise. The complaint also alleges that if the voice signal includes significant interference (implying a more distant user, or "second voice"), the device increases its response volume. ¶53 col. 20:36-44

’675 Patent Infringement Allegations

Claim Element (from Independent Claim 6) Alleged Infringing Functionality Complaint Citation Patent Citation
calculating a distance between the user and an output-content control device by a proximity sensor to classify the voice into either a first voice or a second voice based on the calculated distance; Google products allegedly use ultrasound, camera sensing, and/or NAB as a "proximity sensor" to calculate distance and classify the user's voice as "near" (first voice) or "far." This is also linked to user identification via "Voice Match." ¶60 col. 20:56-62
analyzing the acquired voice to detect intention information indicating what kind of information is wished to be acquired by the user; Google Assistant allegedly processes voice input to determine the target of the user's query, which is equated with "intention information." ¶61 col. 21:11-15
acquiring notification information which includes content information as a content of information to be notified to the user based on the intention information; After analyzing the user's intent, Google Assistant allegedly obtains the relevant output (e.g., calendar information), which is presented as the "notification information." ¶62 col. 21:16-20
generating, when the voice is determined to be the first voice, a first output sentence in which at least one word selected among words included in the content information of the notification information is replaced with another word; The complaint alleges that when "Voice Match" identifies a user ("first voice"), it generates personalized results (e.g., a specific calendar entry). This customization is alleged to be equivalent to replacing a word in the output to ensure personal results are not disclosed to others. ¶63 col. 22:10-18
generating, when the voice is determined to be the second voice, a second output sentence which includes all of the intention information and the content information. The complaint alleges that for an unrecognized user or a user at a distance ("second voice"), more/additional, unfiltered, or non-customized information is displayed or transmitted. ¶64 col. 23:1-7

Identified Points of Contention

  • Scope Questions:
    • For the ’337 and ’675 patents, a central question will be whether the terms "first voice" and "second voice," which the patents primarily describe in terms of proximity or being a whisper, can be construed to cover the concept of a recognized vs. unrecognized user via a feature like "Voice Match."
    • For the ’675 patent, a crucial dispute will be over the meaning of "replaced with another word." The question is whether providing personalized content (e.g., "Your meeting is at 10 AM") instead of a generic or null response constitutes "replacement" as required by the claim, or if the claim requires a more literal substitution within a sentence structure (e.g., changing "meeting" to "appointment").
  • Technical Questions:
    • What evidence supports the allegation that Google's "Ambient IQ" or other volume controls function to create a different volume level for a "second voice" (e.g., a distant user) versus a "first voice" (e.g., a near user), as required by claim 4 of the ’337 patent, rather than simply adjusting for ambient noise regardless of user classification?
    • For the ’487 patent, the infringement analysis will depend on whether the accused "pinch-to-zoom" gestures on Google devices meet the specific geometric and vector-based limitations recited in claim 11, including calculations of angles between vectors and a straight line connecting the initial touch points. The complaint asserts this but does not provide the detailed technical evidence.

V. Key Claim Terms for Construction

Term from ’337 and ’675 Patents: "first voice" / "second voice"

  • The Term: "first voice" and "second voice"
  • Context and Importance: The entire logic of claims in both the ’337 and ’675 patents depends on classifying a user's input into one of these two categories to trigger different device outputs. The dispute will likely center on whether Google's user-identification feature ("Voice Match") falls within the patents' definitions, which are primarily based on physical characteristics like distance or whisper.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The claims themselves tie the classification to "distance" (’337 Patent, col. 20:4; ’675 Patent, col. 22:12), which the complaint alleges Google's products measure. Practitioners may argue that recognized vs. unrecognized users is a proxy for, or a type of, distance-based or context-based classification contemplated by the patent.
    • Evidence for a Narrower Interpretation: The ’337 patent specification repeatedly uses a whisper as the primary example of a "second voice" (’337 Patent, col. 7:51-56). The ’675 patent ties the classification to a "proximity sensor" (’675 Patent, col. 20:58). This suggests the terms are meant to cover physical proximity or vocal characteristics, not abstract user identity.

Term from ’675 Patent: "replaced with another word"

  • The Term: "replaced with another word"
  • Context and Importance: This term is the core of the novel step in claim 6 of the ’675 patent. Infringement hinges on whether Google's delivery of personalized content is a "replacement." Practitioners may focus on this term because the patent's examples suggest a direct substitution for privacy, which may differ from how Google Assistant assembles responses.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The abstract states the goal is to generate output where "at least one word...is replaced" when the voice is predetermined. A party could argue that generating a specific, personalized sentence in place of a generic one functionally achieves this replacement, even if it's not a simple find-and-replace operation on a template.
    • Evidence for a Narrower Interpretation: The patent provides a specific example where "meeting" is replaced with "hospital visit" and "dinner" is replaced with "meeting" (See ’675 Patent, Fig. 6). This intrinsic evidence strongly suggests a direct, one-for-one substitution of a specific word within a sentence to obfuscate meaning, rather than the generation of a different sentence wholesale based on user identity.

VI. Other Allegations

Willful Infringement

The complaint does not contain a separate count for willful infringement, but the prayer for relief requests "enhanced damages for willful infringement" (Compl. p. 34, ¶d). The complaint does not plead specific facts to support a claim of pre-suit knowledge of the patents.

VII. Analyst’s Conclusion: Key Questions for the Case

This case presents three distinct technological assertions. The outcome will likely depend on the court's interpretation of several key claim terms and the factual evidence presented to map the accused products to the patents. The central questions are:

  1. A question of definitional scope: Can the patent terms "first voice" and "second voice," described in the patents with examples like physical proximity and whispering, be construed to encompass the software-based concept of a recognized versus an unrecognized user as implemented in Google's "Voice Match" feature?
  2. A question of functional interpretation: Does Google Assistant's process of generating a personalized response for a recognized user constitute a "replacement" of a word as required by the ’675 patent, or does the patent’s text and examples demand a more literal, one-for-one word substitution for privacy obfuscation?
  3. An evidentiary question on technical mechanics: For the ’487 patent, can the plaintiff provide sufficient technical evidence to demonstrate that Google's "pinch-to-zoom" gesture, as implemented, satisfies all the specific geometric limitations of claim 11, including the derivation of vectors and the calculation of angles relative to a prescribed line?