DCT

2:24-cv-00181

Cerence Operating Co v. Samsung Electronics Co Ltd

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 2:24-cv-00181, E.D. Tex., 03/15/2024
  • Venue Allegations: Plaintiff alleges venue is proper because Samsung Electronics Co., Ltd. is a foreign corporation, and Samsung Electronics America, Inc. commits acts of infringement in the district and maintains a regular and established place of business there.
  • Core Dispute: Plaintiff alleges that Defendant’s virtual assistant and handwriting recognition technologies, as implemented in products like the Samsung Galaxy Note 10+, infringe four patents related to generating synthetic speech with contrastive stress and presenting recognized handwritten symbols.
  • Technical Context: The technologies at issue concern fundamental aspects of human-computer interaction on mobile devices, including the naturalness of synthesized speech and the user-friendliness of handwriting-to-text conversion.
  • Key Procedural History: The complaint alleges that in June 2021, Plaintiff contacted Defendant regarding a potential license to its mobile text entry patents, specifically mentioning U.S. Patent No. 7,680,334 and providing access to a data room with technical information. These pre-suit communications are cited as a basis for willfulness allegations.

Case Timeline

Date Event
2002-08-16 ’334 Patent Priority Date
2010-02-12 ’671, ’486, ’291 Patents Priority Date
2010-03-16 ’334 Patent Issue Date
2014-03-25 ’671 Patent Issue Date
2014-09-02 ’486 Patent Issue Date
2014-12-16 ’291 Patent Issue Date
June 2021 Plaintiff allegedly contacted Defendant regarding licensing
2024-03-15 Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 8,682,671 - Method And Apparatus For Generating Synthetic Speech With Contrastive Stress, issued March 25, 2014

The Invention Explained

  • Problem Addressed: The patent's background section describes conventional text-to-speech (TTS) synthesis as often producing output that is "relatively monotone and flat, lacking the naturalness and emotional expressiveness of the naturally produced human speech" (’671 Patent, col. 8:11-15). It specifically identifies the difficulty of generating "contrastive stress," an emphasis pattern used by human speakers to contrast words or ideas (e.g., "government of the people, by the people, for the people") (’671 Patent, col. 8:17-26).
  • The Patented Solution: The invention provides a method for a speech synthesis system to automatically apply contrastive stress. The system receives multiple text strings, identifies the specific portions that differ between them, and assigns contrastive stress to only those differing portions when generating the audio output (’671 Patent, Abstract). This process is designed to highlight key differences for the listener, making the synthesized speech more natural and intelligible (’671 Patent, col. 8:32-37).
  • Technical Importance: This approach aimed to improve the user experience of automated information systems by making machine-generated speech sound more human and better at conveying nuanced meaning.

Key Claims at a Glance

  • Independent claim 1 is asserted (Compl. ¶13).
  • Essential elements of claim 1 include:
    • Receiving input comprising a plurality of text strings.
    • Identifying a first portion of a first text string as differing from a corresponding first portion of a second text string.
    • Identifying a second portion of the first text string as not differing from a corresponding second portion of the second text string.
    • Assigning contrastive stress to the identified first portion of the first text string, but not to the identified second portion.
    • Generating a speech synthesis output that renders the text strings with the assigned contrastive stress.
    • Providing the speech synthesis output for the speech-enabled application.
  • The complaint does not explicitly reserve the right to assert dependent claims.

U.S. Patent No. 8,825,486 - Method And Apparatus For Generating Synthetic Speech With Contrastive Stress, issued September 2, 2014

The Invention Explained

  • Problem Addressed: Similar to the ’671 Patent, this patent addresses the unnatural, "monotone and flat" quality of conventional TTS systems, which fail to replicate the human use of contrastive stress to emphasize differences and focus a listener's attention (’486 Patent, col. 8:11-32).
  • The Patented Solution: The invention describes a speech-enabled application that generates a text input for a speech synthesis engine. The engine, in turn, produces an audio output where at least one portion carries contrastive stress to contrast with another portion (’486 Patent, Abstract). The system can analyze the text to automatically determine where to apply this stress, for instance by identifying tokens that are alternatives to each other (e.g., a list of departure times) (’486 Patent, col. 9:1-15; Fig. 3B).
  • Technical Importance: This technology sought to simplify the development of applications requiring natural-sounding speech by enabling developers to use plain text inputs while the synthesis system handles the complex task of applying appropriate prosody.

Key Claims at a Glance

  • Independent claim 1 is asserted (Compl. ¶22).
  • Essential elements of claim 1 include:
    • Generating a text input comprising a text transcription of a desired speech output.
    • Inputting the text input to at least one speech synthesis engine.
    • Receiving an audio speech output from the engine corresponding to the text input.
    • The audio speech output comprises at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output.
    • Providing the audio speech output to a user of the speech-enabled application.
  • The complaint does not explicitly reserve the right to assert dependent claims.

Multi-Patent Capsule: U.S. Patent No. 8,914,291

  • Patent Identification: U.S. Patent No. 8,914,291, Method And Apparatus For Generating Synthetic Speech With Contrastive Stress, issued December 16, 2014.
  • Technology Synopsis: Belonging to the same family as the ’671 and ’486 patents, this patent discloses methods for improving automated speech synthesis. It focuses on systems that analyze text to identify contrasting information and then generate an audio output that uses contrastive stress to emphasize those differences, making the speech more natural and easier to understand.
  • Asserted Claims: Claim 1 is asserted (Compl. ¶31).
  • Accused Features: The complaint accuses Defendant's Bixby virtual assistant, which allegedly "produces text-to-speech output with contrastive stress in response to user queries" (Compl. ¶30).

Multi-Patent Capsule: U.S. Patent No. 7,680,334

  • Patent Identification: U.S. Patent No. 7,680,334, Presenting Recognised Handwritten Symbols, issued March 16, 2010.
  • Technology Synopsis: This patent addresses a problem in handwriting recognition where users receive poor feedback when a character is misinterpreted (’334 Patent, col. 1:41-46). The invention proposes a method where, after recognizing a handwritten pattern, the system presents the user with an image of the best-matching template from its database, rather than a generic, standardized font character. This visual feedback is intended to show the user how the system interpreted their input, allowing them to adjust their handwriting for improved future recognition (’334 Patent, col. 2:10-14, 26-30).
  • Asserted Claims: Claim 1 is asserted (Compl. ¶40).
  • Accused Features: The complaint accuses Defendant's products, including the Samsung Galaxy Note 10+, of infringing by "detecting and recognizing handwritten symbols, and presenting the best interpretation of those handwritten symbols" (Compl. ¶39).

III. The Accused Instrumentality

Product Identification

  • The complaint identifies the "Bixby virtual assistant" and products that perform "methods of detecting and recognizing handwritten symbols," using the "Samsung Galaxy Note 10+" as an exemplary device embodying these functionalities (collectively, the "Accused Products") (Compl. ¶¶ 12, 39).

Functionality and Market Context

  • The accused functionality for the contrastive stress patents involves Bixby's ability to convert speech to text and produce "text-to-speech output with contrastive stress in response to user queries" (Compl. ¶¶ 12, 21, 30).
  • The accused functionality for the handwriting recognition patent involves systems that "detect[] and recogniz[e] handwritten symbols, and present[] the best interpretation of those handwritten symbols" (Compl. ¶39).
  • The complaint does not provide sufficient detail for analysis of the products' market context beyond identifying them as consumer electronics products distributed in the United States (Compl. ¶6).

IV. Analysis of Infringement Allegations

’671 Patent Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
receiving, from the speech-enabled application, input comprising a plurality of text strings... The complaint alleges Bixby converts "input speech to text" and responds to "user queries." ¶12 col. 5:1-4
identifying a first portion of a first text string...as differing from a corresponding first portion of a second text string... The complaint does not provide specific facts on how the accused products identify differing portions of text. ¶13 col. 32:41-48
...and a second portion of the first text string as not differing from a corresponding second portion of the second text string; The complaint does not provide specific facts on how the accused products identify non-differing portions of text. ¶13 col. 32:48-52
assigning contrastive stress to the identified first portion of the first text string, but not to the identified second portion... The complaint alleges Bixby produces "text-to-speech output with contrastive stress." ¶12 col. 32:53-57
generating...speech synthesis output to render the plurality of text strings as speech having the assigned contrastive stress... The complaint alleges Bixby produces "text-to-speech output with contrastive stress." ¶12 col. 32:58-61

’486 Patent Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
generating...a text input comprising a text transcription of a desired speech output; The complaint alleges Bixby "converts input speech to text" in response to user queries. ¶21 col. 31:15-18
inputting the text input to at least one speech synthesis engine; The complaint alleges Bixby utilizes a speech-enabled application that produces text-to-speech output. ¶21 col. 31:18-20
receiving from the...engine an audio speech output...comprising at least one portion carrying contrastive stress to contrast with at least one other portion... The complaint alleges Bixby "produces text-to-speech output with contrastive stress in response to user queries." ¶21 col. 31:21-27
providing the audio speech output to at least one user of the speech-enabled application. The complaint alleges Bixby produces text-to-speech output in response to user queries. ¶21 col. 31:28-30

No probative visual evidence provided in complaint.

Identified Points of Contention

  • Scope Questions: A central question for the '334 patent will be whether "presenting the best interpretation" requires displaying the specific graphical pattern of the matched template, as the patent describes, or if it can be read to cover the display of a standard font character that semantically corresponds to the interpretation.
  • Technical Questions: For the contrastive stress patents (’671, ’486, ’291), the analysis may focus on whether the accused Bixby system performs the specific claim steps of programmatically "identifying" differing and non-differing portions of text strings and "assigning" stress accordingly. The dispute may turn on evidence of how Bixby's speech synthesis engine actually processes text and generates prosody, versus merely producing speech that a listener might perceive as having some form of emphasis.

V. Key Claim Terms for Construction

The Term: "contrastive stress" (asserted claims of ’671, ’486, ’291 patents)

Context and Importance

  • This term is the core technical concept of three of the four asserted patents. Its definition will determine whether the type of emphasis allegedly used by the Bixby assistant falls within the scope of the claims. Practitioners may focus on whether the term requires a specific, algorithmically-determined emphasis pattern applied to contrast two explicit elements, or if it can cover more general prosodic variations.

Intrinsic Evidence for Interpretation

  • Evidence for a Broader Interpretation: The specification states that contrastive stress is an "important tool in human understanding of meaning as conveyed by spoken language" (’671 Patent, col. 8:32-34), which may support a construction based on the functional effect on a listener.
  • Evidence for a Narrower Interpretation: The specification defines it as "a particular emphasis pattern... applied in speech to words or syllables that are meant to contrast with each other" and provides a specific example: "government of the people, by the people, for the people" (’671 Patent, col. 8:17-26). This suggests a narrower meaning tied to explicit contrasting pairs or sets of words.

The Term: "identifying a first portion... as differing from a corresponding first portion" (asserted claim 1 of ’671 patent)

Context and Importance

  • This active step is a key limitation in the method claim of the ’671 patent. The case may depend on whether Defendant's system is shown to perform an explicit comparison and identification process on text strings, or if it generates emphasis through other means.

Intrinsic Evidence for Interpretation

  • Evidence for a Broader Interpretation: The patent's flowchart (Fig. 4) shows a high-level block for "Identify Differing Portion(s) In Token(s) To Be Stressed" without specifying the mechanism, which could support a functional interpretation covering any method that achieves this result (’671 Patent, Fig. 4, block 470).
  • Evidence for a Narrower Interpretation: The detailed description explains this step in the context of analyzing fields tagged as "contrastive" and comparing their normalized orthography to find what differs (e.g., comparing "ten_forty_five_a_m" to "eleven_forty_five_a_m" to identify "ten" and "eleven" as the differing portions) (’671 Patent, col. 27:28-40). This may support a construction requiring a more direct, text-based comparison algorithm.

VI. Other Allegations

Indirect Infringement

  • The complaint alleges induced infringement for all four patents, asserting that Defendant "actively encourage[s] and instruct[s] customers" to use the Accused Products in ways that directly infringe (e.g., Compl. ¶¶ 14, 23, 32, 41). The complaint does not plead specific facts, such as references to user manuals or marketing materials, to support this allegation.

Willful Infringement

  • Willfulness is alleged for all four patents. The complaint alleges pre-suit knowledge of the ’334 patent based on "discussions beginning in 2021" where that patent was allegedly identified (Compl. ¶41). For the ’671, ’486, and ’291 patents, the complaint alleges knowledge arising "[t]hrough at least the filing and service of this Complaint," suggesting a basis for post-suit willfulness (e.g., Compl. ¶¶ 14, 23, 32).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of functional operation: Does the accused Bixby assistant's speech synthesis engine perform the specific, multi-step process recited in the contrastive stress patent claims—including the explicit identification of differing textual portions and targeted assignment of stress—or does it use a more general prosodic model that produces an output that may be perceived as emphatic but does not map directly to the claimed method?
  • A second key issue will be one of definitional scope for the handwriting patent: Can the claim term "presenting the pattern of the best template" be construed to cover the display of a standard, system-font character, or does it narrowly require the display of the specific graphical image of the internal template that the system matched to the user's input?
  • A third pivotal question will concern willfulness: Did the alleged 2021 licensing discussions provide Defendant with pre-suit notice of infringement for the three contrastive stress patents, or was notice limited to the ’334 patent, potentially confining enhanced damages for the other patents to post-filing conduct?