DCT

6:23-cv-00576

StreamScale Inc v. Cloudera Inc

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 6:23-cv-00576, W.D. Tex., 08/09/2023
  • Venue Allegations: Plaintiff alleges venue is proper in the Western District of Texas because Defendant Cloudera maintains a regular and established place of business within the district, specifically an office in Austin where it employs numerous individuals.
  • Core Dispute: Plaintiff alleges that Defendant's Cloudera Data Platform, when used with the Intel Intelligent Storage Acceleration Library as instructed by Defendant, indirectly infringes five patents related to accelerated erasure coding technology for data storage.
  • Technical Context: Erasure coding is a data protection method that provides fault tolerance with significantly less storage overhead than traditional data replication, making it a critical technology for large-scale cloud and big data storage systems.
  • Key Procedural History: The complaint alleges that Plaintiff previously sued Defendant for infringement of the same patents-in-suit by an older product (Cloudera Data Hub) in a case filed March 2, 2021. In that prior litigation, Plaintiff allegedly served Defendant with Preliminary and Final Infringement Contentions and an expert report detailing the infringement, which Plaintiff asserts establishes Defendant's knowledge for the current allegations of willful and indirect infringement.

Case Timeline

Date Event
2011-12-30 Earliest Priority Date for all Patents-in-Suit
2013-07-05 StreamScale sends letter to USENIX regarding pending patent applications
2014-03-25 U.S. Patent No. 8,683,296 Issues
2015-03-20 Cloudera employee allegedly informed of StreamScale's infringement claim
2015-10-13 U.S. Patent No. 9,160,374 Issues
2016-07-05 U.S. Patent No. 9,385,759 Issues
2019-05-14 U.S. Patent No. 10,291,259 Issues
2020-05-26 U.S. Patent No. 10,666,296 Issues
2021-03-02 Prior lawsuit (2021 Lawsuit) filed against Cloudera
2021-07-15 Preliminary Infringement Contentions served in 2021 Lawsuit
2022-05-20 Final Infringement Contentions served in 2021 Lawsuit
2023-02-27 Expert Report on Infringement served in 2021 Lawsuit
2023-08-09 Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 8,683,296 - “Accelerated Erasure Coding System and Method”

The Invention Explained

  • Problem Addressed: The patent’s background section describes prior art data protection methods like simple replication as highly inefficient and commercially impracticable due to the large storage overhead required (Compl. ¶4). It also notes that existing erasure coding systems, while more efficient, were considered computationally "brutally expensive" and too slow for practical use, particularly in systems with more than two redundant "check drives" (’296) Patent, col. 2:20-28).
  • The Patented Solution: The invention claims to solve this performance problem through a system and method that uses a "parallel multiplier" to accelerate the complex Galois Field arithmetic required for erasure coding. This approach is designed to process multiple data entries concurrently and to order operations via a "sequencer" in a way that minimizes memory accesses, thereby enabling high-speed performance even in large-scale systems (’296 Patent, Abstract; col. 4:8-14).
  • Technical Importance: This patented approach claims to make erasure coding practical for systems with many check drives, which can increase data reliability while simultaneously lowering costs by enabling the use of larger, more economical groups of data drives (’296 Patent, col. 3:1-12).

Key Claims at a Glance

  • The complaint asserts independent claims 1 and 34, as well as dependent claims 2-4 and 35-36 (Compl. ¶45).
  • Independent Claim 1 (a system) includes the following essential elements:
    • A processing core and a non-volatile storage medium.
    • An erasure coding system implemented thereon, comprising a data matrix, a check matrix, and an encoding matrix held in main memory.
    • A thread for execution on the processing core that includes:
      • A "parallel multiplier" for concurrently multiplying multiple data entries by a single factor.
      • A "first sequencer" for ordering operations using the parallel multiplier to generate the check data.
  • Independent Claim 34 (a method embodied on a storage medium) includes the following essential steps:
    • Arranging original data as a data matrix in main memory.
    • Arranging factors as an encoding matrix in main memory.
    • Generating the check data using a "parallel multiplier" for concurrently multiplying multiple data entries, where the generation step comprises ordering operations via a sequencer.

U.S. Patent No. 9,160,374 - “Accelerated Erasure Coding System and Method”

The Invention Explained

  • Problem Addressed: As a continuation of the application leading to the ’296 Patent, the ’374 Patent addresses the same underlying problem of slow and computationally expensive erasure coding in the prior art (’374 Patent, col. 2:20-50).
  • The Patented Solution: This patent also discloses a system using a "parallel multiplier," but its claims focus more specifically on the process of reconstructing lost data after a drive failure. The invention describes using a "solution matrix" to decode the surviving check data back into the lost original data, with a "second sequencer" ordering the reconstruction operations to maintain high performance (’374 Patent, Abstract; col. 4:32-51).
  • Technical Importance: The invention provides a method for efficiently recovering from data loss in a large-scale erasure-coded system, a critical function for ensuring data availability and fault tolerance (’374 Patent, col. 3:45-54).

Key Claims at a Glance

  • The complaint asserts independent claims 1 and 5, as well as dependent claim 6 (Compl. ¶63).
  • Independent Claim 1 (a system for reconstruction) includes the following essential elements:
    • A processing core and non-volatile storage medium.
    • An erasure coding system configured with a "surviving data matrix", a "lost data matrix", and a "solution matrix" for holding decoding factors.
    • A thread for execution including:
      • A "parallel multiplier".
      • A "second sequencer" for ordering operations using the parallel multiplier to reconstruct the lost original data.
  • Independent Claim 5 (a system with specific hardware features) includes the following essential elements:
    • A non-volatile storage medium and a processing core.
    • The processing core comprises at least 16 data registers, each with at least 16 bytes.
    • A computer program implementing the erasure coding system.
    • The parallel multiplier is configured to process data in units of at least 64 bytes spread over at least four of the data registers.

U.S. Patent No. 9,385,759 - “Accelerated Erasure Coding System and Method”

Technology Synopsis

This patent, part of the same family, also describes systems and methods for accelerated erasure coding. The claims are directed to both the encoding process (generating check data) and the decoding process (reconstructing lost data), with limitations focusing on the interaction between a parallel multiplier, sequencers, and specific data structures like data, check, encoding, and solution matrices (’759 Patent, Abstract).

Asserted Claims

Independent claims 1 and 5, and dependent claims 2-4 and 6-7 (Compl. ¶81).

Accused Features

The complaint alleges that the combination of Cloudera Data Platform and the Intel ISA-L library performs the patented accelerated erasure coding methods (Compl. ¶23).

U.S. Patent No. 10,291,259 - “Accelerated Erasure Coding System and Method”

Technology Synopsis

This patent describes an accelerated erasure coding system with claims that specify the use of a single-instruction-multiple-data (SIMD) CPU core and associated vector registers. It further claims a scheduler for concurrently executing data generation and I/O operations on separate CPU cores to enhance parallelism and performance (’259 Patent, Abstract; Claim 2).

Asserted Claims

Independent claims 12 and 19, and dependent claims 13-16 (Compl. ¶99).

Accused Features

The complaint alleges that Cloudera’s customers use the accused products on modern multi-core, SIMD-capable processors, thereby practicing the patented methods for parallelized erasure coding (Compl. ¶23).

U.S. Patent No. 10,666,296 - “Accelerated Erasure Coding System and Method”

Technology Synopsis

This patent claims a system for accelerated erasure coding with specific architectural details, including a plurality of data drives and more than two check drives. The claims focus on a multi-core architecture where a scheduler assigns data processing tasks and I/O tasks to different threads and CPU cores to run concurrently, thereby parallelizing the workload (’10-296 Patent, Abstract; Claim 2).

Asserted Claims

Independent claims 1 and 5, and dependent claims 2-4 and 6-8 (Compl. ¶117).

Accused Features

The complaint alleges that the use of Cloudera Data Platform with the Intel ISA-L library on modern hardware constitutes the claimed parallelized erasure coding system (Compl. ¶23).

III. The Accused Instrumentality

Product Identification

The accused instrumentality is the combination of Defendant's Cloudera Data Platform (“CDP”) with the Intel Intelligent Storage Acceleration Library (“ISA-L”) (Compl. ¶23). The complaint refers to these collectively as the "Cloudera Infringing Products and Services" or "EC Systems" (Compl. ¶¶23, 25).

Functionality and Market Context

  • The complaint alleges that Cloudera does not bundle ISA-L with CDP but instead "specifically instructs its customers to download ISA-L for use with CDP" to implement accelerated erasure coding (EC) technology (Compl. ¶27).
  • Plaintiff cites Defendant's online documentation, which allegedly provides commands "to install the isa-l library" and to verify that it is working, as evidence of these instructions (Compl. ¶47).
  • The complaint alleges that these "EC Systems" provide significant commercial benefits, reducing storage costs by at least 50% compared to traditional triple replication methods while maintaining data reliability (Compl. ¶26).
  • No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

The complaint alleges that customers of Cloudera directly infringe the asserted patents by using the Cloudera Data Platform in combination with the Intel ISA-L library, as instructed by Cloudera (Compl. ¶¶45, 63). The complaint does not, however, provide a detailed, element-by-element mapping of how the specific features of the accused system meet the limitations of the asserted claims. The general infringement theory appears to be that the combination of CDP and ISA-L performs the functions of the claimed "system for accelerated error-correcting code," with the ISA-L library providing the functionality of the claimed "parallel multiplier" and the overall system logic providing the "sequencer" functions to generate and reconstruct erasure-coded data (Compl. ¶¶23, 45-47).

Identified Points of Contention

  • Technical Questions: The complaint's lack of specific technical mappings raises the question of what evidence will be presented to demonstrate that the accused software combination performs the exact functions required by the claims. For instance, for the '296 Patent, it is unclear what specific operations within ISA-L allegedly constitute the claimed "parallel multiplier for concurrently multiplying multiple data entries of a matrix by a single factor."
  • Scope Questions: A central dispute may concern whether the functions performed by the general-purpose ISA-L library, when used with CDP, fall within the scope of the patent claims. For example, regarding claim 1 of the ’374 Patent, a question may arise as to whether the accused system’s data recovery logic constitutes the claimed "second sequencer for ordering operations through the surviving data matrix, the encoding matrix, the check matrix, and the solution matrix."

V. Key Claim Terms for Construction

The Term: "parallel multiplier" ('296 Patent, Claim 1)

Context and Importance

This term is the central technical component of the claimed invention, purported to deliver the novel speed advantage over the prior art. The definition of this term will be critical to determining whether the accused ISA-L library's functionality infringes, as the complaint's theory relies on this library performing the claimed function.

Intrinsic Evidence for Interpretation

  • Evidence for a Broader Interpretation: The claims and specification describe the term's function as "concurrently multiplying multiple data entries of a matrix by a single factor" ('296 Patent, col. 4:10-12). Plaintiff may argue this language covers any software implementation that achieves this parallel mathematical result, regardless of the specific underlying code.
  • Evidence for a Narrower Interpretation: The specification discloses a specific embodiment of the parallel multiplier that uses "two lookup tables" and the "PSHUFB (Packed Shuffle Bytes) instruction" common to certain SIMD processors ('296 Patent, col. 5:8-12). Defendant may argue that the term should be construed more narrowly to be limited to this or similar SIMD-based implementations, potentially excluding other methods of parallel computation.

The Term: "sequencer" ('296 Patent, Claim 1)

Context and Importance

This term defines the control logic that orders the high-speed operations of the parallel multiplier. Whether the control flow of the accused CDP and ISA-L combination meets this limitation will be a key point of the infringement analysis.

Intrinsic Evidence for Interpretation

  • Evidence for a Broader Interpretation: The claim language is functional, defining the sequencer by its role of "ordering operations through the data matrix and the encoding matrix using the parallel multiplier" ('296 Patent, Claim 1). This could support a broad interpretation covering any code that dictates the order of the multiplication operations.
  • Evidence for a Narrower Interpretation: The patent describes specific sequencing methods, such as a "row-by-row" data access approach designed to "minimize the number of memory accesses" ('296 Patent, col. 6:28-34, FIG. 4). Defendant may argue the term should be limited to sequencers that implement this specific optimization, rather than any arbitrary ordering of operations.

VI. Other Allegations

Indirect Infringement

The complaint's five counts are for indirect infringement, alleging both inducement and contributory infringement.

  • Inducement: The complaint alleges Cloudera induces infringement by providing "instructive materials and information," including online documentation with commands and installation guides, that actively instruct customers to download and use ISA-L with CDP to perform the patented methods (Compl. ¶¶46-47).
  • Contributory Infringement: The complaint alleges Cloudera contributes to infringement by making and selling software "especially made or especially adapted to practice the invention" which is not a "staple article or commodity of commerce suitable for substantial non-infringing use" (Compl. ¶51).

Willful Infringement

Willfulness is alleged for all five patents-in-suit. The basis for willfulness includes alleged pre-suit and post-suit knowledge.

  • Pre-suit Knowledge: The complaint alleges Cloudera has known of at least the '296 Patent since March 20, 2015, when a Cloudera employee was allegedly informed of StreamScale’s infringement claims (Compl. ¶54).
  • Post-suit Knowledge: The complaint asserts that the filing of the 2021 Lawsuit on March 2, 2021, and the subsequent service of detailed infringement contentions and an expert report in that case, provided Cloudera with unequivocal and ongoing knowledge of its infringement (Compl. ¶¶55-58).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A central issue will be one of evidentiary sufficiency and technical mapping: What evidence will Plaintiff provide to demonstrate that the accused combination of Cloudera's platform and Intel's library performs the specific, multi-step processes of the claimed "parallel multiplier" and "sequencer," and does that evidence establish a direct infringement by Cloudera's customers?
  • A key question for indirect infringement and willfulness will be one of intent and equivalence: Given the prior lawsuit concerned a product where infringing technology was allegedly bundled, can Plaintiff prove that Cloudera's current practice of instructing customers to separately download and integrate the same or similar technology constitutes a knowing and intentional inducement of infringement?