DCT

6:21-cv-00198

StreamScale Inc v. Cloudera Inc

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 6:21-cv-00198, W.D. Tex., 07/26/21
  • Venue Allegations: Plaintiff alleges venue is proper in the Western District of Texas because each Defendant maintains one or more regular and established places of business within the district from which they conduct business related to the accused products and services.
  • Core Dispute: Plaintiff alleges that Defendants’ big data storage and processing systems, which incorporate Intel's Intelligent Storage Acceleration Library (ISA-L) and Cloudera's software platform, infringe six patents related to high-speed, accelerated erasure coding.
  • Technical Context: Erasure coding is a data storage method that provides fault tolerance with significantly less storage overhead than traditional data replication, making it a critical technology for large-scale cloud and big data applications.
  • Key Procedural History: The complaint alleges a history of industry awareness of Plaintiff's technology, including a 2013 letter to the USENIX computing association regarding then-pending applications. It alleges that Defendant Intel had knowledge of the patents-in-suit and their relevance to Intel's ISA-L library as early as 2014. The complaint also notes that Plaintiff filed petitions to correct inventorship on two of the patents-in-suit in February 2021 to add an inventor who was allegedly omitted through error.

Case Timeline

Date Event
2011-12-30 Earliest Priority Date for all Patents-in-Suit
2013-07-05 Plaintiff's counsel sends letter to USENIX regarding pending patent applications
2014-03-25 U.S. Patent No. 8,683,296 Issues
2015-10-13 U.S. Patent No. 9,160,374 Issues
2016-07-05 U.S. Patent No. 9,385,759 Issues
2018-06-19 U.S. Patent No. 10,003,358 Issues
2019-05-14 U.S. Patent No. 10,291,259 Issues
2020-05-26 U.S. Patent No. 10,666,296 Issues
2021-02-23 Plaintiff files Petitions for Correction of Inventorship for the '358' and '259' Patents
2021-03-05 Plaintiff serves Original Complaint on Intel
2021-07-07 Plaintiff sends cease and desist letter to Intel
2021-07-26 Seconded Amended Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 8,683,296 - “Accelerated Erasure Coding System and Method”

The Invention Explained

  • Problem Addressed: The patent’s background section describes traditional data replication as “highly inefficient and no longer commercially practicable” for large-scale storage systems (Compl. ¶4). It further notes that existing erasure coding methods, while more efficient, were considered computationally "brutally expensive" and too slow for high-speed applications like RAID systems ('296' Patent, col. 1:56-2:10).
  • The Patented Solution: The invention proposes a computer-implemented system to accelerate erasure coding calculations. The system is configured to hold original data, check data (for redundancy), and encoding factors in matrices within a computer's main memory. A specialized software "thread" then executes on a processing core, using a "parallel multiplier" to concurrently multiply multiple data entries by a single factor and a "sequencer" to order these operations efficiently, thereby generating the check data at high speed ('296 Patent, Abstract; col. 3:42-65).
  • Technical Importance: This approach claimed to make high-performance erasure coding practical, enabling significant reductions in storage overhead compared to replication while maintaining robust fault tolerance (Compl. ¶5).

Key Claims at a Glance

  • The complaint asserts at least independent Claim 1 (Compl. ¶112).
  • Essential elements of Claim 1:
    • A system for accelerated error-correcting code (ECC) processing comprising: a processing core and a non-volatile storage medium.
    • The system is configured to implement an erasure coding system, which itself comprises:
      • a data matrix for holding original data in the main memory;
      • a check matrix for holding check data in the main memory;
      • an encoding matrix for holding first factors in the main memory for encoding the original data; and
      • a thread for executing on the processing core.
    • The thread further comprises:
      • a parallel multiplier for concurrently multiplying multiple data entries of a matrix by a single factor; and
      • a first sequencer for ordering operations through the data and encoding matrices to generate the check data.
  • The complaint does not explicitly reserve the right to assert dependent claims for this patent.

U.S. Patent No. 9,160,374 - “Accelerated Erasure Coding System and Method”

The Invention Explained

  • Problem Addressed: This patent, a continuation of the '296 Patent application, addresses the same problem of computationally intensive erasure coding calculations that made the technology impractical for high-speed data storage systems ('374' Patent, col. 1:13-2:12).
  • The Patented Solution: The solution is structurally similar to that of the '296 Patent but adds hardware specificity. The '374 Patent claims a system where the processing core comprises "at least 16 data registers, each of the data registers comprising at least 16 bytes" ('374 Patent, Claim 1). This language suggests an invention tailored to leverage the capabilities of specific CPU architectures, such as those with Single Instruction, Multiple Data (SIMD) instruction sets, to perform the "parallel multiplier" function more efficiently ('374 Patent, col. 3:45-4:1).
  • Technical Importance: By tying the accelerated coding method to specific processor hardware features, the invention provides a more concrete implementation pathway for achieving the claimed performance increases in erasure coding.

Key Claims at a Glance

  • The complaint asserts at least independent Claim 1 (Compl. ¶135).
  • Essential elements of Claim 1:
    • A system for accelerated ECC processing comprising: a processing core, which itself comprises at least 16 data registers of at least 16 bytes each; and a non-volatile storage medium.
    • The system is configured to implement an erasure coding system comprising a data matrix, a check matrix, and an encoding matrix in main memory.
    • The system includes a thread comprising a parallel multiplier and a first sequencer to generate the check data.
  • The complaint does not explicitly reserve the right to assert dependent claims for this patent.

U.S. Patent No. 9,385,759 - “Accelerated Erasure Coding System and Method”

  • Technology Synopsis: This patent builds on the prior inventions by explicitly adding an "input/output (I/O) controller for controlling data transfers between the main memory and the non-volatile storage media" to the claimed system. This suggests a focus on the complete data path from storage to processing and back, optimizing not just the computation but also the data movement within the system ('759' Patent, Claim 1).
  • Asserted Claims: At least independent Claim 1 (Compl. ¶155).
  • Accused Features: The accused "EC Systems," which comprise hardware (processors, memory, storage) and software (Cloudera CDH, Intel ISA-L), are alleged to practice the claimed system, including the use of I/O controllers to manage data flow for erasure coding operations (Compl. ¶157).

U.S. Patent No. 10,003,358 - “Accelerated Erasure Coding System and Method”

  • Technology Synopsis: This patent further specifies the system's architecture, claiming a processor with at least one single-instruction-multiple-data (SIMD) CPU core having at least 16 vector registers. It also details a storage arrangement with a plurality of data drives and more than two check drives. The claimed thread explicitly includes a parallel adder in addition to the parallel multiplier, breaking down the computation into more granular steps ('358 Patent, Claim 1).
  • Asserted Claims: At least independent Claim 1 (Compl. ¶175).
  • Accused Features: The accused systems are alleged to use SIMD-capable processors (Intel, AMD, etc.) and distributed storage (multiple data and check drives). The complaint alleges the software thread includes a parallel multiplier, parallel adder, and sequencer to compute check data (Compl. ¶177).

U.S. Patent No. 10,291,259 - “Accelerated Erasure Coding System and Method”

  • Technology Synopsis: This patent elaborates on the I/O process, claiming separate first and second I/O controllers. The first controller receives original data and stores it to main memory, while the second stores the computed check data from main memory to the check drives. This separation of I/O paths suggests an architecture designed to prevent bottlenecks in data movement during the erasure coding process ('259 Patent, Claim 1).
  • Asserted Claims: At least independent Claim 1 (Compl. ¶195).
  • Accused Features: The complaint alleges the accused systems include first and second I/O controllers that respectively handle the movement of original data into memory and check data out to storage drives as part of the overall accelerated ECC process (Compl. ¶197).

U.S. Patent No. 10,666,296 - “Accelerated Erasure Coding System and Method”

  • Technology Synopsis: This patent explicitly addresses multi-core processing. It claims a "scheduler for generating ECC data in parallel across a plurality of threads" and assigns these threads to a plurality of CPU cores. This describes a system designed to scale the accelerated erasure coding process across modern multi-core processors ('10-296 Patent, Claim 1).
  • Asserted Claims: At least independent Claim 1 (Compl. ¶215).
  • Accused Features: The accused systems are alleged to use multi-core processors and a scheduler that "generates ECC data in parallel across multiple threads," assigning these threads to the various CPU cores to perform the encoding concurrently (Compl. ¶217).

III. The Accused Instrumentality

Product Identification

  • The complaint collectively refers to the accused instrumentalities as "EC Systems" (Compl. ¶52). These systems are built by Defendants Cloudera, ADP, Experian, and Wargaming and incorporate "Cloudera Erasure Coding Components" such as Cloudera Distribution Including Apache Hadoop ("Cloudera CDH") and Cloudera Enterprise (Compl. ¶51). A key component of these systems is Intel's Intelligent Storage Acceleration Library ("ISA-L"), which is allegedly packaged with and enabled by default in Cloudera CDH (Compl. ¶¶103, 107, 109).

Functionality and Market Context

  • The accused EC Systems provide large-scale, fault-tolerant data storage. Cloudera CDH is a data platform that uses erasure coding as an alternative to data replication to reduce storage costs by approximately 50% (Compl. ¶53). Intel's ISA-L is alleged to be a library of "optimized low-level functions" specifically designed to "accelerate the encoding and decoding calculations" for erasure coding on Intel architecture processors (Compl. ¶¶105, 106, 108).
  • The complaint alleges that Defendants ADP, Experian, and Wargaming have built commercially significant data platforms using these components. For example, ADP's DataCloud service allegedly processes payroll and HR data for millions of people using a Cloudera Enterprise cluster to generate business insights (Compl. ¶¶56-57, 63).

IV. Analysis of Infringement Allegations

No probative visual evidence provided in complaint.

’296 Patent Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
a system for accelerated error-correcting code (ECC) processing comprising: a processing core...and a non-volatile storage medium... The accused Cloudera Infringing Products and Services are systems comprising hardware and software, including processing cores (e.g., Intel, AMD) and non-volatile storage. ¶115 col. 3:43-46
configured to implement an erasure coding system comprising: a data matrix for holding original data in the main memory; a check matrix for holding check data in the main memory; an encoding matrix for holding first factors... The accused systems implement an accelerated ECC system that includes a data matrix for original data, a check matrix for check data, and an encoding matrix for first factors, all held in memory. ¶115 col. 3:49-55
a thread for executing on the processing core and comprising: a parallel multiplier for concurrently multiplying multiple data entries of a matrix by a single factor; The accused systems include a thread for execution on the processing core that includes a parallel lookup multiplier. ¶115 col. 3:56-59
and a first sequencer for ordering operations through the data matrix and the encoding matrix using the parallel multiplier to generate the check data. The thread in the accused systems includes a sequencer for ordering operations through the data and encoding matrices to generate the check data. ¶115 col. 3:60-64

Identified Points of Contention

  • Scope Questions: A central question may be whether the claimed "data matrix...in the main memory" can be construed to read on data structures that are distributed across the memory of multiple physical servers in a Hadoop cluster, as is common in the accused systems. The patent figures illustrate a more monolithic system architecture ('296 Patent, Fig. 8).
  • Technical Questions: The complaint alleges the accused systems use a "parallel lookup multiplier" (Compl. ¶115). A factual dispute may arise over whether the functions in Intel's ISA-L, which are designed for SIMD processing, operate in a manner that meets the specific definition of the "parallel multiplier" and "sequencer" limitations as they would be construed from the patent's specification.

’374 Patent Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
a system for accelerated error-correcting code (ECC) processing comprising: a processing core...the processing core comprising at least 16 data registers, each of the data registers comprising at least 16 bytes; The accused Cloudera Infringing Products and Services comprise a processing core (e.g., Intel, AMD) with at least 16 data registers of at least 16 bytes each. ¶137 col. 3:50-54
and a non-volatile storage medium...configured to implement an erasure coding system comprising: a data matrix..., a check matrix..., an encoding matrix... The accused systems include non-volatile storage and are configured to implement an ECC system with a data matrix, check matrix, and encoding matrix in memory. ¶137 col. 3:54-62
and a thread for executing on the processing core and comprising: a parallel multiplier... and a first sequencer... The accused systems include a thread that executes on the processing core and includes a parallel lookup multiplier and a sequencer to generate check data. ¶137 col. 3:63-4:1

Identified Points of Contention

  • Scope Questions: Similar to the '296 Patent, the scope of "system" and its components in "main memory" will be a key issue for these more hardware-specific claims when applied to a distributed software environment.
  • Technical Questions: The infringement allegation hinges on the accused processing cores meeting the "at least 16 data registers of at least 16 bytes each" limitation. While modern Intel and AMD processors likely meet this requirement, the case may require specific evidence linking the execution of the accused ISA-L functions to the use of these specific hardware resources as claimed.

V. Key Claim Terms for Construction

  • The Term: "parallel multiplier"

  • Context and Importance: This term appears in the independent claims of both the '296 and '374 patents and is central to the invention's alleged speed advantage. The dispute will likely focus on whether the accused ISA-L functions, which use SIMD instructions, fall within the scope of this term.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The claim language defines the term functionally as for "concurrently multiplying multiple data entries of a matrix by a single factor" ('296 Patent, col. 28:28-30). Plaintiff may argue this language is broad enough to cover any software instruction, including a SIMD operation, that achieves this parallelized result.
    • Evidence for a Narrower Interpretation: The specification describes a specific embodiment of the parallel multiplier that uses "two lookup tables" and the PSHUFB (Packed Shuffle Bytes) instruction ('296 Patent, col. 17:51-58). Defendants may argue that the term should be limited to this disclosed embodiment or implementations that are structurally equivalent.
  • The Term: "a data matrix for holding original data in the main memory"

  • Context and Importance: This term defines the location and structure of the data being processed. Its construction is critical for determining whether the claims apply to distributed computing environments like Cloudera CDH, where data is not held in a single contiguous block of one machine's main memory.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The term "matrix" is a logical construct. Plaintiff may argue that as long as the data is logically organized as a matrix and accessible by the processing core—regardless of its physical distribution across a cluster's memory—it meets the limitation. The patent states the invention applies to "storage and retrieval of digital data distributed across numerous storage devices" ('296 Patent, col. 9:23-26).
    • Evidence for a Narrower Interpretation: The patent's detailed description and figures, such as the system diagram in Figure 8, depict a more conventional, single-node computer architecture with a CPU complex and a shared main memory ('296 Patent, Fig. 8). Defendants may argue that "the main memory" implies the memory of a single system, not the aggregated memory of a distributed cluster.

VI. Other Allegations

  • Indirect Infringement: The complaint levels a detailed inducement claim against Intel. It alleges Intel knows of the patents and intends for its customers (the other Defendants) to infringe by providing ISA-L, which is allegedly designed to practice the patents (Compl. ¶119). Affirmative acts of inducement cited include packaging ISA-L with Cloudera CDH, publishing API manuals and marketing materials that instruct users on how to implement the infringing erasure coding, and providing technical support (Compl. ¶¶120-122).
  • Willful Infringement: Willfulness is alleged against Intel based on both pre- and post-suit knowledge. The complaint alleges Intel knew of the parent application to the '296 Patent as early as 2014 and took "deliberate actions to avoid learning" facts about infringement (Compl. ¶¶125-126). It further alleges actual notice from the filing of the original complaint on March 5, 2021, and a cease and desist letter sent on July 7, 2021, after which Intel allegedly continued its infringing conduct (Compl. ¶¶127, 130).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of definitional scope: can the claim term "system," with its components described as being "in the main memory," be construed to cover a distributed software platform like Cloudera CDH where data and processing are spread across a cluster of independent machines? The resolution will depend on whether the court views these terms through a logical or a physical hardware lens.
  • A key evidentiary question will be one of technical implementation: does the accused Intel ISA-L software, which leverages modern SIMD instructions, actually perform the functions of the claimed "parallel multiplier" and "sequencer" as those terms are defined in the patent specifications, particularly in light of the detailed flowcharts and specific instructions disclosed?
  • For the claims against Intel, a central question will be knowledge and intent: what evidence can be presented to establish that Intel, by developing and distributing its ISA-L library for general use in storage applications, specifically intended for its customers to build systems that would infringe the particular combination of elements recited in the StreamScale patent claims?