DCT
1:24-cv-00261
Byteweavr, LLC v. Cloudera, Inc.
Key Events
Complaint
I. Executive Summary and Procedural Information
- Parties & Counsel:
- Plaintiff: Byteweavr, LLC (Texas)
- Defendant: Cloudera, Inc. (Delaware)
- Plaintiff’s Counsel: BRAGALONE OLEJKO SAAD PC
- Case Identification: 1:24-cv-00261, W.D. Tex., 03/08/2024
- Venue Allegations: Plaintiff alleges venue is proper in the Western District of Texas because Defendant maintains a regular and established place of business in Austin, Texas, and has committed acts of infringement in the district.
- Core Dispute: Plaintiff alleges that Defendant's enterprise data platforms, including Cloudera Enterprise and Cloudera Data Platform, infringe eight patents related to distributed computing, network agents, data compression, and workflow management.
- Technical Context: The technologies relate to foundational aspects of large-scale, network-based data processing and management, a critical component of modern enterprise cloud computing and big data analytics.
- Key Procedural History: The complaint does not mention any prior litigation, Inter Partes Review (IPR) proceedings, or licensing history related to the asserted patents.
Case Timeline
| Date | Event |
|---|---|
| 1998-10-23 | Priority Date for U.S. Patent No. 6,839,733 |
| 1998-10-23 | Priority Date for U.S. Patent No. 7,949,752 |
| 1999-04-09 | Priority Date for U.S. Patent No. 6,862,488 |
| 1999-04-26 | Priority Date for U.S. Reissued Patent No. RE42153 |
| 2000-06-23 | Priority Date for U.S. Patent No. 7,082,474 |
| 2000-08-02 | Priority Date for U.S. Patent No. 8,275,827 |
| 2001-08-24 | Priority Date for U.S. Patent No. 6,999,961 |
| 2002-01-31 | Priority Date for U.S. Patent No. 6,965,897 |
| 2004-01-04 | Issue Date for U.S. Patent No. 6,839,733 |
| 2005-03-01 | Issue Date for U.S. Patent No. 6,862,488 |
| 2005-03-01 | Issue Date for U.S. Reissued Patent No. RE42153 |
| 2005-11-15 | Issue Date for U.S. Patent No. 6,965,897 |
| 2006-02-14 | Issue Date for U.S. Patent No. 6,999,961 |
| 2006-07-25 | Issue Date for U.S. Patent No. 7,082,474 |
| 2008-01-01 | Cloudera founded |
| 2011-05-24 | Issue Date for U.S. Patent No. 7,949,752 |
| 2012-09-25 | Issue Date for U.S. Patent No. 8,275,827 |
| 2014-01-01 | Cloudera began offering Cloudera Distributed Hadoop (CDH) |
| 2019-01-01 | Cloudera introduced Cloudera Data Platform (CDP) |
| 2024-03-08 | Complaint Filing Date |
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 6,839,733 - “Network system extensible by users,” issued Jan. 4, 2004
The Invention Explained
- Problem Addressed: The patent describes the problem of customizing technology services for individual subscribers, which previously required time-consuming and expensive modification of provider-side software applications by human developers (Compl. ¶22; ’733 Patent, col. 2:4-29).
- The Patented Solution: The invention proposes a network system augmented with an "agent system" where users can create, modify, or delete software "agents" to perform tasks on their behalf ('733 Patent, col. 2:45-51). These agents utilize network "services" (e.g., e-mail, voice mail) and consume service "resources" (e.g., processing time, storage space), with their consumption mediated and controlled by "service wrappers" to prevent misuse ('733 Patent, Fig. 1; col. 2:51-58).
- Technical Importance: This approach aimed to shift the burden of service customization from the provider to the end-user, enabling programmable and extensible network services in an era before the widespread adoption of modern cloud-based APIs and platforms (Compl. ¶21; ’733 Patent, col. 2:33-37).
Key Claims at a Glance
- The complaint asserts independent method claim 37 (Compl. ¶96).
- Claim 37 Elements: A method comprising the steps of:
- admitting a user to a network system wherein at least one agent is operable to consume a service resource while utilizing a service to perform a task for the user; and
- allowing the user to create, modify, or delete the agent within the network system.
U.S. Patent No. 7,949,752 - “Network system extensible by users,” issued May 24, 2011
The Invention Explained
- Problem Addressed: This patent, related to the ’733 patent, addresses the same general problem of enabling user-driven customization of network services ('752 Patent, col. 2:4-29).
- The Patented Solution: The invention describes a method for creating and invoking a network-based agent where the execution is specifically triggered by receiving a URL that defines an event type and identifies the agent ('752 Patent, col. 17:49-58). The agent's operation consumes a "discrete unit" of a service resource, and the result is communicated back over a network link ('752 Patent, col. 18:2-7). This formalizes the invocation mechanism for the agents described in the parent patent family.
- Technical Importance: This method provided a specific, web-centric mechanism (URL-based invocation) for triggering server-side, resource-constrained software agents, reflecting an early architectural pattern for web-based automation (Compl. ¶23).
Key Claims at a Glance
- The complaint asserts independent method claim 24 (Compl. ¶107).
- Claim 24 Elements: A method comprising the steps of:
- receiving, using a computing device, data for creating a network-based agent;
- invoking, using the computing device, and in response to receiving a URL defining a type of event and identifying the network-based agent, execution of the network-based agent;
- wherein the invoking comprises using a service and a service resource configured to be consumed by the network-based agent for performing the operation;
- wherein a discrete unit of the service resource is exhausted upon being consumed by the network-based agent; and
- communicating, using the computing device, a result of the operation over a network communication link.
U.S. Patent No. 6,862,488 - “Automated validation processing and workflow management,” issued Mar. 1, 2005 (Multi-Patent Capsule)
- Technology Synopsis: The patent addresses the need to automate the creation and management of validation protocols, particularly for regulated industries like pharmaceuticals and biotechnology, which require extensive documentation to prove equipment and processes meet safety and quality standards (Compl. ¶¶ 24, 48; ’488 Patent, col. 1:12-25). The solution is a system with a user interface for inputting validation data and a processing engine with rules to automatically generate and manage these validation protocols ('488 Patent, col. 2:39-51).
- Asserted Claims: At least claim 11 (independent method) (Compl. ¶118).
- Accused Features: Cloudera's Data Platform, particularly its use in pharmaceutical and biotech applications and its Apache NiFi component for automating data flow validation (Compl. ¶¶ 48-54).
U.S. Patent No. 6,965,897 - “Data Compression Method and Apparatus,” issued Nov. 15, 2005 (Multi-Patent Capsule)
- Technology Synopsis: The patent describes a method for compressing data, particularly for large database tables, by arranging data into a "mixed format physical layout" ('897 Patent, Abstract). This layout separates data into fixed-sized fields and variable-sized fields, which are then compressed, aiming to improve compression ratios while maintaining compatibility with database systems that require random access ('897 Patent, col. 1:21-31).
- Asserted Claims: At least claim 1 (independent method) (Compl. ¶129).
- Accused Features: The Cloudera Enterprise platform's use of the Apache Avro data file schema, which arranges data into fixed-sized fields (e.g., integers, longs) and variable-sized fields (e.g., strings) and supports compression (Compl. ¶¶ 59-62).
U.S. Patent No. 6,999,961 - “Method of aggregating and distributing informal and formal knowledge using software agents,” issued Feb. 14, 2006 (Multi-Patent Capsule)
- Technology Synopsis: The patent discloses a method for aggregating information from disparate sources using software agents ('961 Patent, Abstract). A central "content aggregator" transmits a search query to multiple remote agents on distinct networks; these agents search their local networks, return results to the aggregator, which then processes the results according to client-defined rules and transmits the processed information to the client ('961 Patent, col. 2:10-24).
- Asserted Claims: At least claim 1 (independent method) (Compl. ¶140).
- Accused Features: The Cloudera Search tool, which is based on Apache Solr and acts as a content aggregator, transmitting queries to a distributed service of servers (remote agents) that search for data across different networks (e.g., data center, multiple clouds) and return processed results (Compl. ¶¶ 63-68).
U.S. Patent No. 7,082,474 - “Data sharing and file distribution method and associated distributed processing system,” issued Jul. 25, 2006 (Multi-Patent Capsule)
- Technology Synopsis: The patent describes a method for operating a distributed processing system where a server sends a workload and a data index to a host device ('474 Patent, Abstract). The host device accesses the required data using an address from the index and then updates the index to include a new storage address, effectively managing data location in a distributed environment ('474 Patent, col. 4:51-64).
- Asserted Claims: At least claim 1 (independent method) (Compl. ¶151).
- Accused Features: The Cloudera Search tool, where a client submits a query (workload) to a server system (NameNode) that distributes it to host devices (DataNodes/Solr servers). These hosts use an index (HDFS) to locate and access data, and the index is updated to reflect the location of stored data (Compl. ¶¶ 69-73).
U.S. Patent No. 8,275,827 - “Software-based network attached storage services hosted on massively distributed parallel computing networks,” issued Sep. 25, 2012 (Multi-Patent Capsule)
- Technology Synopsis: The patent details a method for creating software-based Network Attached Storage (NAS) services using under-utilized storage resources on distributed devices ('827 Patent, Abstract). Client agents on these devices assess and represent available storage, process storage workloads, and enable devices to function as location-aware or stand-alone NAS devices ('827 Patent, col. 3:6-24).
- Asserted Claims: At least claims 2 and 14 (independent method) (Compl. ¶162).
- Accused Features: Cloudera's Data Hub service, which configures clusters of distributed devices (nodes) with client agents (NodeManagers) that manage workloads and have corresponding software-based NAS components (attached storage volumes) whose resources can be scaled up or down (Compl. ¶¶ 74-81).
U.S. Reissued Patent No. RE42153 - “Dynamic coordination and control of network connected devices for large-scale network site testing and associated architectures,” issued Feb. 15, 2011 (Multi-Patent Capsule)
- Technology Synopsis: The patent describes a method for dynamically coordinating distributed client systems in a computing platform. A server distributes workloads and parameters to client systems, receives "poll communications" from them regarding their status, analyzes these communications, and sends back modified parameters to dynamically change the number of active clients (e.g., for autoscaling) ('153 Patent, col. 2:25-45).
- Asserted Claims: At least claim 1 (independent method) (Compl. ¶175).
- Accused Features: Cloudera's Data Hub service, which uses its management console (server) to distribute workloads to client systems (nodes) and uses autoscaling policies. Client agents (NodeManagers) send poll communications about workload demand and available capacity to a Resource Manager, which analyzes them and modifies poll parameters to scale the number of active nodes up or down (Compl. ¶¶ 82-88).
III. The Accused Instrumentality
- Product Identification: The complaint identifies the "Accused Instrumentalities" as the "Cloudera Platforms," which include Cloudera Enterprise and/or the Cloudera Data Platform (CDP), along with their various software components (Compl. ¶¶ 9, 31). These components include distributions of open-source technologies such as Apache Oozie, NiFi, YARN, Hue, Avro, Zookeeper, and Solr (Compl. ¶¶ 14, 31).
- Functionality and Market Context: The Cloudera Platforms are described as integrated data management and analytics platforms that provide services for data warehousing, data engineering, and operational database workloads (Compl. ¶4). They are built on the Hadoop ecosystem, which provides scalable storage and distributed computing (Compl. ¶6). Key functionalities accused of infringement include:
- Workflow Scheduling: Using Apache Oozie as a workflow scheduler to manage and schedule Hadoop jobs, which the complaint maps to the claimed "agent" (Compl. ¶32). A screenshot shows the Hue web interface for managing Oozie jobs, including a "DailyAnalytics" workflow (Compl. p. 16, ¶35).
- Resource Management: Using the YARN architecture to allocate and manage cluster resources such as CPU and memory for scheduled tasks (Compl. ¶¶ 37, 38).
- Cluster Management: Using the Cloudera Management Console to create, manage, start, and stop clusters, which the complaint maps to the creation and invocation of a "network-based agent" (Compl. ¶¶ 40, 44). A screenshot from Cloudera's documentation shows the user interface for starting a cluster (Compl. p. 22, ¶44).
- Data Flow and Validation: Using Apache NiFi to automate and validate the flow of data between systems, particularly in regulated environments (Compl. ¶¶ 49-54).
- Data Storage and Compression: Using the Apache Avro file format, which organizes data into fixed-size and variable-size fields and supports data compression schemes like "Deflate" (Compl. ¶¶ 59, 62).
- Distributed Search: Using Cloudera Search (powered by Apache Solr) to provide full-text search across data stored in Hadoop, HBase, or cloud storage by aggregating results from a distributed set of servers (Compl. ¶¶ 63-65).
The complaint alleges these platforms are used across numerous industries and are offered via a subscription model (Compl. ¶¶ 9, 30).
IV. Analysis of Infringement Allegations
’733 Patent Infringement Allegations
| Claim Element (from Independent Claim 37) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| admitting a user to a network system | A user is admitted to the Cloudera network system, for example, by passing login authentication in the Hue web interface. | ¶33 | col. 9:10-14 |
| wherein at least one agent is operable to consume a service resource while utilizing a service to perform a task for the user | The Apache Oozie editor allows a user to create a scheduler agent (e.g., a workflow named "DailyImport") to perform a task, such as importing data. This agent utilizes Cloudera services and consumes service resources (e.g., CPU, memory) allocated by the YARN architecture's Resource Manager. | ¶¶32, 36-38, 97 | col. 3:10-14 |
| and allowing the user to create, modify, or delete the agent within the network system | The user, via the Oozie Editor interface within the Cloudera platform, can create, modify, or delete the Oozie workflow scheduler agent. The complaint provides a screenshot showing agents can be deleted by being "move[d] to trash". | ¶¶39, 97 | col. 3:14-16 |
- Identified Points of Contention:
- Scope Question: A primary issue may be whether an "Oozie workflow scheduler" (Compl. ¶39) as implemented in the Accused Instrumentalities constitutes an "agent" as that term is used in the '733 Patent. The patent describes agents as being created by users to extend or customize the network system ('733 Patent, col. 2:45-51), raising the question of whether a pre-defined workflow scheduler component fits this description.
- Technical Question: It may be disputed whether the allocation of system resources like CPU and memory by YARN's Resource Manager (Compl. ¶37) constitutes the "consumption" of a "service resource" by an "agent" in the specific manner contemplated by the patent's disclosure.
’752 Patent Infringement Allegations
| Claim Element (from Independent Claim 24) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| receiving, using a computing device, data for creating a network-based agent | The Cloudera Management Console (running on a Cloudera server) receives data from a user, such as cluster definition, name, and number of nodes, for creating a network-based agent (a cluster). | ¶¶40, 41 | col. 3:3-9 |
| invoking... in response to receiving a URL defining a type of event and identifying the network-based agent, execution of the network-based agent | The user clicks a "start" icon or "Provision Cluster" button, which the complaint alleges is a hyperlink with a URL that defines a "start" event and identifies the cluster. In response, the Cloudera server is instructed to invoke and start execution of the cluster. | ¶¶42-44 | col. 17:52-58 |
| wherein the invoking comprises using a service and a service resource configured to be consumed by the network-based agent for performing the operation | The execution of the cluster involves multiple services (e.g., HDFS, Hive). The resources for these services (e.g., cluster capacity, memory) are allocated by the Cloudera YARN architecture and consumed during operation. | ¶¶45, 46 | col. 3:11-14 |
| wherein a discrete unit of the service resource is exhausted upon being consumed by the network-based agent | The complaint does not provide sufficient detail for analysis of this element. | ||
| communicating, using the computing device, a result of the operation over a network communication link | The Cloudera management console communicates the result of the cluster's execution, such as resource utilization graphs, back to the user over the network. | ¶47 | col. 3:24-27 |
- Identified Points of Contention:
- Scope Question: A central point of contention may be whether a user clicking a GUI button such as "Provision Cluster" (Compl. p. 20, ¶42) constitutes "receiving a URL defining a type of event" as required by the claim. The defense may argue this is a standard user interface interaction, whereas the patent's language could be interpreted to require a more specific, direct URL-based invocation protocol.
- Technical Question: The complaint's allegations will raise the factual question of what evidence demonstrates that a "discrete unit of the service resource is exhausted" during the cluster invocation process. This may depend on how Cloudera's YARN architecture allocates and accounts for computational resources.
V. Key Claim Terms for Construction
For the ’733 Patent:
- The Term: "agent"
- Context and Importance: The infringement theory for the '733 Patent hinges on classifying Cloudera's "Oozie workflow scheduler" as an "agent" (Compl. ¶39). The construction of this term will be dispositive, as its meaning determines whether the accused functionality falls within the scope of the claims. Practitioners may focus on this term because it is a broad, functional term from an older patent being applied to a specific component of a modern, complex software stack.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The specification states that agents are "task-based" and can be implemented as "a software application, program, or process which autonomously, and possibly continuously, runs on behalf of its principal" ('733 Patent, col. 8:38-41). This broad, functional definition could support including a workflow scheduler.
- Evidence for a Narrower Interpretation: The patent's background emphasizes agents as a means for end-users to "extend or customize the network system according to their own particular needs" ('733 Patent, col. 2:45-48). This could support a narrower construction limited to user-created or highly customizable software objects, potentially distinguishing them from a pre-built platform component like Oozie.
For the ’752 Patent:
- The Term: "receiving a URL defining a type of event"
- Context and Importance: This term is the trigger for the claimed "invoking" step. The complaint alleges that a user clicking a "start" icon or "Provision Cluster" button, described as a hyperlink, satisfies this limitation (Compl. ¶¶42, 44). The viability of the infringement claim depends on whether this common GUI action can be characterized as the specific URL-based event reception described in the patent.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The patent does not appear to specify a particular format for the URL beyond it defining an "event's type and the agent... which is event's intended recipient" ('752 Patent, col. 17:52-54). This could be argued to cover any URL that, when accessed, initiates a specific action on a specific object.
- Evidence for a Narrower Interpretation: The specification describes a system where a web server can "receive HyperText Transfer Protocol (HTTP) requests for the web page at the URL so that agent server 20 can relay the event to the particular agent" ('752 Patent, col. 17:55-58). This language might support a narrower construction requiring a direct HTTP request/response mechanism for a specific URL, which may be functionally different from an internal command triggered by a button in a comprehensive management console.
VI. Other Allegations
- Indirect Infringement: The complaint alleges inducement of infringement of the '827 patent. The allegations state that Cloudera provides "training, certifications, demos, webinars, events, resource libraries, documentation, instructions and/or manuals for the Accused Instrumentalities" that allegedly instruct and encourage users to perform the infringing methods (Compl. ¶¶ 87, 165).
- Willful Infringement: The complaint alleges willful infringement of the '827 patent, asserting that Cloudera "disregarded an objectively high likelihood of infringement" (Compl. ¶166). However, the basis for knowledge is alleged to be "at least as early as the filing date of this Complaint" (Compl. ¶164), which may primarily support a claim for post-suit, rather than pre-suit, willful infringement.
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of definitional scope: can the term "agent," rooted in the patent's context of user-programmable, task-based software objects from the late 1990s, be construed to cover modern, complex, and highly-structured platform components like an "Oozie workflow scheduler" or a "cluster" in the Hadoop ecosystem?
- A key evidentiary question will be one of functional specificity: does the general operation of the accused Cloudera platforms—such as creating a cluster via a management console or allocating resources via YARN—perform the specific, multi-part functions required by the claims, particularly the claim limitation of "invoking... in response to receiving a URL defining a type of event"?
- The case will also likely turn on a question of technological equivalence and validity: given the foundational nature of the asserted patent claims and their application to a platform built on widely adopted open-source technologies, a central dispute will be whether the specific methods claimed by these patents, filed between 1998 and 2002, represent a distinct, patentable invention over the concurrent and subsequent evolution of distributed computing and web-based systems.