2:24-cv-00162
Byteweavr LLC v. Databricks Inc
I. Executive Summary and Procedural Information
- Parties & Counsel:- Plaintiff: Byteweavr, LLC (Texas)
- Defendant: Databricks, Inc. (Delaware)
- Plaintiff’s Counsel: BRAGALONE OLEJKO SAAD PC
 
- Case Identification: 2:24-cv-00162, E.D. Tex., 07/18/2024
- Venue Allegations: Plaintiff alleges venue is proper because Defendant Databricks maintains a regular and established place of business in Plano, Texas, within the Eastern District of Texas.
- Core Dispute: Plaintiff alleges that Defendant’s Databricks Lakehouse Platform, a unified data management and analytics platform, infringes seven patents related to extensible network systems, distributed data processing, automated workflow management, and data compression.
- Technical Context: The technology at issue is in the field of large-scale, cloud-based data analytics and artificial intelligence platforms, a commercially significant market for enterprise data warehousing and machine learning applications.
- Key Procedural History: The complaint indicates that all seven asserted patents have expired. This positions the lawsuit as a purely retrospective action focused on claims for past monetary damages, rather than prospective injunctive relief.
Case Timeline
| Date | Event | 
|---|---|
| 1998-07-08 | Earliest Priority Date (’153 Patent) | 
| 1998-10-23 | Priority Date (’733, ’752 Patents) | 
| 2000-06-23 | Priority Date (’474 Patent) | 
| 2001-04-13 | Priority Date (’827 Patent) | 
| 2002-01-28 | Priority Date (’897 Patent) | 
| 2002-07-05 | Priority Date (’488 Patent) | 
| 2004-01-01 | ’733 Patent Issued | 
| 2005-03-01 | ’488 Patent Issued | 
| 2005-11-15 | ’897 Patent Issued | 
| 2006-07-25 | ’474 Patent Issued | 
| 2011-02-15 | ’153 Patent Reissued | 
| 2011-05-24 | ’752 Patent Issued | 
| 2012-09-25 | ’827 Patent Issued | 
| 2013-01-01 | Databricks Founded | 
| 2019-08-28 | Accused MLflow functionality accessible | 
| 2020-01-01 | Accused "Lakehouse" platform introduced | 
| 2022-05-10 | Accused Databricks Workflows introduced | 
| 2024-07-18 | Complaint Filing Date | 
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 6,839,733 - "Network system extensible by users"
- Patent Identification: U.S. Patent No. 6,839,733, "Network system extensible by users," Issued Jan. 4, 2005.
The Invention Explained
- Problem Addressed: The patent describes a problem in which technological services (e.g., email, voice mail) were generalized for a broad subscriber base, making customization for individual users a difficult, slow, and expensive process that required direct intervention by the service provider's software developers (ʼ733 Patent, col. 2:4-29).
- The Patented Solution: The invention proposes a network system augmented with an "agent system" to solve this problem. Users, through a user interface, can interact with an "agent server" to create, modify, or delete software "agents." These agents are described as autonomous software objects that act on the user's behalf to utilize network services, thereby extending or customizing the network's functionality without direct developer intervention from the service provider (’733 Patent, Abstract; col. 3:11-14; FIG. 1).
- Technical Importance: This architecture aimed to decentralize service customization, empowering end-users to programmatically tailor network services to their specific needs, which could facilitate more rapid and personalized application development (’733 Patent, col. 2:30-34).
Key Claims at a Glance
- The complaint asserts independent claim 37 (Compl. ¶85).
- Claim 37 includes the following essential elements:- admitting a user to a network system wherein at least one agent is operable to consume a service resource while utilizing a service to perform a task for the user; and
- allowing the user to create, modify, or delete the agent within the network system.
 
- The complaint does not explicitly reserve the right to assert dependent claims for this patent.
U.S. Patent No. 7,949,752 - "Network system extensible by users"
- Patent Identification: U.S. Patent No. 7,949,752, "Network system extensible by users," Issued May 24, 2011.
The Invention Explained
- Problem Addressed: This patent, related to the ’733 Patent, addresses the same general problem of inflexible, provider-centric network services (’752 Patent, col. 2:10-24).
- The Patented Solution: The invention details a specific method for invoking the network-based agents described in the patent family. A computing device receives data for creating an agent, and the agent's execution is triggered in response to receiving a Uniform Resource Locator (URL). The URL defines the type of event and identifies the specific agent to be executed. The invoked agent then consumes a "discrete unit" of a service resource to perform its operation, with the result communicated back over the network (’752 Patent, Abstract; col. 18:40-54).
- Technical Importance: The use of URLs as a trigger mechanism provided a standardized, web-native method for invoking and integrating these extensible, agent-based services into other networked applications and workflows (’752 Patent, col. 18:40-44).
Key Claims at a Glance
- The complaint asserts independent claim 24 (Compl. ¶96).
- Claim 24 includes the following essential elements:- receiving, using a computing device, data for creating a network-based agent;
- invoking, in response to receiving a URL defining a type of event and identifying the network-based agent, execution of the network-based agent;
- wherein the invoking comprises using a service and a service resource configured to be consumed by the agent, and wherein a discrete unit of the service resource is exhausted upon being consumed; and
- communicating, using the computing device, a result of the operation over a network communication link.
 
- The complaint does not explicitly reserve the right to assert dependent claims for this patent.
U.S. Patent No. 6,965,897 - "Data Compression Method and Apparatus"
- Patent Identification: U.S. Patent No. 6,965,897, "Data Compression Method and Apparatus," Issued Nov. 15, 2005.
- Technology Synopsis: The patent addresses inefficient data compression in large databases that require random access. It proposes a "mixed format physical layout" combining fixed-size fields, variable-size fields, and "offset slots" that can act as pointers to a dictionary of common values, thereby improving compression by encoding column-wise redundancy while maintaining fast access (’897 Patent, Abstract; col. 1:21-30).
- Asserted Claims: Claim 1 (Compl. ¶118).
- Accused Features: The complaint alleges that Databricks' use of data formats like Avro, which supports "deflate" compression (a method that uses references to repeated data sequences) and maps data into fixed and variable-sized fields, infringes the patent (Compl. ¶60-65).
U.S. Patent No. 7,082,474 - "Data sharing and file distribution method and associated distributed processing system"
- Patent Identification: U.S. Patent No. 7,082,474, "Data sharing and file distribution method and associated distributed processing system," Issued July 25, 2006.
- Technology Synopsis: The patent describes a method for managing data in a distributed processing system to improve efficiency. A server sends a workload to a host device along with an index defining the location of data needed to process it. After the host processes the workload, the central index is updated to include the host's local storage address as a new location for the data, effectively treating the host's local cache as part of the distributed file system (’474 Patent, Abstract).
- Asserted Claims: Claim 1 (Compl. ¶129).
- Accused Features: The Databricks platform is accused of infringing by distributing workloads to cluster nodes (host devices), sending an index of data locations via the Databricks File System (DBFS), and using disk caching on the nodes, which updates the location of the data to include the node's local storage (Compl. ¶53-59).
U.S. Patent No. 8,275,827 - "Software-based network attached storage services hosted on massively distributed parallel computing networks"
- Patent Identification: U.S. Patent No. 8,275,827, "Software-based network attached storage services hosted on massively distributed parallel computing networks," Issued Sep. 25, 2012.
- Technology Synopsis: The patent discloses a method for creating virtual Network Attached Storage (NAS) services using the unused or underutilized storage resources of distributed devices. Client agents installed on these devices are configured with a software-based NAS component, allowing the collection of devices to function as a distributed NAS system for processing data storage and access workloads (’827 Patent, Abstract).
- Asserted Claims: Claims 2 and 14 (Compl. ¶140).
- Accused Features: The complaint alleges that Databricks' clusters, which consist of distributed worker nodes with dedicated storage resources (software-based NAS components), are configured to process workloads and collectively function as a distributed NAS system (Compl. ¶66-71).
U.S. Patent No. 6,862,488 - "Automated validation processing and workflow management"
- Patent Identification: U.S. Patent No. 6,862,488, "Automated validation processing and workflow management," Issued Mar. 1, 2005.
- Technology Synopsis: The invention provides a system to automate the validation of processes and equipment, particularly in regulated industries like biotechnology manufacturing. The system includes a user interface where a user can input data and select options, and a "validation processing engine" that uses pre-defined processing rules to automatically generate formal validation protocols based on the user's input (’488 Patent, Abstract).
- Asserted Claims: Claim 11 (Compl. ¶107).
- Accused Features: Databricks Workflows are accused of infringing by providing a user interface for defining job tasks and parameters, which are then automatically validated against configuration rules to create and execute a valid workflow, a process the complaint aligns with the claimed method (Compl. ¶47-52).
U.S. Reissued Patent No. RE42,153 - "Dynamic coordination and control of network connected devices for large-scale network site testing and associated architectures"
- Patent Identification: U.S. Reissued Patent No. RE42,153, "Dynamic coordination and control of network connected devices for large-scale network site testing and associated architectures," Reissued Feb. 15, 2011.
- Technology Synopsis: The patent describes a method for dynamically coordinating a group of distributed client systems working on a project. A server distributes workloads and initial parameters, receives "poll communications" from the clients, analyzes this polling data to create a "dynamic snapshot" of project status, and then sends modified parameters back to the clients to change their behavior (e.g., adding or removing active clients) in real-time (’153 Patent, Abstract).
- Asserted Claims: Claim 1 (Compl. ¶153).
- Accused Features: Databricks' "Enhanced Autoscaling" feature is accused of infringing. This feature monitors the utilization of worker nodes in a cluster, analyzes this "dynamic snapshot" information, and automatically adds or removes nodes ("clients") to optimize resource allocation, which the complaint maps to the claimed method (Compl. ¶72-77).
III. The Accused Instrumentality
- Product Identification: The complaint identifies the "Accused Instrumentalities" as the Databricks Lakehouse Platform and its components, including Databricks SQL, Delta Lake, Unity Catalog, MLflow, Databricks Marketplace, Data Intelligence Platform, and Databricks Spark Applications (Compl. ¶6, 27).
- Functionality and Market Context: The Databricks Lakehouse Platform is described as a unified platform for data management and analytics that combines the features of data lakes (for unstructured data) and data warehouses (for structured data) (Compl. ¶4). Its core functionality is to allow enterprises to manage, process, and analyze very large datasets for data engineering, business intelligence, and machine learning applications (Compl. ¶32). The platform operates on cloud infrastructure, allowing users to define and run jobs on dynamically configured clusters of computational resources (Compl. ¶44, 66). The complaint alleges the platform is used by over 50% of Fortune 500 companies and has significant commercial importance (Compl. ¶3, 26).
IV. Analysis of Infringement Allegations
’733 Patent Infringement Allegations
| Claim Element (from Independent Claim 37) | Alleged Infringing Functionality | Complaint Citation | Patent Citation | 
|---|---|---|---|
| admitting a user to a network system wherein at least one agent is operable to consume a service resource while utilizing a service to perform a task for the user; | Databricks admits users to a workspace via credentials. The complaint alleges an "MLflow model" is an "agent" that performs tasks (e.g., tracking metrics, deploying models) and consumes compute resources measured by Databricks Units (DBUs). | ¶33, 36, 37 | col. 3:10-13 | 
| and allowing the user to create, modify, or delete the agent within the network system. | Databricks provides an interface that allows users to create, modify, or delete MLflow models and to manage permissions for these actions. A screenshot shows a "Permission Settings" dialog for managing a model. | ¶38, p. 18 | col. 3:13-14 | 
- Identified Points of Contention:- Scope Questions: A primary issue may be whether an "MLflow model," which can be a trained data structure with associated code, qualifies as an "agent" as described in the patent. The patent specification characterizes agents as "personal software assistants" that act "autonomously" on behalf of a principal (’733 Patent, col. 8:35-42), a description that may suggest a different type of entity than a machine learning model.
- Technical Questions: Does admitting a user to a software "workspace" for data analytics constitute "admitting a user to a network system" in the context of the patent, which provides examples rooted in telecommunications and general network services?
 
’752 Patent Infringement Allegations
| Claim Element (from Independent Claim 24) | Alleged Infringing Functionality | Complaint Citation | Patent Citation | 
|---|---|---|---|
| receiving, using a computing device, data for creating a network-based agent; | Databricks receives user-defined data, such as cluster configuration and node counts, to create a "cluster," which the complaint alleges is the claimed "network-based agent." | ¶39, 40 | col. 18:40-42 | 
| invoking, using the computing device, and in response to receiving a URL defining a type of event and identifying the network-based agent, execution of the network-based agent... | The complaint alleges that a user clicking a "Run now" button, which corresponds to a URL, invokes the execution of a job on the created cluster. A screenshot shows a UI with a "Run now" button that triggers a job run. | ¶42, p. 22 | col. 18:42-45 | 
| wherein the invoking comprises using a service and a service resource configured to be consumed by the network-based agent for performing the operation... | The cluster uses Databricks services for data engineering and analytics workloads, consuming computational resources that Databricks measures and bills for using DBUs. | ¶44, 45 | col. 18:45-48 | 
| and wherein a discrete unit of the service resource is exhausted upon being consumed by the network-based agent; | The complaint alleges that compute resources are consumed and measured by DBUs. However, it does not specify what "discrete unit" is "exhausted" by the operation. | ¶37, 45 | col. 18:48-51 | 
| and communicating, using the computing device, a result of the operation over a network communication link. | Databricks communicates the result of a job run (e.g., start, success, or failure) to the user via notifications, such as email. A screenshot shows email notifications being sent for job events. | ¶46, p. 26 | col. 18:51-54 | 
- Identified Points of Contention:- Scope Questions: The analysis may turn on whether a "Databricks cluster," a collection of computational resources, can be construed as a "network-based agent," which the patent describes as a software object (’752 Patent, col. 8:35-42).
- Technical Questions: What evidence does the complaint provide that the accused platform's consumption of resources, measured by the continuous metric of DBUs, meets the claim limitation of a "discrete unit of the service resource" being "exhausted"? The patent's examples of discrete units include finite resources like "long-distance calling time" (’752 Patent, col. 11:21-24), which may differ from a pay-as-you-go compute model.
 
V. Key Claim Terms for Construction
- The Term: "agent" 
- Context and Importance: This term is central to the infringement theories for both the ’733 and ’752 patents. The viability of the plaintiff's case depends on construing "agent" to read on both an "MLflow model" and a "Databricks cluster." Practitioners may focus on this term because its definition will determine whether the core accused functionalities fall within the scope of the claims. 
- Intrinsic Evidence for Interpretation: - Evidence for a Broader Interpretation: The patent defines an agent broadly as a "software application, program, or process which autonomously, and possibly continuously, runs on behalf of its principal" (’733 Patent, col. 8:39-42). This language could potentially encompass a wide variety of software and automated processes.
- Evidence for a Narrower Interpretation: The specification provides numerous examples framing an agent as a "personal software assistant" and an "electronic extension" of a human user, performing tasks like "answering telephone calls," "placing telephone calls," and "negotiating deals" (’733 Patent, col. 7:65-col. 8:42). This context may support a narrower construction limited to more user-interactive, autonomous software entities.
 
- The Term: "a discrete unit of the service resource is exhausted" 
- Context and Importance: This limitation from claim 24 of the ’752 patent requires a specific mode of resource consumption that must be met by the accused product. The complaint's reliance on the DBU metric makes the definition of this phrase critical to the infringement analysis. 
- Intrinsic Evidence for Interpretation: - Evidence for a Broader Interpretation: The specification gives examples of service resources comprising "discrete units which are 'consumed'," such as "units of long-distance calling time" or "units of data-access time" (’752 Patent, col. 11:21-29). This could support an argument that any metered unit of service, like a DBU, qualifies as a "discrete unit."
- Evidence for a Narrower Interpretation: The term "exhausted" suggests the complete consumption of a pre-defined, finite quantum of a resource. This could be argued to be distinct from a pay-as-you-go model where consumption is measured continuously. A defendant might argue that "exhausted" implies a finite resource is fully depleted by the operation, which may not align with how DBUs are calculated and billed.
 
VI. Other Allegations
- Indirect Infringement: The complaint alleges induced infringement of the ’827 patent, asserting that Databricks provides documentation, manuals, training, and technical support that instruct and encourage its customers and partners to use the platform in an infringing manner (Compl. ¶143).
- Willful Infringement: Willfulness is alleged for the ’827 patent. The complaint asserts that Defendant had knowledge of the patent "at least as early as the filing date of this Complaint" and continued its infringing conduct despite an "objectively high likelihood of infringement," characterizing the behavior as "willful, wanton, malicious," and "characteristic of a pirate" (Compl. ¶142, 144). This suggests a theory based on post-suit willfulness.
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of definitional scope: can the term "agent," which the patents describe in the context of user-centric "personal software assistants," be construed broadly enough to cover modern, large-scale cloud computing constructs like a "Databricks cluster" or an "MLflow model"? The outcome of this claim construction dispute may be dispositive for a significant portion of the case.
- A key evidentiary question will be one of functional mapping: does the technical operation of the accused Databricks platform features—such as its DBU-based resource metering, its URL-triggered job execution, and its dynamic cluster autoscaling—perform the specific, multi-step functions required by the asserted claims, or is there a fundamental mismatch between the patented methods and the accused technology?
- As all asserted patents have expired, the case is purely a dispute over historical monetary damages. A central question will be one of apportionment and valuation: assuming infringement is found, how can a reasonable royalty be calculated for a complex, multi-component platform based on patents that cover discrete features, some of which were introduced years after the platform's initial launch?