DCT
5:18-cv-05624
PersonalWeb Tech LLC v. Upwork Global Inc
Key Events
Amended Complaint
I. Executive Summary and Procedural Information
- Parties & Counsel:
- Plaintiff: PersonalWeb Technologies, LLC (Texas) and Level 3 Communications, LLC (Delaware)
- Defendant: Upwork Global Inc. (California)
- Plaintiff’s Counsel: Stubbs, Alderton & Markiles, LLP; Maceiko IP; David D. Wier
- Case Identification: 5:18-cv-05624, N.D. Cal., 10/04/2018
- Venue Allegations: Venue is alleged to be proper because Defendant is incorporated in California, has a regular and established place of business in the Northern District of California, and has committed alleged acts of infringement in the district.
- Core Dispute: Plaintiff alleges that Defendant’s website infrastructure infringes four patents related to using content-based identifiers to manage and efficiently distribute data across computer networks.
- Technical Context: The technology involves generating unique identifiers for data files based on their content (e.g., through cryptographic hashing) to improve the efficiency of data storage, retrieval, and caching in distributed systems like the internet.
- Key Procedural History: The complaint notes that the patents-in-suit have expired and that the infringement allegations pertain to the time period before expiration. It also states that Plaintiff PersonalWeb has previously enforced these patents against third parties, resulting in settlements and non-exclusive licenses.
Case Timeline
| Date | Event |
|---|---|
| 1995-04-11 | Earliest priority date for all Patents-in-Suit |
| 2005-08-09 | U.S. Patent No. 6,928,442 Issued |
| 2010-09-21 | U.S. Patent No. 7,802,310 Issued |
| 2011-05-17 | U.S. Patent No. 7,945,544 Issued |
| 2012-01-17 | U.S. Patent No. 8,099,420 Issued |
| 2018-10-04 | Complaint Filed |
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers," issued August 9, 2005 (’442 Patent)
The Invention Explained
- Problem Addressed: The patent’s background describes the inefficiency of conventional data identification systems, where data is named relative to a specific context (e.g., a file path). In large distributed networks, this can lead to redundant data copies, difficulty verifying data integrity, and challenges in controlling access (Compl. ¶13; ’442 Patent, col. 1:11-2:12).
- The Patented Solution: The invention proposes a system that identifies data items using a “substantially unique” identifier derived directly from the content of the data itself, using a function like a cryptographic hash (Compl. ¶14-15). This content-based identifier, termed a "True Name," allows any data item to be located, managed, and verified based only on its content, independent of its name or location (’442 Patent, Abstract; col. 3:29-37).
- Technical Importance: This content-centric approach to data identification reduces bandwidth and storage needs, forming a foundational concept for modern cloud computing and content delivery networks (Compl. ¶11).
Key Claims at a Glance
- The complaint asserts independent claim 10 and dependent claim 11 (Compl. ¶55).
- Independent Claim 10 recites a method with the following essential elements:
- In a system of distributed files across multiple computers;
- Obtaining a name for a data file based on a function of the file's contents;
- Using that name to determine if a copy of the data file is present on one of the computers; and
- Determining if a present copy is an "unauthorized copy or an unlicensed copy" of the data file.
U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System," issued September 21, 2010 (’310 Patent)
The Invention Explained
- Problem Addressed: The technology addresses the need to control the distribution of and access to data in a network of computers, particularly in caching scenarios where it is critical to ensure that users access the correct, authorized version of content (’310 Patent, col. 3:1-12).
- The Patented Solution: The patent describes a method where a first computer (e.g., an origin server) receives a request for a data item from a second computer (e.g., a browser or cache server). The request includes a "content-dependent name" (e.g., a hash or ETag). The first computer uses this name to determine if the content held by the second computer is authorized or licensed. Based on this determination, it either permits access to the existing content or provides updated, authorized content (’310 Patent, Abstract).
- Technical Importance: The invention provides a mechanism for efficient cache validation and access control in distributed computing environments, ensuring data integrity while minimizing unnecessary data transfers (Compl. ¶34).
Key Claims at a Glance
- The complaint asserts independent claims 20 and 69 (Compl. ¶63).
- Independent Claim 20 recites a method with the following essential elements:
- In a system with multiple computers, controlling the distribution of content from a first computer to another;
- This control is in response to a request from the second computer that includes a "content-dependent name" of a data item, where the name is based on a hash function of the item's data;
- Based on this name, the first computer either (A) permits the content to be provided if it is not determined to be unauthorized or unlicensed, or (B) does not permit the content to be provided if it is determined to be unauthorized or unlicensed.
Multi-Patent Capsule: U.S. Patent No. 7,945,544 - "Similarity-Based Access Control of Data in a Data Processing System," issued May 17, 2011 (’544 Patent)
- Technology Synopsis: The patent describes a method for generating a "digital key" for a file composed of multiple parts (e.g., a webpage base file and its asset files). The key for the composite file is determined by a function of "part values," where each part value is itself generated from the content of a constituent part. This hierarchical hashing allows for efficient similarity-based access control, as a change in a single part will alter the final key for the entire file.
- Asserted Claims: 46, 48, 52, and 55 (Compl. ¶73).
- Accused Features: The complaint alleges that Defendant's system generates an ETag (a digital key) for a webpage base file by applying a hash function to the base file's contents, which themselves include the URIs of various asset files. The asset file URIs contain fingerprints (part values) generated from the content of those asset files (Compl. ¶75-76).
Multi-Patent Capsule: U.S. Patent No. 8,099,420 - "Accessing Data in a Data Processing System," issued January 17, 2012 (’420 Patent)
- Technology Synopsis: The patent describes a system for accessing data by using content-dependent digital identifiers. The system determines one or more such identifiers for a data item and then selectively permits the data item to be made available based on whether its identifier corresponds to an entry in one or more databases.
- Asserted Claims: 25, 26, 27, 29, 30, 32, 34–36, and 166 (Compl. ¶84).
- Accused Features: Defendant's system allegedly uses content-dependent identifiers (ETags and fingerprints) for its webpage files and maintains databases (on web servers) that map these identifiers to file URIs. The system uses this database to selectively permit downstream caches to access authorized file content via conditional GET requests and HTTP 304/200 responses (Compl. ¶87-89).
III. The Accused Instrumentality
Product Identification
- Defendant’s website, formerly operated at odesk.com, and its underlying content delivery and caching infrastructure (Compl. ¶29).
Functionality and Market Context
- The complaint alleges the accused website functions as a content distribution system that uses content-based identifiers to manage caching and ensure efficient delivery of webpages (Compl. ¶30, ¶33-34). Two primary mechanisms are described:
- Content-Based ETags: The system generates and serves "ETag" values based on the content of webpage base files and asset files. Downstream browsers and cache servers use these ETags in conditional HTTP "GET" requests (via the "If-None-Match" header) to ask the server if their cached version of a file is still valid (Compl. ¶31, ¶47).
- URI Fingerprinting: The system generates a "fingerprint" (hash) based on the content of an asset file (e.g., a stylesheet or image) and embeds that fingerprint into the asset file's URI. When the asset's content changes, a new fingerprint is generated, resulting in a new URI. This change propagates to any webpage base file that references the asset, which in turn causes the base file's own content and ETag to change (Compl. ¶32, ¶36-37, ¶39).
- The complaint alleges that by using these methods, Defendant reduces bandwidth and computation requirements for its servers, as content is served from the nearest cache and only files with changed content are transmitted (Compl. ¶34). No probative visual evidence provided in complaint.
IV. Analysis of Infringement Allegations
’442 Patent Infringement Allegations
| Claim Element (from Independent Claim 10) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| obtaining a name for a data file, the name being based at least in part on a given function of the data... | Defendant generates or obtains "ETag" values for its webpage base files and asset files. These ETags are generated using a hash function and are based on the contents of the respective files. | ¶57 | ’442 Patent, Abstract |
| determining, using at least the name, whether a copy of the data file is present on at least one of said computers. | An origin or intermediate cache server receives a conditional "GET" request containing a URI and an "ETag". The server determines if it has a file corresponding to that URI and compares the received "ETag" to the "ETag" it has stored. | ¶58 | ’442 Patent, col. 6:3-6 |
| determining whether a copy of the data file that is present... is an unauthorized copy or an unlicensed copy of the data file. | If the "ETag" from the request matches the server's stored "ETag", the cached copy is determined to be "authorized." If there is no match, the cached copy is determined to be "unauthorized." | ¶59 | ’442 Patent, Abstract |
- Identified Points of Contention:
- Scope Questions: A central issue may be whether a standard HTTP "ETag" used for cache validation constitutes the "name" for a data file as contemplated by the patent. Furthermore, the claim recites determining if a copy is "unauthorized or unlicensed," which suggests a context of content licensing or access control, whereas the complaint's theory maps this limitation to cache coherency (i.e., an outdated file is "unauthorized").
- Technical Questions: What evidence does the complaint provide that the accused system’s ETag comparison functionally meets the claim limitation of determining whether a copy is "unlicensed"? The infringement theory appears to equate a technical state (cache invalidity) with a legal or administrative status (unlicensed).
’310 Patent Infringement Allegations
| Claim Element (from Independent Claim 20) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| controlling distribution of content from a first computer to at least one other computer, in response to a request...the request including at least a content-dependent name... | An upstream server (first computer) controls content distribution to a downstream cache or browser (second computer) in response to a conditional "GET" request. The request includes an "ETag", which is a content-dependent name based on a hash function. | ¶65 | ’310 Patent, Abstract |
| based at least in part on said content-dependent name...the first device (A) permitting the content to be provided...if it is not determined that the content is unauthorized or unlicensed, otherwise, (B)...not permitting the content... | The upstream server compares the received "ETag" to its stored "ETag". If they match, it permits use of the cached content by sending an HTTP "304" response. If they do not match, it does not permit use of the old content and instead provides new content via an HTTP "200" response. | ¶66 | ’310 Patent, Abstract |
- Identified Points of Contention:
- Scope Questions: Similar to the '442 Patent, the dispute may center on whether the terms "content-dependent name" and "unauthorized or unlicensed" can be construed to read on the use of standard "ETags" for cache validation.
- Technical Questions: Does sending an HTTP "304 Not Modified" response constitute "permitting the content to be provided to or accessed by" the other computer, as required by the claim? A defendant might argue that this response merely confirms a status and does not actively "permit" access, which is instead governed by the browser's own caching logic.
V. Key Claim Terms for Construction
The Term: "content-dependent name" (’310 Patent) / "name for a data file, the name being based at least in part on a given function of the data" (’442 Patent)
- Context and Importance: This term is the lynchpin of the infringement allegation. The construction will determine whether standard web technologies like ETags and fingerprinted URIs fall within the scope of the claims. Practitioners may focus on this term because the patents frequently use the specific term "True Name," suggesting the inventors may have had a more specific system in mind than general-purpose web caching identifiers.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The claims define the name as being based on a "function of at least some of the data," where the function can be a "message digest function or a hash function" (’310 Patent, Claim 20). This broad functional language could support an interpretation that covers any content-based hash, including those used to generate ETags.
- Evidence for a Narrower Interpretation: The specifications of the related patents extensively describe a specific ecosystem built around the concept of a "True Name," including data structures like a "True File Registry" (’442 Patent, FIG. 4, col. 8:29-37). A defendant may argue that the claim term should be interpreted in light of these specific embodiments and limited to identifiers used within such a proprietary management system.
The Term: "unauthorized or unlicensed" (’310 and ’442 Patents)
- Context and Importance: The plaintiff’s infringement theory equates a non-matching ETag—signifying an outdated cached file—with content that is "unauthorized or unlicensed." The viability of this theory depends entirely on the construction of this term.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The claims do not explicitly define the reason for the "unauthorized" status. Plaintiff could argue that from the perspective of the content owner, distribution of outdated content is not authorized, and the term should be given its plain meaning in that context.
- Evidence for a Narrower Interpretation: The title of the ’442 Patent is "Enforcement and Policing of Licensed Content..." and its abstract discusses providing files only to "licensed (or authorized) parties." This intrinsic evidence strongly suggests the term relates to rights management, such as copyright licensing or user access permissions, rather than technical cache coherency.
VI. Other Allegations
No other allegations of indirect or willful infringement are made in the complaint.
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of definitional scope: can the patents' central concept of a "content-dependent name" or "True Name," developed in the context of a proprietary data management system, be construed to encompass standard, publicly-defined web technologies like HTTP "ETags" and fingerprinted URIs used for cache validation?
- A second key issue will be one of functional equivalence: does the accused system's process of checking cache validity—where a non-matching "ETag" signifies an outdated file—perform the same function claimed by the patents as "determining whether a copy...is unauthorized or unlicensed," a phrase whose context in the patent specification appears rooted in access rights and content licensing?