DCT

5:18-cv-03583

PersonalWeb Tech LLC v. Rockethub Inc

Key Events
Amended Complaint
amended complaint

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 5:18-cv-03583, N.D. Cal., 10/04/2018
  • Venue Allegations: The complaint alleges that venue was originally proper in the Southern District of New York, where Defendant Rockethub is incorporated, maintains a place of business, and committed infringing acts. The case is before the Northern District of California as part of a transfer for consolidated pretrial proceedings by the Judicial Panel on Multidistrict Litigation.
  • Core Dispute: Plaintiff alleges that Defendant’s website and associated content delivery systems infringe four patents related to using content-based identifiers to uniquely identify, manage, and control access to data in distributed computer networks.
  • Technical Context: The technology concerns foundational methods for data de-duplication and efficient content retrieval in networks by assigning data items a unique name based on their content, a concept central to modern cloud computing and content delivery networks.
  • Key Procedural History: The complaint notes that the patents-in-suit have been successfully enforced against third parties, resulting in settlements and non-exclusive licenses. It also states that the last of the patents-in-suit has expired, indicating the action is for past damages only.

Case Timeline

Date Event
1995-04-11 Priority Date for all Patents-in-Suit
2005-08-09 U.S. Patent No. 6,928,442 Issued
2010-09-21 U.S. Patent No. 7,802,310 Issued
2011-05-17 U.S. Patent No. 7,945,544 Issued
2020-01-17 U.S. Patent No. 8,099,420 Issued
2018-10-04 First Amended Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers"

The Invention Explained

  • Problem Addressed: The patent's family history describes the inefficiency of conventional data identification systems, which rely on file names and storage locations. In large, distributed networks, this approach could lead to redundant data storage, increased bandwidth usage, and difficulty in verifying that a retrieved file matches the requested content (Compl. ¶¶17-18).
  • The Patented Solution: The invention proposes a system that replaces conventional naming with "substantially unique," content-based identifiers, termed "True Names" (Compl. ¶¶19, 21). These identifiers are generated by applying a cryptographic hash function (such as MD5 or SHA) to the data item's content (Compl. ¶20). This allows any data item—from a file segment to a full directory—to be identified, located, and managed based solely on its content, independent of its name or location (U.S. Patent No. 5,978,791, col. 3:24–32). The '442 patent applies this core technology to a method for policing licensed content.
  • Technical Importance: This approach provided a foundational method for data de-duplication and efficient retrieval, which are critical for reducing bandwidth and storage requirements in distributed systems like content delivery networks and cloud storage platforms (Compl. ¶16).

Key Claims at a Glance

  • The complaint asserts infringement of at least claims 10 and 11 (Compl. ¶60).
  • Independent claim 10 is a method claim comprising the following essential elements:
    • A method in a system with files distributed across multiple computers.
    • Obtaining a name for a data file based at least in part on a function of the file's contents.
    • Using that name to determine whether a copy of the data file is present on at least one of the computers.
    • Determining whether a present copy of the data file is an unauthorized or unlicensed copy.

U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System"

The Invention Explained

  • Problem Addressed: As with the parent technology, the patent addresses the challenges of managing data in complex, distributed networks (Compl. ¶¶17-18). This patent focuses specifically on controlling access to and distribution of that data.
  • The Patented Solution: The invention describes a method and system for using content-dependent names to control access to data. A first computer (e.g., an origin server) receives a request from a second computer (e.g., a browser or cache) that includes a content-dependent name for a data item. The first computer then uses this name to determine if the content is "unauthorized or unlicensed" and, based on that determination, either permits or denies the content from being provided to or accessed by the second computer ('310 Patent, Claim 20).
  • Technical Importance: The claimed solution provides a mechanism for content validation and access control in distributed networks, ensuring that clients and intermediate caches only serve or render authorized versions of content (Compl. ¶¶16, 57).

Key Claims at a Glance

  • The complaint asserts infringement of at least claims 20 and 69 (Compl. ¶68).
  • Independent claim 20 is a method claim comprising the following essential elements:
    • A computer-implemented method in a multi-computer system.
    • Controlling content distribution from a first computer to another in response to a request from the second computer.
    • The request includes a content-dependent name of a data item, where the name is based on a hash function of the data.
    • Based on the content-dependent name, the first computer permits access if the content is not determined to be unauthorized or unlicensed, and otherwise does not permit access.
  • Independent claim 69 is a system claim comprising hardware and software configured to perform analogous steps, including receiving a request with a content-dependent name and, in response, comparing the name to a plurality of values to determine if access is authorized.

Multi-Patent Capsule: U.S. Patent No. 7,945,544 - "Similarity-Based Access Control of Data in a Data Processing System"

  • Patent Identification: U.S. Patent No. 7945544, "Similarity-Based Access Control of Data in a Data Processing System," issued May 17, 2011.
  • Technology Synopsis: This patent describes a method for creating a composite "digital key" for a file (like a webpage) that is composed of multiple parts (like asset files). A first function generates "part values" for each component part based on their content. A second function then operates on these part values to generate the final digital key for the composite file. This key is then used to control access to the file.
  • Asserted Claims: 46, 48, 52, and 55 (Compl. ¶78). Independent claim is 46.
  • Accused Features: The complaint alleges that Defendant generates content-based "fingerprints" for individual asset files (the "part values"), includes these in the URIs referenced by a webpage base file, and then generates a content-based ETag (the "digital key") for the webpage base file itself. This ETag, which is dependent on the fingerprints of the constituent assets, is then used in database lookups to manage caching and access (Compl. ¶¶80-82).

Multi-Patent Capsule: U.S. Patent No. 8,099,420 - "Accessing Data in a Data Processing System"

  • Patent Identification: U.S. Patent No. 8099420, "Accessing Data in a Data Processing System," issued January 17, 2012.
  • Technology Synopsis: This patent claims a system for selectively permitting access to data items. The system determines one or more content-dependent digital identifiers for a data item and then decides whether to grant access by checking if the identifier corresponds to an entry in one or more databases containing a plurality of authorized identifiers.
  • Asserted Claims: 25, 26, 27, 29, 30, 32, 34-36, and 166 (Compl. ¶89). Independent claim is 166.
  • Accused Features: The complaint alleges that Defendant's web servers maintain databases of ETag values associated with the URIs of webpage files. When a conditional GET request is received with an ETag, the system compares the received ETag with the ETags in its database to determine whether to authorize the downstream cache to use its existing file content or to serve new content (Compl. ¶¶93-94).

III. The Accused Instrumentality

Product Identification

  • The accused instrumentality is the website located at "rockethub.com" and its associated back-end system for storing, managing, and delivering webpage content (Compl. ¶34).

Functionality and Market Context

  • The complaint alleges the "rockethub.com" website utilizes a sophisticated content delivery and caching system to optimize performance and reduce bandwidth (Compl. ¶39). This system is alleged to employ two forms of content-based identifiers. First, it generates "fingerprints" from the content of asset files (e.g., images, scripts) and embeds these fingerprints into the filenames, which are then used in the asset's URI (Compl. ¶¶37, 41). Second, for both webpage base files and certain asset files, it generates content-based "ETag" values, which are transmitted in HTTP headers (Compl. ¶¶36, 45, 47).
  • These identifiers are used to manage caching. When a browser or intermediate cache requests a resource it already has, it sends a conditional GET request with the "ETag" in an "If-None-Match" header (Compl. ¶49). An upstream server compares this "ETag" to its current "ETag" for the resource. If they match, the server responds with an HTTP 304 (Not Modified) message, authorizing the cache to use its local copy. If they do not match, the server sends an HTTP 200 (OK) response with the new content and the new "ETag" (Compl. ¶¶54-55). This process ensures that only current, authorized content is rendered while minimizing redundant data transfer (Compl. ¶¶38-39). The complaint notes that Defendant uses third-party services, such as Amazon S3, which also generate content-based ETags for stored objects on Defendant's behalf (Compl. ¶46).

No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

U.S. Patent No. 6,928,442 Infringement Allegations

Claim Element (from Independent Claim 10) Alleged Infringing Functionality Complaint Citation Patent Citation
...obtaining a name for a data file, the name being based at least in part on a given function of the data, wherein the data used by the function comprises the contents of the particular file. Defendant's system generates or obtains ETag values for its webpage base files and asset files. These ETags are generated using a hash function based on the contents of the respective files. ¶62 U.S. Patent No. US5978791A, col. 14:1-12
...determining, using at least the name, whether a copy of the data file is present on at least one of said computers. In response to a conditional GET request containing an ETag in an If-None-Match header, Defendant's origin or intermediate cache servers compare the received ETag with the stored ETag for the requested URI to determine if content with that ETag is present. ¶63 U.S. Patent No. 5,978,791, col. 15:35-44
...determining whether a copy of the data file that is present... is an unauthorized copy or an unlicensed copy of the data file. If the received ETag matches the server's stored ETag, the server determines the copy is authorized. If there is no match, the server determines the copy at the downstream cache is unauthorized. ¶64 U.S. Patent No. US6928442B2, Abstract
  • Identified Points of Contention:
    • Scope Questions: A central question may be whether a standard HTTP "ETag", used for cache validation, constitutes a "name for a data file" as contemplated by the patent. A further question is whether a standard caching check for data freshness constitutes "determining whether a copy...is an unauthorized copy or an unlicensed copy," particularly given the patent's title concerning "Licensed Content."
    • Technical Questions: What evidence does the complaint provide that the ETag comparison performs a licensing or authorization function beyond simple version control? The defense may argue that this is a routine and non-infringing implementation of the HTTP standard for performance, not a method for policing content rights.

U.S. Patent No. 7,802,310 Infringement Allegations

Claim Element (from Independent Claim 20) Alleged Infringing Functionality Complaint Citation Patent Citation
...controlling distribution of content from a first computer to at least one other computer, in response to a request...the request including at least a content-dependent name... An upstream server (first computer) controls distribution of content to a downstream browser or cache (second computer) in response to a conditional GET request that includes an ETag (content-dependent name) in an If-None-Match header. ¶70 U.S. Patent No. 5,978,791, col. 15:10-28
...based at least in part on said content-dependent name..., the first device (A) permitting the content to be provided...if it is not determined that the content is unauthorized or unlicensed... The upstream server compares the received ETag to its stored ETag. If they match, it sends an HTTP 304 response, thereby permitting the use of the cached content. If they do not match, it sends an HTTP 200 response with new content, thereby not permitting the use of the old content, which is deemed unauthorized. ¶71 U.S. Patent No. US7802310B2, col. 29:3-23
  • Identified Points of Contention:
    • Scope Questions: Does sending an HTTP 304 or 200 response in a standard caching protocol amount to "permitting" or "not permitting" content to be provided in the manner of an access control system? The interpretation of these terms will be critical.
    • Technical Questions: The dispute may turn on whether the function performed by the accused system—cache validation—is technically equivalent to the claimed function of determining whether content is "unauthorized or unlicensed." The complaint equates an outdated file with an "unauthorized" one, a characterization that will likely be contested.

V. Key Claim Terms for Construction

  • The Term: "unauthorized copy or an unlicensed copy" ('442 Patent); "content is unauthorized or unlicensed" ('310 Patent)
  • Context and Importance: The construction of this term is central to the dispute. The complaint equates an outdated cached file with an "unauthorized" file (Compl. ¶¶64, 71). Practitioners may focus on this term because if it is construed narrowly to require a determination related to intellectual property rights (e.g., copyright, licensing), the infringement case may be significantly weakened. If construed broadly to include "not the most current version," the allegations may more closely map to the accused caching functionality.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The '310 patent specification states that before using a cached item, a client "must either reload the cached item, be informed of changes to the cached item, or confirm that the master item... has not changed." ('310 Patent, col. 2:62-66, via incorporation of U.S. Patent 5978791). This language focuses on data consistency and change, which may support an interpretation where "unauthorized" means "outdated."
    • Evidence for a Narrower Interpretation: The titles of the patents—"Enforcement and Policing of Licensed Content" ('442 Patent) and "Controlling Access to Data" ('310 Patent)—and the repeated use of the word "unlicensed" suggest an intent focused on rights management, not merely version control. The '442 patent's abstract explicitly mentions providing a file "only to licensed (or authorized) parties." This language may support a narrower construction tied to user permissions or content licenses.
  • The Term: "name for a data file" ('442 Patent); "content-dependent name" ('310 Patent)
  • Context and Importance: This term's definition will determine if standard web identifiers like ETags and fingerprinted filenames fall within the claims. The complaint alleges that both ETags and fingerprints in URIs are infringing "names" (Compl. ¶¶62, 70, 74).
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The specification of the parent '791 patent, incorporated by reference, describes the "name" as the output of a "message digest function" like MD5 or SHA ('791 Patent, col. 13:9-14). This functional definition could support construing any content-derived hash, including an ETag, as the claimed "name."
    • Evidence for a Narrower Interpretation: The specification consistently refers to the generated identifier as a "True Name" and describes it as part of a system-wide mechanism that can replace conventional naming conventions ('791 Patent, col. 5:11-20). A defendant may argue that ETags and fingerprinted filenames are not "True Names" because they supplement, rather than replace, traditional location-based URIs and serve a more limited purpose of cache validation.

VI. Other Allegations

The complaint does not provide sufficient detail for analysis of indirect infringement. The infringement counts are based on direct infringement by making, using, selling, or offering for sale the accused system, and/or by controlling the distribution of webpage content through that system (Compl. ¶¶60, 68, 78, 89).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of definitional scope: can the patent claims, which are directed at systems for "policing licensed content" and "controlling access," be construed to cover the widespread and standardized use of HTTP ETags and content-hashed filenames for the purpose of cache validation and performance optimization?
  • A key factual question will be one of technical purpose: does the accused "rockethub.com" system function as a rights-management or access-control mechanism as described in the patents, or is its use of content-based identifiers a conventional, non-infringing implementation of web standards designed to ensure data freshness and reduce bandwidth?