DCT

5:18-cv-05203

Personal Web Tech LLC v. Startdate Labs Inc

Key Events
Amended Complaint
amended complaint

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 5:18-cv-05203, N.D. Cal., 10/04/2018
  • Venue Allegations: Venue is alleged to be proper in the Northern District of California because the action was transferred to the district by the Judicial Panel on Multidistrict Litigation for consolidated pretrial proceedings.
  • Core Dispute: Plaintiff alleges that Defendant’s website and associated content delivery architecture infringe four expired patents related to the use of content-based identifiers for managing and distributing data in computer networks.
  • Technical Context: The technology concerns using cryptographic hashes to uniquely identify data content, a foundational technique for modern cloud computing, data deduplication, and Content Delivery Networks (CDNs) to ensure data integrity and reduce bandwidth usage.
  • Key Procedural History: The complaint states that the asserted patents have expired and that the action is for damages incurred during the life of the patents. The case is part of a multi-district litigation (MDL) proceeding consolidating numerous similar cases filed by the Plaintiff. Plaintiff also notes a history of successfully enforcing the patents-in-suit, resulting in settlements and non-exclusive licenses.

Case Timeline

Date Event
1995-04-11 Earliest Priority Date for all Patents-in-Suit
2005-08-09 U.S. Patent No. 6,928,442 Issued
2010-09-21 U.S. Patent No. 7,802,310 Issued
2011-05-17 U.S. Patent No. 7,945,544 Issued
2012-01-17 U.S. Patent No. 8,099,420 Issued
2018-10-04 Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 6,928,442 - “Enforcement and Policing of Licensed Content Using Content-Based Identifiers”

  • Patent Identification: U.S. Patent No. 6,928,442, entitled “Enforcement and Policing of Licensed Content Using Content-Based Identifiers,” issued on August 9, 2005.

The Invention Explained

  • Problem Addressed: The patent’s background describes the inefficiency of conventional data management in expanding computer networks, where file names are not tied to file content (Compl. ¶15). This decoupling could lead to network congestion from duplicate data, difficulty locating specific files, and an inability to verify that a received file matches the requested file (’442 Patent, col. 1:13-25).
  • The Patented Solution: The invention proposes replacing conventional file names with content-based identifiers, called “True Names,” generated by applying a cryptographic hash function (such as MD5 or SHA) to the data itself (Compl. ¶¶16-18). This system allows any data item to be uniquely identified, located, and managed based purely on its content, irrespective of its name or location. The patent describes using these identifiers to police a network for unauthorized or unlicensed copies of content ('442 Patent, col. 2:50-63).
  • Technical Importance: This content-centric approach to data identification provided a method to reduce redundant data storage and transmission while ensuring data integrity across distributed systems, a key principle underlying modern content delivery and cloud storage architectures (Compl. ¶13).

Key Claims at a Glance

  • The complaint asserts independent claim 10 (Compl. ¶57).
  • The essential elements of independent claim 10 are:
    • A method in a system with files distributed across multiple computers.
    • Obtaining a name for a data file based on a function of the file's contents.
    • Using that name to determine if a copy of the data file is present on at least one of the computers.
    • Determining if the present copy is an "unauthorized copy or an unlicensed copy."
  • The complaint also asserts dependent claim 11 (Compl. ¶57).

U.S. Patent No. 7,802,310 - “Controlling Access to Data in a Data Processing System”

  • Patent Identification: U.S. Patent No. 7,802,310, entitled “Controlling Access to Data in a Data Processing System,” issued on September 21, 2010.

The Invention Explained

  • Problem Addressed: The patent addresses the same general problem of managing data in distributed systems where identifiers are not tied to content (’310 Patent, col. 1:21-36).
  • The Patented Solution: The invention describes a method and system for controlling the distribution of content between computers using content-dependent names ('310 Patent, Abstract). A requesting computer sends a request that includes a content-dependent name (e.g., a hash). A receiving computer compares this name to a set of values and, based on the comparison, determines whether access is authorized, permitting or denying the content accordingly ('310 Patent, col. 26:50-59). This framework is suited for managing cache validity and access control.
  • Technical Importance: The technology provides a systematic way to use content hashes to validate cached data and control access in a distributed network, which is fundamental to the operation of efficient content delivery networks (Compl. ¶¶35-36).

Key Claims at a Glance

  • The complaint asserts independent claims 20 and 69 (Compl. ¶65).
  • The essential elements of independent claim 20 (method) include:
    • Controlling content distribution from a first computer to another in response to a request from the other computer.
    • The request includes a "content-dependent name" (e.g., a hash) of a data item.
    • Based on that name, the first computer permits access if the content is not determined to be "unauthorized or unlicensed," and otherwise does not permit access.
  • The essential elements of independent claim 69 (system) include:
    • A system with hardware and software to receive a request from a second computer at a first computer, the request including a content-dependent name.
    • In response, the system compares the name to a plurality of values.
    • Based on the comparison, it determines if access is "authorized or unauthorized" and allows access if not determined to be unauthorized.
  • The complaint also asserts dependent claims (Compl. ¶65).

U.S. Patent No. 7,945,544 - “Similarity-Based Access Control of Data in a Data Processing System”

  • Patent Identification: U.S. Patent No. 7,945,544, entitled “Similarity-Based Access Control of Data in a Data Processing System,” issued on May 17, 2011.
  • Technology Synopsis: This patent describes a hierarchical method for generating a content-based identifier, or "digital key," for a data file. The method involves first applying a function (e.g., a hash) to individual "parts" of a file to generate "part values." A second function is then applied to these part values to generate the final digital key for the entire file (’544 Patent, Abstract). This two-level approach facilitates efficient data comparison and management, particularly for large or composite files.
  • Asserted Claims: The complaint asserts independent claim 46 (Compl. ¶75).
  • Accused Features: The complaint alleges that Defendant’s system of generating ETags for webpages infringes. This is allegedly done by first generating "fingerprints" (hashes) for individual asset files (the "parts") and embedding them in URIs, and then applying a second hash function to the webpage base file (which contains these URIs) to create the final ETag (the "digital key") (Compl. ¶¶77-78).

U.S. Patent No. 8,099,420 - “Accessing Data in a Data Processing System”

  • Patent Identification: U.S. Patent No. 8,099,420, entitled “Accessing Data in a Data Processing System,” issued on January 17, 2012.
  • Technology Synopsis: This patent claims a system for data access control using content-based identifiers. The system is described as determining one or more content-dependent digital identifiers for a data item and then selectively permitting access to that item based on whether the identifier corresponds to an entry in one or more databases (’420 Patent, Abstract).
  • Asserted Claims: The complaint asserts independent claim 166 (Compl. ¶86).
  • Accused Features: Defendant's system is accused of infringing by generating content-dependent identifiers (ETags and file fingerprints) and using them to control access to content. The system allegedly compares a received ETag in a conditional GET request with ETags stored in a database to determine whether a downstream cache should serve its existing content or must retrieve new, authorized content from an upstream server (Compl. ¶¶89, 91).

III. The Accused Instrumentality

Product Identification

  • The website "startwire.com" and its associated systems, methods, and infrastructure for providing webpage content to users (Compl. ¶31).

Functionality and Market Context

  • The complaint alleges the accused instrumentality uses a system to manage the distribution and caching of its webpage content efficiently (Compl. ¶32). This system is alleged to employ standard web technologies, including "conditional" HTTP GET requests that use an "If-None-Match" header to pass a content-based "ETag" value (Compl. ¶33). Additionally, the system allegedly inserts content-based "fingerprints" directly into the filenames of asset files (e.g., JavaScript or CSS files), such that a change in the file's content results in a new fingerprint and thus a new filename and URI (Compl. ¶¶34, 39). When a browser or intermediate cache requests a resource, it provides the ETag or requests the fingerprinted URI; the server then determines if the content is current. If it is, the server sends an HTTP 304 (Not Modified) response, instructing the cache to use its local copy. If the content has changed (and thus has a new ETag), the server sends an HTTP 200 response with the new content and the new ETag (Compl. ¶¶51-52). This process is alleged to reduce bandwidth and server load by avoiding the re-transmission of unchanged files (Compl. ¶36). No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

U.S. Patent No. 6,928,442 Infringement Allegations

Claim Element (from Independent Claim 10) Alleged Infringing Functionality Complaint Citation Patent Citation
a method, in a system in which a plurality of files are distributed across a plurality of computers Defendant's system distributes webpage files across a network of production servers, origin servers, intermediate caches, and end-user browsers. ¶58 col. 1:47-49
obtaining a name for a data file, the name being based at least in part on a given function of the data, wherein the data used by the function comprises the contents of the particular file Defendant's system generates or obtains content-based ETags for its webpage and asset files by applying a hash function to the contents of those files. ¶59 col. 2:50-55
determining, using at least the name, whether a copy of the data file is present on at least one of said computers Defendant's servers receive a conditional GET request containing an ETag and compare it to the ETag for the requested URI to determine if a matching copy of the content is present. ¶60 col. 2:56-59
and determining whether a copy of the data file that is present on a at least one of said computers is an unauthorized copy or an unlicensed copy of the data file If the received ETag matches the server's ETag, the server determines the downstream copy is authorized; if there is no match, it determines the downstream copy is unauthorized and sends the new content. ¶61 col. 2:60-63
  • Identified Points of Contention:
    • Scope Questions: A central issue may be the scope of the term "unauthorized copy or an unlicensed copy." The patent's title and abstract frame this in the context of policing licensed content. The complaint's theory, however, alleges that an "outdated" or "stale" file in a web caching context is an "unauthorized" copy. The court will need to determine if the claim language, in light of the specification, can be construed this broadly.
    • Technical Questions: Claim 10 recites a series of steps. The complaint alleges these steps are performed by a system of computers under Defendant's control (Compl. ¶58). Analysis may focus on whether a single entity performs or directs all the claimed steps, or if the actions are impermissibly divided between Defendant's servers, third-party caches, and end-user browsers.

U.S. Patent No. 7,802,310 Infringement Allegations

Claim Element (from Independent Claim 20) Alleged Infringing Functionality Complaint Citation Patent Citation
a computer-implemented method operable in a system which includes a plurality of computers, controlling distribution of content from a first computer to at least one other computer... Defendant's system controls the distribution of webpage content from its origin servers or upstream cache servers to downstream intermediate caches and end-user browsers. ¶66 col. 26:30-32
...in response to a request obtained by a first device...from a second device..., the request including at least a content-dependent name of a particular data item... Downstream caches and browsers (second device) send conditional GET requests with an "If-None-Match" header containing an ETag (the content-dependent name) to an upstream server (first device). ¶67 col. 26:38-46
based at least in part on said content-dependent name..., the first device (A) permitting the content to be provided...if it is not determined that the content is unauthorized or unlicensed, otherwise, (B)...not permitting the content to be provided... The upstream server compares the received ETag to its current ETag. If they match, it permits use of the cached content (HTTP 304); if not, it does not permit use of the old content and provides the new version (HTTP 200). ¶68 col. 26:50-59
  • Identified Points of Contention:
    • Scope Questions: As with the ’442 Patent, a primary question is whether the claim term "unauthorized or unlicensed" can be construed to read on a technically outdated or stale cached file.
    • Technical Questions: The claim recites a "first device" that makes the determination and "permits" or "not permits" the content to be provided. In the accused CDN architecture, this decision-making may occur at any of several layers of caching servers. The infringement analysis may turn on identifying which computer constitutes the "first device" and whether that single device performs the claimed functions.

V. Key Claim Terms for Construction

  • The Term: "unauthorized copy or an unlicensed copy" (appearing in asserted claims of the ’442 and ’310 patents)

  • Context and Importance: This term is critical because the plaintiff's infringement theory hinges on equating a technically outdated cached file with an "unauthorized" one. Defendant will likely argue that "unauthorized" is limited to a legal status (e.g., copyright infringement or license violation), whereas plaintiff will argue for a broader technical meaning of "not approved for use." The outcome of this construction could be dispositive.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The specification of the ’442 Patent describes a general system for managing data and controlling access, stating the system "tracks possession of specific data items according to content by owner" ('442 Patent, col. 4:32-33). This could be argued to support a general concept of authorization tied to the content owner's intent, including serving the most current version.
    • Evidence for a Narrower Interpretation: The title of the ’442 Patent is “Enforcement and Policing of Licensed Content...” The abstract states a copy is "only provided to licensed (or authorized) parties," and the system checks for "unauthorized or unlicensed content." This language strongly suggests a context of digital rights management (DRM) and license enforcement, not cache coherency.
  • The Term: "determining" (appearing in asserted claim 10 of the ’442 Patent)

  • Context and Importance: Claim 10 recites multiple "determining" steps. The complaint alleges these actions are performed across a distributed system of servers and caches (Compl. ¶¶60-61). Practitioners may focus on this term because its construction will inform whether the claim requires a single entity to perform all steps, which is central to any potential divided infringement defense.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The patent repeatedly refers to a "system" that performs actions, and the system is defined as comprising a "plurality of computers" ('442 Patent, Abstract; cl. 10). This may support a construction where the "determining" is an emergent property of the system controlled by a single entity, rather than an action confined to one device.
    • Evidence for a Narrower Interpretation: Method claim elements are traditionally interpreted as being performed by a single actor. A defendant could argue that the claim requires a single computer or process to perform both the step of "determining...whether a copy...is present" and "determining whether a copy...is an unauthorized copy," and that in the accused system, these functions are split between different entities (e.g., a cache server and a browser).

VI. Other Allegations

  • Indirect Infringement: While not pleaded as a separate count, the complaint alleges facts that may support a claim for induced infringement. It states that Defendant's system "instructed the browser cache and all intermediate cache servers" to use conditional GET requests and that Defendant "caused" its servers to perform the claimed steps, suggesting active encouragement of third-party actions (Compl. ¶46, ¶59, ¶67).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of definitional scope: can the term "unauthorized...copy," which appears rooted in the patent's context of digital rights management and license enforcement, be construed to cover a technically "stale" or "outdated" file within a standard web caching system?
  • A key evidentiary question will be one of functional equivalence: does the accused system's use of ETags and conditional HTTP requests—a standard mechanism for maintaining cache coherency—perform the specific function of "determining whether a copy...is an unauthorized copy" as required by the claims, or is there a fundamental mismatch in the technical purpose and operation?
  • A third question concerns divided infringement: given the distributed nature of the accused web serving architecture, can the plaintiff prove that the defendant directs or controls all steps of the asserted method claims, or are the required actions impermissibly divided among the defendant's servers, third-party CDNs, and end-user browsers?