DCT

5:18-cv-03452

PersonalWeb Tech LLC v. Yotpo Ltd

Key Events
Amended Complaint

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 5:18-cv-03452, N.D. Cal., 10/04/2018
  • Venue Allegations: Venue is alleged to be proper because the Defendant is not a resident of the United States and may therefore be sued in any judicial district. The case was also transferred to this district by the Judicial Panel on Multidistrict Litigation for coordinated pretrial proceedings.
  • Core Dispute: Plaintiff alleges that Defendant’s website content delivery system infringes four patents related to using content-based identifiers to manage and distribute data in computer networks.
  • Technical Context: The technology relates to generating unique digital identifiers for data based on the content itself, a foundational technique for modern cloud computing and content delivery networks (CDNs) to reduce data storage and bandwidth consumption.
  • Key Procedural History: The complaint notes that the patents-in-suit have been the subject of prior enforcement actions resulting in settlements and non-exclusive licenses. It also states that the last of the asserted patents has expired, and the infringement allegations are directed to the time period before this expiration. This case is part of a multi-district litigation (MDL) proceeding.

Case Timeline

Date Event
1995-04-11 Priority Date for ’442, ’310, ’544, and ’420 Patents
2005-08-09 U.S. Patent No. 6,928,442 Issued
2010-09-21 U.S. Patent No. 7,802,310 Issued
2011-05-17 U.S. Patent No. 7,945,544 Issued
2012-01-17 U.S. Patent No. 8,099,420 Issued
2018-10-04 Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 6,928,442 - “Enforcement and Policing of Licensed Content Using Content-Based Identifiers”

  • Issued: August 9, 2005

The Invention Explained

  • Problem Addressed: In expanding computer networks, conventional methods of identifying data by name and location are inefficient. This can lead to redundant copies of data clogging networks, and there is no simple way to verify that a received data item is the correct one corresponding to its name (Compl. ¶¶14-15; U.S. Patent No. 5,978,791, col. 2:12-24).
  • The Patented Solution: The invention proposes replacing conventional naming with "substantially unique," content-based identifiers, which the inventors called "True Names." These identifiers are generated by applying a cryptographic hash function (like MD5 or SHA) to the data's content. Because the identifier depends only on the content itself, it allows any data item to be reliably identified, located, and managed across a distributed network, regardless of its name or location (Compl. ¶¶16-17; U.S. Patent No. 5,978,791, col. 3:26-34).
  • Technical Importance: This content-centric approach to data identification provides a foundational method for reducing bandwidth and storage requirements in large-scale networks, a key challenge in the development of cloud computing and content delivery systems (Compl. ¶13).

Key Claims at a Glance

  • The complaint asserts independent claim 10 (Compl. ¶55).
  • Claim 10 requires a method with the following essential elements:
    • Obtaining a name for a data file, where the name is based at least in part on a function of the data in the file.
    • Determining, using at least the name, whether a copy of the data file is present on at least one computer in a distributed system.
    • Determining whether a present copy of the data file is an unauthorized or unlicensed copy.
  • The complaint also asserts dependent claim 11, reserving the right to assert others (Compl. ¶55).

U.S. Patent No. 7,802,310 - “Controlling Access to Data in a Data Processing System”

  • Issued: September 21, 2010

The Invention Explained

  • Problem Addressed: The patent addresses the same problem of managing and verifying data in large distributed networks as the ’442 Patent (Compl. ¶¶14-15, incorporated by reference in ¶61; U.S. Patent No. 5,978,791, col. 2:12-24).
  • The Patented Solution: The invention describes methods and systems for controlling the distribution of content using content-dependent names. A first computer receives a request from a second computer that includes a content-dependent name (e.g., a hash). The first computer compares this name to its own records to determine if the content held by the second computer is authorized. Based on this determination, it either permits access by sending a "not modified" message or denies access to the old content by providing new, authorized content (Compl. ¶¶65-66, 68-69; U.S. Patent No. 5,978,791, col. 3:26-34).
  • Technical Importance: This solution provides a mechanism for efficiently managing cache coherency and controlling content access in distributed systems, ensuring that users and intermediate servers are only served authorized versions of data (Compl. ¶19).

Key Claims at a Glance

  • The complaint asserts independent claims 20 and 69 (Compl. ¶63).
  • Claim 20 requires a computer-implemented method for controlling content distribution, with the following essential elements:
    • Receiving a request from a second device at a first device, the request including a content-dependent name of a data item generated via a hash function.
    • Based on the name, the first device either (A) permits the content to be provided if it is not determined to be unauthorized or unlicensed, or (B) does not permit the content to be provided if it is determined to be unauthorized or unlicensed.
  • Claim 69 requires a system operable in a network of computers, with the following essential elements:
    • Hardware and software configured to receive a request from a second computer at a first computer, the request including a content-dependent name.
    • In response, the system compares the content-dependent name to a plurality of values.
    • It then determines if access is authorized based on whether the name corresponds to one of the values.
    • Based on that determination, it allows the data item to be provided if access is not determined to be unauthorized.
  • The complaint reserves the right to assert other claims (Compl. ¶63).

U.S. Patent No. 7,945,544 - “Similarity-Based Access Control of Data in a Data Processing System”

  • Issued: May 17, 2011
  • Technology Synopsis: This patent describes a method for creating a "digital key" for a file composed of multiple parts. A first function generates content-based values for each individual part. A second function then operates on these part values to generate the final digital key for the entire file. This hierarchical hashing allows for efficient comparison and access control based on the similarity of constituent parts (Compl. ¶¶75-76).
  • Asserted Claims: Claims 46, 48, 52, and 55, with claim 46 being independent (Compl. ¶73).
  • Accused Features: The accused system is alleged to practice this invention by generating content-based "fingerprints" for individual asset files (the first function) and then generating a content-based "ETag" for the main webpage base file using its content, which includes the URIs containing the asset file fingerprints (the second function) (Compl. ¶¶75-76).

U.S. Patent No. 8,099,420 - “Accessing Data in a Data Processing System”

  • Issued: January 17, 2012
  • Technology Synopsis: This patent claims a system that uses content-dependent digital identifiers to manage data access in a network. The system determines these identifiers for data items and then uses one or more databases of these identifiers to selectively permit access, ensuring that only authorized content is provided to or accessed by computers in the network (Compl. ¶¶87-88).
  • Asserted Claims: Claims 25, 26, 27, 29, 30, 32, 34–36, and 166, with claim 166 being independent (Compl. ¶84).
  • Accused Features: The accused infringement theory maps the Defendant's web servers, which allegedly maintain databases of ETag values associated with URIs, to the claimed system. The use of conditional GET requests and HTTP responses containing these ETags is alleged to be the claimed method of selectively permitting access to ensure downstream caches only use authorized content (Compl. ¶89).

III. The Accused Instrumentality

No probative visual evidence provided in complaint.

Product Identification

  • The accused instrumentality is the Defendant's system and method for providing webpage content via its website at yotpo.com (Compl. ¶31).

Functionality and Market Context

  • The complaint alleges the Defendant's system uses a two-tiered, content-based identification scheme to manage web content and caching. First, for subordinate "asset files" (e.g., scripts, stylesheets), the system generates a content-based "fingerprint" and incorporates it into the file's URI (Compl. ¶¶34, 38). Second, for the main "webpage base file" (e.g., an HTML file), the system generates a content-based ETag value by applying a hash function to the file's contents (Compl. ¶42).
  • When an asset file's content changes, its fingerprint and URI change. This change is reflected in the base file that references it, which in turn causes the base file's ETag to change (Compl. ¶41). Browsers and intermediate caches are instructed to use these ETags in conditional HTTP GET requests with an "If-None-Match" header to verify that their cached content is still the authorized version to be served (Compl. ¶¶44, 47). This system allegedly reduces bandwidth and computational load on servers by ensuring that only changed files are re-transmitted, allowing content to be served efficiently from the nearest cache (Compl. ¶36).

IV. Analysis of Infringement Allegations

’442 Patent Infringement Allegations

Claim Element (from Independent Claim 10) Alleged Infringing Functionality Complaint Citation Patent Citation
obtaining a name for a data file, the name being based at least in part on a given function of the data, wherein the data used by the function comprises the contents of the particular file Defendant's system generates or obtains ETags for its webpage base files using a hash function, where the ETags are based on the contents of those files. ¶57 U.S. Patent No. 5,978,791, col. 14:1-13
determining, using at least the name, whether a copy of the data file is present on at least one of said computers Defendant's origin servers and intermediate cache servers receive a conditional GET request with a URI and an ETag, and compare the ETag in the request to the ETag associated with that URI to determine if content with that ETag is present. ¶58 U.S. Patent No. 5,978,791, col. 15:33-40
determining whether a copy of the data file that is present on a at least one of said computers is an unauthorized copy or an unlicensed copy of the data file If the ETag from the request matches the server's ETag, the cached copy is determined to be authorized. If there is no match, the cached copy is determined to be unauthorized. ¶59 U.S. Patent No. 6,928,442, col. 1:54-61

’310 Patent Infringement Allegations

Claim Element (from Independent Claim 20) Alleged Infringing Functionality Complaint Citation Patent Citation
controlling distribution of content from a first computer to at least one other computer, in response to a request obtained by a first device...from a second device..., the request including at least a content-dependent name of a particular data item, the content-dependent name being based at least in part on a function of at least some of the data...wherein the function comprises a message digest function or a hash function Defendant's system controls content distribution from upstream cache/origin servers (first computer) to downstream caches/browsers (second computer) in response to conditional GET requests containing content-dependent ETags generated via hashing. ¶65 U.S. Patent No. 7,802,310, col. 1:15-21
based at least in part on said content-dependent name..., the first device (A) permitting the content to be provided...if it is not determined that the content is unauthorized or unlicensed, otherwise, (B) if it is determined that the content is unauthorized or unlicensed, not permitting the content to be provided... An upstream server compares the ETag in a request to its maintained ETag. If they match, it determines the downstream content is authorized and sends an HTTP 304 response, permitting its use. If they do not match, it determines the content is no longer authorized and sends an HTTP 200 response with new content. ¶66 U.S. Patent No. 7,802,310, col. 3:51-60

Identified Points of Contention

  • Scope Questions: A central dispute may arise over whether a "stale" or "out-of-date" cached file, as identified by a non-matching ETag, constitutes an "unauthorized copy or an unlicensed copy" as required by claim 10 of the ’442 Patent. The defense may argue these terms imply a rights-management or licensing context, whereas the plaintiff's theory appears to equate "unauthorized" with "not the currently correct version for delivery."
  • Technical Questions: What evidence does the complaint provide that the comparison of an ETag in an HTTP header to a stored ETag value on a server performs the specific legal determination of whether a file is "unauthorized"? This raises the question of whether a technical check for content identity is equivalent to the claimed step of making a determination regarding authorization status.

V. Key Claim Terms for Construction

Term: "unauthorized copy or an unlicensed copy" (’442 Patent, Claim 10)

  • Context and Importance: The viability of the infringement allegation for the ’442 Patent may depend on the construction of this term. Practitioners may focus on this term because the Defendant's system for cache validation identifies outdated content, and the Plaintiff equates this "outdated" status with being "unauthorized." If the term is construed narrowly to mean a violation of licensing or access rights, the infringement case could be weakened; if construed broadly to include content that a server no longer authorizes for distribution (i.e., stale content), the case may be strengthened.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The parent patent's background describes a core problem as ensuring a requesting processor can "verify that the data delivered is the correct data" (U.S. Patent No. 5,978,791, col. 2:22-24). This focus on data correctness may support a broader reading where "unauthorized" means "not the correct version authorized for delivery."
    • Evidence for a Narrower Interpretation: The title of the ’442 Patent is “Enforcement and Policing of Licensed Content...”, which suggests a direct connection to legal or contractual licensing. The specification of the parent patent also describes a "license table" for identifying files that "may only be used by licensed users" (U.S. Patent No. 5,978,791, col. 7:51-56), which points toward a rights-management context rather than simple content versioning.

Term: "content-dependent name" (’310 Patent, Claims 20, 69)

  • Context and Importance: Plaintiff’s infringement theory maps both HTTP ETags and fingerprinted URIs to this term. The construction is critical because an ETag is a value transmitted in an HTTP header for verification, not typically used as a primary "name" to request an object like a URI. The case may turn on whether an ETag can be legally considered a "name" for the data item it represents.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The patent specification uses "True Name," "data identity," and "data identifier" interchangeably, suggesting "name" is used broadly for any identifier derived from content (U.S. Patent No. 5,978,791, col. 6:6-9). The core property of the identifier is that it "depends on all of the data in the data item and only on the data in the data item" (U.S. Patent No. 5,978,791, col. 3:28-31), a functional description that an ETag generated from content appears to meet.
    • Evidence for a Narrower Interpretation: The background of the parent patent contrasts "True Names" with conventional file-naming systems that use pathnames and filenames (U.S. Patent No. 5,978,791, col. 2:4-11). A defendant may argue that a "name" must be an identifier used to request or locate a file, a role filled by a URI but not an ETag, which serves as a validation token in a subsequent request.

VI. Other Allegations

Indirect Infringement

  • The complaint does not plead a separate count for indirect infringement. However, the factual allegations are framed in a manner that may support such a theory, stating that "Defendant caused" various downstream and upstream servers and caches to perform the allegedly infringing steps, such as obtaining ETags and determining whether files match (Compl. ¶¶57, 58, 65). This language suggests Plaintiff may pursue a theory of induced infringement based on Defendant's control over the content delivery system.

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of definitional scope: can the patent term "unauthorized copy," which appears in the context of policing licensed content, be construed to cover a merely "stale" or "out-of-date" file in a web caching system, as alleged by the Plaintiff?
  • A second key question will be one of functional and semantic equivalence: does an HTTP "ETag," which functions as a validation token within a request header, qualify as a "content-dependent name" for a data item under the patent's claim language, or is its role fundamentally different from the file-naming identifiers contemplated by the patent?