DCT

5:18-cv-05202

Personal Web Tech LLC v. Sharefile LLC

Key Events

Amended Complaint

I. Executive Summary and Procedural Information

Parties & Counsel:
- Plaintiff: PersonalWeb Technologies, LLC (Texas) and Level 3 Communications, LLC (Delaware)
- Defendant: ShareFile, LLC (Delaware)
- Plaintiff’s Counsel: Stubbs, Alderton & Markiles, LLP
Case Identification: 5:18-cv-05202, N.D. Cal., 10/04/2018
Venue Allegations: Venue is alleged to be proper as the action was transferred to the Northern District of California by the Judicial Panel on Multidistrict Litigation for consolidated pretrial proceedings.
Core Dispute: Plaintiff alleges that Defendant’s "rightsignature.com" website and its associated content delivery system infringe four patents related to using content-based identifiers to manage, distribute, and control access to data in computer networks.
Technical Context: The technology concerns fundamental aspects of cloud computing and content delivery networks, specifically the use of cryptographic hashes to uniquely identify data files for purposes of efficient caching, data integrity verification, and authorization.
Key Procedural History: The complaint notes that the patents-in-suit have been successfully enforced against third parties, resulting in settlements and non-exclusive licenses. It also states that the allegations are directed to the time period before the expiration of the last of the patents-in-suit. This case is part of a multi-district litigation proceeding.

Case Timeline

Date	Event
1995-04-11	Priority Date for all Patents-in-Suit
2005-08-09	U.S. Patent No. 6,928,442 Issued
2010-09-21	U.S. Patent No. 7,802,310 Issued
2011-10-01	Approx. Date of ShareFile, LLC acquisition by Citrix Systems, Inc.
2011-05-17	U.S. Patent No. 7,945,544 Issued
2012-01-17	U.S. Patent No. 8,099,420 Issued
2014-10-01	Approx. Date of RightSignature LLC acquisition by Citrix Systems, Inc.
2017-10-01	Approx. Date of RightSignature LLC merger into ShareFile LLC
2018-10-04	Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers"

The Invention Explained

Problem Addressed: The patent’s background section describes the problem that conventional methods for naming and locating data files in computer networks are context-dependent and cannot keep pace with expanding, distributed systems, leading to data duplication and difficulty in controlling access (Compl. ¶ 16; ’442 Patent, col. 1:49-2:26).
The Patented Solution: The invention proposes replacing conventional file names with “substantially unique,” content-based identifiers, which the inventors termed “True Names” (Compl. ¶ 17, 19). These identifiers are generated by applying a cryptographic hash function (such as MD5 or SHA) to the entire content of a data item, making the identifier dependent only on the data itself, not its name or location (’442 Patent, Abstract, col. 4:5-11). This system allows for policing a network to identify unauthorized or unlicensed copies of content by comparing their content-based names (’442 Patent, Abstract).
Technical Importance: The use of content-based identifiers provided a foundational method for data deduplication, integrity verification, and efficient content distribution in large-scale networks, which are core components of modern cloud computing and content delivery networks (Compl. ¶ 15).

Key Claims at a Glance

The complaint asserts independent claim 10 and dependent claim 11 (Compl. ¶ 58).
Independent Claim 10 is a method claim comprising the key elements:
- In a system of distributed files across multiple computers, obtaining a name for a data file based on a function of the file's contents.
- Using that name to determine if a copy of the data file is present on at least one of the computers.
- Determining if that present copy is an "unauthorized copy or an unlicensed copy" of the data file.

U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System"

The Invention Explained

Problem Addressed: As with the parent ’442 Patent, this patent addresses the inefficiencies and control issues arising from context-dependent file naming in distributed computer networks (’310 Patent, col. 1:47-2:25).
The Patented Solution: The invention describes a method and system for controlling content distribution between computers in a network. A requesting computer sends a request that includes a "content-dependent name" (e.g., a hash) of a data item. A receiving computer then uses this content-dependent name to determine if access to the data item is authorized or licensed, and based on this determination, either permits or denies the content from being provided or accessed (’310 Patent, Abstract, col. 4:18-33).
Technical Importance: This technology outlines the server-side logic for an access control system based on content identifiers, enabling efficient cache validation and authorization in content delivery networks without transmitting the full data for comparison (Compl. ¶ 15).

Key Claims at a Glance

The complaint asserts independent claims 20 (method) and 69 (system) (Compl. ¶ 66).
Independent Claim 20 is a method claim comprising the key elements:
- Controlling the distribution of content from a first computer to another computer in response to a request.
- The request includes a "content-dependent name" of a data item, where the name is based on a hash function of the item's data.
- Based on that name, the first computer either (A) permits the content to be provided if it is not determined to be unauthorized/unlicensed, or (B) does not permit it to be provided if it is determined to be unauthorized/unlicensed.

Multi-Patent Capsule

Patent Identification: U.S. Patent No. 7,945,544, “Similarity-Based Access Control of Data in a Data Processing System,” Issued May 17, 2011.
Technology Synopsis: This patent describes a method for creating a composite identifier, or "digital key," for a data file. The method first generates identifiers ("part values") for individual parts of the file using a first hash function. It then applies a second hash function to these part values to create the final digital key. This two-tiered identification structure is used to determine if a requested file matches a file stored in a database.
Asserted Claims: Independent claims 46 and 52; dependent claims 48 and 55 (Compl. ¶ 76).
Accused Features: The complaint alleges that content-based "fingerprints" in asset file URIs serve as the "part values," and the content-based "ETag" of a webpage base file (which contains those URIs) serves as the "digital key." The accused system allegedly compares a received "ETag" ("search key") with a database of stored "ETags" to authorize content delivery (Compl. ¶¶ 78-83).

Multi-Patent Capsule

Patent Identification: U.S. Patent No. 8,099,420, “Accessing Data in a Data Processing System,” Issued January 17, 2012.
Technology Synopsis: This patent claims a system that determines one or more "content-dependent digital identifiers" for a data item using a given function (e.g., a hash). The system then uses a database of these identifiers to selectively permit or deny access to the data item, based on whether the identifier for a requested item corresponds to an entry in the database.
Asserted Claims: Independent claim 166; dependent claims 25-27, 29, 30, 32, 34-36 (Compl. ¶ 87).
Accused Features: The accused system allegedly uses hash functions to generate "ETags" and fingerprints, which function as "content-dependent digital identifiers." These identifiers are allegedly stored in server databases and compared against identifiers in incoming requests to determine whether a downstream cache is authorized to use its existing content or must fetch new content (Compl. ¶¶ 90-92).

III. The Accused Instrumentality

Product Identification

The "rightsignature.com" website and its associated systems for storing and serving webpage content (Compl. ¶ 32).

Functionality and Market Context

The complaint alleges the accused system uses content-based identifiers to manage web content delivery and caching efficiency (Compl. ¶ 14). Specifically, it is alleged to use two forms of such identifiers: (1) content-based "ETag" values for webpage files, generated by applying a hash function to the file's contents; and (2) "fingerprints," also generated via a hash function, which are inserted directly into the filenames of asset files (e.g., stylesheets, scripts) (Compl. ¶¶ 34, 35, 43, 45).
These identifiers are allegedly used in conjunction with conditional HTTP GET requests containing an "If-None-Match" header. An intermediate cache or end-user browser sends the "ETag" of its cached file to the server; the server compares it to the "ETag" of the current version of the file. A match results in an HTTP 304 (Not Modified) response, authorizing use of the cached copy. A mismatch results in an HTTP 200 (OK) response with the new file and new "ETag", indicating the cached copy is no longer authorized for use (Compl. ¶¶ 47, 50-53). This process allegedly reduces bandwidth and computational load on the origin servers (Compl. ¶ 37).
No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

6,928,442 Patent Infringement Allegations

Claim Element (from Independent Claim 10)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
a method, in a system in which a plurality of files are distributed across a plurality of computers.	Defendant's system distributes webpage content files across a plurality of computers, including production servers, origin servers, intermediate cache servers, and endpoint browser caches.	¶59	col. 1:21-26
obtaining a name for a data file, the name being based at least in part on a given function of the data, wherein the data used by the function comprises the contents of the particular file.	Defendant generates or otherwise obtains "ETags" for its webpage and asset files. These "ETags" are alleged to be names based on a hash function applied to the contents of the respective files.	¶60	col. 4:5-11
determining, using at least the name, whether a copy of the data file is present on at least one of said computers.	In response to a conditional GET request, Defendant's servers compare the "ETag" received in the request's "If-None-Match" header to the "ETag" maintained for that file's URI to determine if a copy of the content is present and current.	¶61	col. 8:30-38
determining whether a copy of the data file that is present... is an unauthorized copy or an unlicensed copy of the data file.	If the "ETags" match, the server determines the downstream cached copy is authorized. If they do not match, the server determines the downstream copy is unauthorized and serves the new version.	¶62	col. 4:9-13

Identified Points of Contention:
- Scope Questions: A primary question may be whether the claim term "unauthorized copy or an unlicensed copy" can be construed to cover a technically "stale" or "out-of-date" file in a browser cache. The analysis will likely focus on whether a standard cache validation check, which determines if content is current, performs the function of determining a legal or access-based authorization status as contemplated by the patent.
- Technical Questions: Does the complaint provide sufficient evidence that the "ETags" used by the accused system are in fact "based at least in part on a... function of the data... compris[ing] the contents of the particular file"? While this is a common method for generating "ETags", they can also be generated based on other metadata like version numbers or timestamps.

7,802,310 Patent Infringement Allegations

Claim Element (from Independent Claim 20)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
controlling distribution of content from a first computer to at least one other computer, in response to a request...	An upstream origin or intermediate cache server (first computer) controls the distribution of content to a downstream browser or cache (second computer) in response to requests.	¶68	col. 1:21-25
the request including at least a content-dependent name of a particular data item, the content-dependent name being based at least in part on a function of at least some of the data... wherein the function comprises a message digest function or a hash function...	The downstream computer sends a conditional GET request that includes a content-based "ETag" in the "If-None-Match" header. This "ETag" is alleged to be a content-dependent name generated using a hash function.	¶68	col. 4:18-24
based at least in part on said content-dependent name... the first device (A) permitting the content to be provided to or accessed by the at least one other computer if it is not determined that the content is unauthorized or unlicensed, otherwise, (B)... not permitting the content to be provided...	The upstream server compares the received "ETag" to its current "ETag" for the file. A match results in an HTTP 304 response (permitting use of cached content). A mismatch results in an HTTP 200 response with new content (not permitting use of the old, now-unauthorized cached content).	¶69	col. 4:25-33

Identified Points of Contention:
- Scope Questions: Can the term "controlling distribution of content" read on the standard operation of an HTTP server responding to a conditional GET request? Further, as with the ’442 Patent, the construction of "unauthorized or unlicensed" will be central to determining whether cache validation falls within the claim's scope.
- Technical Questions: What evidence does the complaint provide that the accused system's "ETag" comparison logic constitutes a determination of "unauthorized or unlicensed" status, as opposed to simply a determination of content freshness? The dispute may turn on whether the server's response (HTTP 304 vs. 200) is merely an efficiency mechanism or an explicit act of "permitting" or "not permitting" access in the manner claimed.

V. Key Claim Terms for Construction

The Term: "unauthorized copy or an unlicensed copy" (’442 Patent, Claim 10) and "unauthorized or unlicensed" (’310 Patent, Claim 20).
Context and Importance: This language is the lynchpin of the infringement allegation. Plaintiff's theory hinges on equating a technically stale file in a cache (as identified by a non-matching "ETag") with a legally "unauthorized or unlicensed" copy. Practitioners may focus on this term because its construction will determine whether a common web caching technique can be considered an act of "policing" or access control under the patents.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The title of the ’442 patent is "Enforcement and Policing of Licensed Content," and the abstract states a copy is "only provided to licensed (or authorized) parties," suggesting the invention's purpose is to control distribution, which is the functional effect of the accused "ETag" system (Compl. ¶¶ 31, 36; ’442 Patent, Abstract).
- Evidence for a Narrower Interpretation: The specification of the ’442 Patent explicitly discusses a "license table" that identifies "licensed users" (’442 Patent, col. 8:52-56). This could support an interpretation that "unlicensed" requires a determination related to a specific user's permissions, not merely the freshness of a data file in an anonymous cache.
The Term: "content-dependent name" (’310 Patent, Claim 20).
Context and Importance: The complaint alleges that HTTP "ETags" and content-based "fingerprints" in filenames meet this limitation. The viability of the infringement case for this patent family depends on whether these standard web technologies can be classified as the patented "content-dependent name."
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The patent defines the "content-dependent name" as being based on a "message digest function or a hash function" applied to the data, which is a common method for generating strong "ETags" (’310 Patent, cl. 20; Compl. ¶ 45). The specification also defines "data item" very broadly, potentially encompassing webpage asset files (’310 Patent, col. 1:55-61).
- Evidence for a Narrower Interpretation: The patents describe a comprehensive architecture including specific data structures like a "True File Registry" and "Local Directory Extensions" (’310 Patent, FIG. 1(b)). A defendant might argue that the term "content-dependent name" should be limited to identifiers used within this specific, disclosed system, not generic identifiers used in the different context of standard HTTP.

VI. Other Allegations

Indirect Infringement: The complaint alleges that Defendant "caused" its servers, as well as intermediate and endpoint caches, to perform the claimed method steps, such as determining whether to use a cached file (Compl. ¶¶ 61, 68). This language suggests a potential theory of induced infringement, where Defendant's system architecture and instructions (e.g., HTTP headers) direct the actions of downstream, third-party components like browser caches.

VII. Analyst’s Conclusion: Key Questions for the Case

A core issue will be one of definitional scope: Can claim terms rooted in the context of "policing licensed content," such as "unauthorized copy," be construed to cover the technical state of a "stale" file in a web cache? The case may turn on whether the function of HTTP cache-validation is legally and technically equivalent to the claimed function of determining authorization status.
A key evidentiary question will be one of technological equivalence: Does the accused use of standard HTTP features like "ETags" and content-hashed filenames constitute the specific, integrated "True Name" system for managing and controlling data as described and enabled in the patents' specifications, or is there a fundamental mismatch in the architecture and operation of the accused system versus the patented invention?