DCT

1:18-cv-00206

PersonalWeb Tech LLC v. Kickstarter PBC

Key Events

Complaint

complaint

I. Executive Summary and Procedural Information

Parties & Counsel:
- Plaintiff: PersonalWeb Technologies, LLC (Texas) and Level 3 Communications, LLC (Delaware)
- Defendant: Kickstarter, PBC (Delaware)
- Plaintiff’s Counsel: Kent, Beatty & Gordon, LLP
Case Identification: 1:18-cv-00206, E.D.N.Y., 01/11/2018
Venue Allegations: Venue is alleged to be proper based on Defendant's regular and established place of business in the district, its business dealings in the district, and the commission of alleged acts of infringement within the district.
Core Dispute: Plaintiff alleges that Defendant’s website and associated content delivery system infringe five patents related to using content-based identifiers to manage, locate, and efficiently retrieve data in a distributed network.
Technical Context: The technology concerns generating unique digital fingerprints for data items based on their content, a foundational technique for data de-duplication, caching, and content delivery in modern cloud computing environments.
Key Procedural History: The complaint notes that the patents-in-suit have been the subject of prior enforcement actions resulting in settlements and non-exclusive licenses. It also states that the last of the asserted patents has expired, and the infringement allegations are directed to the time period before expiration, framing this as a case for past damages.

Case Timeline

Date	Event
1995-04-11	Priority Date for all Patents-in-Suit
1999-11-02	U.S. Patent No. 5,978,791 Issued
2005-08-09	U.S. Patent No. 6,928,442 Issued
2010-09-21	U.S. Patent No. 7,802,310 Issued
2011-05-17	U.S. Patent No. 7,945,544 Issued
2012-01-17	U.S. Patent No. 8,099,420 Issued
2018-01-11	Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 5,978,791 - "Data Processing System Using Substantially Unique Identifiers to Identify Data Items, Whereby Identical Data Items Have the Same Identifiers"

The Invention Explained

Problem Addressed: In traditional data processing systems, data is identified by names (e.g., file names) that are context-dependent and bear no relation to the data's actual content. This creates problems in large, distributed networks where the same name can refer to different data in different locations, and different names can refer to identical data, leading to data duplication and difficulty in verifying content integrity (’791 Patent, col. 1:12–2:10).
The Patented Solution: The invention proposes replacing context-dependent names with a "substantially unique identifier," dubbed a "True Name," which is generated by applying a cryptographic hash function (such as MD5) to the content of a data item. This identifier depends "on all of the data in the data item and only on the data in the data item," allowing any data item to be uniquely identified, located, and verified based solely on its content, irrespective of its name or location (’791 Patent, Abstract; col. 13:50–14:2).
Technical Importance: This content-addressable storage approach provided a robust method for de-duplication and efficient data management in increasingly large and distributed computing networks (Compl. ¶ 12).

Key Claims at a Glance

Independent claim 38 is asserted (Compl. ¶ 36).
Essential elements of claim 38 (a method of locating a data item) include:
- (A) Determining a substantially unique identifier for the data item, where the identifier depends on all of and only the data in the data item.
- (B) Requesting the data item by sending its identifier from a requester location to one of a plurality of provider locations.
- (C) At the provider location: (a) determining and maintaining a set of identifiers for its own data items, and (b) determining if the requested data item is present by checking for the requested identifier in its set.
- (c) If present, notifying the requestor that the provider has a copy.
The complaint also asserts claim 42, a dependent claim (Compl. ¶ 36).

U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers"

The Invention Explained

Problem Addressed: This patent builds on the "True Name" concept to address the challenge of managing and policing licensed content in a distributed system where multiple, potentially unauthorized copies of files may exist across different computers (’442 Patent, Abstract).
The Patented Solution: The invention provides a method for using a content-based name (like a "True Name") to check a distributed system for copies of a data file. By comparing the content-based name of a file on a local computer with a central registry or other computers, the system can determine if that copy is authorized or licensed, enabling enforcement actions or content management (’442 Patent, Abstract; col. 2:27-38).
Technical Importance: The technology provides a mechanism for digital rights management and content policing in networked environments by linking access rights to the content itself rather than to a specific file instance or location (Compl. ¶ 45).

Key Claims at a Glance

Independent claim 10 is asserted (Compl. ¶ 46).
Essential elements of claim 10 (a method in a system of distributed files) include:
- Obtaining a name for a data file based at least in part on a function of the data contents of the file.
- Using at least the name to determine whether a copy of the data file is present on at least one of the computers.
- Determining whether that present copy is an unauthorized or unlicensed copy of the data file.
The complaint also asserts claim 11, a dependent claim (Compl. ¶ 46).

U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System"

Technology Synopsis: This patent describes a system for controlling access to data based on content-dependent names. A first computer receives a request from a second computer that includes a content-dependent name (e.g., a hash) for a data item. The first computer compares this name to a plurality of values to determine if access is authorized and responds accordingly (’310 Patent, Abstract).
Asserted Claims: Independent claim 69 is asserted, along with dependent claims 20 and 71 (Compl. ¶ 54).
Accused Features: The accused features include Defendant's upstream servers (first computer) receiving CONDITIONAL GET requests with E-Tags (content-dependent names) from user browsers (second computer) and comparing the E-Tags to a list of stored values to determine whether to authorize access to cached content (Compl. ¶¶ 56–57).

U.S. Patent No. 7,945,544 - "Similarity-Based Access Control of Data in a Data Processing System"

Technology Synopsis: This patent details a method for creating a "digital key" for a file by first generating "part values" for constituent parts of the file using a first function (e.g., hash), and then using a second function on those part values. A search key is determined using the same process, and the system attempts to match the search key against a database of digital keys to provide information about the corresponding file (’544 Patent, Abstract).
Asserted Claims: Independent claim 46 is asserted, along with dependent claims 48, 49, 52, 55, and 56 (Compl. ¶ 61).
Accused Features: The complaint alleges that the E-Tag for a webpage's index file is a "digital key." It is allegedly generated by applying a hash function to the index file's contents, which themselves consist of URIs of asset files that include content fingerprints ("part values"). This E-Tag is used as a "search key" in CONDITIONAL GET requests to match against a database of E-Tags on the server (Compl. ¶¶ 63–68).

U.S. Patent No. 8,099,420 - "Accessing Data in a Data Processing System"

Technology Synopsis: This patent describes a system that uses content-dependent digital identifiers to selectively permit access to data items. The system determines whether a content-dependent identifier corresponds to an entry in a database to resolve whether access is authorized. This allows the system to ensure that downstream caches only access authorized file content (’420 Patent, Abstract).
Asserted Claims: Independent claim 166 is asserted, along with dependent claims 25-27, 29, 30, and 32-36 (Compl. ¶ 73).
Accused Features: Defendant's system allegedly uses E-Tags and fingerprints as "content-dependent digital identifiers." Webpage servers with databases of E-Tag values compare received E-Tags from CONDITIONAL GET messages to the database to selectively determine whether a requesting computer can access its cached content or must receive newly authorized content (Compl. ¶¶ 76–78).

III. The Accused Instrumentality

Product Identification

The website located at kickstarter.com and its associated content delivery system (Compl. ¶ 19).

Functionality and Market Context

The complaint alleges the accused instrumentality uses a Ruby on Rails architecture to compile webpage files (index and asset files) and generate content "fingerprints," which are appended to the files' URLs (Compl. ¶ 21). These files are uploaded as objects to an Amazon S3 host system, which generates an associated "E-Tag" value for each object by applying a hash function to its content (Compl. ¶¶ 22–23). When a user's browser requests a webpage, it sends an HTTP CONDITIONAL GET request with an "IF-NONE-MATCH" header containing the E-Tag for the cached content. Responding servers (origination or intermediate caches) check this E-Tag against their own records. If there is a match, the server sends an HTTP 304 "Not Modified" response, authorizing the browser to use its cached version; if not, it sends an HTTP 200 response with the new content and a new E-Tag (Compl. ¶¶ 26–28). This process is alleged to reduce bandwidth usage by ensuring that only content that has changed is transmitted over the network (Compl. ¶ 20).

No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

’791 Patent Infringement Allegations

Claim Element (from Independent Claim 38)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
(A) determining a substantially unique identifier for the data item, the identifier depending on and being determined using all of the data in the data item and only the data in the data item...	Defendant's system determines a "substantially unique identifier" by calculating a hash fingerprint (during compilation) and an E-Tag (upon upload to S3) based on the file's contents.	¶38	col. 13:50-14:2
(B) requesting the particular data item by sending the data identifier of the data item from the requester location to at least one location of a plurality of provider locations in the system.	A user's browser (requester location) sends a CONDITIONAL GET request with an IF-NONE-MATCH header containing the E-Tag (data identifier) to an upstream cache or origination server (provider location).	¶39	col. 15:11-20
(C) on at least some of the provider locations, (a) ... (ii) making and maintaining a set of identifiers of data items.	Defendant's origination and intermediate cache servers maintain a database or table mapping the URI of each asset/index file to its associated E-Tag, which constitutes a set of identifiers.	¶40	col. 7:45-8:12
(b) determining, based on the set of identifiers, whether the data item corresponding to the requested data identifier is present at the provider location.	The responding server compares the E-Tag received in the CONDITIONAL GET request to the E-Tag values in its database to determine if there is a match, thereby determining if the content is present.	¶41	col. 4:26-32
(c) based on the determining, when the provider location determines that the particular data item is present... notifying the requestor that the provider has a copy of the given data item.	If a match is found, the server issues an HTTP 304 message, notifying the requesting browser that the same file content is present and that it is authorized to use its cached copy.	¶42	col. 14:1-14

’442 Patent Infringement Allegations

Claim Element (from Independent Claim 10)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
a method, in a system in which a plurality of files are distributed across a plurality of computers.	Defendant's system distributes webpage files across a plurality of computers including origin servers, intermediate cache servers, and end-point browser caches.	¶47	col. 3:55-58
obtaining a name for a data file, the name being based at least in part on a given function of the data, wherein the data used by the function comprises the contents of the particular file.	Defendant's system obtains E-Tags and fingerprints for its index and asset files by using a hash function based on the file contents.	¶48	col. 2:27-38
determining, using at least the name, whether a copy of the data file is present on at least one of said computers.	A responding server (origination or intermediate cache) receives a CONDITIONAL GET request with an E-Tag and compares it to its own list of E-Tags to determine if a copy of the content is present.	¶49	col. 4:26-38
determining whether a copy of the data file that is present on ... said computers is an unauthorized copy or an unlicensed copy of the data file.	If the E-Tag in the request matches an E-Tag at the server, the server determines the copy at the downstream cache is authorized. If there is no match, it determines the copy is unauthorized and sends new content.	¶50	col. 4:39-49

Identified Points of Contention:
- Scope Questions: A central question may be whether standard web caching protocols, such as HTTP's use of E-Tags for cache validation, fall within the scope of the patents' claims, which describe a more comprehensive data processing and file management system. The defense could argue that an E-Tag is used for content validation, not for "locating" a data item as recited in claim 38 of the ’791 patent.
- Technical Questions: What evidence does the complaint provide that the E-Tag generated by Amazon S3 or the fingerprint generated by Ruby on Rails is "determined using all of the data in the data item and only the data in the data item," as required by the claims? While plausible for certain hashing algorithms, this is a technical assertion that will require factual proof, as different implementations of E-Tags can exist.

V. Key Claim Terms for Construction

The Term: "substantially unique identifier"
Context and Importance: This term is the central inventive concept of the patent portfolio. Its construction will be critical. Practitioners may focus on whether this term is broad enough to cover any content-based hash used in a network (such as an HTTP E-Tag) or if it is implicitly limited by the patent's specification to an identifier used within the context of the specifically disclosed file system architecture for locating and de-duplicating files.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The specification defines a "data item" broadly as any "entity which can be represented by a sequence of bits" (’791 Patent, col. 1:53-59) and states the identifier depends "only on the data itself" (’791 Patent, col. 3:33-35), which may support application to any content-based hash.
- Evidence for a Narrower Interpretation: The specification consistently describes the "True Name" (the embodiment of the identifier) as an integral part of a comprehensive file management system, with data structures like a "True File registry" and processes for "assimilation," suggesting a more specific function than simple cache validation (’791 Patent, col. 7:26-34).
The Term: "locating a particular data item"
Context and Importance: This phrase from the preamble of claim 38 of the ’791 patent defines the purpose of the claimed method. Its definition is key to whether the accused process of cache validation performs the claimed method. A dispute may arise over whether a server's check for a matching E-Tag to validate a cache constitutes "locating" a file, or if "locating" requires the more complex search-and-retrieve processes described in the patent.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The complaint alleges the system is used to "locate and control the distribution of data items" (Compl. ¶ 37). A server checking its own storage for content matching an E-Tag could be framed as a simple form of "locating" that content.
- Evidence for a Narrower Interpretation: The detailed description describes processes like "Locate Remote File," which involves broadcasting requests to a plurality of source processors to find a copy of a file (’791 Patent, col. 16:40-66). This suggests a more active discovery process than the simple match/no-match check of an HTTP conditional GET request.

VI. Other Allegations

Indirect Infringement: The complaint alleges Defendant "controlled" the distribution of its content and "forced" intermediate caches and browsers to perform the claimed steps (e.g., sending CONDITIONAL GET requests with E-Tags) (Compl. ¶¶ 36, 39). These allegations may support a theory of induced infringement, based on the argument that Defendant designed and operates a system that instructs and causes third-party components (browsers, caches) to act in an infringing manner.
Willful Infringement: The complaint does not contain specific factual allegations to support a claim for willful infringement, such as knowledge of the patents prior to the lawsuit.

VII. Analyst’s Conclusion: Key Questions for the Case

A core issue will be one of definitional scope: can the term "substantially unique identifier," which is described in the patents as a "True Name" within a comprehensive file management system, be construed to cover standard, protocol-defined web technologies like HTTP "E-Tags" and framework-generated "fingerprints" used for cache validation?
A key evidentiary question will be one of functional equivalence: does the accused system's process of responding to a conditional GET request perform the same function as the patent's claimed method of "locating a particular data item," or is there a fundamental operational mismatch between a cache validation protocol and the more complex data discovery and retrieval system detailed in the specification?