5:18-cv-00196
PersonalWeb Tech LLC v. Vend Inc
I. Executive Summary and Procedural Information
- Parties & Counsel:
- Plaintiff: PersonalWeb Technologies, LLC (Texas) and Level 3 Communications, LLC (Delaware)
- Defendant: Vend Inc. (Delaware) and Vend Limited (New Zealand)
- Plaintiff’s Counsel: IP Law Group, LLP; MacEiko IP; Sethlaw
- Case Identification: 3:18-cv-00196, N.D. Cal., 01/09/2018
- Venue Allegations: Venue is alleged to be proper in the Northern District of California because Defendant Vend Inc. is a Delaware corporation that maintains a regular and established place of business in the District and has allegedly committed acts of infringement there.
- Core Dispute: Plaintiff alleges that Defendant’s website and associated content delivery architecture infringe five patents related to methods for uniquely identifying, locating, and managing data in distributed computer networks using content-based identifiers.
- Technical Context: The patents relate to foundational technologies for content-addressable storage and content delivery networks (CDNs), which use identifiers derived from the content of data itself (such as cryptographic hashes) to de-duplicate storage, verify data integrity, and manage caching efficiently across networks.
- Key Procedural History: The complaint alleges that the patents-in-suit have been successfully enforced against third parties, resulting in settlements and non-exclusive licenses. It also notes that the last of the asserted patents has expired, and the infringement allegations are directed to the time period before expiration, limiting potential remedies to monetary damages.
Case Timeline
| Date | Event |
|---|---|
| 1995-04-11 | Priority Date for U.S. Patent No. 5,978,791 |
| 1999-11-02 | U.S. Patent No. 5,978,791 Issued |
| 2001-11-15 | Priority Date for U.S. Patent No. 6,928,442 |
| 2004-12-22 | Priority Date for U.S. Patent Nos. 7,802,310, 7,945,544, 8,099,420 |
| 2005-08-09 | U.S. Patent No. 6,928,442 Issued |
| 2010-09-21 | U.S. Patent No. 7,802,310 Issued |
| 2011-05-17 | U.S. Patent No. 7,945,544 Issued |
| 2012-01-17 | U.S. Patent No. 8,099,420 Issued |
| 2018-01-09 | Complaint Filing Date |
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 5,978,791 - "Data Processing System Using Substantially Unique Identifiers to Identify Data Items, Whereby Identical Data Items Have the Same Identifiers"
- Patent Identification: U.S. Patent No. 5,978,791, "Data Processing System Using Substantially Unique Identifiers to Identify Data Items, Whereby Identical Data Items Have the Same Identifiers," issued November 2, 1999.
The Invention Explained
- Problem Addressed: In expanding computer networks, traditional methods of naming files (e.g., pathnames) are context-dependent, making it difficult to verify data integrity, locate content, and avoid storing redundant copies of the same data under different names or in different locations (Compl. ¶13; ’791 Patent, col. 1:12-2:4).
- The Patented Solution: The invention proposes a system where every "data item" (e.g., a file or a part of a file) is assigned a "substantially unique identifier," or "True Name," generated by applying a cryptographic hash function like MD5 directly to the item's content. This content-based identifier is independent of the data's name or location, allowing any processor in a network to identify, locate, and manage data based solely on what it contains, thereby reducing duplication and simplifying data management (Compl. ¶¶14-17; ’791 Patent, Abstract).
- Technical Importance: This content-addressable storage model is a foundational concept for data de-duplication, efficient caching, and content delivery networks that reduce bandwidth and storage requirements (Compl. ¶11).
Key Claims at a Glance
- The complaint asserts independent claim 38 and dependent claim 42 (Compl. ¶37).
- The essential elements of independent claim 38, a method of locating a data item, include:
- (A) Determining a substantially unique identifier for a data item based on all of, and only, the data in that item.
- (B) Requesting the data item by sending its identifier from a requester location to one or more provider locations.
- (C) At a provider location: (a) determining and maintaining a set of identifiers for the data items it holds; (b) using that set to determine if the requested data item is present; and (c) if present, notifying the requester that it has a copy.
U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers"
- Patent Identification: U.S. Patent No. 6,928,442, "Enforcement and Policing of Licensed Content Using Content-Based Identifiers," issued August 9, 2005.
The Invention Explained
- Problem Addressed: In distributed systems where files are copied across numerous computers, it is difficult to determine whether a given copy of a data file is licensed or authorized for use (Compl. ¶52).
- The Patented Solution: The invention describes a method where a content-based name (identifier) is obtained for a data file. This name is then used to determine not only if a copy of the file is present on a computer, but also whether that copy is "an unauthorized copy or an unlicensed copy" (Compl. ¶¶49-52; ’442 Patent, Abstract).
- Technical Importance: This approach provides a mechanism for enforcing content licensing and access policies within a distributed network, independent of a file's location or user-assigned name.
Key Claims at a Glance
- The complaint asserts claims 10 and 11 (Compl. ¶47).
- The essential elements of independent claim 10, a method in a system with distributed files, include:
- Obtaining a name for a data file, where the name is based at least in part on a function of the file's data content.
- Using that name to determine whether a copy of the data file is present on at least one computer in the system.
- Determining whether a present copy of the data file is an unauthorized or unlicensed copy.
U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System"
- Patent Identification: U.S. Patent No. 7,802,310, "Controlling Access to Data in a Data Processing System," issued September 21, 2010.
- Technology Synopsis: The patent describes a system for controlling data access. A first computer receives a request from a second computer that includes a content-dependent name for a data item; the first computer then compares this name to a plurality of values to determine if access is authorized and, if so, allows the data item to be provided (Compl. ¶¶58-59).
- Asserted Claims: 20, 69, and 71 (Compl. ¶56).
- Accused Features: The complaint alleges infringement by Defendant's system of using E-Tags as content-dependent names in HTTP CONDITIONAL GET requests, where upstream servers compare the received E-Tag to a list of stored E-Tag values to determine whether access to the cached content is still authorized (Compl. ¶¶58-59).
U.S. Patent No. 7,945,544 - "Similarity-Based Access Control of Data in a Data Processing System"
- Patent Identification: U.S. Patent No. 7,945,544, "Similarity-Based Access Control of Data in a Data Processing System," issued May 17, 2011.
- Technology Synopsis: The patent discloses a method for creating a hierarchical content-based identifier. A "digital key" for a file (e.g., a webpage index file) is determined by a function of the content-based "part values" of its constituent parts (e.g., asset files). This digital key is then used as a search key to determine if content has changed (Compl. ¶¶66, 68).
- Asserted Claims: 46, 48, 49, 52, 55, and 56 (Compl. ¶63).
- Accused Features: Defendant's system is accused of generating an E-Tag (the "digital key") for a webpage's index file based on the contents of that file, which includes URIs of asset files. This E-Tag is then allegedly used as a "search key" in "IF-NONE-MATCH" requests to check for content changes (Compl. ¶¶66, 68).
U.S. Patent No. 8,099,420 - "Accessing Data in a Data Processing System"
- Patent Identification: U.S. Patent No. 8,099,420, "Accessing Data in a Data Processing System," issued January 17, 2012.
- Technology Synopsis: The patent covers a system that determines one or more content-dependent digital identifiers for a data item. It then selectively permits the data item to be accessed based on whether at least one of the identifiers corresponds to an entry in one or more databases of authorized identifiers (Compl. ¶77).
- Asserted Claims: 25, 26, 27, 29, 30, 32-36, and 166 (Compl. ¶73).
- Accused Features: The accused system allegedly uses content-based E-Tags and fingerprints for webpage files. These identifiers are compared against databases of E-Tag values on webpage servers to "selectively determine whether the requesting computer could access the file content it already had or must access newly received authorized content" (Compl. ¶¶76-78).
III. The Accused Instrumentality
- Product Identification: The accused instrumentality is the website located at vendhq.com and its underlying content delivery architecture (Compl. ¶20).
- Functionality and Market Context: The complaint alleges the accused system uses a Ruby on Rails architecture hosted on Amazon S3 (Compl. ¶¶22-23). Its relevant functionality is alleged to include generating content-based identifiers for its website files in two ways: (1) "fingerprints" created by a hash function that are appended to file URLs to form a URI, and (2) E-Tag values generated for files upon upload (Compl. ¶¶22-24, 34). The system is alleged to leverage this architecture to manage a distributed caching system. When a user's browser requests content, it sends an HTTP CONDITIONAL GET request containing the E-Tag for the cached content in an "IF-NONE-MATCH" header. An upstream server (either an intermediate cache or the origin server) compares this E-Tag to its stored value. A match results in an HTTP 304 (Not Modified) response, instructing the browser to use its local copy. A mismatch results in an HTTP 200 (OK) response containing the new content and a new E-Tag (Compl. ¶¶27-29). This process allegedly reduces bandwidth and improves efficiency by serving content from the nearest cache and only transferring files whose content has changed (Compl. ¶21).
No probative visual evidence provided in complaint.
IV. Analysis of Infringement Allegations
5,978,791 Patent Infringement Allegations
| Claim Element (from Independent Claim 38) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| (A) determining a substantially unique identifier for the data item, the identifier depending on and being determined using all of the data in the data item and only the data in the data item... | Defendant's system calculates hash fingerprints and E-Tags based on the contents of its website files (e.g., asset files). An identical sequence of bits in a file results in an identical identifier. | ¶39 | col. 13:10-14 |
| (B) requesting the particular data item by sending the data identifier of the data item from the requester location to at least one location of a plurality of provider locations in the system. | A user's browser (requester location) sends an HTTP CONDITIONAL GET request with an "IF-NONE-MATCH" header containing the E-Tag (the data identifier) to an intermediate cache or origin server (provider locations). | ¶40 | col. 15:11-20 |
| (C) on at least some of the provider locations, (a) ... (i) determining a substantially unique identifier for the data item ... and (ii) making and maintaining a set of identifiers of data items. | Defendant's origin and intermediate cache servers are alleged to store data items (website content) and maintain a set of their corresponding identifiers (E-Tags and fingerprints appended to URIs) in a database or table. | ¶41 | col. 7:26-34 |
| (b) determining, based on the set of identifiers, whether the data item corresponding to the requested data identifier is present at the provider location. | An upstream server that receives the CONDITIONAL GET request compares the E-Tag from the request header against its database of stored E-Tag values to determine if there is a match. | ¶42 | col. 13:46-51 |
| (c) based on the determining, when the provider location determines that the particular data item is present ... notifying the requestor that the provider has a copy of the given data item. | When a match is found between the requested E-Tag and a stored E-Tag, the server sends an HTTP 304 (Not Modified) message, which allegedly notifies the requesting browser that the server has a copy of the same file content. | ¶43 | col. 16:6-10 |
6,928,442 Patent Infringement Allegations
| Claim Element (from Independent Claim 10) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| a method, in a system in which a plurality of files are distributed across a plurality of computers... | The accused system distributes website files (e.g., index and asset files) across origin servers, intermediate cache servers, and end-point browser caches. | ¶48 | col. 3:41-46 |
| obtaining a name for a data file, the name being based at least in part on a given function of the data... | Defendant's system obtains E-Tags and fingerprints for its website files using an alleged hash function, with the resulting name being based on the file's contents. | ¶49 | col. 12:55-62 |
| determining, using at least the name, whether a copy of the data file is present on at least one of said computers. | An origin or intermediate cache server receives a CONDITIONAL GET request containing an E-Tag (the "name") and compares it to its stored E-Tags to determine if a copy of the content associated with that E-Tag is present. | ¶¶50-51 | col. 14:1-5 |
| determining whether a copy of the data file that is present on a at least one of said computers is an unauthorized copy or an unlicensed copy of the data file. | If the E-Tag in the request matches a stored E-Tag, the server determines the copy of the file at the downstream/browser cache is an "authorized or licensed copy." If there is no match, it determines the copy is an "unauthorized or unlicensed copy" that needs to be replaced. | ¶52 | col. 16:45-50 |
- Identified Points of Contention:
- Scope Questions: A central issue may be whether claims directed to a proprietary data management and location system can be construed to cover the use of the HTTP "ETag"/"If-None-Match" mechanism, which is a standardized and widely adopted protocol for web caching. The infringement theory appears to map claim terms onto the routine operation of this public internet standard. Another question is whether the claimed "method of locating" a data item reads on a process that primarily serves to determine if a known, cached copy of a data item is still fresh.
- Technical Questions: What is the exact function used by Defendant's system, or by its service provider Amazon S3, to generate E-Tags? The complaint alleges it is a "message digest hash function" based on file content (Compl. ¶¶17-18, 22). The infringement case may depend on whether Plaintiff can prove that the specific function used meets all claim limitations, such as being based on "all of the data in the data item and only the data in the data item" as required by claim 38 of the ’791 patent.
V. Key Claim Terms for Construction
The Term: "substantially unique identifier" (from '791 Patent, claim 38)
Context and Importance: This term is the foundation of the '791 patent's invention. Its construction is critical because the infringement allegation equates this term with HTTP E-Tags and hashed "fingerprints" (Compl. ¶39). Practitioners may focus on this term because the defense could argue that standard E-Tags, particularly "weak" E-Tags, are not "substantially unique" in the manner required by the patent, which emphasizes cryptographic properties.
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The patent specification suggests that "substantially unique" is a probabilistic standard, referring to message digest functions like MD5 and noting that the probability of a collision is "arbitrarily small" but not zero (’791 Patent, col. 13:40-45). This may support reading the term on identifiers that are unique in practice for a given system, even if not globally unique.
- Evidence for a Narrower Interpretation: The claim requires the identifier to depend on "all of the data in the data item and only on the data in the data item." An E-Tag generated by a server could potentially be based on other metadata (like a timestamp or inode number), which may support a narrower construction that excludes some forms of E-Tags. The patent title itself states, "Identical Data Items Have the Same Identifiers," suggesting a strict content-based linkage.
The Term: "unauthorized copy or an unlicensed copy" (from '442 Patent, claim 10)
Context and Importance: This term connects the technical act of matching (or not matching) a content-based identifier to a legal or policy status. The infringement theory hinges on construing a standard cache validation check (an HTTP 304 vs. 200 response) as a determination of authorization status (Compl. ¶52).
Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The complaint's narrative frames the HTTP 304 "Not Modified" response as a signal that the browser was "reauthorized to again use the previously cached asset files" and the HTTP 200 response as indicating it was "not authorized to use" the old content (Compl. ¶¶28-29). Plaintiff may argue that in the context of controlling content distribution, any signal that permits or denies use of a specific version of a file is a form of authorization determination.
- Evidence for a Narrower Interpretation: The patent is titled "Enforcement and Policing of Licensed Content." A defendant may argue that this context implies a determination related to legal rights, licenses, or explicit permissions, rather than a purely technical determination of data freshness. An HTTP 304 response merely indicates the content is unchanged; it does not necessarily convey any information about the legal license status of that content.
VI. Other Allegations
- Indirect Infringement: The complaint does not provide sufficient detail for analysis of indirect infringement. The claims for relief are based on direct infringement under 35 U.S.C. § 271 without specifying subsections (a), (b), or (c).
- Willful Infringement: The complaint does not contain allegations of willful infringement or pre-suit knowledge of the patents.
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of definitional scope: can patent claims for a data identification and management system be construed to cover the standardized and ubiquitous use of HTTP E-Tags and CONDITIONAL GET requests, a fundamental mechanism for web caching on the public internet? The case may turn on whether applying the patent claims to this standard functionality is a permissible interpretation of the claim language or an improper attempt to capture a public domain tool.
- A key evidentiary question will be one of functional equivalence: does the Defendant's system of responding to cache-validation requests (sending an HTTP 304 "Not Modified" or HTTP 200 "OK") perform the specific function required by the claims of determining whether a copy of a file is "unauthorized or unlicensed," or is there a fundamental mismatch in technical operation between a data freshness check and the claimed method of policy enforcement?