DCT
5:18-cv-06042
PersonalWeb Tech LLC v. Vimeo Inc
I. Executive Summary and Procedural Information
- Parties & Counsel:
- Plaintiff: PersonalWeb Technologies, LLC (Texas) and Level 3 Communications, LLC (Delaware)
- Defendant: Vimeo, Inc. (Delaware)
- Plaintiff’s Counsel: Stubbs, Alderton & Markiles, LLP
- Case Identification: 5:18-cv-06042, N.D. Cal., 10/04/2018
- Venue Allegations: The complaint alleges venue is proper because the action was transferred to the Northern District of California by the Judicial Panel on Multidistrict Litigation for consolidated pretrial proceedings. Initial venue was based on Defendant's incorporation in Delaware and its established place of business in the Southern District of New York.
- Core Dispute: Plaintiff alleges that Defendant’s video hosting platform infringes patents related to using content-based identifiers, generated via hash functions, to manage, locate, and control access to data in distributed computer networks.
- Technical Context: The technology addresses data management in large-scale networks by replacing conventional file names with unique identifiers derived from the content itself, a technique used to improve caching efficiency and data integrity.
- Key Procedural History: The complaint notes that the patents-in-suit have expired and that allegations are directed to the period before expiration. It also states that Plaintiff PersonalWeb has previously enforced its rights against third parties, resulting in settlements and licenses. The case was transferred to the Northern District of California as part of a multi-district litigation proceeding.
Case Timeline
| Date | Event |
|---|---|
| 1995-04-11 | Earliest Priority Date for all Patents-in-Suit |
| 2005-08-09 | U.S. Patent No. 6,928,442 Issued |
| 2010-09-21 | U.S. Patent No. 7,802,310 Issued |
| 2012-01-17 | U.S. Patent No. 8,099,420 Issued |
| 2018-10-04 | Complaint Filed |
II. Technology and Patent(s)-in-Suit Analysis
U.S. Patent No. 6,928,442 - "Enforcement and Policing of Licensed Content Using Content-Based Identifiers," issued August 9, 2005
The Invention Explained
- Problem Addressed: In expanding computer networks, conventional methods of naming and locating files were inefficient, leading to duplicate data storage and difficulty in verifying that a retrieved file was the correct one, as the file's name had no direct relationship to its content (Compl. ¶¶ 14-15; ’442 Patent, col. 1:13-27).
- The Patented Solution: The invention proposes replacing conventional file names with "substantially unique identifiers" called "True Names," which are generated by applying a cryptographic hash function (such as MD5 or SHA) to the content of the data item itself (Compl. ¶¶ 16-17). This creates an identifier that is independent of the file's location or user-assigned name and is intrinsically tied to its content, allowing any data item to be uniquely identified system-wide based only on its sequence of bits (’442 Patent, col. 3:30-37).
- Technical Importance: This content-based identification approach was designed to reduce network bandwidth and storage requirements by enabling reliable detection of duplicate data and facilitating more efficient data caching and integrity verification across distributed systems (Compl. ¶ 13).
Key Claims at a Glance
- The complaint asserts independent claim 10 and dependent claim 11 (Compl. ¶ 51).
- Independent Claim 10 includes the essential elements of:
- A method in a system with files distributed across multiple computers.
- Obtaining a name for a data file based on a function of the file's contents.
- Using that name to determine if a copy of the file is present on one of the computers.
- Determining if a present copy of the file is an "unauthorized copy or an unlicensed copy."
U.S. Patent No. 7,802,310 - "Controlling Access to Data in a Data Processing System," issued September 21, 2010
The Invention Explained
- Problem Addressed: The patent addresses the need to control access to and distribution of data in a networked environment, particularly where content may be cached at various locations and authorization to use that content can change (’310 Patent, Abstract).
- The Patented Solution: The invention describes a method where a "first computer" (e.g., an origin server) controls content distribution to another computer (e.g., a browser or cache) in response to a request. The request must include a "content-dependent name" (such as a "True Name" or hash) of the data. Based on this content-dependent name, the first computer determines whether the content is "unauthorized or unlicensed" and either permits or denies the request to provide or access the content (’310 Patent, Abstract; ’310 Patent, col. 42:24-40).
- Technical Importance: The method provides a content-aware access control system that can manage data distribution and enforce authorization policies in a distributed caching environment without relying on traditional location-based or user-based permissions (Compl. ¶ 19).
Key Claims at a Glance
- The complaint asserts independent claim 20 (Compl. ¶ 59).
- Independent Claim 20 includes the essential elements of:
- A computer-implemented method in a multi-computer system.
- Controlling content distribution from a first computer to another in response to a request from the other computer.
- The request includes a "content-dependent name" of a data item, where the name is generated by a hash or message digest function of the item's data.
- Based on the content-dependent name, the first computer either (A) permits the content to be provided if it is not determined to be unauthorized/unlicensed, or (B) does not permit it to be provided if it is determined to be unauthorized/unlicensed.
U.S. Patent No. 8,099,420 - "Accessing Data in a Data Processing System," issued January 17, 2012
- Technology Synopsis: The patent describes a system for accessing data items in a network. For a given data item, the system determines one or more "content-dependent digital identifiers" based on a function of the item's bits. It then selectively permits the data item to be accessed based on whether the identifier corresponds to an entry in a database, thereby preventing unauthorized access (’420 Patent, Abstract).
- Asserted Claims: Claims 25, 26, 27, 29, 30, 32, 34-36, and 166 are asserted, with claim 166 being independent (Compl. ¶ 66).
- Accused Features: The complaint alleges infringement by Defendant's system of web servers and databases that use content-based ETags to control access to webpage asset files. These ETags are allegedly compared to entries in a database to determine whether a requesting computer is authorized to access a cached version of a file or must receive a new one (Compl. ¶¶ 69-71).
III. The Accused Instrumentality
Product Identification
- The accused instrumentality is Defendant's system and method for operating the website "vimeo.com", specifically its process for distributing webpage content to users (Compl. ¶ 31).
Functionality and Market Context
- The complaint alleges that Defendant's system uses the HTTP protocol's "ETag" mechanism to manage cached content (Compl. ¶ 33). An ETag value, alleged to be a content-based identifier, is generated by applying a hash function to the contents of a webpage asset file (Compl. ¶ 37). When a user's browser or an intermediate cache requests an asset file it already has stored, it sends the corresponding ETag in a conditional GET request (Compl. ¶ 43). Defendant's servers allegedly compare the received ETag with the server's current ETag for that file. A match results in an HTTP 304 "Not Modified" response, instructing the browser to use its cached version, while a mismatch results in an HTTP 200 "OK" response with the new file and ETag (Compl. ¶¶ 45-46). This process is alleged to reduce bandwidth and server load by avoiding unnecessary re-transmission of unchanged files (Compl. ¶ 35). The complaint also alleges Defendant contracts with Amazon to use its S3 system to store and serve some of these files, with the S3 system generating the ETags on Defendant's behalf (Compl. ¶ 38). No probative visual evidence provided in complaint.
IV. Analysis of Infringement Allegations
'6,928,442 Patent Infringement Allegations
| Claim Element (from Independent Claim 10) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| obtaining a name for a data file, the name being based at least in part on a given function of the data... | Defendant allegedly generates or obtains ETags for its asset files by applying a hash function to the contents of the files (Compl. ¶ 53). | ¶53 | col. 41:15-24 |
| determining, using at least the name, whether a copy of the data file is present on at least one of said computers... | Defendant’s origin or intermediate cache servers allegedly receive a conditional GET request with an ETag and compare that ETag to the ETag stored for the requested file URI to determine if content with that ETag is present (Compl. ¶ 54). | ¶54 | col. 42:25-30 |
| determining whether a copy of the data file that is present... is an unauthorized copy or an unlicensed copy of the data file. | If the ETags match, the server allegedly determines the downstream copy is "authorized." If there is no match, it allegedly determines the copy is "unauthorized" and sends the new, authorized content (Compl. ¶ 55). | ¶55 | col. 42:25-30 |
'7,802,310 Patent Infringement Allegations
| Claim Element (from Independent Claim 20) | Alleged Infringing Functionality | Complaint Citation | Patent Citation |
|---|---|---|---|
| controlling distribution of content from a first computer to at least one other computer, in response to a request... the request including at least a content-dependent name of a particular data item... | An upstream server ("first computer") allegedly receives a conditional GET request from a downstream browser or cache ("other computer"). The request includes an ETag, which is alleged to be the "content-dependent name" based on a hash of the data item's contents (Compl. ¶ 61). | ¶61 | col. 40:1-8 |
| based at least in part on said content-dependent name... the first device (A) permitting the content to be provided... if it is not determined that the content is unauthorized or unlicensed, otherwise, (B)... not permitting... | The upstream server compares the received ETag with its stored ETag. If they match, it allegedly permits access by sending an HTTP 304 response. If they do not match, it allegedly determines the content is no longer authorized and sends an HTTP 200 response with the new content (Compl. ¶ 62). | ¶62 | col. 40:8-20 |
Identified Points of Contention
- Scope Questions: A central question may be whether an HTTP "ETag," a standard entity tag for web cache validation, constitutes a "name for a data file" or "content-dependent name" as those terms are used in the patents. The defense may argue that an ETag is a cache validator, not a file "name" intended to replace a conventional file system's naming structure as described in the patent specifications.
- Technical Questions: The analysis will likely focus on whether the HTTP 304/200 response mechanism performs the function of determining if a file is "unauthorized or unlicensed." Does a mismatch indicating stale cached data (a technical state) equate to a determination of an "unauthorized" status (a permissions state), as required by the claims? The complaint alleges an HTTP 304 response "authoriz[es]" the use of a cached file, a characterization that may be contested (Compl. ¶ 62).
V. Key Claim Terms for Construction
"name for a data file, the name being based at least in part on a given function of the data" ('442 Patent) / "content-dependent name" ('310 Patent)
- Context and Importance: This term is the core of the asserted claims. The plaintiff's case relies on construing this term to read on the HTTP "ETag" values used by the accused system. The outcome of the case may turn on whether an ETag, a standard part of the HTTP protocol for cache validation, is considered a "name" in the manner contemplated by the patents.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The patent specification describes the identifier as depending "only on the data in the data item" and being independent of its "name, origin, location, address, or other information not derivable directly from the data" (’442 Patent, col. 3:32-37). This functional description could support reading the term on any identifier, including an ETag, that is generated from a file's content.
- Evidence for a Narrower Interpretation: The specification repeatedly refers to this identifier as a "True Name" and describes its use within a comprehensive data management system involving specific data structures like a "True File registry" and "local directory extensions" (’442 Patent, col. 7:20-36). Practitioners may argue this context suggests the claimed "name" is not just any content-derived hash, but one integrated into the specific file system architecture disclosed in the patent, which is distinct from the standard HTTP protocol.
"unauthorized copy or an unlicensed copy" ('442 Patent) / "content is unauthorized or unlicensed" ('310 Patent)
- Context and Importance: Plaintiff’s theory is that a server's response to a non-matching ETag constitutes a determination that the cached copy is "unauthorized." The viability of the infringement claims depends on this functional mapping. Practitioners may focus on this term because it links a technical cache-coherency check to a legal or permissions-based status.
- Intrinsic Evidence for Interpretation:
- Evidence for a Broader Interpretation: The title of the ’442 Patent, "Enforcement and Policing of Licensed Content," suggests the invention is directed at authorization. The complaint alleges that if an ETag matches, the downstream browser is "authorized to use that previously cached asset file" (Compl. ¶ 45), and if it does not, the browser is "not authorized to use" the old file (Compl. ¶ 46).
- Evidence for a Narrower Interpretation: The specification describes using the "True Name" to verify that retrieved data is the "correct data" or that a "cached item has not changed" (’442 Patent, col. 2:20-27, col. 3:1-3). This language focuses on data integrity and staleness. A party could argue that determining a cached file is "stale" is a technical check for data corruption or outdatedness, not a determination of "unauthorized" or "unlicensed" status in the legal or access-control sense.
VI. Other Allegations
Indirect Infringement
- The complaint alleges facts that may support theories of induced infringement. It states that Defendant "caused" downstream intermediate cache servers and endpoint caches to perform certain steps, such as obtaining ETags and sending conditional GET requests, by designing its system to operate in this manner (Compl. ¶¶ 53, 54, 61).
VII. Analyst’s Conclusion: Key Questions for the Case
- A core issue will be one of definitional scope: can an HTTP "ETag," a standardized technical tool for web cache validation, be construed as the proprietary "content-dependent name" described in the patents, which is presented as the cornerstone of a comprehensive data management and policing system?
- A key evidentiary question will be one of functional equivalence: does the accused system’s automated response to a non-matching ETag—a standard protocol for indicating stale data—perform the specific function of "determining" that a copy of a file is "unauthorized or unlicensed" in the legal or permissions-based sense required by the claims?