PTAB

IPR2013-00087

EMC Corp v. PersonalWeb Technologies LLC

Key Events
Petition
petition

1. Case Identification

2. Patent Overview

  • Title: Computer File System Using Content-Dependent File Identifiers
  • Brief Description: The ’096 patent discloses a data storage system that uses "substantially unique data identifiers" to identify and access data items. These identifiers, termed "True Names," are generated based solely on the content of a data item (e.g., via a hash function like MD5), making them independent of the data's location, name, or context, thereby addressing issues with duplicate files and pathname-based identification.

3. Grounds for Unpatentability

Ground 1: Anticipation - Claims 1, 2, 81, and 83 are anticipated by Browne

  • Prior Art Relied Upon: Browne (S. Browne et al., “Location-Independent Naming for Virtual Distributed Software Repositories,” a University of Tennessee Technical Report, Feb. 1995).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner argued that Browne disclosed every limitation of the challenged claims. Browne described a Bulk File Distribution (BFD) system that used "Location Independent File Names" (LIFNs) to identify files based on their content, not their location, using the same MD5 hash algorithm mentioned in the ’096 patent. For compound data items ("resources"), Browne taught generating a LIFN for each component file (a hash) and then creating a top-level LIFN for the entire resource by hashing the sequence of component LIFNs (a "hash of hashes"). To access a file, Browne disclosed a two-step mapping process: a query to a "LIFN database" mapped the resource's LIFN to a list of its component LIFNs, and a "LIFN-to-location mapping service" then mapped a component's LIFN to a list of server locations storing that component.
    • Key Aspects: Petitioner contended that Browne’s system of content-based LIFNs, its handling of compound data items with a "hash of hashes," and its two-tiered mapping database structure directly read on the method steps of the challenged claims.

Ground 2: Obviousness - Claims 1, 2, 81, and 83 are obvious over Langer in view of Satyanarayanan II

  • Prior Art Relied Upon: Langer (an August 1991 Usenet article) and Satyanarayanan II (M. Satyanarayanan et al., “Coda: A Highly Available File System for a Distributed Workstation Environment,” an April 1990 journal article).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner asserted that Langer, like the ’096 patent, solved the problem of uniquely identifying files in a distributed system by proposing a unique identifier based on a file's content using a cryptographic hash function like MD5. Langer also addressed compound data items by hashing individual components and then creating a "hash of hashes" for the entire package. Access was achieved by querying a central database (like Archie or WAIS search engines) that mapped the MD5 hash to physical file locations on FTP servers. While Langer mentioned "mirror sites," Satyanarayanan II was cited for its explicit teaching of a replicated file system (the Coda system) that stored multiple copies of files on different servers to improve availability.
    • Motivation to Combine: A POSITA would combine Langer’s content-based identification system with Satyanarayanan II’s file replication technology to achieve a predictable and desirable result: improved reliability and performance. Satyanarayanan II provided an express motivation by teaching a mechanism to select a "preferred server" based on criteria like physical proximity, directly addressing Langer's goal of automatically informing a user of the nearest location from which to download a file.
    • Expectation of Success: The combination involved applying the known benefits of file replication (taught by Satyanarayanan II) to a known content-based file identification system (taught by Langer), which would have been a straightforward implementation with a high expectation of success.

Ground 3: Obviousness - Claims 1, 2, 81, and 83 are obvious over Kantor in view of Satyanarayanan II

  • Prior Art Relied Upon: Kantor ("The Frederick W. Kantor Contents-Signature System Version 1.22," an August 1993 user manual) and Satyanarayanan II.
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner argued that Kantor taught a system (FWKCS) that created "contents-signatures" to uniquely identify files based on their content, using a hash (a 32-bit CRC) and the file length. For compound data items like "zipfiles," Kantor disclosed creating a "zipfile contents-signature" by hashing the individual contents-signatures of the component files—a "hash of hashes." These signatures were stored in a master list, analogous to the ’096 patent's registry. The combination with Satyanarayanan II was argued for the same reasons as in Ground 2: to add known server replication and proximity-based access to Kantor’s system.
    • Motivation to Combine: A POSITA would have been motivated to apply the teachings of Satyanarayanan II to Kantor’s system for bulletin board systems (BBSs) to increase the reliability and response time for file requests. Storing files on multiple servers and providing users with the nearest available copy was a known technique to improve such systems.
    • Expectation of Success: Implementing the well-understood server replication and preferred server selection taught by Satyanarayanan II within the framework of Kantor's content-signature system would have been a predictable application of known technologies.

4. Key Claim Construction Positions

  • "True Name, data identity, and data identifier": Petitioner argued this term should be construed as the "substantially unique data identifier for a particular item," as defined in the patent's specification. This construction was central because Petitioner argued that the prior art’s use of content-based hashes (e.g., Browne's "LIFN," Kantor's "contents-signature") met this definition, directly mapping to the core inventive concept claimed in the ’096 patent.

5. Key Technical Contentions (Beyond Claim Construction)

  • Content-Based Identification Was Not Novel: A central theme of the petition was that the fundamental concepts of the ’096 patent—using hash functions to create content-based "fingerprints" for data and using a "hash of hashes" for compound data—were old and well-established in computer science long before the patent's priority date. Petitioner argued that the patentee's representations to the contrary during prosecution were incorrect and that these techniques were widely known for deduplication and file management.

6. Relief Requested

  • Petitioner requested institution of an inter partes review and cancellation of claims 1, 2, 81, and 83 of the ’096 patent as unpatentable under 35 U.S.C. §§ 102 and 103.