PTAB

IPR2013-00086

EMC Corp v. PersonalWeb Technologies LLC

Key Events
Petition
petition

1. Case Identification

2. Patent Overview

  • Title: De-Duplication of Data in a Data Processing System
  • Brief Description: The ’662 patent discloses a data storage system that uses content-based, "substantially unique data identifiers" to identify, access, and manage data items. The system uses these location-independent identifiers, which the patent calls "True Names," to perform file management functions such as deleting unwanted duplicate copies of data across multiple servers.

3. Grounds for Unpatentability

Ground 1: Claim 30 is anticipated by Browne

  • Prior Art Relied Upon: Browne ("Location-Independent Naming for Virtual Distributed Software Repositories," University of Tennessee Technical Report CS-95-278, Feb. 1995).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner argued that Browne disclosed every element of claim 30. Browne’s system created "Location Independent File Names" (LIFNs) by calculating an MD5 hash of a file's contents, directly corresponding to the ’662 patent’s content-based "digital data item identifier." Browne stored these LIFNs in a "LIFN database," which mapped the LIFNs to the locations of replicated files on multiple servers, satisfying the claim’s requirement for a "plurality of servers" and a "list indicating... a corresponding status" (i.e., location). Browne explicitly taught a deletion method where a file server sends a "request[] to delete old LIFN-to-location mappings" to the LIFN database. This process meets the claimed steps of obtaining the content-based identifier (LIFN) in response to a deletion attempt and updating a record in the list (the LIFN database) to reflect the deletion.

Ground 2: Claim 30 is obvious over Kantor in view of Satyanarayanan II

  • Prior Art Relied Upon: Kantor ("The Frederick W. Kantor Contents-Signature System Version 1.22," a 1993 user manual) and Satyanarayanan II ("Coda: A Highly Available File System for a Distributed Workstation Environment," a 1990 IEEE article).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner contended that Kantor taught most limitations of claim 30. Kantor’s system for bulletin board systems (BBS) created "contents-signatures" (using a CRC hash and file length) to identify files based on their content. It used lists, such as a "MULTIS" list and a master "CSLIST.SRT," to track duplicate files and mark them for deletion using flags. This system performed the claimed method of obtaining a content-based identifier and updating a list to reflect a deletion. While Kantor focused on file identification and deletion logic, it did not explicitly detail a replicated, multi-server storage architecture. Satyanarayanan II remedied this by teaching the Coda system, a well-known architecture for replicating files across multiple servers in a distributed environment to ensure high availability.
    • Motivation to Combine: A POSITA would combine the teachings of Satyanarayanan II with Kantor’s system to improve the reliability and fault tolerance of the underlying BBS file storage. Implementing a robust, replicated storage system was a known solution to a common problem, making the combination a predictable and logical step to enhance the performance of Kantor's file management system.
    • Expectation of Success: The integration of a known file replication system (Satyanarayanan II) with a content-based file management system (Kantor) would have been straightforward for a POSITA, yielding the predictable result of a more robust overall system.

Ground 3: Claim 30 is obvious over Woodhill in view of Ritchie

  • Prior Art Relied Upon: Woodhill (Patent 5,649,196) and Ritchie ("The UNIX Time-Sharing System," a 1974 ACM article).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner asserted that Woodhill, which teaches a distributed backup system, disclosed the foundational elements of claim 30. Woodhill’s system used content-based "Binary Object Identifiers" (BOIs) to identify file objects, which were replicated across multiple servers for backup. To manage deletions, Woodhill used a "Backup Queue Database" where records for deleted files were assigned a "File Status" of "DELETED." This database and status field served as the claimed "list." The system obtained the BOIs for a deleted file and updated its status, mapping to the claimed method steps. Petitioner argued that Ritchie’s seminal paper on the UNIX operating system supplied any missing detail regarding robust deletion management. Ritchie described the use of an "i-list" containing "i-nodes" with a "link-count" (or use count) to track shared files. Deleting a file decremented the count, and the file’s storage was freed only when the count reached zero.
    • Motivation to Combine: A POSITA would have been motivated to apply Ritchie's well-known, simple, and rapid link-counting technique to Woodhill's backup system to enhance the management and reliability of deleting shared data. Ritchie's method was a standard, widely used solution for managing file consistency, making its application to Woodhill's system an obvious improvement.
    • Expectation of Success: Applying the standard UNIX file deletion methodology to another file system was a predictable and well-understood task for a POSITA, who would have had a high expectation of success.

4. Key Claim Construction Positions

  • Petitioner argued that claim terms should be given their broadest reasonable construction but noted that several key terms were explicitly defined in the ’662 patent’s specification.
  • The term "True Name, data identity, and data identifier" was central, defined in the specification as referring to a "substantially unique data identifier for a particular item." Petitioner emphasized that this identifier was calculated using a content-based hash function (e.g., MD5), making it independent of the data's location or context. This construction was used to show that the prior art's "LIFNs," "contents-signatures," and "BOIs" were the same technological concept.

5. Key Technical Contentions (Beyond Claim Construction)

  • Petitioner’s central technical contention was that the core concept of the ’662 patent—using content-based hashing to create location-independent file identifiers—was not novel at the time of the invention.
  • Petitioner argued that techniques for creating a data "fingerprint" or "signature" were old and widely used. This included the development of hashing in the 1950s, Merkle trees in the 1970s, and the specific MD5 algorithm referenced in the patent, which was available in the early 1990s. The petition asserted that the patent’s characterization of all prior art systems as relying solely on context- or location-based names was factually incorrect.

6. Relief Requested

  • Petitioner requested institution of an inter partes review and cancellation of claim 30 of Patent 7,949,662 as unpatentable under 35 U.S.C. §§ 102 and 103.