PTAB

IPR2024-00303

Cloudera Inc v. R2 Solutions LLC

Key Events

Petition

petition Intelligence

Title: Data Processing Over a Distributed System
Brief Description: The ’610 patent discloses a method for processing data sets within the MapReduce programming framework. The purported enhancement involves treating input data as a plurality of "data groups," where different groups can have different data schemas, allowing for the processing of heterogeneous data sources.

Prior Art Relied Upon: Pike (Patent 7,590,620)
Core Argument for this Ground:
- Prior Art Mapping: Petitioner argued that Pike, which discloses the foundational MapReduce framework for large-scale distributed data processing, teaches all limitations of the challenged claims. Pike’s disclosure of processing input files that include "a variety of data types," such as "text files" and "tables," was asserted to teach the claimed concept of "data groups" having different schemas. The file name and type (e.g., ".csv") serve as the "mechanism for identifying" each data group. Pike’s system inherently partitions these input files into data blocks, provides them to worker processes that apply map operators, and then merges the results using reduce operators, mapping directly to the method steps of claim 1.
- Motivation to Combine (for §103 grounds): Not applicable as this ground relies on a single reference. Petitioner contended Pike alone renders the claims obvious as it describes applying the MapReduce framework to varied data types, which is precisely what the ’610 patent claims.
- Expectation of Success (for §103 grounds): A person of ordinary skill in the art (POSITA) would have expected success in using Pike’s established MapReduce system to process different types of input files, as this was a known and intended application of the technology.

Prior Art Relied Upon: Pike (Patent 7,590,620) and Chowdhuri (Application # 2006/0218123).
Core Argument for this Ground:
- Prior Art Mapping: Petitioner asserted that Chowdhuri explicitly teaches processing disparate data tables ("order" and "customer" tables with different schemas) using iterators (scan, hashjoin, GroupBy) that correspond to the map and reduce functions of the MapReduce framework. Pike provides the robust, fault-tolerant, and distributed system for implementing such processing. Chowdhuri’s tables directly teach the claimed "data groups," and its iterators teach scanning (mapping) and joining/grouping (reducing) these distinct data groups based on a common key ("customer_id").
- Motivation to Combine (for §103 grounds): A POSITA would combine these references to implement the specific relational data processing techniques taught by Chowdhuri within the more general and powerful distributed framework of Pike. Pike suggests processing "tables," making the integration of Chowdhuri's table-based methods a natural and predictable extension to leverage Pike's fault tolerance and parallel processing capabilities for improved efficiency.
- Expectation of Success (for §103 grounds): Success was expected because it involved applying known database query techniques (from Chowdhuri) to a system designed for large-scale data processing (Pike), a straightforward combination of compatible technologies.

Prior Art Relied Upon: Pike (Patent 7,590,620), Chowdhuri (Application # 2006/0218123), and MacLeod (Patent 6,343,295).
Core Argument for this Ground:
- Prior Art Mapping: This ground builds on the Pike/Chowdhuri combination to address claims requiring the use of "metadata" identifiable to a data group. MacLeod discloses a system for "tracking the lineage of data in a database" using a "lineage identifier" that is attached to data as it moves through a system. This identifier explicitly teaches metadata that identifies the origin of a file (i.e., the data group).
- Motivation to Combine (for §103 grounds): A POSITA would be motivated to incorporate MacLeod's lineage tracking into the Pike/Chowdhuri system to improve the tracking of data from heterogeneous sources through the map and reduce phases. This would predictably enhance data management, particularly for post-processing operations like selectively removing certain data groups from the reduce phase, thereby improving efficiency.
- Expectation of Success (for §103 grounds): A POSITA would have reasonably expected success in adding a known data tracking identifier (MacLeod) to a data processing system (Pike/Chowdhuri) to achieve the known benefit of improved data traceability.

Petitioner applied constructions from a related district court case for the purposes of the petition.
"data group": Construed as "a group of data and a mechanism for identifying data from that group." Petitioner argued that file names and table names in the prior art meet this construction.
"a plurality of mapping functions that are each user-configurable": Construed as "two or more mapping functions that are each configurable by a user." Petitioner argued Pike’s "application-specific" map operators, which are created by a programmer for a given use case, satisfy this limitation.

Petitioner requests institution of an inter partes review and cancellation of claims 1-46 of the ’610 patent as unpatentable.