DCT

1:19-cv-12551

Singular Computing LLC v. Google LLC

Key Events

Amended Complaint

amended complaint

I. Executive Summary and Procedural Information

Parties & Counsel:
- Plaintiff: Singular Computing LLC (Delaware)
- Defendant: Google LLC (Delaware)
- Plaintiff’s Counsel: Prince Lobel Tye LLP
Case Identification: 1:19-cv-12551, D. Mass., 03/20/2020
Venue Allegations: Plaintiff alleges venue is proper in the District of Massachusetts based on Defendant maintaining regular and established places of business in the district, including a major office in Cambridge, and having committed acts of patent infringement within the district.
Core Dispute: Plaintiff alleges that Defendant’s Tensor Processing Unit (TPU) hardware, used for accelerating artificial intelligence workloads, infringes patents related to computer architectures that utilize large numbers of low-precision, high-dynamic-range processing elements.
Technical Context: The technology concerns specialized processors (ASICs) designed to accelerate machine learning computations, a critical component for large-scale cloud computing and artificial intelligence services.
Key Procedural History: The complaint alleges that Plaintiff's founder, Dr. Joseph Bates, met with Google representatives more than three times prior to 2017 under a non-disclosure agreement, during which he disclosed his patented computer architecture and prototype. Plaintiff alleges Google subsequently copied the invention in its accused TPU devices. Subsequent to the filing of this complaint, Inter Partes Review (IPR) proceedings at the USPTO resulted in the cancellation of several asserted claims, including representative claim 4 of the ’961 patent. However, representative claims 53 of the ’273 patent and 7 of the ’156 patent survived the IPR challenges.

Case Timeline

Date	Event
2009-06-19	Patent Priority Date for ’273, ’156, and ’961 Patents
2013-03-26	U.S. Patent No. 8,407,273 Issues
2015-12-22	U.S. Patent No. 9,218,156 Issues
2017-05-01	Accused Cloud Tensor Processing Unit Version 2 (TPUv2) Launch
2019-09-17	U.S. Patent No. 10,416,961 Issues
2020-03-20	Amended Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 8,407,273 - "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT"

Patent Identification: U.S. Patent No. 8,407,273, titled "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT," issued March 26, 2013.

The Invention Explained

Problem Addressed: The patent’s background section describes the inefficiency of conventional computer architectures, where billions of transistors on a CPU chip enable software to perform only a few operations per clock cycle, wasting inherent computing power (Compl. ¶8; ’273 Patent, col. 1:56-62).
The Patented Solution: The invention proposes a processor architecture built from a massive number of "low precision high dynamic range" (LPHDR) processing elements. These elements are designed to be physically small by trading high precision for computational efficiency, allowing many more units to be placed on a single chip. This architecture is intended to dramatically increase the number of arithmetic operations performed per unit of time for applications that can tolerate small numerical errors (’273 Patent, Abstract; col. 2:1-15).
Technical Importance: This design challenged the prevailing trend of increasing arithmetic precision in processors, offering a new architectural path to achieve greater computational density for emerging, error-tolerant workloads like AI and machine learning (Compl. ¶¶9, 11).

Key Claims at a Glance

The complaint asserts independent claim 53 as representative (Compl. ¶31).
Essential elements of Claim 53 include:
- A device comprising at least one first low precision high dynamic range (LPHDR) execution unit.
- The unit operates on inputs with a dynamic range of at least 1/1,000,000 to 1,000,000.
- For at least 5% of possible inputs, the unit's output differs by at least 0.05% from an exact mathematical calculation.
- The number of LPHDR execution units exceeds by at least 100 the number of execution units adapted for multiplication on floating-point numbers of at least 32 bits.
The complaint also asserts claims 17, 18, and 52 (Compl. ¶87).

U.S. Patent No. 9,218,156 - "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT"

Patent Identification: U.S. Patent No. 9,218,156, titled "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT," issued December 22, 2015.

The Invention Explained

Problem Addressed: The ’156 Patent addresses the same problem of inefficient transistor use in conventional computers as the ’273 Patent (’156 Patent, col. 1:45-67).
The Patented Solution: This patent claims a heterogeneous architecture that combines the LPHDR processing elements with a conventional computing device. The invention describes a system where a controller, such as a CPU, GPU, or FPGA, directs the operations of the massively parallel LPHDR execution units, effectively creating a specialized co-processor for computationally intensive, low-precision tasks (’156 Patent, col. 8:45-54; Compl. ¶62).
Technical Importance: The invention provides a concrete structural framework for integrating the novel LPHDR processors into a broader computing system, allowing a general-purpose processor to offload specific tasks to the highly efficient, specialized hardware (Compl. ¶63).

Key Claims at a Glance

The complaint asserts independent claim 7 as representative (Compl. ¶47).
Essential elements of Claim 7 include:
- A device comprising at least one LPHDR execution unit with the same dynamic range and imprecision characteristics as in the ’273 Patent.
- The device includes at least one first computing device (e.g., CPU, GPU, FPGA) adapted to control the operation of the LPHDR execution unit.
- The number of LPHDR execution units exceeds by at least 100 the number of execution units adapted for 32-bit floating-point multiplication.
The complaint also asserts claims 6, 21, and 22 (Compl. ¶104).

U.S. Patent No. 10,416,961 - "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT"

Patent Identification: U.S. Patent No. 10,416,961, titled "PROCESSING WITH COMPACT ARITHMETIC PROCESSING ELEMENT," issued September 17, 2019.

Technology Synopsis

The ’961 Patent is part of the same family and describes a similar heterogeneous architecture. A key distinction in its representative claim is a higher threshold for imprecision: the LPHDR unit's output must differ by at least 0.2% for at least 10% of its possible inputs, which is a greater degree of inexactness than required by the earlier patents (Compl. ¶¶64, 69).

Asserted Claims

Independent claim 4 is asserted as representative, along with claims 1-3, 5, 10, 13, 14, and 15 (Compl. ¶¶64, 122). Subsequent to the complaint's filing, an IPR proceeding resulted in the cancellation of claims 1, 2, 4, 5, 10, 13, 14, 21, 24, and 25 of this patent.

Accused Features

The accused features are the same TPUv2 and TPUv3 devices, with the complaint alleging that their use of bfloat16 arithmetic meets the higher imprecision threshold of this patent (Compl. ¶¶126, 129).

III. The Accused Instrumentality

Product Identification

Google's Cloud Tensor Processing Unit Version 2 (TPUv2) and Version 3 (TPUv3) Devices, which are custom-designed Application-Specific Integrated Circuits (ASICs) (Compl. ¶¶22, 87).

Functionality and Market Context

The complaint alleges that the accused TPUs are hardware accelerators designed to massively increase the speed and efficiency of machine learning and AI computations (Compl. ¶87). Each TPU contains multiple cores, and each core features one or more Matrix Multiply Units (MXUs) that provide the bulk of the computational power by performing matrix multiplications at "reduced bfloat16 precision" (Compl. ¶¶90, 107a-b). The complaint asserts these TPUs are installed in Google's data centers to power its primary AI-driven services—including Search, Translate, and Photos—and are also offered as a service to the public via Google Cloud (Compl. ¶¶25, 87). The complaint portrays the TPUs as Google's solution to a "scary and daunting" increase in computational demand driven by its AI services (Compl. ¶15). The complaint includes a diagram illustrating how a user's virtual machine connects to a PCI-attached "TPU board" containing multiple TPU cores to run training code (Compl. p. 31).

IV. Analysis of Infringement Allegations

8,407,273 Patent Infringement Allegations

Claim Element (from Independent Claim 53)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
A device comprising at least one first low precision high-dynamic range (LPHDR) execution unit adapted to execute a first operation on a first input signal...	Each "MXU Reduced Precision Multiply Cell," one of over 100,000 such units in each TPU device, is alleged to be an LPHDR execution unit performing a multiplication operation.	¶91	col. 2:1-7
wherein the dynamic range of the possible valid inputs to the first operation is at least as wide as from 1/1,000,000 through 1,000,000...	The MXU cells accept 32-bit floating point (float32) inputs, which have a dynamic range of approximately 10⁻³⁸ to 10³⁸, exceeding the claimed range. A Google diagram comparing number formats is cited as evidence (Compl. p. 39).	¶94	col. 2:25-29
and for at least X = 5% of the possible valid inputs..., the statistical mean... differs by at least Y = 0.05% from the result of an exact mathematical calculation...	The MXU cells perform multiplication at "reduced bfloat16 precision," which is inherently inexact. The complaint presents "Singular test results" showing this method yields a >0.05% difference for over 99% of valid inputs.	¶94	col. 2:16-20
wherein the number of LPHDR execution units... exceeds by at least one hundred the... execution units... adapted to execute... multiplication on floating-point numbers that are at least 32 bits wide.	A TPUv2 allegedly has 131,072 LPHDR units (MXU cells) versus approximately 13,107 32-bit units (in the VPU), and a TPUv3 has 262,144 LPHDR units versus approximately 26,214 32-bit units, far exceeding the 100-unit difference.	¶95	col. 2:8-15

Identified Points of Contention:
- Scope Questions: A central question may be whether one of the thousands of individual "multiply cells" within a Google TPU's systolic array constitutes an "execution unit" as the term is used in the patent, or if the entire MXU should be considered the unit.
- Technical Questions: The infringement theory for the imprecision element relies on "Singular test results" (Compl. p. 41). The methodology, accuracy, and applicability of these tests to the accused products as they operate in the field will likely be a point of factual dispute. The nature of "reduced bfloat16 precision" will also be examined: is it a 32-bit operation with error, or simply a 16-bit operation outside the scope of a claim limitation based on 32-bit operations?

9,218,156 Patent Infringement Allegations

Claim Element (from Independent Claim 7)	Alleged Infringing Functionality	Complaint Citation	Patent Citation
A device comprising: at least one first low precision high-dynamic range (LPHDR) execution unit...	The allegations are identical to those for the ’273 patent, identifying each "MXU Reduced Precision Multiply Cell" as an LPHDR unit.	¶108	col. 2:1-7
at least one first computing device adapted to control the operation of the at least one first LPHDR execution unit	The CPU of the "Host VM" that sends instructions to the TPU board is alleged to be the controlling computing device.	¶112	col. 8:16-23
wherein the at least one first computing device comprises at least one of a central processing unit (CPU)...	The Host VM's processor is alleged to be the claimed CPU. The complaint cites a Google document describing how a "master" VM runs code that "drives the TensorFlow server running on a TPU worker" (Compl. p. 59).	¶112	col. 8:49-54
wherein the number of LPHDR execution units in the device exceeds by at least one hundred the... execution units... adapted to execute... multiplication on floating point numbers that are at least 32 bits wide.	The allegations are identical to those for the ’273 patent regarding the ratio of LPHDR units to 32-bit floating point units.	¶113	col. 8:58-65

Identified Points of Contention:
- Scope Questions: A key legal question may be whether the combination of a host server (Host VM) and a separate, bus-attached accelerator card (TPU) constitutes a single "device" as claimed in the patent.
- Technical Questions: The analysis may turn on the nature of the "control" exercised by the Host VM over the TPU. A court may need to determine whether the high-level software commands sent from the host meet the claim requirement of a "computing device adapted to control the operation of the... execution unit," or if a more direct, low-level hardware control is required by the patent's disclosure.

V. Key Claim Terms for Construction

The Term: "low precision high dynamic range (LPHDR) execution unit"
- Context and Importance: This term is the central inventive concept. Its construction will determine whether Google's architecture, which performs calculations using the bfloat16 format, falls within the claims. Practitioners may focus on whether this term is defined by its functional characteristics (imprecision and range) or by specific structural embodiments disclosed in the patent.
- Intrinsic Evidence for Interpretation:
  - Evidence for a Broader Interpretation: The specification provides a functional definition, describing "low precision" as producing results that "frequently differ from exact results by at least 0.1%" and "high dynamic range" as operating on values "spanning a range at least as large as from one millionth to one million" (’273 Patent, col. 2:16-29). This could support a construction that covers any hardware meeting these functional criteria.
  - Evidence for a Narrower Interpretation: The patent also describes specific embodiments, such as circuits using a "logarithmic representation of the values" or floating-point values with small mantissas ("no more than 10 bits") (’273 Patent, col. 5:61-66). This language may be used to argue that the term should be limited to these disclosed structures or their equivalents, rather than a purely functional definition.
The Term: "a computing device adapted to control the operation of the... LPHDR execution unit" (’156 Patent, Claim 7)
- Context and Importance: This term defines the required relationship between the controller and the LPHDR processors in the claimed heterogeneous system. Its meaning is critical to whether the accused architecture, which pairs a host server with a TPU accelerator, infringes.
- Intrinsic Evidence for Interpretation:
  - Evidence for a Broader Interpretation: The claim lists a wide variety of potential controllers, including a "central processing unit (CPU)" and "microcode-based processor." The specification describes the control unit's function as issuing instructions to the processing elements, a high-level description that could encompass a host server sending commands (’156 Patent, col. 8:16-23).
  - Evidence for a Narrower Interpretation: The patent's Figure 1 depicts a "Control Unit (CU)" and a "Processing Element Array (PEA)" as tightly coupled components of a single system 100. This may support an argument that the claimed "computing device" must be more integrated with the execution units than a separate server connected via a peripheral bus.

VI. Other Allegations

Indirect Infringement: The complaint focuses on allegations of direct infringement by Google through its making, using, and testing of the accused TPU devices in its U.S. data centers (Compl. ¶¶87, 104, 122).
Willful Infringement: The complaint alleges willful infringement based on pre-suit knowledge. It asserts that Singular's founder met with Google representatives under an NDA on multiple occasions between 2010 and early 2017, disclosed the patented technology and a prototype, and informed Google that the technology was patent-protected (Compl. ¶¶16-19). The complaint further alleges that Google "copied and adopted" the invention after these disclosures, supporting this with side-by-side comparisons of Singular's presentation materials and Google's later technical publications (Compl. ¶22, pp. 7-8).

VII. Analyst’s Conclusion: Key Questions for the Case

A core issue will be one of definitional scope: can the term "LPHDR execution unit," defined in the patent with reference to specific numerical precision and range thresholds, be construed to read on Google's "MXU Reduced Precision Multiply Cells," which function by executing 32-bit floating-point operations at a reduced 16-bit (bfloat16) precision?
A second primary question will be one of architectural scope: does the accused system, which pairs a general-purpose host server with a physically separate, bus-connected TPU accelerator, constitute a single infringing "device" under the language of the asserted claims, particularly those requiring an integrated "computing device" that controls the execution units?
A central factual dispute will concern pre-suit conduct: what was the nature of the technology disclosed by Singular to Google under the NDA, and does the evidence support the allegation that Google copied this technology in developing its TPU architecture? The resolution of this issue will be critical to the determination of willfulness.