DCT

3:25-cv-07170

Artificial Intelligence Industry Association Inc v. Osaro Inc

Key Events
Amended Complaint

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 3:25-cv-07170, N.D. Cal., 10/27/2025
  • Venue Allegations: Plaintiff alleges venue is proper in the Northern District of California because Defendant has committed acts of infringement in the district, including offering for sale and selling accused software products to customers with operations in the district, and because Defendant's principal place of business is in San Francisco.
  • Core Dispute: Plaintiff alleges that Defendant’s robotic perception and control software infringes patents related to generating synthetic training data for machine learning, embedding calibration metadata into stereoscopic video, and stabilizing stereoscopic video.
  • Technical Context: The technology at issue resides in the field of computer vision for robotic automation, where AI models are trained on large datasets to enable robots to perceive and interact with their environment.
  • Key Procedural History: Prior to filing, Plaintiff alleges it sent Defendant a formal demand letter identifying the Asserted Patents and asserting that Defendant's software products infringe one or more of their claims.

Case Timeline

Date Event
2015-04-29 Priority Date for ’315 and ’693 Patents
2018-03-27 U.S. Patent No. 9,930,315 Issued
2018-09-11 U.S. Patent No. 10,075,693 Issued
2019-04-25 Priority Date for ’272 Patent
2022-02-22 U.S. Patent No. 11,257,272 Issued
2023-06-01 Defendant's BusinessWire Announcement Date
2025-10-27 Complaint Filing Date

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 9,930,315 - Stereoscopic 3D Camera for Virtual Reality Experience

  • Patent Identification: U.S. Patent No. 9,930,315, titled Stereoscopic 3D Camera for Virtual Reality Experience, issued on March 27, 2018 (’315 Patent) (Compl. ¶21).

The Invention Explained

  • Problem Addressed: The patent describes the challenge of stabilizing stereoscopic 3D video, as unwanted camera motion can disrupt the user’s immersive experience, particularly in virtual reality (VR) headsets which are highly sensitive to such motion ('315 Patent, col. 7:49-53).
  • The Patented Solution: The invention is a method for processing stereoscopic video by receiving the video with embedded calibration information, processing frames in real-time to identify stabilization data (e.g., through frame comparison), extracting both static data (like lens distortion profiles) and time-varying data (like inertial measurement unit readings), and then generating and applying a stabilization operation to both the left and right video channels ('315 Patent, Abstract; col. 8:14-26).
  • Technical Importance: This approach allows for real-time correction of instability in stereoscopic video, which is critical for providing a comfortable and realistic VR experience from captured footage ('315 Patent, col. 7:49-57).

Key Claims at a Glance

  • The complaint asserts independent Claim 7 ('315 Patent, col. 14:28-50; Compl. ¶43).
  • Essential elements of Claim 7 include:
    • Receiving a stereoscopic 3D video sequence with embedded calibration information.
    • Processing video frames in real-time to identify stabilization data based on a comparison of video frames.
    • Extracting calibration information in at least two ways: one static extraction (e.g., lens distortion profile) per sequence and one time-varying extraction (e.g., inertial data) per frame.
    • Generating and applying a video stabilization operation based on both the left and right channels of the video sequence.
  • The complaint reserves the right to assert other claims (Compl. ¶47).

U.S. Patent No. 10,075,693 - Embedding Calibration Metadata Into Stereoscopic Video Files

  • Patent Identification: U.S. Patent No. 10,075,693, titled Embedding Calibration Metadata Into Stereoscopic Video Files, issued on September 11, 2018 (’693 Patent) (Compl. ¶19).

The Invention Explained

  • Problem Addressed: The patent addresses the need to associate camera and sensor parameters (e.g., lens distortion, accelerometer data) with video files in a time-synchronized manner, which is difficult when parameters change during recording or when video from different cameras is combined ('693 Patent, col. 1:53 - col. 2:12).
  • The Patented Solution: The patent discloses a computerized system that records a stereoscopic video feed along with contemporaneous metadata feeds from sensors. A processor embeds these metadata feeds into the video in real-time by encoding the data into the "subtitles or closed captioning metadata fields of the video file format," which preserves the timing alignment between the data and the corresponding video frames ('693 Patent, Claim 1; col. 8:10-21).
  • Technical Importance: This method provides a standardized mechanism to embed dynamic, time-sequenced sensor data directly into a video file, ensuring that playback devices can accurately calibrate and render the video ('693 Patent, col. 2:5-12).

Key Claims at a Glance

  • The complaint asserts independent Claim 1 (’693 Patent, col. 9:52 - col. 10:21; Compl. ¶52).
  • Essential elements of Claim 1 include:
    • A computer store containing a stereoscopic video feed and contemporaneous metadata feeds from a sensor.
    • A computer processor programmed to obtain the video and metadata feeds.
    • The processor is further programmed to embed the metadata feeds into the stereoscopic video feed in real-time as it is recorded.
    • The embedding process involves encoding the metadata into the subtitles or closed captioning fields of the video file format to convey timing.
  • The complaint reserves the right to assert other claims (Compl. ¶57).

U.S. Patent No. 11,257,272 - Systems and Methods for Generating Labeled Image Data for Machine Learning Using a Multi-Stage Image Processing Pipeline

  • Patent Identification: U.S. Patent No. 11,257,272, titled Systems and Methods for Generating Labeled Image Data for Machine Learning Using a Multi-Stage Image Processing Pipeline, issued on February 22, 2022 (’272 Patent) (Compl. ¶17).
  • Technology Synopsis: The patent addresses the technical problem of acquiring large, diverse, and accurately labeled datasets required for training machine learning models in computer vision (Compl. ¶18; ’272 Patent, col. 1:55 - col. 2:4). The patented solution is a multi-stage method for automatically generating synthetic, richly annotated image datasets by constructing virtual 3D scenes, populating them with 3D models, and using virtual cameras with configurable settings to render realistic images and associated data channels like depth maps (Compl. ¶18; ’272 Patent, Abstract).
  • Asserted Claims: The complaint asserts at least method Claim 17 (Compl. ¶¶18, 62).
  • Accused Features: The complaint alleges that Defendant’s SightWorks software generates synthetic image data, including stereoscopic images, for training machine learning models used in its robotic vision systems by creating virtual scenes, populating them with 3D models, and rendering images from various camera views (Compl. ¶¶25, 62).

III. The Accused Instrumentality

Product Identification

The accused instrumentalities are Defendant Osaro, Inc.’s SightWorks perception and control software and associated robotic systems, such as the Osaro Robotic Kitting System (collectively, the "Products") (Compl. ¶¶2-3).

Functionality and Market Context

  • The complaint describes the Products as robotic piece-picking solutions for warehouse automation tasks like bagging, kitting, and depalletizing (Compl. ¶2). The SightWorks software allegedly uses advanced machine learning and vision systems that process stereoscopic 3D images or videos for object recognition and manipulation (Compl. ¶¶2-3).
  • Functionality accused of infringement includes generating synthetic image data for training machine learning models; capturing stereoscopic 3D video feeds while embedding calibration metadata (e.g., camera intrinsic parameters, sensor data) into the video files; and applying video stabilization operations using motion data from sensors (Compl. ¶¶3, 25, 28, 31).
  • The complaint positions the Products as "advanced machine-learning vision software" for robotic operations in the warehouse automation market (Compl. ¶23).

IV. Analysis of Infringement Allegations

No probative visual evidence provided in complaint.

’315 Patent Infringement Allegations

Claim Element (from Independent Claim 7) Alleged Infringing Functionality Complaint Citation Patent Citation
receiving a stereoscopic 3D video sequence comprising an embedded calibration information; Osaro's SightWorks systems receive stereoscopic 3D video sequences from a multi-camera rig, which contain embedded calibration information such as lens distortion profiles and camera positions. ¶31 col. 14:30-32
processing video frames of the stereoscopic 3D video in a real time...to identify data for stabilizing the stereoscopic 3D video sequence based on a comparison of a portion of the video frames to other frames...; SightWorks processes video frames in real-time to identify stabilization data by comparing consecutive frames to detect unwanted motion or jitter. ¶31 col. 14:33-39
extracting...the embedded calibration information comprising: extracting calibration information in at least two ways...a first way comprising extracting a static calibration information once per the stereoscopic video sequence...a second way comprising extracting a time varying calibration information once per frame... SightWorks allegedly extracts static calibration information (e.g., lens distortion profiles) once per sequence and time-varying calibration data (e.g., IMU data from gyroscopic or accelerometer sensors) on a per-frame basis. ¶31 col. 14:39-47
generating a video stabilization operation based on a left channel and a right channel of the stereoscopic 3D video sequence; and applying the video stabilization operation on the portion of the video frames for both the left and right channels. SightWorks allegedly generates and applies a stabilization operation to both left and right channels, which includes unwrapping fisheye images and stabilizing for translation and rotation. ¶31 col. 14:47-50

’693 Patent Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
a computer store containing data, wherein the data comprises: a stereoscopic video feed...and a plurality of contemporaneous metadata feeds... The SightWorks system includes a computer store (memory) that holds stereoscopic video feeds from the robotic camera and contemporaneous metadata feeds (e.g., camera intrinsics, IMU data) from sensors. ¶28 col. 9:55-63
a computer processor...programmed to: obtain the stereoscopic video feed from the computer store, obtain the plurality of contemporaneous metadata feeds from the computer store, and embed, in real-time as the stereoscopic video feed is recorded, the stereoscopic video feed with the plurality of contemporaneous metadata feeds... The SightWorks processor retrieves live video and metadata from memory and embeds the metadata into the video stream in real time as it is recorded. ¶28 col. 10:1-8
wherein the plurality of metadata feeds is embedded into the stereoscopic video feed by encoding the contemporaneous metadata feeds into the subtitles or closed captioning metadata fields of the video file format, such that the timing of the subtitle or closed captioning metadata conveys the timing of the metadata feeds. Osaro's system allegedly embeds calibration and sensor data using subtitle or closed captioning fields in the video file to preserve timing alignment between the data and the corresponding video frames. ¶28 col. 10:14-21

Identified Points of Contention

  • Scope Questions: A central question for the ’693 Patent will be factual: does the accused SightWorks software encode metadata into the specific "subtitles or closed captioning metadata fields" as required by Claim 1, or does it use a different, non-infringing data track or embedding method? The specificity of this claim language suggests little room for interpretation beyond its plain meaning.
  • Technical Questions: For the ’315 Patent, a point of contention may be whether the "stabilization" performed by Osaro for robotic precision is the same technical "video stabilization operation" contemplated by the patent, which is described in the context of VR playback for a human viewer. The analysis will question if correcting robotic arm jitter is functionally equivalent to stabilizing a video sequence for an immersive user experience.

V. Key Claim Terms for Construction

’315 Patent, Claim 7

  • The Term: "video stabilization operation"
  • Context and Importance: The scope of this term is critical. The infringement analysis will depend on whether Osaro's methods for enhancing robotic operational stability by correcting for camera motion fall within the definition of the claimed "stabilization operation," which the patent describes in the context of improving a human's VR viewing experience. Practitioners may focus on whether the term is limited to user-centric visual enhancement or broadly covers any algorithmic motion correction.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: The claim language itself describes the operation in functional terms, such as "stabilizing a rotation of the stereoscopic 3D video sequence" and stabilizing "translational and rotational components," which could be read to cover any such correction regardless of application ('315 Patent, col. 15:8-14).
    • Evidence for a Narrower Interpretation: The specification repeatedly frames the purpose of stabilization in the context of VR and the user experience, for example, to "correct, or stabilize, the scene for the user of the playback device" ('315 Patent, col. 8:8-12). The abstract also links the invention to providing a VR experience. This context may support a narrower construction tied to human visual perception.

’693 Patent, Claim 1

  • The Term: "encoding the contemporaneous metadata feeds into the subtitles or closed captioning metadata fields of the video file format"
  • Context and Importance: This term recites a highly specific implementation detail for achieving time-synchronized metadata. Infringement hinges on whether Osaro's products use this exact mechanism. Practitioners may focus on this term because if Defendant uses any other method for embedding timed data (e.g., a custom metadata track within the file container), it may fall outside the literal scope of the claim.
  • Intrinsic Evidence for Interpretation:
    • Evidence for a Broader Interpretation: A party might argue this is merely an exemplary mechanism for the broader invention of time-aligning metadata. The patent's summary describes the goal as embedding parameters "directly into the video file," with the subtitle field being one way to do it ('693 Patent, col. 2:55-58).
    • Evidence for a Narrower Interpretation: The language of Claim 1 is explicit and limiting. The claim itself defines the embedding step by "encoding...into the subtitles or closed captioning metadata fields." The specification reinforces this, stating metadata can be encoded "using subtitle metadata or a table in the metadata header" ('693 Patent, col. 8:9-12). This provides strong support for a narrow construction limited to the recited fields.

VI. Other Allegations

  • Indirect Infringement: The complaint alleges inducement of infringement under 35 U.S.C. § 271(b) based on Defendant providing "detailed technical documentation, tutorials, and customer support services" that allegedly instruct customers on using the infringing functionalities (Compl. ¶4, 34). Contributory infringement under § 271(c) is alleged on the basis that Defendant sells software with "specialized modules" for synthetic data generation and metadata embedding that have "no substantial non-infringing use" (Compl. ¶¶7, 11, 35).
  • Willful Infringement: The complaint alleges willful infringement based on Defendant's alleged actual knowledge of the Asserted Patents from a pre-suit "formal demand letter" (Compl. ¶6). The complaint asserts that Defendant’s continued promotion and sale of the accused products despite this notice constitutes deliberate and willful conduct (Compl. ¶¶4, 29, 32).

VII. Analyst’s Conclusion: Key Questions for the Case

  • A core issue will be one of technical implementation: does Osaro's software embed time-aligned metadata using the specific "subtitles or closed captioning metadata fields" as explicitly required by Claim 1 of the '693 patent, or does it employ a different, non-infringing mechanism? This is likely to be a central factual dispute dependent on evidence from the accused product's architecture.
  • A key question will be one of functional scope: does the "video stabilization" performed by Osaro's software to enhance the precision of robotic manipulators constitute the same "video stabilization operation" claimed in the '315 patent, which the specification frames in the context of improving a human's immersive VR viewing experience? The case may turn on whether the term is construed broadly to cover any motion correction or is limited to the patent's disclosed purpose.
  • A third central question, related to the '272 patent, will be one of procedural fidelity: does the accused synthetic data generation process practice all elements of the claimed multi-stage pipeline, including the specific steps of constructing a second, distinct synthetic scene and capturing views from identical camera positions as the first scene?