PTAB

IPR2019-01187

Microsoft Corp v. Uniloc 2017 LLC

1. Case Identification

2. Patent Overview

  • Title: Method and System for Indicating Change of Speaker in a Videoconference Application
  • Brief Description: The ’114 patent relates to a method for managing video displays in a multi-participant videoconference. The system distinguishes between short and long utterances from a new speaker, removing the new speaker's image after a short "first duration" of speech but replacing an existing participant's image if the speech continues for a longer duration.

3. Grounds for Unpatentability

Ground 1: Claims 1-2 are obvious over Tompkins

  • Prior Art Relied Upon: Tompkins (Patent 5,014,267).
  • Core Argument for this Ground:
    • Prior Art Mapping: Petitioner argued that Tompkins, a single reference, teaches the specific durational logic of the challenged claims. Tompkins discloses a priority-based system with primary, secondary, and non-displayed ("m-ary") conferees. A non-displayed speaker who speaks for a predetermined "first duration" (e.g., 750ms) triggers a voice detect signal and is promoted to "secondary conferee," causing their image to be displayed. If that speaker stops and another non-displayed participant begins speaking, the first speaker is demoted back to non-displayed status, and their image is removed. However, if the secondary conferee continues speaking long enough to trigger a second voice detect signal (a "longer duration"), they are promoted to "primary conferee," thereby replacing the image of the previous primary conferee. This maps directly to the claimed steps of displaying, removing for a first duration, and replacing for a longer duration.
    • Key Aspects: This ground asserted that the core inventive concept—a two-tiered temporal threshold for removing versus replacing a speaker's image—was fully disclosed within a single prior art patent.

Ground 2: Claims 1-7 are obvious over Lai in view of Kamata

  • Prior Art Relied Upon: Lai (Patent 6,288,740) and Kamata (Patent 5,953,050).
  • Core Argument for this Ground:
    • Prior Art Mapping: This ground was presented to the extent the claims are interpreted to require visual animation. Petitioner argued Lai teaches the foundational system: a multi-quadrant video display where one quadrant is voice-activated, showing the current "dominant speaker." When a new speaker becomes dominant, their video replaces the previous one. Kamata addresses the problem of jarring or abrupt speaker transitions in such systems by teaching the use of "special effects" to create smooth, gradual changes. Specifically, Kamata discloses animations where an old speaker's image gradually contracts while a new speaker's image gradually dilates to replace it. This combination directly meets the limitations of claim 2, which recites "displacing and contracting," and the dependent claims reciting "dilating."
    • Motivation to Combine: A POSITA implementing Lai's system would recognize, as Lai itself noted, that speaker changes can be "disruptive." A POSITA would combine Kamata's known solution for smooth visual transitions to improve the user experience of Lai's system, a straightforward and predictable enhancement.
    • Expectation of Success: The ’114 patent itself admits that the technology for generating such visual effects was well-known and posed "no problem" to implement, confirming a POSITA would have had a high expectation of success in combining the references.

Ground 3: Claims 1-2 are obvious over Lai in view of Kannes

  • Prior Art Relied Upon: Lai (Patent 6,288,740) and Kannes (Patent 5,382,972).

  • Core Argument for this Ground:

    • Prior Art Mapping: This ground addressed the core "remove versus replace" logic using a different combination. While Lai provides the base voice-switching system, Kannes provides a method for handling simultaneous speakers. Kannes teaches that if two participants speak simultaneously for at least a "preselected time period" (a first duration), the system will alternate between displaying their images. In this scenario, the first speaker's image is displayed, then removed after the preselected time period in favor of the second speaker. As the alternating pattern continues, the first speaker's speech will necessarily be of a longer duration when their image replaces the second speaker's image. This combination satisfies the specific temporal logic of claim 1.
    • Motivation to Combine: Kannes was argued to fill a functional gap in Lai by providing a specific solution for the common problem of simultaneous speakers. A POSITA would be motivated to incorporate Kannes's alternating display logic into Lai's system to create a more robust and fair videoconferencing experience, preventing one speaker from dominating the display during a prolonged exchange.
    • Expectation of Success: Both references operate on the same fundamental principles of voice-activity detection. Integrating Kannes's logic for handling a specific conversational scenario into Lai's more general framework was presented as a predictable and straightforward modification.
  • Additional Grounds: Petitioner asserted additional obviousness challenges based on Lai alone; Tompkins in view of Kamata; and the combination of Lai, Kamata, and Kannes. These grounds relied on similar arguments regarding the known principles of voice-activated switching and the obviousness of incorporating animations or specific logic for handling simultaneous speakers.

4. Relief Requested

  • Petitioner requests institution of an inter partes review and cancellation of claims 1-7 of Patent 6,473,114 as unpatentable.