DCT

2:17-cv-00596

Word To Info Inc v. Microsoft Corp

I. Executive Summary and Procedural Information

  • Parties & Counsel:
  • Case Identification: 2:17-cv-00596, W.D. Wash., 07/07/2017
  • Venue Allegations: Venue is alleged to be proper in the Western District of Washington because Microsoft's principal place of business is located within the district and it has committed alleged acts of infringement there.
  • Core Dispute: Plaintiff alleges that Defendant’s Cortana personal assistant software infringes a family of seven U.S. patents related to systems and methods for natural language processing and understanding.
  • Technical Context: The technology concerns advanced computational linguistics, specifically systems that can parse, understand, and store knowledge from natural language to build a contextual memory for use in ongoing conversations.
  • Key Procedural History: The complaint references claim constructions for several key patent terms from prior litigation in the Northern District of California, adopting those constructions for its infringement allegations in the present case. The complaint also alleges that Defendant had knowledge of the asserted technology, noting that at least one of the patents-in-suit was cited during the prosecution of patents assigned to Microsoft.

Case Timeline

Date Event
1994-09-30 Priority Date for all Patents-in-Suit
1998-02-03 U.S. Patent No. 5,715,468 Issued
2000-10-24 U.S. Patent No. 6,138,087 Issued
2003-08-19 U.S. Patent No. 6,609,091 Issued
2008-03-25 U.S. Patent No. 7,349,840 Issued
2011-01-18 U.S. Patent No. 7,873,509 Issued
2012-12-04 U.S. Patent No. 8,326,603 Issued
2014-04-01 U.S. Patent No. 8,688,436 Issued
2017-07-07 Complaint Filed

II. Technology and Patent(s)-in-Suit Analysis

U.S. Patent No. 5,715,468 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language"

  • Issued: February 3, 1998

The Invention Explained

  • Problem Addressed: The patent's background section describes prior art natural language processing systems as being limited in their ability to understand language beyond a single sentence and lacking the capability to combine multiple sentences into a coherent representation of "experience and knowledge" (ʼ468 Patent, col. 2:3-13).
  • The Patented Solution: The invention claims a method and apparatus for processing natural language by converting it into a structured format that is stored in a memory system. This system uses a dictionary database containing entries with "syntax usage data" and "word sense numbers," which function as addresses to the meaning of words ('468 Patent, Abstract). Through processes including syntactic and morphological analysis, the system builds a contextual understanding that can be expanded with subsequent language input, enabling it to handle more complex conversational interactions ('468 Patent, col. 4:5-25).
  • Technical Importance: The technology represents an early architectural approach for moving beyond simple command-response interfaces toward more sophisticated conversational AI that can maintain context and learn from interactions.

Key Claims at a Glance

  • The complaint asserts independent claim 1 and dependent claims 8, 21, 29, and 33 (Compl. ¶18).
  • Independent Claim 1 requires:
    • A method of processing natural language.
    • Providing electronically encoded data representative of the natural language.
    • Providing a dictionary database with a plurality of entries, where entries are comprised of syntax usage data, word sense numbers, state representation data, and/or function codes.
    • Lexically processing the encoded data to access the dictionary database.
    • Providing a grammar specification.
    • Utilizing the syntax usage data and grammar specification to produce output data representing a grammatical parse of the natural language.

U.S. Patent No. 6,138,087 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language Utilizing State Representation Data, Word Sense Numbers, Function Codes and/or Directed Graphs"

  • Issued: October 24, 2000

The Invention Explained

  • Problem Addressed: As a continuation of the application leading to the ’468 patent, the ’087 Patent addresses the same fundamental problem of creating a more robust natural language understanding system that can build a knowledge base from conversations (’087 Patent, col. 3:32-40).
  • The Patented Solution: This patent further details the memory architecture, emphasizing the storage of knowledge and experience in "directed graphs" composed of nodes connected by paths (’087 Patent, Abstract). These graphs use word sense numbers and associated "state representation data" to define the meaning and relationships between concepts. The system is designed to traverse these paths to determine relationships, purposes, and context, allowing for more advanced inference and understanding (’087 Patent, col. 13:30-47).
  • Technical Importance: The patent describes a foundational method for representing complex relationships in a machine-readable format, an approach that is conceptually similar to modern knowledge graphs used in search and AI.

Key Claims at a Glance

  • The complaint asserts independent claim 17 and dependent claim 18 (Compl. ¶31).
  • Independent Claim 17 requires:
    • A method of processing natural language.
    • Providing electronically encoded data representative of the natural language.
    • Providing a dictionary database with entries comprised of syntax usage data and word sense numbers having associated state representation data.
    • Lexically processing the encoded data to access the dictionary.
    • Utilizing the syntax usage data and word sense numbers from the dictionary, with reference to the state representation data, to select and access word sense numbers for words of the natural language.

U.S. Patent No. 6,609,091 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language Utilizing State Representation Data, Word Sense Numbers, Function Codes and/or Directed Graphs"

  • Issued: August 19, 2003
  • Technology Synopsis: This patent continues the development of the natural language processing system, focusing on a method where the selection of word sense numbers is governed by a "database of requirements." These requirements must be met by the state representation data associated with the word sense numbers for them to be selected, adding a layer of validation to the semantic analysis.
  • Asserted Claims: Claims 1 and 12 (Compl. ¶42).
  • Accused Features: The complaint alleges that Cortana provides a database of requirements that must be met by associated data for word sense numbers to be selected, pointing to user and relationship requirements for entities in its knowledge graph (Compl. ¶52).

U.S. Patent No. 7,349,840 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language Utilizing State Representation Data, Word Sense Numbers, Function Codes, Directed Graphs and/or Context Memory"

  • Issued: March 25, 2008
  • Technology Synopsis: This patent adds the concept of a "context data base" to the memory system. This database stores entries related to the ongoing conversation, allowing the system to use situational context (such as time or location) to select the correct word sense numbers and better understand the user's intent.
  • Asserted Claims: Claims 1, 2, 3, and 5 (Compl. ¶56).
  • Accused Features: The complaint alleges that Cortana uses a context database to present data to a user in a specific context, providing an example of Cortana using a user's location and time (Compl. ¶66).

U.S. Patent No. 7,873,509 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language Utilizing State Representation Data, Word Sense Numbers, Function Codes, Directed Graphs, Context Memory, and/or Purpose Relations"

  • Issued: January 18, 2011
  • Technology Synopsis: This patent introduces an "experience and knowledge data base" comprising directed graphs with nodes and paths. A key element is the use of "purpose relation identification processing" to find paths between nodes, which allows the system to infer the user's goals or the purpose behind a natural language statement.
  • Asserted Claims: Claims 9, 10, and 16 (Compl. ¶70).
  • Accused Features: The complaint alleges Cortana uses an experience and knowledge database with directed graphs, specifically identifying Resource Description Framework (RDF) triples and "Path-Tree" indexing schemes as the infringing technology for path identification (Compl. ¶71, 76-77).

U.S. Patent No. 8,326,603 - "Memory System for Storing and Retrieving Experience and Knowledge with Natural Language Queries"

  • Issued: December 4, 2012
  • Technology Synopsis: This patent focuses on the processing of natural language queries using the established memory system. The claims describe providing natural language associated with a "clause implying word sense numbers" and using an experience and knowledge database to traverse directed graphs to process the query.
  • Asserted Claims: Claims 14 and 16 (Compl. ¶82).
  • Accused Features: The complaint alleges that Cortana provides natural language with associated clause-implying word sense numbers and uses its knowledge database to traverse directed graphs for query processing (Compl. ¶83, 87-88).

U.S. Patent No. 8,688,436 - "Memory System for Storing and Retrieving Experience and Knowledge by Utilizing Natural Language Responses"

  • Issued: April 1, 2014
  • Technology Synopsis: This patent claims a system that includes a "natural language plausibility and expectedness processor." This component is used to initiate access to the dictionary database and can provide alternate choices to the user, for example, through autocomplete or autosuggest functionalities.
  • Asserted Claims: Claims 1, 2, and 7 (Compl. ¶91).
  • Accused Features: The complaint alleges that Cortana's autocomplete and autosuggest functionalities serve as the claimed "natural language plausibility and expectedness processor" used to initiate access to its dictionary database (Compl. ¶99-100).

III. The Accused Instrumentality

Product Identification

  • Microsoft's Cortana personal assistant software (Compl. ¶18).

Functionality and Market Context

  • The complaint identifies Cortana as a personal assistant software that processes natural language inputs, including speech and text (Compl. ¶19). It is alleged to be "powered by Bing" and to utilize Microsoft's "Satori" technology, which functions as a knowledge graph or dictionary database providing entities and relationships (Compl. ¶20). The accused functionality includes encoding natural language, accessing this database, performing syntactic parsing, and using the results to understand user queries and generate responses (Compl. ¶26-28).

No probative visual evidence provided in complaint.

IV. Analysis of Infringement Allegations

U.S. Patent No. 5,715,468 Infringement Allegations

Claim Element (from Independent Claim 1) Alleged Infringing Functionality Complaint Citation Patent Citation
A method of processing natural language, which comprises steps: providing electronically encoded data which is representative of said natural language Microsoft Cortana allegedly provides electronically encoded data by encoding natural language inputs into audio files and/or text files which represent the natural language input. ¶19 col. 26:60-65
providing a dictionary data base wherein said dictionary data base contains a plurality of entries which are comprised of one or more of syntax usage data, associated word sense numbers having associated state representation data and/or function codes Cortana allegedly provides a dictionary database utilizing Microsoft Bing technology, which includes the Satori technology knowledge graph. This database is alleged to contain entries with syntax usage data (e.g., synonyms), associated word sense numbers, and state representation data. ¶20, ¶21, ¶25 col. 27:35-43
lexically processing said electronically encoded data to access said dictionary data base Cortana is alleged to lexically process the encoded data, using Bing speech APIs for entity discovery to assemble semantic meaning. ¶26 col. 44:42-52
providing a grammar specification; and Cortana allegedly provides a grammar specification, including support for pre-defined grammars and custom grammars, such as a Speech Recognition Grammar Specification (SRGS). ¶27 col. 32:15-18
utilizing said syntax usage data which are from entries of said dictionary data base and which are associated with words of said natural language with reference to said grammar specification to produce output data representative of a grammatical parse of the natural language Cortana allegedly utilizes syntax usage data from the database with reference to the grammar specification to produce output representing a grammatical parse. ¶28 col. 32:15-32

U.S. Patent No. 6,138,087 Infringement Allegations

Claim Element (from Independent Claim 17) Alleged Infringing Functionality Complaint Citation Patent Citation
A method of processing natural language, which comprises steps: providing electronically encoded data which is representative of said natural language For example, Microsoft Cortana provides electronically encoded data by encoding natural language inputs into audio files and/or text files that represent the natural language input. ¶32 col. 26:57-62
providing a dictionary data base wherein said dictionary data base contains a plurality of entries which are comprised of one or more of syntax usage data, and word sense numbers having associated state representation data The complaint alleges that Cortana uses Bing technology, including the Satori knowledge graph, which serves as a dictionary database. The entities in Satori are alleged to be associated with entries of a knowledge repository and have start addresses, which correspond to word sense numbers associated with state representation data. ¶33, ¶37 col. 28:50-67
utilizing said syntax usage data and said word sense numbers which are from entries of said dictionary data base and which are associated with words of said natural language with reference to associated state representation data to select and access said word sense numbers for words of said natural language Microsoft Cortana allegedly utilizes algorithms based on relationships between database entries and synonyms to select and access these entries. The complaint provides an example of Cortana providing information on Mt. Everest when asked about the tallest mountain. ¶39 col. 8:36-44
  • Identified Points of Contention:
    • Scope Questions: A central question for both the ’468 and ’087 patents will be whether the terms "dictionary data base" and "word sense numbers," as defined and used in the patents from the 1990s, can be construed to cover Microsoft's modern, distributed, web-scale knowledge graph architecture (Bing/Satori). The patents describe a more self-contained system, whereas the accused product is a cloud-based service.
    • Technical Questions: The complaint relies heavily on a prior claim construction of "word sense number" from a different district court (Compl. ¶21, ¶33). A key technical dispute will likely focus on whether the internal data structures of Cortana, such as the "start addresses" and "identification numbers" associated with entities in the Satori knowledge graph, perform the specific functions and have the specific components required by that construction and the patent specifications.

V. Key Claim Terms for Construction

  • The Term: "word sense number"

  • Context and Importance: This term is fundamental to the architecture of the claimed invention and appears in the asserted independent claims of both lead patents. Its definition is critical, as the Plaintiff's infringement theory depends on mapping this term to data identifiers within Microsoft's Bing and Satori technologies. Practitioners may focus on this term because the complaint proactively offers a construction from a previous case, indicating it has been a central point of dispute before and will likely be again.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The specification provides a high-level definition, stating a "word sense number is an address to the meaning of a word" and an "address to a dictionary definition" ('468 Patent, col. 4:8-9). This language could support an interpretation covering any form of pointer or address that links a word to its semantic meaning in a database.
    • Evidence for a Narrower Interpretation: The specification provides highly detailed formats for different types of word sense numbers, breaking them down into specific components like a "word sense identification number," a "type number," a "specificity number," and an "experience number" ('468 Patent, FIG. 17A, col. 6:40-49). A defendant may argue that the term requires this specific multi-component structure, which may not be present in the accused system's data identifiers.
  • The Term: "dictionary data base"

  • Context and Importance: This term defines the core repository of linguistic information that the claimed method processes. Plaintiff's case alleges that Microsoft's vast, cloud-based Bing and Satori technologies collectively function as the claimed "dictionary data base." The viability of the infringement claim may depend on whether this modern, distributed architecture can be equated with the database structure described in the patent.

  • Intrinsic Evidence for Interpretation:

    • Evidence for a Broader Interpretation: The patent claims describe the database as containing a "plurality of entries" with "syntax usage data" and "word sense numbers" ('468 Patent, cl. 1). This functional description could be argued to encompass any data repository that serves these functions, regardless of its specific implementation.
    • Evidence for a Narrower Interpretation: The patent's figures and detailed description illustrate a more localized and explicitly structured database, with specific tables and formats like "Dictionary 20" ('468 Patent, FIG. 3A, col. 27:35-43). A defendant could argue that this implies a more self-contained and pre-defined structure, unlike the dynamically updated, web-scale graph of the accused system.

VI. Other Allegations

  • Indirect Infringement: The complaint does not contain allegations supporting indirect infringement; each count of infringement alleges direct infringement under 35 U.S.C. § 271(a) (Compl. ¶18, 31, 42, 56, 70, 82, 91).
  • Willful Infringement: The complaint seeks a finding of willful infringement (Compl. p. 42, ¶B). The factual basis alleged for willfulness is pre-suit knowledge, based on the assertion that "one of the patents-in-suit has been cited during prosecution of patents listing Defendant Microsoft as assignee" for related natural language processing technologies (Compl. ¶9).

VII. Analyst’s Conclusion: Key Questions for the Case

  • Definitional Scope: A core issue will be whether the specific data structures claimed in patents filed in the mid-1990s, such as "word sense number" and "dictionary data base", can be construed broadly enough to read on the fundamentally different architecture of a modern, cloud-based, and dynamic knowledge graph like that allegedly used by Microsoft Cortana. The case may turn on whether the court views the claims through a functional lens or a more rigid, structural one.
  • Evidentiary Mapping: A central evidentiary question will be one of technical equivalence. Can the plaintiff produce sufficient evidence to demonstrate that the internal identifiers and data relationships within Microsoft’s proprietary Bing and Satori technologies function in the same way as the multi-component "word sense numbers" and "state representation data" detailed in the patent specifications, or will Microsoft be able to demonstrate a fundamental mismatch in technical operation?
  • Impact of Prior Construction: The plaintiff’s decision to preemptively introduce claim constructions from a prior case in another district (Compl. ¶21) raises a key strategic question: will this court adopt or be persuaded by those earlier constructions, and how will that choice influence the central questions of definitional scope and technical mapping?