Projects

Ambiverse

Ambiverse LogoAmbiverse develops AI technology to automatically understand and analyze big collections of textual data. Our Cloud-based natural language understanding and text analytics API is built on highly-cited research (e.g. by IBM Watson as state-of the-art) at the German Max Planck Institute for Informatics, a world leading research lab. The Ambiverse core technology identifies persons, organizations, products, and other real-world entities in text. For instance, consider the sentence “Page played the hit Kashmir on his uniquely tuned Les Paul.” Our technology understands that “Page” refers to the famous rock guitarist Jimmy Page and not to Larry Page (or any other “Page”), or that Kashmir refers to the song and not the Asian region. The set of entities that our system spots are stored in a continuously expanding knowledge-graph that we mine from trusted web sources. Also, in contrast to other services, our technology is tailored towards running over customer-specific knowledge graphs for highly specialized text analytics. These can be directly provided by the user or built by Ambiverse according to customer requirements and specific data sources. Our long-term goal is to develop language understanding technology so that machines can extract and understand knowledge in order to answer and fulfill complex human requests.

My Research

AIDA is a framework and online tool for entity detection and disambiguation. Given a natural-language text, for example news articles, it maps mentions of ambiguous names onto canonical entities (e.g., individual people or places) registered in the YAGO knowledge base. The source code, JSON Web Service API, and demo is available on the AIDA website.

AIDA Example
Finding the meaning in a sentence with AIDA.

YAGO is a huge semantic knowledge base, derived from WikipediaWordNet and GeoNames. Currently, YAGO has knowledge of more than 10 million entities (like persons, organizations, cities, etc.) and contains more than 120 million facts about these entities. All the data as well as several interfaces to browse and query the data are available on the YAGO website.

STICS is an entity-centric search engine that makes use of AIDA and YAGO. By extending the Google slogan of “things, not strings” to support also entity categories, STICS provides powerful functionality for querying and analyzing news and other text corpora in terms of entities, semantic classes, and text phrases. You can search, for example, for presidents of the United States and the JFK airport, and see how STICS distinguishes between JFK and JFK.