How to get a computer to answer a factoid question

Computers can now answer factoid questions–if they can tell what’s being asked…

architecture_question_reponse
An architecture for a question-answering system. Picture source: https://commons.wikimedia.org/wiki/File:Architecture_question_reponse.png (By Tinmn (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)%5D, via Wikimedia Commons)

France’s ability to keep on keeping on after the Paris attacks has been amazing.  In that spirit, here’s a post about something other than how horrified I am.

In a recent post, we talked about factoid questions–questions that typically start with words like who, what, when, or where, and typically have answers that are just a short phrase.  We’re pretty good at getting computers to answer those kinds of questions.

Once upon a time, the assumption in trying to get computers to answer questions (which we’re going to call question-answering in English, or questions-réponses in French) was that there was a database that contained the answers, and you were going to get the computer to process the question in such a way as to retrieve the answer from the database.  Today, the assumption is that there is a web page somewhere that has the answer.  So, how do you get to that answer?

The first step in question-answering is usually to figure out what kind of question you’re dealing with. This lets your system know what kind of answer it should be looking for.  
Where is Paris? and Where is the spleen?  call for very different kinds of answers.  On the other hand, Where is the capital of France? and What is the capital of France? need the same kind of answer.  So, it’s not as simple as just checking whether the question starts with who, what, when, or where. (Of course, there are many other ways that you can ask a factoid question—When was Mozart born? could more or less equivalently be asked as What year was Mozart born? You can see how difficult this can get.)

The French Wikipedia page on questions-réponses talks about some things that are helpful in making these kinds of distinctions between question types (and types of expected answers), and of course Zipf’s Law comes into play, so we’ll need to learn some new words (or, at least, I will–I don’t know about you):

  • le focus: As far as I can tell, this is an unassimilated English loan word that means “focus.”  Le focus d’une question correspond à la propriété ou l’entité recherchée par la question.  “The focus of a question corresponds to the property or the entity sought by the question.”
  • Le thème: theme, subject, or topic.  Le thème de la question (ou topic) est l’objet sur lequel se porte la question.  “The theme of the question (or topic) is the thing that the question is about.”
Question Focus Theme
Who is the president of Benin? Who the president of Benin
When was Mozart born? When Mozart
When do cells divide? When cells
How much does a kimono cost? How much a kimono
How much does an elephant weigh? How much an elephant

You can see from even a few examples that this is hard for a computer to do.  When was Mozart born? requires a very different answer from When do cells divide?, despite the fact that the focus looks the same in both questions.  Similarly, How much does a kimono cost? and How much does an elephant weigh? have focuses (foci?) that look the same, but they require very different types of answers.  However, determining the focus and the theme of a factoid question are a good start.  We’ll see in another post how identifying what are known as named entities can help to refine our understanding of the question.

3 thoughts on “How to get a computer to answer a factoid question”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Curative Power of Medical Data

JCDL 2020 Workshop on Biomedical Natural Language Processing

Crimescribe

Criminal Curiosities

BioNLP

Biomedical natural language processing

Mostly Mammoths

but other things that fascinate me, too

Zygoma

Adventures in natural history collections

Our French Oasis

FAMILY LIFE IN A FRENCH COUNTRY VILLAGE

ACL 2017

PC Chairs Blog

Abby Mullen

A site about history and life

EFL Notes

Random commentary on teaching English as a foreign language

Natural Language Processing

Université Paris-Centrale, Spring 2017

Speak Out in Spanish!

living and loving language

- MIKE STEEDEN -

THE DRIVELLINGS OF TWATTERSLEY FROMAGE

mathbabe

Exploring and venting about quantitative issues

%d bloggers like this: