Loebner Prize 2009

US$3000 and a Bronze Annual Medal

Sunday, 6 September 2009 Brighton, England

In conjunction with InterSpeech 2009

 

 

In 1950 Alan Turing wrote:

“…I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 10^9, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning…”

 

Turing’s prediction is ambiguous.  Did he mean a 5 minute test, or did he mean 5 minutes of questioning the program? If the latter, and presuming that the human would also be subjected to a 5 minute questioning period, the test itself would take 10 minutes.

 

The 2008 Loebner Prize put this prediction to the test in the first manner by having 5 minute Turing Tests.  That is, the judge was allowed a total of 5 minutes to respond to both entities.  As a consequence the expected interaction time with the computer program was 2.5 minutes.

 

The 2009 Loebner Prize will test Turing’s assertion in the second manner.  The 2009 rules will require that the each judge be required to interrogate each entity for 5 minutes.

 

[Note that there is also the ambiguity of the “70 per cent chance,” since if the computer were able to respond as a human, we would still only expect it to be chosen as the human 50 per cent of the time.]

 

Since the questioning period will only be 10 minutes, the US$25,000 and the Silver Medal will not be at risk.

 

Rules for Loebner Prize 2009.

 

1: IMPORTANT DATES:

 

The 2009 Competition is scheduled for Sunday, 6 September 2009 in Brighton, England in conjunction with InterSpeech 2009.

 

6 Apr 2009 - Opening Date for Entries

4 May 2009 - Closing Date for receipt of Entries

1 Jun 2009 - Final Four announced

 

The date and venue are subject to change but NO changes to the date and venue will be made after 1 May 2009. In the case that the date is moved, it will be not be moved to a date earlier than Sunday, 4 October 2009 or later than Sunday, 1 November 2009

 

Entrants have three options for submitting their programs.

A. They may submit their entries on CD, DVD or USB Flash via a message service

• requiring a receipt signature and

• having a time/date stamp (E.g. Certified, Registered, FedEx, UPS, etc)

B. They may install their programs at the testing site on a management supplied computer.

C. They may bring a computer with the program installed to the testing site.

 

Entrants choosing option A. should transmit their entry to:

       Loebner Prize Contest

       c/o Crown Industries, Inc.

       155 North Park St.

       East Orange, NJ 07017

 

Entrants choosing options B. or C. must schedule the time/date of their appearance with me PRIOR to 6 Apr.  The date to be scheduled must be after 6 Apr and prior to 4 May.

 

Final Four entrants who chose submission option A. do NOT have to be present at the competition. Those who choose options B. or C. MUST be present to install and operate their entries.

 

No entry will be tested by contest management which requires contest management to key in path names.

 

No entry will be tested by contest management which requires contest management to modify system variables (although these may be modified by a supplied installer).

 

No entry will be tested by contest management which does not provide, on the transmittal media, all necessary programs, interpreters, etc (e.g. Perl, MySQL, etc).

 

No person may be affiliated with more than one entry.

 

Every entry must be accompanied by a statement asserting that the submitter(s) have intellectual rights to all components of the entry.

 

Entrants under 18 years of age must have written permission by at least one guardian.

 

Only the first 16 compliant entries will be evaluated in depth.  This means that all entries will be tested for in order of receipt for compliance with the rules. The 16 compliant entries having the earliest time stamps will be screened according to the criteria in point 4, below.

 

If there is no compliant Entry for the 2009 Competition, the $3000 prize will be added to the 2010 Competition prize making the 2010 prize $6000, and the 2010 Competition will be held under these rules.

 

2: COMMUNICATIONS PROTOCOL. 

The Loebner Prize Protocol (LPP) will be used in the 2009 will be. Each Entry Program must communicate with a "Judge Communications" program in the following manner:

 

The LPP is a character by character asynchronous communications protocol.

 

Each program, upon startup, must provide a “browse” function to select a directory.  Communications shall be by means of the creation, detection, and deletion of sub-directories within the specified communications directory.

 

To simulate a key press the entry program must create a sub-directory within the communications directory with the following format:

 

“time.keypress-name.extension”

 

where time is a monotonically increasing 18 digit number (in lexical and numerical order) (i.e. zero filled to the left) to be retrieved from the system clock and expressed as milliseconds past some initial time as defined by the system clock.

“keypress-name” is either a single letter (case sensitive) or the name of the special character, as appended to these rules.

The extension is “.other”

 

For example: “000001234567890123.bracketleft.other”

 

To detect a key press by the judge, the program must detect, within the communications directory a sub-directory with the same format, but extension “.judge” and then must remove or delete the judge’s sub-directory from the communications directory.

 

A previous version of the judge program is available at:

 

http://loebner.net/Prizef/JComm.txt

 

In order to run this as a Perl program, change the extension from .txt to .pl (or whatever extension is assigned to Perl programs).

 

Note that there will be an update to this program but the basic communications strategy will not change.

 

3: INTERACTION SEQUENCE.

 

Each judge will begin the round by making an initial comment with the LEFT entity. The judge will continue interacting for the left entity for 5 minutes.  At the conclusion of the five minutes, the judge will begin the interaction with the RIGHT entity and continue for 5 minutes.

 

The decision as to whether the LEFT entity is the human or the computer will be made on a random basis.

 

Both entry programs and human confederates must wait until a judge starts the interaction.

 

Entries will be expected to respond to the judges' initial comment or question.  There will be no restrictions on what names etc the entries, humans, or judges can use, nor any other restrictions on the content of the conversations.

 

Participants are advised that transcripts of their conversations will be published.

 

At the conclusion of the 10 minutes of questioning, judges will be allowed 10 minutes to review the conversations.   They will then score one of the two entities as the human. Following this, there will be a 5 minute period for judges and confederates to take their places for the next round.

 

Contest management reserves the right to enter one or more publicly available open source programs,

 

3:  SCORING THE "FINAL FOUR".

 

The Final Four Competition will be scored using the Method of Paired Comparisons.

 

Each judge will select one entry from each pair as being the human.  After the judging has been completed each judge will have judged 4 entries as “non-human.”  Of these 4 perceived “non-human” entities each judge shall then rank them in terms of “degree of humanness” with 4 being “Most Human” and 1 being “Least Human.”

 

The computer program which has been evaluated as “Human” the most times will be declared the winner.

 

We wish (a) each Entry to be compared with every Confederate; (b) each Judge to evaluate every Entry, (c) each Judge to evaluate every Confederate.

 

Label the four Entries E1..E4, four human Confederates C1..C4, and four judges J1..J4

 

The following matrix has Judges as rows and Entry Programs as columns. The intersection of each row and column shows which human Confederate is assigned to the combination of Entry and Judge.

 

        E1 .... E2 .... E3 .... E4

----------------------------------

J1 .... C1 .... C2 .... C3 .... C4

J2 .... C4 .... C1 .... C2 .... C3

J3 .... C3 .... C4 .... C1 .... C2

J4 .... C2 .... C3 .... C4 .... C1

 

 

For example, reading across the row 2 we see that J2 compares E1 with C4, E2 with C1, E3 with C2, and E4 with C3. J2 will have scored every Entry and every Confederate, but in different combinations than J's 1, 3 and 4.

 

Reading down the third column, we see in the first row that E3 is judged by J1 against confederate C3. Let us enter a 1 in that cell if E3 was chosen as the human and 0 otherwise. We may continue down the column, entering a 1 in the second row if E3 was evaluated as the human against confederate C2, zero otherwise. The sum of the column will be the number of times E3 was judged as "more human" than a Confederate. We may do this for each Entry.

 

The Entry with the highest column total will be declared the winner.

 

If two or more Entries tie for high column totals, the programs shall be evaluated by the mean of its rankings by those who judged them not to be the human.

 

Judging will consist of seven rounds of 20 minutes duration with 5 minute intermissions. Not all Judges and Confederates will participate in every round. In each round, Judges will have 10 minutes to interact with a pair and 10 minutes to review and score the programs. After the 10 minute evaluation period there will be a 5 minute break for reassignment.

 

The following table shows each round. In the first round J1 compares E1 with C1 and J2 compares E3 with C2. Judges J3 and J4 and Confederates C3 and C4 are excused from the round. Excused Judges will be kept separate from excused Confederates and both will be kept separate from the competition.

 

Round ...... Participating ............Excused

1 .... J1E1C1 J2E3C2 ................ J4 C3 J3 C4

2 .... J4E1C2 J3E3C1 J2E4C3 ......... J1 C4

3 .... J3E1C3 J4E3C4 J1E2C2 ......... J2 C1

4 .... J2E1C4 J3E4C2 ................ J1 C1 J4 C3

5 .... J2E2C1 J1E3C3 ................ J3 J4 C2 C4

6 .... J1E4C4 J4E2C3................. J2 J3 C1 C2

7 .... J4E4C1 J3E2C4................. J1 J2 C2 C3

 

 

4: SELECTING THE FINALISTS. 

 

The finalists will be chosen based upon ability to respond "intelligently" to the following types of question.

 

The 4 entries with the highest scores will be selected as finalists.

 

It is not necessary that a program be able to respond to the selection questions.  If no entries can respond "intelligently" to these questions I will evaluate the entries on a general quality of responses.

 

I will not ask about rare or unusual things.  All nouns, adjectives and verbs will come from a dictionary suitable for children or adolescents under the age of 12.

 

Set 1 - Questions relating to time:

Background facts: For testing purposes, I will consider these to be correct whether or not the time and venue of the contest has been changed and set the system clock accordingly.

 

 a. The system clock will be accurate to within a minute or two.

 b. The competition is scheduled to start at 10:00 AM Sunday, 6 Sept 2009.

 c. There will be 7 rounds of 20 minutes each.

 

Sample Questions

• What time is it?

• What round is this?

• Is it morning, noon, or night?

etc.

 

Set 2 - General questions relating to things.

Sample Questions

• What would I use a hammer for?

• Of what use is a taxi?

• etc.

 

Set 3 Questions relating to relationships

Sample Questions

• Which is larger, a grape or a grapefruit?

• Which is faster, a train or a plane?

• John is older than Mary, and Mary is older than Sarah.  Which of them is the oldest?

• Etc.

 

Set 4 - Questions demonstrating "memory"

**Sample** Questions

I have a friend named Harry who likes to play tennis.

<Following this assertion there follows one or more intervening questions or statements, followed in turn by questions about the assertion, e.g.>

• What is the name of the friend I just told you about?

• Do you know what game Harry likes to play?

• etc.

 


 

Appendix

 

Following are the names for special characters in LPP.  Case is sensitive.

 

Name           Key

 

braceleft       '{',

braceright      '}',

bracketleft     '[',

bracketright    ']',

parenleft       '(',

parenright      ')',

space           ' ',

comma           ',',

period          '.',

greater         '>',

less            '<',

slash           '/',

backslash       '\',

bar             '|',

quotedbl        '"',

quoteright      "'",

Tab             "\t",

equal           '=',

underscore      '_',

plus            '+',

minus           '-',

exclam          '!',

at              '@',

numbersign      '#',

dollar          '$',

percent         '%',

asterisk        '*',

asciicircum     '^',

asciitilde      '~',

quoteleft       '`',

ampersand       '&',

Return          "\n",

colon           ":",

semicolon       ";",

question        "?",

BackSpace       "BackSpace",