HON 207 Introduction to Cognitive Science
Spring 2011

Small Project 2
Inflecting English Verbs
Due Friday April 22

Overview

In class, we discuss two models of verb inflection, the Words and Rules model (WR), and the Rumelhart McClelland Model (RMM). WR uses a grammar (rules) for inflecting regular verbs, and irregulars are stored in the lexicon. RMM is a connectionist (neural network) model that uses a single network to inflect verbs based on their phonetics.

In this project, you will get hands-on experience with a third model of inflection, which we'll call the Rules-only Model (RO Model), because it uses only rules to inflect all verbs, regular and irregular. You can use this project to better understand the capabilities and limitations of the different models.

You will create a set of rules that can transform an English verb into its past, past participle, or present participle forms. Then you'll analyze and write about your experience, and compare this model to other models of verb inflection.

Background

Linguists broadly categorize english verbs into two classes: regular and irregular. The past tense of regular verbs is formed by adding "ed". For example, the simple present verb "form" becomes "formed" in the past tense. The past and present participle forms of "form" are "formed" and "forming" respectively.

In the RO model, a single set of rules is used to create the different verb tenses from the infinitive. A very simple set of rules for transforming some verbs is shown below:

(
  ((eat past) --> (ate))  ; eat
  ((eat past-part) --> (eaten))

  ((*Ay past) --> (<Aied))  ; e.g. apply, pry, cry
                                  ; (but not say, play, ...)

  ((*A$V$C past) --> (<A<V<C<Ced)) ; ex. pat, tap, grip
  ((*A$V$C past-part) --> (<A<V<C<Ced))
  ((*A$V$C pres-part) --> (<A<V<C<Cing))

  ((*A past) --> (<Aed))  ; default rule
  ((*A past-part) --> (<Aed))
  ((*A pres-part) --> (<Aing))
)

Because these rules are processed by a computer, they must conform to a particular syntax. The parentheses at the very beginning and very end indicate that there is a list of rules. Each of the other lines is a rule. The basic format of a rule is:

(<input-pattern> <tense>) --> (<result-pattern>)

For example, the first rule, "((eat past) --> (ate))" could be translated into english as: if the input verb is "eat" and the intended tense is "past", then return the string, "ate". The second rule is similar for the past participle form.

Rules can also contain wildcards. A wildcard consists of a punctuation mark and a letter that stands for a variable. The wildcard is interpreted as follows:

At the end of the third rule is a comment. The system ignores the rest of the line after a semicolon.

The sixth rule above, ((*A past) --> (<Aed)), is a sort of catch-all rule for the simple past tense. The *A wildcard matches the entire word, and the result side of the rule copies the word and adds "ed" to it.

Before you start, you should make sure that you understand each of the rules above, what verbs they will work on, and what verbs they will not work on.

One more note about the rules: they are tried sequentially (from the top). The first rule whose input-pattern matches the current verb is the one that does the inflection.

What to do

First, point your browser to this address: http://alarm.cti.depaul.edu/verbs/check.

You will find a simple application that implements the RO model. On this page there is a text area for rules, a text box for the simple form of the verb that you want to inflect, a widget to select the tense of the verb to create, and a submit button. The rule box comes with a default set of rules preloaded, the same set of rules shown above. Try inflecting a few verbs with the default rules to make sure your predictions were correct. You will modify these rules to create your own rule set.

Now think of as many verbs as you can, both regular and irregular. Identify verbs that will not be processed correctly with the default rules, and come up with rules for correctly processing as many as you can. Can you identify groups of "regular irregulars" where several different irregular verbs follow the same pattern?

Your verb list should include at least 20 (more is better), some regular, some irregular, and (especially), some groups that are pseudo-regular, that is, they don't correspond to the default inflection, but are still systematic. For each verb, list the infinitive, the simple past, the past participle and the present participle like this:

   see saw seen seeing
   is was been being
   pat patted patted patting

This will make it very easy to see how well your rules are inflecting the different verbs.

Hints and warnings

The Verb Inflection Engine is not especially clever with regards to checking that your rules are well-formed. If there is some type of error in your rules (most likely a missing or extra parenthesis), you'll get a simple error message, but the application won't tell you exactly where the problem is. For this reason, you may want to try out your rules one at a time instead of all in one bunch. Once you successfully test a rule, however, you should combine it with the rest to make sure the whole set works as a system. To reiterate: the entire set of rules should be combined for your final testing of the system. You should also put comments on the groups of rules following the example above.

Warning: The web page will not save your rules for you. You will need to cut and paste them. When you have finalized your rule changes, run a set of verbs through and indicate which verbs are handled properly and which are not.

What to hand in

Write a report that describes your experience and your thoughts about it. Your report should include these sections:

There is no specific requirement on the length of the paper. It should be long enough to make it clear to me what you were doing and what you thought about, but please don't confuse quantity with quality. In other words, it doesn't need to be long to be good.

Discussion questions:

Grading

This project is worth 20 points. The following grading scheme will be applied when reviewing the reports: