The Jerome project is $TRANSMACHINA$'s multilingual database and is a core component of the $SPRAWK$ translation suite. It represents a complex language network linking words and meanings together. For example, the word "fan" has several linked meanings ("cooling device", "a fanatic supporter"). And the meaning "a small round confectionary baked in an oven" has several words even in English ("biscuit" and "cookie") depending on dialect.
Words also have links to other words, for example "dogs" is linked to "dog" because it is the plural form, and "ate" is linked to "to eat" because it is a past tense form. Currently, the database contains only base forms (singular for nouns, and the infinitive for verbs)
To compute, record and link in all word forms for all languages.
For nouns: plurals and cases will need to be added. The rules for automatically generating these forms can be very complex, but are often available on the web. Transmachina has already developed Java methods to calculate these for our focus languages.
For verbs: tenses for each person and number (e.g. je suis, tu es, il est, nous sommes, vous etes, ils sont). For latin languages this can generate hundreds of forms per infinitive. Once again, Transmachina has routines for our core languages.
For adjectives: gender and plural agreement must be generated for most European languages
For adverbs: comparativeness needs to be generated (big, bigger, biggest)
All generated forms must link back to the base form via a Word-to-Word link. Thus if we were to look up the word "dogs" the Meaning "a four-legged mammal" can be reached via the base word. Also, all base words should have links to all inflected form (the type of the link depend on the direction. If it is base->inflection, the type is inflection, if it is inflection->base it is base form.)
Word-to-Word links have types of at least the following:
These can be configured via the Administration interface of the database.