Copyright, 1991, by the Logical Language Group, Inc.
2904 Beau Lane,
Fairfax VA 22031-1303
USA Phone (703) 385-0273
lojbab@lojban.org

All rights reserved.  Permission to copy granted subject to your verification
that this is the latest version of this document, that your distribution be
for the promotion of Lojban, that there is no charge for the product, and that
this copyright notice is included intact in the copy.

			TANRU AND LUJVO-MAKING

tanru are Lojban metaphors.  They are made up of gismu representing concepts
that are related to the concept being communicated.  The relationship isn't
necessarily unambiguous in meaning, although the grammatical realtionship
between the words is unambiguous.  A blue-nest in some way nests someone or
something.  It could be a nest for blue eggs, or blue people, or it could be a
house painted blue either partially or completely.  It takes a more elaborate
tanru to distinguish these less ambiguously (if such is important), or
non-tanru methods can be used to expand communication unambiguously.  Most
often, tanru will be appropriate.

tanru are something like English adjective-noun and adverb-verb combinations.
They go beyond these concepts by combining and expanding upon them.  In this,
they are similar to Chinese metaphor more than to English.  In general, the
gismu on the left modify those on the right, and all groupings are in pairs
from the left.  Thus broda brode brodi brodo brodu is a 5-part tanru.  The
meaning is interpreted by grouping gismu in pairs as:

	(((broda brode) brodi) brodo) brodu.

To change this unambiguous grouping, specific cmavo are used that allow unique
expression of the possible groupings.  The cmavo 'bo' causes two adjacent
gismu to group together before any other groupings.  Thus we get

	((broda brode) (brodi bo brodo)) brodu.

If there are two bo's in a tanru, the leftmost takes precedence, but this is
unlikely to occur in normal usage.  The cmavo 'ke' can be used to change the
grouping.  ke causes everything to the left of it to modify everything to the
right; an example is:

	broda ke (((brode brodi) brodo) brodu).

To terminate right-grouping before the end, close off with ke'e, the right
grouping terminator cmavo:

	(broda ke ((brode brodi) brodo) ke'e brodu.

The logical connectives and negation can be used to modify part or all of a
tanru.  The tanru logical connective cmavo are ja, je, jo, ju, and each of
these can be negated using na (negates the term before the connective) and nai
(negates the term after the connective).  In addition, the mixed connectives
of selma'o JOI and the abstraction cmavo of selma'o NU can modify the
components of tanru, and numbers can be incorporated into tanru with the aid
of selma'o MOI.

tanru-making is one of the most important skills in speaking Lojban, because
tanru are the primary source of semantic ambiguity in the language.  The
essence of tanru interpretation depends on imaginatively thinking of possible
meanings (using some simple conventions to limit the possibilities), and then
determining why the speaker used this particular tanru as opposed to some
other, thus weeding out the interpretations that are not intended by the
speaker.

To make a new tanru, the reverse process is used.  Think of a few
possibilities, then try to analyze how a listener might misinterpret each
possibility.  Thus, making tanru gets to the true essence of human
communication:  putting one's self in the mind of the other person, and
figuring out what that mind is thinking.  When thinking Lojbanically - seeing
the world through tanru - one is to some extent practicing a form of
mind-reading.

The following are two of the most basic of the tanru conventions: 

- In a tanru, where a gismu such as 'bajra' (which means 'x1 runs from x2 to
x3, etc.') occurs, the keyword  'run' (which is an English verb) should not be
used in interpreting the tanru.  The sumti 'le bajra' refers to 'the runner
from...', and not 'the run from...'.  This is because x1 in the gismu
definition would be replaced by the one who is running.  Similarly, in tanru,
use the x1 place in building meanings.  A tordu bajra is a 'short-runner', not
a 'short- run'.  To get the 'verb' as in the latter, use nu bajra ('the event
of x1 running ...') in the tanru, giving tordu nu bajra.  'Adjectives' can be
communicated by using the quality abstractor cmavo 'ka'.

- tanru have the place structure of their final gismu, and not a combination
thereof.  Thus a pikta lebna ('ticket- taker') is one who in some way related
to tickets takes something from someone, since this is the place structure of
lebna.  There are also ways to get the places of pikta involved, but these are
much more complicated.

Given a tanru which expresses an idea to be used frequently, it can be turned
into a lujvo by following the lujvo- making algorithm.

In building a lujvo, the first step is to replace each gismu with a rafsi that
uniquely represents that gismu.  (Some cmavo found in tanru are also assigned
rafsi.  If a cmavo embedded in a tanru does not have a rafsi, you may have to
paraphrase the tanru in some way.)  These rafsi are then attached together by
fixed rules that allow the resulting compound to be recognized as a single
word and to be broken down in only one way.  Some conventions that have been
adopted for this are listed at the end of this essay, before the rafsi list.



There are four other complications:

- Rules for Lojban word forms

- The lujvo must be formed according to Lojban's word-formation rules.  The
constraints of Lojban word forms forbid any lujvo from ending in a consonant,
so that words most commonly found in the final position of a tanru have been
prioritized to have a rafsi that ends in a vowel.  However, words found in
initial positions often form better sounding combinations if their rafsi end
in a consonant.  (Also, because we usually recognize words by the consonants
in them rather than the vowels, the rafsi of form CVV and CV'V are harder to
memorize.  Note:  C and V in abbrevaiations of this sort stand for any Lojban
consonant and vowel, respectively.  The apostrophe is the Lojban "'", which is
considered neither a consonant nor a vowel.)

Certain sounds are forbidden to occur next to each other (so-called
'impermissible medial' consonants), and must be separated by a 'hyphen'-sound,
the "uh" of "sofa", represented in Lojban by the letter 'y' (this letter is
found only as a hyphen, in lerfu, the words for letters of the alphabet, and
along to represent the hesitation noise.  It is thus not normally considered a
'V' is the C/V convention scheme.  Indeed, "CyC" is considered a consonant
cluster in Lojban morphology, albeit a hyphenated one).  In addition, a CVV or
CV'V rafsi at the beginning of any lujvo must either carry the penultimate
stress, it must be 'glued' to the remaining rafsi with a syllabic 'r' or 'n'
sound, or the rafsi falls off into a separate word, a cmavo.  (In addition, a
CVV or CV'V rafsi followed by another CVV or CV'V rafsi in a 2-term lujvo must
have the 'r' or 'n' added, or the consonant cluster mandatory in any brivla in
not present, and the rafsi break up into two separate cmavo.)

- Multiple rafsi to choose from

- Because of these rules, there is usually more than one rafsi usable for each
gismu.  The one to be used is simply whichever sounds best to the
speaker/writer.  There are many valid combinations of the possible rafsi.  Any
rafsi for a given word is equally valid in place of another, AND ALL MEAN THE
SAME THING.  There is an optional scoring component to the lujvo-making
algorithm which attempts to systematically pick the 'best' one; this algorithm
tries for short forms and tends to push more vowels into the words to make
them easier to say.  The Japanese, Chinese, and Polynesian speakers will
prefer this; Russians have a different aesthetic, since they are used to
saying consonant clusters.  But these are not necessarily the criteria you
will wish to use. 

- lujvo have ONE meaning

- While a tanru is ambiguous, having several possible meanings, a lujvo (one
that would be put into the dictionary) has ONE MEANING.  Just like gismu, a
lujvo is a predicate which encompasses one area of the semantic universe, with
one set of places.  Hopefully this is the most 'useful' or 'logical' of the
possible semantic spaces.  A known source of linguistic drift in Lojban will
be as Lojbanic society evolves, and the concept represented by a sequence of
rafsi that is most 'useful' or 'logical' changes.  At that time, it might be
decided that we want to redefine the lujvo to assume the new meaning.  lujvo
must not be allowed to retain two meanings.  So those that maintain the
dictionary will be ever watchful of tanru and lujvo usage to ensure this
standard is kept.

One should try to be aware of the possibility of prior meanings of a new
lujvo, especially if you are writing for 'posterity'.  If a lujvo is invented
which involves the same tanru as one that is in the dictionary, and is
assigned a different meaning (including a different place structure),
linguistic drift results.  This isn't necessarily bad; it happens in every
natural language.  You communicate quite well in English even though you don't
know most words in the dictionary, and in spite of the fact that you use some
words in ways not found in the dictionary.  Whenever you use a meaning
different from the dictionary definition, you risk a reader/listener using the
dictionary and therefore misunderstanding you.  One major reason for having a
standard lujvo scoring algorithm is that with several possible rafsi choices
to consider, a dictionary is most efficient by putting the definition under
the single most preferred form.

You may optionally mark a nonce word that you create without checking a
dictionary by preceding it with "za'e".  "za'e" simply tells the listener that
the word is a nonce word, and may not agree with a dictionary entry for that
sequence of rafsi.  The essential nature of human communication is that if the
listener understands, then all is well.  Let this be the ultimate guideline
for choosing meanings and place structures for invented lujvo. 

- Zipf's law and lujvo

- This complication is simple, but is the scariest.  Zipf's Law (actually a
hypothesis), says that the length of words is inversely proportional to their
frequency of usage.  The shortest words are those which are used more; the
longest ones are used less.  The corollary for Lojban is that commonly used
concepts will tend to be abbreviated.  Speakers will choose the shortest form
for frequently expressed ideas that gets their meaning across, even at the
cost of accuracy in meaning.  In English, we have abbreviations and acronyms
and jargon, all of which are words for complex ideas used with high frequency
by a group of people.  So they shortened them to convey the often-used in-
formation more rapidly.

The jargon-forming interpretation of Zipf's Law may be a cause of multiple
meanings of words in the natural languages, especially of short words.  If
true, it threatens the Lojban rule that all lujvo must have one meaning.  The
Lojbanist thus resigned accepts a complication in lujvo-making:  A perfectly
good and clear tanru may have to be abbreviated, if the concept it represents
likely will be used so often as to cause Zipf's Law to take effect.

Thus, given a tanru with grouping markers, abstraction markers, and other
cmavo in it to make the tanru syntactically unambiguous, in many cases one
drops some of the cmavo to make a shorter (incorrect) tanru, and then uses
that one to make the lujvo.

This doesn't lead to ambiguity, as it might seem.  A given lujvo still has
exactly one meaning and place structure.  But now, more than one tanru is
competing for the same lujvo.  This is not as difficult to accept or allow for
as it might seem:  more than one meaning for a single tanru was already
competing for the 'right' to be used for the lujvo.  Someone has to use
judgement in deciding which one meaning is to be chosen over the others.  This
judgement will be made on the basis of usage, presumably by some fairly
logical criteria.

If the lujvo made by a shorter form of tanru is already in use, or is likely
to be useful for another meaning, the wordmaker then retains one or more of
the cmavo, preferably ones that clearly set this meaning apart from the
shorter form meaning that is used or anticipated.  In Lojban, therefore,
shorter lujvo will be used for a less complicated concept, possibly even over
a more frequent word.  If two concepts compete for a single rafsi sequence,
the simpler concept will take a shorter form, and the more complex concept
will have some indication of its more complex nature added into the word
structure.  It is easier to add a cmavo to clarify the meaning of a more
complex term than it is to find a good alternate tanru for the simpler term.

A good lujvo-composer considers the listener, and a good lujvo interpreter
remebers the difficulties of lujvo- making.  If someone hears a word he
doesn't know, decomposes it, and gets a tanru that makes no sense for the
context, he knows that the grouping operators may have been dropped out, he
may try alternate groupings.  Or he may try using the verb form of the concept
instead of the first sumti, inserting an abstraction operator if it seems
plausible.  Plausibility is key to learning new ideas, and evaluating
unfamiliar lujvo.