Polysemy and length

1. Problem and history

The relation between polysemy and length of words is one of the oldest problems in quantitative linguistics introduced by Zipf (see esp. 1949). It is also part of Köhler´s self-regulation cycle (1986). The question is the direction of dependence. Since word prolongation by affixation, compounding, reduplication etc. results from semantic needs, namely specification of meaning, polysemy is the independent variable; though in a complete self-regulation cycle both directions can be tested. Hypotheses were set up and tested by Altmann, Beöthy and Best (1982) for German, Hungarian and Slovak, by Sambor (1984) for Polish, German, Slovak and Hungarian, by Fickermann, Markner-Jäger, Rothe (1984) for German, Swedish and Indonesian and by Köhler (1999a) for Maori (New Zealand). Köhler considered the both directions (cf. also Length, frequency, polysemy)

2. Hypothesis

Length is a function of polysemy.

3. Derivation

The relative rate of change of length (L) is proportional to the relative rate of change of the polysemy (P) amplified by an additive constant expressed as

(1) \frac{dL}{L}= ( a+\frac{b}{P} )dP

resulting in

(2) L= Cp^b e^{aP}\quad.

The analogous relation holds if in (1) one exchanges the variables, i.e. if one considers P = f(L). The result is evidently special case of the Unified theory, see Example 1 in § 4.1.

Example. Length and polysemy in Maori

Köhler (1999) showed on Maori data that the direction P = f(L) is slightly more significant then the other way round and that in most cases a = 0 is sufficient, i.e. mathematically L = CPb would be more correct. The result of fitting is showed in Table 1 and Fig. 1

Tabelle1 PaL.jpg
Grafik1 PaL.jpg
Fig. 1. Length of lexemes as a function of polysemy in Maori (Köhler 1999)


4. Authors: U. Strauss, G. Altmann

5. References

Altmann, G., Bagheri, D., Goebl, H., Köhler, R., Prün, C. (2002). Einführung in die quantitative Lexikologie. Götingen: Peust & Gutschmidt.

Altmann, G., Beöthy, E., Best, K.-H. (1982). Die Bedeutungskomplexität der Wörter und das Menzerathsche Gesetz. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikations-forschung 35, 537-543.

Altmann, G., Schwibbe, M. (1989). Das Menzerathsche Gesetz in informationsverarbeitenden Systemen. Hildesheim, Olms.

Breiter, M.A. (1994). Length of Chinese words in relation to their other systemic features. J. of Quantitative Linguistics 1(3), 224-231.

Drebet, V.V., Levickij, V.V., Cherubim, D. (1996). Morphologische Faktoren bei der Polysemie der deutschen Adjektive. Naukovy Visnyk Cerniveckoho Universitetu 1, 29-32.

Elts, J. (1995). Word length and its sematic complexity. In: Mikk, J. (ed.), Family and Textbooks: 115-126. Tartu: University of Tartu.

Fickermann, I., Markner-Jäger, B, Rothe, U. (1984). Wortlänge und Bedeutungskomplexität. Glottometrika 6, 115-126.

Gieseking, K. (2002). Untersuchungen zur Synergetik der englischen Lexik. In: Köhler, R. (ed.), Korpuslinguistische Untersuchungen in die quantitative und systemtheoretische Linguistik: 387-433. http://ubt.opus.hbz-nrw.de/volltexte/2004/279/

Guiter, H. (1977), Les relations /fréquence-longeur-sens/ des mots (langues romanes et anglais). Actes du 14ème congrès international de linguistique et philologie romanes. Napoli 15-20 Aprile 1974, Vol. 4, 1977: 373-381. Napoli: Macchiaroli.

Hammerl, R. (1991). Untersuchungen zur Struktur der Lexik: Aufbau eines lexikalischen Basismodells. Trier, WVT.

Hammerl, R., Sambor, J. (1993). O statystycznych prawach jezykowych. Warszawa: Polskie Towarzystwo Semiotyczne.

Hoffmann, Ch. (2001). Polylexie lexikalischer Einheiten in Texten. In: Uhlířova, L., Wimmer, G., Altmann, G., Köhler, R. (Eds.), Text as a linguistic paradigm: levels, constituents, constructs. Festschrift in honour of Ludek Hřebíček: 76-97. Trier: WVT.

Köhler, R. (1986). Zur linguistischen Synergetik: Struktur und Dynbamik der Lexik. Bochum: Brockmeyer.

Köhler, R. (1999a). Der Zusammenhang zwischen Lexemlänge und Polysemie im Maori. In: Ondrejovič, S., Genzor, J. (eds.), Pange lingua. Zborník na počest´Viktora Krupu: 27-33. Bratislava: Veda.

Krylov, Ju.K. (1982). Eine Untersuchung statistischer Gesetzmäßigkeiten auf der paradig-matischen Ebene der Lexik natürlicher Sprachen. In: Guiter, H., Arapov, M.V. (Eds.), Studies on Zipf´s law: 234-262. Bochum: Brockmeyer.

Mikk, J., Uibo, H., Elts, J. (2001). Word length as an indicator of semantic complexity. In Uhlířova, L., Wimmer, G., Altmann, G., Köhler, R. (Eds.), Text as a linguistic paradigm: levels, constituents, constructs. Festschrift in honour of Ludek Hřebíček: 187-195. Trier: WVT.

Rothe, U. (1983). Wortlänge und Bedeutungsmenge. Eine Untersuchung zum Menzerathschen Gesetz an drei romanischen Sprachen. Glottometrika 5, 101-112. Sambor, J. (1984). Menzerath’s law and the polysemy of words. Glottometrika 6, 94-114.

Zipf, G.K. (1935). The psycho-biology of language. Boston: Houghton Mifflin.

Zipf, G.K. (1945). The meaning-frequency relationship of words. J. of General Psychology 33, 251-256.

Zipf, G.K. (1949). Human behavior and the principle of least effort. Cambridge: Addison-Wesley.