Difference between revisions of "Diversification"

m
 
(19 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
'''1. Problem and history'''
 
'''1. Problem and history'''
  
Diversification is a process of enlarging the number of forms or meanings of any linguistic entity. It can be ''paradigmatic'', e.g. the rise of cases, numbers, tenses, etc., ''syntactic'', e.g. the rise of allophones, allomorphs etc., ''geographical'', e.g. the increase of different expressions of a concept, ''social'', e.g. the rise different words or meanings of a word or different pronunciations, ''idiolectal'' within a community, ''semantic'', e.g. the increase of synonymy and polysemy, ''contextual'', e.g. the increase of using a unit in different contexts. It comprises a number of phenomena dispersed in this volume.
+
Diversification is a process of enlarging the number of forms or meanings of any linguistic entity. It can be ''paradigmatic'', e.g. the rise of cases, numbers, tenses, etc., ''phono-morphemic'', e.g. the rise of allophones, allomorphs etc., ''geographical'', e.g. the increase in the number of different expressions of a concept, ''social'', e.g. the rise of different words or meanings of a word or different pronunciations, ''idiolectal'' within a community, ''semantic'', e.g. the increase in synonymy and polysemy, ''contextual'', e.g. the increase in the usage of a unit in different contexts. Diversification comprises a number of phenomena dispersed in this volume.
  
 
For the sake of illustration let us show some concrete examples:
 
For the sake of illustration let us show some concrete examples:
  
(1) The word can enlarge its class membership without any change, e.g. through conversion: “the hand”, “to hand”.
+
1) The word can enlarge its class membership without any change, e.g. through conversion: “the hand”, “to hand”.
  
(2) The stem enlarges its class membership through derivation, e.g. German "Bild",       "bilden", "bildhaft", or vocalization in Semitic languages, etc.
+
2) The stem enlarges its class membership through derivation, e.g. German "Bild", "bilden", "bildhaft", or vocalization in Semitic languages, etc.
  
(3) The stem can enlarge its applicability within one class through derivation e.g. German "Blut", "Blutung", "Bluter", or through vocalization, etc.
+
3) The stem can enlarge its applicability within one class through derivation e.g. German "Blut", "Blutung", "Bluter", or through vocalization, etc.
  
(4) The stem can enlarge its applicability within one class through compounding e.g. "Blut", "Blutdruck", "Blutdurst", etc.
+
4) The stem can enlarge its applicability within one class through compounding e.g. "Blut", "Blutdruck", "Blutdurst", etc.
  
(5) If a language abandons the isolating morphology, then morphemes diversify into several morphs because of agglutination or inflection (sequential or syntactic dependence).
+
5) If a language abandons isolating morphology, then morphemes diversify into several morphs because of agglutination or inflection (sequential or syntactic dependence).
  
(6) The word can enlarge its applicability in the sentence by acquiring several functions, i.e. it enlarges its dispositional properties, which are different from the constant grammatical properties, e.g. practically every word can become the subject of a sentence.
+
6) The word can enlarge its applicability in the sentence by acquiring several functions, i.e. it enlarges its dispositional properties, which are different from the constant grammatical properties, e.g. practically every word can become the subject of a sentence.
  
(7) Verbs can enlarge their valence, i.e. their combinability with different cases.
+
7) Verbs can enlarge their valence, i.e. their combinability with different cases.
  
(8) The word can enlarge its cotextuality (cf. Köhler 1986), i.e. its ability to occur in several contexts (where "context" can be defined in several ways). The reverse of this kind of diversification process is a part of style formation, where a "position" diversifies, i.e., a position in a given context can be filled with different units (words, sentences, etc.).
+
8) The word can enlarge its cotextuality (cf. Köhler 1986), i.e. its ability to occur in several contexts (where "context" can be defined in several ways). The reverse of this kind of diversification process is part of style formation, where a "position" diversifies, i.e., a position in a given context can be filled with different units (words, sentences, etc.).
 
   
 
   
(9) A concept can be expressed by different forms, giving rise to dialects, sociolects, idiolects, or to synonymy.
+
9) A concept can be expressed by different forms, giving rise to dialects, sociolects, idiolects, or to synonymy.
  
(10) A word can acquire different meaning (polysemy).
+
10) A word can acquire different meaning (polysemy).
  
(11) Every word can acquire different associations (connotations).  
+
11) Every word can acquire different associations (connotations).  
  
 
Diversified entities abide by a ranking law, i.e. if the members of the diversified entity are ordered according to their frequency, then the frequencies are “lawfully” connected.
 
Diversified entities abide by a ranking law, i.e. if the members of the diversified entity are ordered according to their frequency, then the frequencies are “lawfully” connected.
 +
 
The factors of diversification can be as follows (Altmann 2005):  
 
The factors of diversification can be as follows (Altmann 2005):  
  
(a) ''Random fluctuation'' which is omnipresent in any language phenomena.
+
a) ''Random fluctuation'' which is omnipresent in any language phenomena.
 
   
 
   
(b) ''Environmentally conditioned variation'' forcing an element to acquire different forms or meaning nuances in different environments.
+
b) ''Environmentally conditioned variation'' forcing an element to acquire different forms or meaning nuances in different environments.
 
   
 
   
(c) ''Conscious change'' through conscious creation, borrowing, emotionality etc.
+
c) ''Conscious change'' through conscious creation, borrowing, emotionality etc.
 
   
 
   
(d) ''Self-organisatory triggering'' of a process to a limit, causing changes in other levels, too.
+
d) ''Self-organisatory triggering'' of a process to a limit, causing changes on other levels, too.
 
   
 
   
(e) ''System modification'' joined with local or global modifications in a subsystem,
+
e) ''System modification'' joined with local or global modifications in a subsystem,
 
   
 
   
(f) ''Köhlerian requirements'' (Köhler 1986, 1987, 1989, 1990, 1991) forcing to take into account collateral pressures form different sides. They are as follows: (i) ''The trend for minimal coding and deciding effort'', (ii) ''sufficient redundancy'', (iii) ''the coding requirement in general'', (iv) ''context economy vs. context specificity'', and (v) ''invariance vs. flexibility of relation between expression and meaning.''
+
f) ''Köhlerian requirements'' (Köhler 1986, 1987, 1989, 1990, 1991) forcing to take into account collateral pressures form different sides. They are as follows: (i) ''The trend for minimal coding and deciding effort'', (ii) ''sufficient redundancy'', (iii) ''the coding requirement in general'', (iv) ''context economy vs. context specificity'', and (v) ''invariance vs. flexibility of relation between expression and meaning.''
The concepts of diversification and unification go back to G.K. Zipf (1935, 1949). Together they are called “Zipfian processes”. The scope of the phenomena is enormous. Semantic phenomena have been examined by Beöthy and Altmann (1984a,b, 1991), Altmann (1985a), Altmann, Best, Kind (1987); grammatical phenomena  were analyzed in the omnibus volume edited by Rothe (1991), where also a study on spelling errors in Japanese can be found, and dialectal diversification was studied by Altmann (1985b)
+
 
 +
The concepts of diversification and unification go back to G.K. Zipf (1935, 1949). Together they are called “Zipfian processes”. The scope of the phenomena is enormous. Semantic phenomena have been examined by Beöthy and Altmann (1984a,b, 1991), Altmann (1985a), Altmann, Best, Kind (1987); grammatical phenomena  were analyzed in the omnibus volume edited by Rothe (1991), where also a study on spelling errors in English can be found, and dialectal diversification was studied by Altmann (1985b).
 +
 
 
The laws hold usually for ranked nominal classes of limited size.
 
The laws hold usually for ranked nominal classes of limited size.
  
Line 57: Line 60:
 
“Classes” are built by a class-building criterion, e.g. derivates, compounds, declination classes, word classes (Wortarten), even semantic classes, etc.
 
“Classes” are built by a class-building criterion, e.g. derivates, compounds, declination classes, word classes (Wortarten), even semantic classes, etc.
  
'''Corollary''': If the above hypothesis holds, then the frequencies of elements of a linguistic class are not distributed uniformly.
+
'''Corollary''': ''If the above hypothesis holds, then the frequencies of elements of a linguistic class are not distributed uniformly''.
  
 
In a “uniform distribution” all frequencies are equal.  
 
In a “uniform distribution” all frequencies are equal.  
Line 65: Line 68:
 
'''3. Derivation'''
 
'''3. Derivation'''
  
'''3.1.  Altmann´s approach A (1991).'''
+
'''3.1.  Altmann´s approach A (1991).'''
  
Since the entities are ranked and the corollary holds, it is true that for the probabilities of classes it holds that
+
Because the entities are ranked and because of the corollary, it is true that for the probabilities of classes it holds that
  
 
<math>P_x\le P_{x-1}</math>
 
<math>P_x\le P_{x-1}</math>
  
Since <math>P_x</math> and  <math>P_{x-1}</math> (x = 2,3,…) are joined in a law-like manner, we can write
+
Since <math>P_x</math> and  <math>P_{x-1}</math> (<math>x</math> = 2,3,…) are joined in a law-like manner, we can write
  
(1) <math>P_x=g(x)P_{x-1}</math>, where <math>g(x)\le 1</math>.
+
(1) <math>P_x=g(x)P_{x-1}\quad</math>, where <math>g(x)\le 1\quad</math>.
  
Furthermore, g(x) can be written as
+
Furthermore, <math>g(x)</math> can be written as
  
 
<math>g(x)=\frac{f(x)}{h(x)}</math>,
 
<math>g(x)=\frac{f(x)}{h(x)}</math>,
  
where f(x) is a function composed of a language constant a and the diversifying effect of the speaker bx, i.e. f(x) = a+bx, while h(x) contains the controlling, regulating effect of the hearer (community) cx, i.e.
+
where <math>f(x)</math> is a function composed of a language constant <math>a</math> and the diversifying effect of the speaker <math>bx</math>, i.e. <math>f(x) = a+bx</math>, while <math>h(x)</math> contains the controlling, regulating effect of the hearer (community) <math>cx</math>, i.e.
  
<math>g(x)=\frac{a+bx}{cx}</math>, <math>\quad a+bx\le cx</math> (a, b, and c are assumed positive),
+
<math>g(x)=\frac{a+bx}{cx} \le 1</math>, (a, b, and c are assumed positive),
  
 
so that
 
so that
  
(2)<math>P_x=\frac{a+bx}{cx}p_{x-1}</math>.
+
(2) <math>P_x=\frac{a+bx}{cx}P_{x-1}</math>.
  
In order to obtain a known distribution, one can reparametrize (2) by writing a/b = k-1 and b/c = q, and solving (2) for Px. One obtains
+
In order to obtain a known distribution, one can reparametrize (2) by writing <math>a/b = k-1</math> and <math>b/c = q</math>, and solving (2) for <math>P_x</math>. One obtains
  
(3)<math>P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}\frac{p^kq^x}{1-p^k}, \quad x=1,2,3,...</math>  
+
(3) <math>P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}\frac{p^kq^x}{1-p^k}, \quad x=1,2,3,...</math>  
  
 
yielding the zero-truncated (positive) negative binomial distribution. The condition <math>g(x)\le 1</math> is fulfilled if <math>kq\le 1</math>.  
 
yielding the zero-truncated (positive) negative binomial distribution. The condition <math>g(x)\le 1</math> is fulfilled if <math>kq\le 1</math>.  
Line 99: Line 102:
 
For the purposes of dialectal variation captured in terms of numbers of lexeme variants on maps of a dialect atlas, Altmann (1985) used the birth-and-death process based on the following assumptions:
 
For the purposes of dialectal variation captured in terms of numbers of lexeme variants on maps of a dialect atlas, Altmann (1985) used the birth-and-death process based on the following assumptions:
  
(a) In a time interval Δt the birth of a new variant is proportional to the length of the interval, i.e. aΔt.
+
(a) In a time interval Δ<math>t</math> the birth of a new variant is proportional to the length of the interval, i.e. <math>a</math>Δ<math>t</math>.
  
(b) The assertion of a variant against x rivals is propotional to the number of rivals and the length of the interval, i.e. bxΔt.
+
(b) The assertion of a variant against <math>x</math> rivals is propotional to the number of rivals and the length of the interval, i.e. <math>bx</math>Δ<math>t</math>.
  
(c) The death of a variant is proportional to the number of variants and the length of the interval, i.e. cxΔt.
+
(c) The death of a variant is proportional to the number of variants and the length of the interval, i.e. <math>cx</math>Δ<math>t</math>.
  
(d) No change (birth, death or assertion) in Δt is given as the complement to the above changes: 1 – [a+(b+c)x]Δt ignoring intervals smaller then Δt.
+
(d) No change (birth, death or assertion) in Δ<math>t</math> is given as the complement to the above changes: 1 – [<math>a+(b+c)x</math>]Δ<math>t</math>, ignoring intervals smaller then Δ<math>t</math>.
 
   
 
   
 
(e) The events are independent and the probability of more then one event in the interval is zero.
 
(e) The events are independent and the probability of more then one event in the interval is zero.
  
Thus the probability that there are x-1 variants and a new variant arises or asserts itself against x-1 rivals is
+
Thus the probability that there are <math>x-1</math> variants and a new variant arises or asserts itself against <math>x-1</math> rivals is
  
 
<math>a\triangle tP_{x-1}(t) + b(x-1)\triangle tP_{x-1}(t)</math>;
 
<math>a\triangle tP_{x-1}(t) + b(x-1)\triangle tP_{x-1}(t)</math>;
  
the probability that there are x+1 variants and one dies is
+
the probability that there are <math>x+1</math> variants and one dies is
  
 
<math>c(x+1)\triangle tP_{x+1}(t)</math>;
 
<math>c(x+1)\triangle tP_{x+1}(t)</math>;
  
the probability that nothing happens in Δt is
+
the probability that nothing happens in Δ<math>t</math> is
  
<math>{1-(a+(b+c)x]\triangle t}P_x(t)</math>.
+
<math>{{1-[a+(b+c)x]\triangle t}}P_x(t)</math>.
  
Putting these probabilities together we obtain the probability that in the interval (t, t+Δt) there will be exactly x variants as
+
Putting these probabilities together we obtain the probability that in the interval (<math>t</math>, <math>t</math>+Δ<math>t</math>) there will be exactly <math>x</math> variants as
  
 
<math>P_x(t+\triangle t) = [a+b(x-1)]\triangle tP_{x-1}(t) + c(x+1)\triangle tP_{x+1}(t) + {1-[a+(b+c)x]\triangle t}P_x(t)</math>.
 
<math>P_x(t+\triangle t) = [a+b(x-1)]\triangle tP_{x-1}(t) + c(x+1)\triangle tP_{x+1}(t) + {1-[a+(b+c)x]\triangle t}P_x(t)</math>.
  
Substracting Px from both sides and dividing them by Δt, we obtain
+
Substracting <math>P_x</math> from both sides and dividing them by Δ<math>t</math>, we obtain
  
 
<math>\frac{P_x(t+\triangle t)-P_x(t)}{\triangle t}= [a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)]</math>.
 
<math>\frac{P_x(t+\triangle t)-P_x(t)}{\triangle t}= [a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)]</math>.
  
Letting <math>\triangle t\rightarrow  0</math> we finally obtain
+
Letting <math>\triangle t\rightarrow  0</math> we finally get
  
 
<math>\frac{dP_x(t)}{dt}=[a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)]</math>
 
<math>\frac{dP_x(t)}{dt}=[a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)]</math>
Line 138: Line 141:
 
Solving the balancing equations holding for the steady state
 
Solving the balancing equations holding for the steady state
  
<math>-aP_0+cP_1=0,</math>
+
<math>-aP_0+cP_1=0,\quad</math>
  
 
<math>-[a+(b+c)x]P_x+[a+b(x-1)]P_{x-1}+c(x+1)P_{x+1}=0, \quad x\ge 1,</math>
 
<math>-[a+(b+c)x]P_x+[a+b(x-1)]P_{x-1}+c(x+1)P_{x+1}=0, \quad x\ge 1,</math>
Line 145: Line 148:
 
and setting b/c = q and a/b = k results again in the negative binomial distribution
 
and setting b/c = q and a/b = k results again in the negative binomial distribution
  
(4)<math>P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}p^kq^x, \quad x=0,1,2,...</math>
+
(4) <math>P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}p^kq^x, \quad x=0,1,2,...</math>
  
For dialect maps, (4) is to be understood as the probability that the basic lexeme has x variants, i.e. if on a map there is only one unique form, then x = 0.
+
For dialect maps, (4) is to be understood as the probability that the basic lexeme has <math>x</math> variants, i.e. if on a map there is only one unique form, then <math>x</math> = 0.
  
  
 
'''Example:  Goebl´s law (dialectal diversification)'''
 
'''Example:  Goebl´s law (dialectal diversification)'''
  
Goebl (1984) studied the dialect maps of North West France and Italy and brought the distribution of the numbers of variants in the atlases. Since dialectal variants of a concept arise by a birth-and-death process, the number of maps containing x variants must follow the negative binomial distribution. One of these distributions is shown in Table 1 (Fig. 1).
+
Goebl (1984) studied the dialect maps of North West France and Italy and brought the distribution of the numbers of variants in the atlases. Since dialectal variants of a concept arise by a birth-and-death process, the number of maps containing <math>x</math> variants must follow the negative binomial distribution. One of these distributions is shown in Table 1 (Fig. 1).
  
<div align="center">
 
Table 1
 
Fitting the negative binomial distribution (4) to Goebl´s data</div>
 
  
 +
<div align="center">[[Image:Tabelle111_Div.jpg]]</div>
 +
 +
<div align="center">[[Image:DivFig1.JPG]]</div>
 +
<div align="center">Fig. 1.Fitting the negative binomial distribution to Goebl´s data</div>
  
Fig. 1. Fitting the negative binomial distribution to Goebl´s data
 
  
'''[[[Variable x muß von 0 anfangen]]] Zeichnen!!! Ich kann´s nicht.'''
 
  
  
 
'''Example: Beöthy´s law (semantic diversification)'''
 
'''Example: Beöthy´s law (semantic diversification)'''
  
According to this law ''the ranked frequencies of the elements of a semantic class are distributed according'' to (3) or (5) (see below). Rothe (1991c) brings a survey of semantic classes abiding by these laws. Testing has been perfomed for meanings of different Hungarian verbal prefixes (Beöthy, Altmann 1984a,b, 1991), Slovak verbal prefixes (Nemcová 1991), the Japanese postposition ni (Roos 1991), German compounds (Raether, Rothe 1991), the German particle ''von'' (Best 1991), the German preposition ''auf'' (Fuchs 1991), the English preposition ''in'' (Hennern 1991), the Polish preposition ''w'' (Hammerl, Sambor 1991), Russian conjunctions ''a'' and ''no'' (Kuße 1991), the French conjunction ''et'' (Rothe 1986), the German genitive (Rothe 1991b), word class distribution in Latin, German and Chinese (Schweers, Zhu 1991), in German (Best 1994, 1997b, 2000b, 2001e; Hammerl 1989; Judt 1995), in Arabic (Altmann 1991a), in Portuguese (Ziegler 1998, 2001), in French (Judt 1995), spelling errors by Japanese English-users (Rothe 1991), word building patterns in Early High German (Best 1990).  
+
According to this law ''the ranked frequencies of the elements of a semantic class are distributed according'' to (3) or (5) (see below). Rothe (1991c) brings a survey of semantic classes abiding by these laws. Testing has been perfomed for meanings of different Hungarian verbal prefixes (Beöthy, Altmann 1984a,b, 1991), Slovak verbal prefixes (Nemcová 1991), the Japanese postposition ni (Roos 1991), German compounds (Raether, Rothe 1991), the German particle ''von'' (Best 1991), the German preposition ''auf'' (Fuchs 1991), the English preposition ''in'' (Hennern 1991), the Polish preposition ''w'' (Hammerl, Sambor 1991), Russian conjunctions ''a'' and ''no'' (Kuße 1991), the French conjunction ''et'' (Rothe 1986), the German genitive (Rothe 1991b), word class distribution in Latin, German and Chinese (Schweers, Zhu 1991), in German (Best 1994, 1997b, 2000b, 2001e; Hammerl 1989; Judt 1995), in Arabic (Altmann 1991a), in Portuguese (Ziegler 1998, 2001), in French (Judt 1995), spelling errors by Japanese English-users (Rothe 1991c), word building patterns in Early High German (Best 1990).  
 
In the example (Table 2, Fig. 2) one finds the ranked distribution of German neologisms of the type “Noun + Noun” categorized in 13 groups from Raether, Rothe (1991).
 
In the example (Table 2, Fig. 2) one finds the ranked distribution of German neologisms of the type “Noun + Noun” categorized in 13 groups from Raether, Rothe (1991).
  
<div align="center">Table 2</div>
 
  
<div align="center">Rank-frequency distribution of German neologisms “Noun+Noun” of different grammatical-semantical categories according to Raether, Rothe (1991)</div>
+
<div align="center">[[Image:Tabelle2_Div.jpg ]]</div>
 
 
 
 
<div align="center">[[Image:Tabelle_2_Div.jpg]]</div>
 
  
 
The result shows that nominal classifications of language entities abide by this type of diversification law.
 
The result shows that nominal classifications of language entities abide by this type of diversification law.
  
 
<div align="center">[[Image:Grafik_2_Div.jpg]]</div>
 
<div align="center">[[Image:Grafik_2_Div.jpg]]</div>
 +
<div align="center">Fig. 2. Fitting the positive negative binomial distribution (3) to Raether-Rothe data</div>
  
Fig. 2. Fitting the positive negative binomial distribution (3) to Raether-Rothe data
 
  
  
Line 186: Line 184:
  
 
Hřebíček used two assumptions:
 
Hřebíček used two assumptions:
 +
 
(i) The logarithm of the ratio of the probabilities <math>P_1</math> and <math>P_x</math> is proportional to the logarithm of the classe size, i.e
 
(i) The logarithm of the ratio of the probabilities <math>P_1</math> and <math>P_x</math> is proportional to the logarithm of the classe size, i.e
  
<math>\ln(P_1/P_x) \ln </math>
+
<math>\ln(P_1/P_x)\propto\ln x\quad</math>
 
   
 
   
 
(ii) the proportionality function is given by the logarithm of Menzerath´s law (<math>\rightarrow</math> Hierarchy), i.e.
 
(ii) the proportionality function is given by the logarithm of Menzerath´s law (<math>\rightarrow</math> Hierarchy), i.e.
  
<math>\ln(P_1/P_x)=\ln(AX^b)\ln x</math>,
+
<math>\ln(P_1/P_x)=\ln(AX^b)\ln x\quad</math>,
  
 
yielding the solution
 
yielding the solution
  
(5)<math>P_x=P_1x^{-(a+b\ln x)}, \quad x=1,2,3,...</math>.
+
(5) <math>P_x=P_1x^{-(a+b\ln x)}, \quad x=1,2,3,...</math>.
  
If (5) is considered a probability distribution, then P1 is the norming constant, otherwise it is estimated as the size of the first class, x = 1. Since the frequency of the first class x = 1 is decisive for the form of the distribution, one usually ascribes it a special value α, modifying (5) as
+
If (5) is considered a probability distribution, then <math>P_1</math> is the norming constant, otherwise it is estimated as the size of the first class (<math>x</math> = 1). Since the frequency of the first class <math>x</math> = 1 is decisive for the form of the distribution, one usually ascribes it a special value α, modifying (5) as
  
(6)<math>P_x=\begin{cases}a, & x=1\\\frac{(1-a)x^{a+b\ln x}}{T}, & x=2,3,...,(n)\end{cases}</math>  
+
(6) <math>P_x=\begin{cases}\alpha, & x=1\\\frac{(1-a)x^{(a+b\ln x)}}{T}, & x=2,3,...,(n)\end{cases}</math>  
  
where  <math>T=\sum_{j=2}^nj^{-(a+b\ln j)}</math>, 0 < α < 1,  <math>a,b\in\mathfrak{R}</math>  so that <math>P_x</math> converges for <math>n\rightarrow\infty</math>. This version corroborates again the relevance of Menzerath´s law (<math>\rightarrow</math>). Distributions (5) or (6) are called ''Zipf-Alekseev distributions''. If ''n'' is finite, (6) is called ''modified right truncated Zipf-Alekseev distribution'' (see Wimmer, Altmann 1999).
+
where  <math>T=\sum_{j=2}^nj^{-(a+b\ln j)}</math>, 0 < α < 1,  <math>a,b\in\mathfrak{R}</math>  so that <math>P_x</math> converges for <math>n\rightarrow\infty</math>. This version corroborates again the relevance of Menzerath´s law (<math>\rightarrow</math>). Distributions (5) or (6) are called ''Zipf-Alekseev distributions''. If <math>n</math> is finite, (6) is called ''modified right truncated Zipf-Alekseev distribution'' (see Wimmer, Altmann 1999).
 
Even though (3) and (5) are quite different, it can be shown that they are special cases of the Siromoney-Dirichlet distribution
 
Even though (3) and (5) are quite different, it can be shown that they are special cases of the Siromoney-Dirichlet distribution
  
(7)<math>P_x=\frac{a_xe^{-\theta b_x}}{f(\theta)}, \quad x=1,2,3,...</math> <math>f(\theta)=\sum_{j=1}^\infty a_je^{-\theta b_j}<\infty</math>
+
(7) <math>P_x=\frac{a_xe^{-\theta b_x}}{f(\theta)}, \quad x=1,2,3,...</math> <math>f(\theta)=\sum_{j=1}^\infty a_je^{-\theta b_j}<\infty</math>
  
(i) If <math>a_x = k^{(x)}/x!, b_x = x, e^{-\theta} = q</math>, we obtain the positive negative binomial distribution with parameters (k, p) (q = 1-p);
+
(i) If <math>a_x = k^{(x)}/x!, b_x = x, e^{-\theta} = q\quad</math>, we obtain the positive negative binomial distribution with parameters (<math>k</math>,<math>p</math>) (<math>q = 1-p</math>);
  
(ii) if <math>\theta = 1, a_x = 1, b_x = (a+b \quad\ln \quad x)\ln x</math>, we obtain the Zipf-Alekseev distribution (a,b);
+
(ii) if <math>\theta = 1, a_x = 1, b_x = (a+b \quad\ln \quad x)\ln x</math>, we obtain the Zipf-Alekseev distribution (<math>a,b</math>);
  
(iii) the 1-displaced negative binomial distribution, which would be obtained with the conventional displacement of (4), would result if <math>a_x = k^{(x-1)}/(x-1)!, b_x = x-1, e^{-\theta} = q</math>.
+
(iii) the 1-displaced negative binomial distribution, which would be obtained with the conventional displacement of (4), would result if <math>a_x = k^{(x-1)}/(x-1)!, b_x = x-1, e^{-\theta} = q\quad</math>.
  
  
Line 220: Line 219:
  
 
The connotations of a word diversify because everybody can have different associations. Nevertheless, within a community of speakers, they are distributed in a very regular way suggesting a background mechanism which can be captured as a law.
 
The connotations of a word diversify because everybody can have different associations. Nevertheless, within a community of speakers, they are distributed in a very regular way suggesting a background mechanism which can be captured as a law.
In the dictionaries of word associations (see e.g. Palermo, Jenkins 1964), the responses to a stimulus word are ordered according to the number of test persons that gave the same response, i.e. they are ranked according to their frequency of occurrence. The test persons are usually classified according to age, sex, education, occupation, social status etc. Quantitative modelling began most probably in Horvath (1963) and continued in  Haight (1966), Haight, Jones (1974), Lánský, Radil-Weiss (1980) who used the logarithmic, the Yule, the Borel and the Haight-zeta distributions, none of which gave satisfactory results. Dolinskij (1988, 1994) proposed the Zipf-Alekseev distribution, Altmann (1992) added the 1-displaced negative binomial and modified the Zipf-Alekseev distributions.  
+
In the dictionaries of word associations (see e.g. Palermo, Jenkins 1964), the responses to a stimulus word are ordered according to the number of test persons that gave the same response, i.e. they are ranked according to their frequency of occurrence. The persons tested are usually classified according to age, sex, education, occupation, social status etc. Quantitative modelling began most probably in Horvath (1963) and continued in  Haight (1966), Haight, Jones (1974), Lánský, Radil-Weiss (1980) who used the logarithmic, the Yule, the Borel and the Haight-zeta distributions, none of which gave satisfactory results. Dolinskij (1988, 1994) proposed the Zipf-Alekseev distribution, Altmann (1992) added the 1-displaced negative binomial and modified the Zipf-Alekseev distributions.  
 
In Table 3 (Figure 3) one finds the fitting of the Zipf-Alekseev distribution to the rank-frequency of associations of the word “high” (4th grade, male) as given by Palermo, Jenkins (1964).
 
In Table 3 (Figure 3) one finds the fitting of the Zipf-Alekseev distribution to the rank-frequency of associations of the word “high” (4th grade, male) as given by Palermo, Jenkins (1964).
  
 
<div align="center">Table 3</div>
 
<div align="center">Table 3</div>
  
<div align="center">Fitting model (5) to the associations of the word “high” (4th grade, male) given by Palermo, Jenkins (1964)</div>
+
<div align="center">Fitting model (5) to the associations of the word “high” (4th grade, male) </div>
 +
<div align="center">given by Palermo, Jenkins (1964)</div>
  
  
Line 239: Line 239:
  
 
<div align="left">
 
<div align="left">
'''4. Authors''': U. Strauss, G. Altmann
+
'''4. Author''': U. Strauss, G. Altmann
  
  
Line 266: Line 266:
  
 
'''Beöthy, E., Altmann, G.''' (1984a). The diversification of meaning of Hungarian verbal prefixes. II. ki-. ''Finnisch-Ugrische Mitteilungen 8, 29-37.''
 
'''Beöthy, E., Altmann, G.''' (1984a). The diversification of meaning of Hungarian verbal prefixes. II. ki-. ''Finnisch-Ugrische Mitteilungen 8, 29-37.''
 +
 
'''Beöthy, E., Altmann, G.'''  (1984b). Semantic diversification of Hungarian verbal prefixes. III.föl-, el-, be-. ''Glottometrika 7, 73-100''.
 
'''Beöthy, E., Altmann, G.'''  (1984b). Semantic diversification of Hungarian verbal prefixes. III.föl-, el-, be-. ''Glottometrika 7, 73-100''.
  
Line 283: Line 284:
  
 
'''Best, K.-H.''' (2001). Zur Gesetzmäßigkeit der Wortartenverteilungen in deutschen Pressetexten. ''Glottometrics 1, 1-26''.
 
'''Best, K.-H.''' (2001). Zur Gesetzmäßigkeit der Wortartenverteilungen in deutschen Pressetexten. ''Glottometrics 1, 1-26''.
 +
 +
'''Best, K.-H.''' (2003). ''Quantitative Linguistik: eine Annäherung.'' 2nd ed. Göttingen: Peust & Gutschmidt.
 +
 +
'''Best, K.-H.''' (2007). Diversifikation bei Eigennamen. In: Grzybek, P., Köhler, R. (eds.), ''Exact Methods in the Study of language and Text: 21-31''. Berlin: de Gruyter
 +
 +
'''Best, K.-H.''' (2008). Zur Diversifikation lateinischer und griechischer Hexameter. ''Glottometrics 17: 43-50.
 +
 +
'''Best, K.-H.''' (2009). Diversifikation des Phonems /r/ im Deutschen. ''Glottometrics 18: 26-31.
  
 
'''Brüers, N., Heeren, A.''' (2004). Plural-Allomorphe in Briefen Heinrich von Kleists. ''Glottometrics 7: 85-90.''
 
'''Brüers, N., Heeren, A.''' (2004). Plural-Allomorphe in Briefen Heinrich von Kleists. ''Glottometrics 7: 85-90.''
  
 
'''Dolinskij, V.A.''' (1988). Raspredelenie reakcij v ekseprimentach po verbal´nym associacijam. ''Acta et Commentationes Universitatis Tartuensis 827, 80-101.''
 
'''Dolinskij, V.A.''' (1988). Raspredelenie reakcij v ekseprimentach po verbal´nym associacijam. ''Acta et Commentationes Universitatis Tartuensis 827, 80-101.''
 +
 
'''Dolinskij, V.A.''' (1994). Moscow Student´s  word associations. In: ''2nd International Confer ence on Quantitative Linguistics, September 20-24, 1994, Moscow: 66-68.'' Moscow: Lomonosov Moscow State University.
 
'''Dolinskij, V.A.''' (1994). Moscow Student´s  word associations. In: ''2nd International Confer ence on Quantitative Linguistics, September 20-24, 1994, Moscow: 66-68.'' Moscow: Lomonosov Moscow State University.
  
Line 336: Line 346:
  
 
'''Kuße, H.''' (1991). A und no in N.M. Karamzins Pis´ma Russkogo Putesetvennika. In: Rothe, U. (ed.), ''Diversification processes in language: grammar: 173-182''. Hagen: Rottmann.
 
'''Kuße, H.''' (1991). A und no in N.M. Karamzins Pis´ma Russkogo Putesetvennika. In: Rothe, U. (ed.), ''Diversification processes in language: grammar: 173-182''. Hagen: Rottmann.
 +
 +
'''Laufer, J., Nemcová, E.''' (2009). Diversifikation deutscher morphologischer Klassen. ''Glottometrics 18, 13-25.''
  
 
'''Lánský, P., Radil-Weiss, T.''' (1980). A generalization of the Yule-Simon model, with special reference to word association tests and neural cell assembly formation. ''J. of Mathematical Psychology 21, 53-65''.
 
'''Lánský, P., Radil-Weiss, T.''' (1980). A generalization of the Yule-Simon model, with special reference to word association tests and neural cell assembly formation. ''J. of Mathematical Psychology 21, 53-65''.
  
 
'''Leopold, E.''' (1998). ''Stochastische Modellierung lexikalischer Evolutionsprozesse''. Hamburg: Kovač.
 
'''Leopold, E.''' (1998). ''Stochastische Modellierung lexikalischer Evolutionsprozesse''. Hamburg: Kovač.
 +
 +
'''Meuser, K., Schütte, J.M., Stremme, S.''' (2008). Pluralallomorphe in den Kurzgeschichten von Wolfdietrich Schnurre. ''Glottometrics 17, 12-17.''
  
 
'''Nemcová, E.''' (1991). Semantic diversification of Slovak verbal prefixes. In: Rothe, U. (ed.), ''Diversification processes in language: grammar: 67-74''. Hagen: Rottmann.
 
'''Nemcová, E.''' (1991). Semantic diversification of Slovak verbal prefixes. In: Rothe, U. (ed.), ''Diversification processes in language: grammar: 67-74''. Hagen: Rottmann.
Line 346: Line 360:
  
 
'''Pawlowski, A.''' (1999). The quantitative approach in cultural anthropology: Application of linguistic corpora in the analysis of basic colour terms. ''J. of Quantitative Linguistics 6, 222 234''.
 
'''Pawlowski, A.''' (1999). The quantitative approach in cultural anthropology: Application of linguistic corpora in the analysis of basic colour terms. ''J. of Quantitative Linguistics 6, 222 234''.
 +
 +
'''Popescu, I.-I., Altmann, G.''' (2008). On the regularity of diversification in language. ''Glottometrics 17, 94-108''.
 +
 +
'''Popescu, I.-I., Kelih, E., Best, K.-H., Altmann, G.''' (2009). Diversification of the case. ''Glottometrics 18, 32-39''.
  
 
'''Raether, A., Rothe, U.''' (1991). Diversifikation der deutschen Komposita. In: Rothe, U. (ed.) ''1991: 85-91''.
 
'''Raether, A., Rothe, U.''' (1991). Diversifikation der deutschen Komposita. In: Rothe, U. (ed.) ''1991: 85-91''.

Latest revision as of 13:23, 15 June 2009

1. Problem and history

Diversification is a process of enlarging the number of forms or meanings of any linguistic entity. It can be paradigmatic, e.g. the rise of cases, numbers, tenses, etc., phono-morphemic, e.g. the rise of allophones, allomorphs etc., geographical, e.g. the increase in the number of different expressions of a concept, social, e.g. the rise of different words or meanings of a word or different pronunciations, idiolectal within a community, semantic, e.g. the increase in synonymy and polysemy, contextual, e.g. the increase in the usage of a unit in different contexts. Diversification comprises a number of phenomena dispersed in this volume.

For the sake of illustration let us show some concrete examples:

1) The word can enlarge its class membership without any change, e.g. through conversion: “the hand”, “to hand”.

2) The stem enlarges its class membership through derivation, e.g. German "Bild", "bilden", "bildhaft", or vocalization in Semitic languages, etc.

3) The stem can enlarge its applicability within one class through derivation e.g. German "Blut", "Blutung", "Bluter", or through vocalization, etc.

4) The stem can enlarge its applicability within one class through compounding e.g. "Blut", "Blutdruck", "Blutdurst", etc.

5) If a language abandons isolating morphology, then morphemes diversify into several morphs because of agglutination or inflection (sequential or syntactic dependence).

6) The word can enlarge its applicability in the sentence by acquiring several functions, i.e. it enlarges its dispositional properties, which are different from the constant grammatical properties, e.g. practically every word can become the subject of a sentence.

7) Verbs can enlarge their valence, i.e. their combinability with different cases.

8) The word can enlarge its cotextuality (cf. Köhler 1986), i.e. its ability to occur in several contexts (where "context" can be defined in several ways). The reverse of this kind of diversification process is part of style formation, where a "position" diversifies, i.e., a position in a given context can be filled with different units (words, sentences, etc.).

9) A concept can be expressed by different forms, giving rise to dialects, sociolects, idiolects, or to synonymy.

10) A word can acquire different meaning (polysemy).

11) Every word can acquire different associations (connotations).

Diversified entities abide by a ranking law, i.e. if the members of the diversified entity are ordered according to their frequency, then the frequencies are “lawfully” connected.

The factors of diversification can be as follows (Altmann 2005):

a) Random fluctuation which is omnipresent in any language phenomena.

b) Environmentally conditioned variation forcing an element to acquire different forms or meaning nuances in different environments.

c) Conscious change through conscious creation, borrowing, emotionality etc.

d) Self-organisatory triggering of a process to a limit, causing changes on other levels, too.

e) System modification joined with local or global modifications in a subsystem,

f) Köhlerian requirements (Köhler 1986, 1987, 1989, 1990, 1991) forcing to take into account collateral pressures form different sides. They are as follows: (i) The trend for minimal coding and deciding effort, (ii) sufficient redundancy, (iii) the coding requirement in general, (iv) context economy vs. context specificity, and (v) invariance vs. flexibility of relation between expression and meaning.

The concepts of diversification and unification go back to G.K. Zipf (1935, 1949). Together they are called “Zipfian processes”. The scope of the phenomena is enormous. Semantic phenomena have been examined by Beöthy and Altmann (1984a,b, 1991), Altmann (1985a), Altmann, Best, Kind (1987); grammatical phenomena were analyzed in the omnibus volume edited by Rothe (1991), where also a study on spelling errors in English can be found, and dialectal diversification was studied by Altmann (1985b).

The laws hold usually for ranked nominal classes of limited size.


2. Hypothesis

Every linguistic entity diversifies, i.e. it generates variants and secondary forms and acquires membership in different classes. The ranked frequencies of individual entities abide by a rank-frequency distribution (or a rank-frequency series).

A “rank-frequency distribution” (series) is a function expressing the decrease of frequencies ranked according to their magnitude. There are, eo ipso, no bell-shaped rank-frequency distributions.

“Variants” are all free or conditional “non-standard” forms of the entity, e.g. allophones, allomorphs, dialectal or sociolectal expressions of a concept, etc.

“Secondary forms” are in some way derived from the primary form, e.g. secondary meanings (polysemy), cases, times, moods, aspects, etc.

“Classes” are built by a class-building criterion, e.g. derivates, compounds, declination classes, word classes (Wortarten), even semantic classes, etc.

Corollary: If the above hypothesis holds, then the frequencies of elements of a linguistic class are not distributed uniformly.

In a “uniform distribution” all frequencies are equal. The corollary is rather a well corroborated inductive generalization. Some theoretical rank-frequency distributions can result in the discrete uniform distribution for special values of parameters but they are not actual in linguistics.


3. Derivation

3.1. Altmann´s approach A (1991).

Because the entities are ranked and because of the corollary, it is true that for the probabilities of classes it holds that

P_x\le P_{x-1}

Since P_x and P_{x-1} (x = 2,3,…) are joined in a law-like manner, we can write

(1) P_x=g(x)P_{x-1}\quad, where g(x)\le 1\quad.

Furthermore, g(x) can be written as

g(x)=\frac{f(x)}{h(x)},

where f(x) is a function composed of a language constant a and the diversifying effect of the speaker bx, i.e. f(x) = a+bx, while h(x) contains the controlling, regulating effect of the hearer (community) cx, i.e.

g(x)=\frac{a+bx}{cx} \le 1, (a, b, and c are assumed positive),

so that

(2) P_x=\frac{a+bx}{cx}P_{x-1}.

In order to obtain a known distribution, one can reparametrize (2) by writing a/b = k-1 and b/c = q, and solving (2) for P_x. One obtains

(3) P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}\frac{p^kq^x}{1-p^k}, \quad x=1,2,3,...

yielding the zero-truncated (positive) negative binomial distribution. The condition g(x)\le 1 is fulfilled if kq\le 1. Using (1) Altmann (1991) showed a number of other possibilities of obtaining a diversification distribution.


3.2. Alternative derivation (Altmann 1985b)

For the purposes of dialectal variation captured in terms of numbers of lexeme variants on maps of a dialect atlas, Altmann (1985) used the birth-and-death process based on the following assumptions:

(a) In a time interval Δt the birth of a new variant is proportional to the length of the interval, i.e. aΔt.

(b) The assertion of a variant against x rivals is propotional to the number of rivals and the length of the interval, i.e. bxΔt.

(c) The death of a variant is proportional to the number of variants and the length of the interval, i.e. cxΔt.

(d) No change (birth, death or assertion) in Δt is given as the complement to the above changes: 1 – [a+(b+c)xt, ignoring intervals smaller then Δt.

(e) The events are independent and the probability of more then one event in the interval is zero.

Thus the probability that there are x-1 variants and a new variant arises or asserts itself against x-1 rivals is

a\triangle tP_{x-1}(t) + b(x-1)\triangle tP_{x-1}(t);

the probability that there are x+1 variants and one dies is

c(x+1)\triangle tP_{x+1}(t);

the probability that nothing happens in Δt is

{{1-[a+(b+c)x]\triangle t}}P_x(t).

Putting these probabilities together we obtain the probability that in the interval (t, tt) there will be exactly x variants as

P_x(t+\triangle t) = [a+b(x-1)]\triangle tP_{x-1}(t) + c(x+1)\triangle tP_{x+1}(t) + {1-[a+(b+c)x]\triangle t}P_x(t).

Substracting P_x from both sides and dividing them by Δt, we obtain

\frac{P_x(t+\triangle t)-P_x(t)}{\triangle t}= [a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)].

Letting \triangle t\rightarrow  0 we finally get

\frac{dP_x(t)}{dt}=[a+b(x-1)P_{x-1}(t)+c(x+1)P_{x+1}(t)-[a+(b+c)x]P_x(t)]

\frac{dP_0(t)}{dt}=cP_1(t)-aP_0(t)


Solving the balancing equations holding for the steady state

-aP_0+cP_1=0,\quad

-[a+(b+c)x]P_x+[a+b(x-1)]P_{x-1}+c(x+1)P_{x+1}=0, \quad x\ge 1,


and setting b/c = q and a/b = k results again in the negative binomial distribution

(4) P_x=\begin{pmatrix}k&+&x&-&1\\&&x\end{pmatrix}p^kq^x, \quad x=0,1,2,...

For dialect maps, (4) is to be understood as the probability that the basic lexeme has x variants, i.e. if on a map there is only one unique form, then x = 0.


Example: Goebl´s law (dialectal diversification)

Goebl (1984) studied the dialect maps of North West France and Italy and brought the distribution of the numbers of variants in the atlases. Since dialectal variants of a concept arise by a birth-and-death process, the number of maps containing x variants must follow the negative binomial distribution. One of these distributions is shown in Table 1 (Fig. 1).


Tabelle111 Div.jpg
DivFig1.JPG
Fig. 1.Fitting the negative binomial distribution to Goebl´s data



Example: Beöthy´s law (semantic diversification)

According to this law the ranked frequencies of the elements of a semantic class are distributed according to (3) or (5) (see below). Rothe (1991c) brings a survey of semantic classes abiding by these laws. Testing has been perfomed for meanings of different Hungarian verbal prefixes (Beöthy, Altmann 1984a,b, 1991), Slovak verbal prefixes (Nemcová 1991), the Japanese postposition ni (Roos 1991), German compounds (Raether, Rothe 1991), the German particle von (Best 1991), the German preposition auf (Fuchs 1991), the English preposition in (Hennern 1991), the Polish preposition w (Hammerl, Sambor 1991), Russian conjunctions a and no (Kuße 1991), the French conjunction et (Rothe 1986), the German genitive (Rothe 1991b), word class distribution in Latin, German and Chinese (Schweers, Zhu 1991), in German (Best 1994, 1997b, 2000b, 2001e; Hammerl 1989; Judt 1995), in Arabic (Altmann 1991a), in Portuguese (Ziegler 1998, 2001), in French (Judt 1995), spelling errors by Japanese English-users (Rothe 1991c), word building patterns in Early High German (Best 1990). In the example (Table 2, Fig. 2) one finds the ranked distribution of German neologisms of the type “Noun + Noun” categorized in 13 groups from Raether, Rothe (1991).


Tabelle2 Div.jpg

The result shows that nominal classifications of language entities abide by this type of diversification law.

Grafik 2 Div.jpg
Fig. 2. Fitting the positive negative binomial distribution (3) to Raether-Rothe data


3.3. Hřebíček ´s approach (1996)

Hřebíček used two assumptions:

(i) The logarithm of the ratio of the probabilities P_1 and P_x is proportional to the logarithm of the classe size, i.e

\ln(P_1/P_x)\propto\ln x\quad

(ii) the proportionality function is given by the logarithm of Menzerath´s law (\rightarrow Hierarchy), i.e.

\ln(P_1/P_x)=\ln(AX^b)\ln x\quad,

yielding the solution

(5) P_x=P_1x^{-(a+b\ln x)}, \quad x=1,2,3,....

If (5) is considered a probability distribution, then P_1 is the norming constant, otherwise it is estimated as the size of the first class (x = 1). Since the frequency of the first class x = 1 is decisive for the form of the distribution, one usually ascribes it a special value α, modifying (5) as

(6) P_x=\begin{cases}\alpha, & x=1\\\frac{(1-a)x^{(a+b\ln x)}}{T}, & x=2,3,...,(n)\end{cases}

where T=\sum_{j=2}^nj^{-(a+b\ln j)}, 0 < α < 1, a,b\in\mathfrak{R} so that P_x converges for n\rightarrow\infty. This version corroborates again the relevance of Menzerath´s law (\rightarrow). Distributions (5) or (6) are called Zipf-Alekseev distributions. If n is finite, (6) is called modified right truncated Zipf-Alekseev distribution (see Wimmer, Altmann 1999). Even though (3) and (5) are quite different, it can be shown that they are special cases of the Siromoney-Dirichlet distribution

(7) P_x=\frac{a_xe^{-\theta b_x}}{f(\theta)}, \quad x=1,2,3,... f(\theta)=\sum_{j=1}^\infty a_je^{-\theta b_j}<\infty

(i) If a_x = k^{(x)}/x!, b_x = x, e^{-\theta} = q\quad, we obtain the positive negative binomial distribution with parameters (k,p) (q = 1-p);

(ii) if \theta = 1, a_x = 1, b_x = (a+b \quad\ln \quad x)\ln x, we obtain the Zipf-Alekseev distribution (a,b);

(iii) the 1-displaced negative binomial distribution, which would be obtained with the conventional displacement of (4), would result if a_x = k^{(x-1)}/(x-1)!, b_x = x-1, e^{-\theta} = q\quad.


Formula (7) admits to the development of further theoretical approaches (see Wimmer, Altmann 1999).


Example: Association law

The connotations of a word diversify because everybody can have different associations. Nevertheless, within a community of speakers, they are distributed in a very regular way suggesting a background mechanism which can be captured as a law. In the dictionaries of word associations (see e.g. Palermo, Jenkins 1964), the responses to a stimulus word are ordered according to the number of test persons that gave the same response, i.e. they are ranked according to their frequency of occurrence. The persons tested are usually classified according to age, sex, education, occupation, social status etc. Quantitative modelling began most probably in Horvath (1963) and continued in Haight (1966), Haight, Jones (1974), Lánský, Radil-Weiss (1980) who used the logarithmic, the Yule, the Borel and the Haight-zeta distributions, none of which gave satisfactory results. Dolinskij (1988, 1994) proposed the Zipf-Alekseev distribution, Altmann (1992) added the 1-displaced negative binomial and modified the Zipf-Alekseev distributions. In Table 3 (Figure 3) one finds the fitting of the Zipf-Alekseev distribution to the rank-frequency of associations of the word “high” (4th grade, male) as given by Palermo, Jenkins (1964).

Table 3
Fitting model (5) to the associations of the word “high” (4th grade, male)
given by Palermo, Jenkins (1964)


Tabelle 3 Divers.jpg

The result represents a perfect fit that has been found in all cases of associations.


Grafik 3 Div.jpg

Fig. 3. Fitting the Zipf-Alekseev distribution (5) to the word associations of “high”


4. Author: U. Strauss, G. Altmann


5. References


Alekseev, P. M. (1978), O nelinejnych formulirovkach zakona Cipfa. In: Piotrovskij, R.G. (ed.), Statistika reči i avtomatičeskij analiz teksta: 53-65. Moskva/Leningrad: Naučnyj sovet po kompleksnoj probleme “Kibernetika” AN SSSR.

Altmann, G. (1985a). Semantische Diversifikation. Folia Linguistica 19, 177-200.

Altmann, G. (1985b). Die Entstehung diatopischer Varianten. Ein stochastisches Modell. Zs. für Sprachwissenschaft 4, 139-155.

Altmann, G. (1991). Modelling diversification phenomena in language. In: Rothe 1991: 33-46.

Altmann, G. (1991a). Word class diversification of Arabic verbal roots. In: Rothe 1991: 57-59.

Altmann, G. (1992). Two models for word association data. Glottometrika 13, 105-120.

Altmann, G. (1996). Diversification processes of the word. Glottometrika 15, 102-111.

Altmann, G. (2005). Diversification processes. In: Köhler, R., Altmann, G., Piotrowski, R.G. (eds.), Handbook of Quantitative Linguistics: 646-658. Berlin: de Gryuter.

Altmann, G., Best, K.H., Kind, B. (1987). Eine Verallgemeinerung des Gesetzes der semantischen Diversifikation. Glottometrika 8, 130-139.

Becker, H. (1995). Die Wirtschaft in der deutschsprachigen Presse. Frankfurt: Lang.

Beöthy, E., Altmann, G. (1984a). The diversification of meaning of Hungarian verbal prefixes. II. ki-. Finnisch-Ugrische Mitteilungen 8, 29-37.

Beöthy, E., Altmann, G. (1984b). Semantic diversification of Hungarian verbal prefixes. III.föl-, el-, be-. Glottometrika 7, 73-100.

Beöthy, E., Altmann, G. (1991). The diversification of meaning of Hungarian verbal prefixes. I.meg-. In: Rothe, U. (ed) 1991: 60-66.

Best, K.-H. (1990). Die semantische Diversifikation eines Wortbildungsmusters im Frühneuhochdeutschen. Glottometrika 11, 107-110.

Best, K.-H. (1991). Von: Zur Diversifikation einer Partikel des Deutschen. In: Rothe U. (ed) 1991: 94-104.

Best, K.H. (1993). Zur Wortartenhäufigkeit in Texten deutscher Kurzprosa der Gegenwart. Glottometrika 15, 1993, 1-11.

Best, K.-H. (1994). Word class frequencies in contemporary German short prose texts. J. of Quantitative Linguistics 1, 144-147.

Best, K.-H. (1997). Zur Wortartenhäufigkeit in Texten deutscher Kurzprosa. Glottometrika 16, 276-285.

Best, K.-H. (2000). Verteilung der Wortarten in Anzeigen. Göttinger Beiträge zur Sprachwissenschaft 4, 37-51

Best, K.-H. (2001). Zur Gesetzmäßigkeit der Wortartenverteilungen in deutschen Pressetexten. Glottometrics 1, 1-26.

Best, K.-H. (2003). Quantitative Linguistik: eine Annäherung. 2nd ed. Göttingen: Peust & Gutschmidt.

Best, K.-H. (2007). Diversifikation bei Eigennamen. In: Grzybek, P., Köhler, R. (eds.), Exact Methods in the Study of language and Text: 21-31. Berlin: de Gruyter

Best, K.-H. (2008). Zur Diversifikation lateinischer und griechischer Hexameter. Glottometrics 17: 43-50.

Best, K.-H. (2009). Diversifikation des Phonems /r/ im Deutschen. Glottometrics 18: 26-31.

Brüers, N., Heeren, A. (2004). Plural-Allomorphe in Briefen Heinrich von Kleists. Glottometrics 7: 85-90.

Dolinskij, V.A. (1988). Raspredelenie reakcij v ekseprimentach po verbal´nym associacijam. Acta et Commentationes Universitatis Tartuensis 827, 80-101.

Dolinskij, V.A. (1994). Moscow Student´s word associations. In: 2nd International Confer ence on Quantitative Linguistics, September 20-24, 1994, Moscow: 66-68. Moscow: Lomonosov Moscow State University.

Fuchs, R. (1991). Semantische Diversifikation der deutschen Präposition auf. In: Rothe, U. (ed.) 1991: 105-115.

Goebl, H. (1984). Dialektometrische Studien I. Tübingen: Niememyer.

Haight, F.A. (1966). Some statistical problems in connection with word association data. J. of Mathematical Psychology 3, 217-233.

Haight, F.A., Jones, R.B. (1974). A probabilistic treatment of qualitative data with special reference to word association tests. J. of Mathematical Psychology 11, 237-244.

Hammerl, R. (1989). Untersuchungen zur Verteilung der Wortarten im Text. Glottometrika 11, 142-156.

Hammerl, R. (1991). Untersuchungen zur Struktur der Lexik: Aufbau eines lexikalischen Basismodells. Trier, WVT.

Hammerl, R., Sambor, J. (1991). Untersuchungen zur Verteilung der Bedeutungen der polyfunktionalen polnischen Präposition ‘w’ im Text. In: Rothe, U. (ed.), 1991: 127-137.

Hammerl, R., Sambor, J. (1993a). O statystycznych prawach jezykowych. Warszawa: Polskie Towarzystwo Semiotyczne.

Hennern, A. (1991). Zur semantischen Diversifikation von „in“ im Englischen. In: Rothe, U. (Hrsg.), Diversification processes in language: grammar: 116-126. Hagen: Rottmann.

Horvath, W.J. (1963). A stochastic model for word association tests. Psychological Review 70, 361-364.

Hřebíček, L. (1996). Word associations and text. Glottometrika 15, 12-17.

Jakubajtis, T.A. (1981). Časti reči i tipi tekstov. Riga: Zinatne.

Judt, B. (1995). Wortartenhäufigkeiten im Deutschen und Französischen. Göttinen: Staats examensarbeit.

Junger, J. (1989). Diversification in the modern Hebrew verbal system. Glottometrika 10, 71 99.

Kločkova, E.A. (1968). O raspredelenii klassov slov v nekotorych funkcional´nach stiljach russ kogo jazyka. In: Voprosy slavjanskogo jazykoznanija: 109-118. Saratov.

Köhler, R. (1986), Zur linguistischen Synergetik. Struktur und Dynamik der Lexik. Bochum: Bockmeyer.

Köhler, R. (1987), Systems theoretical linguistics. Theoretical Linguistics 14, 241-57.

Köhler, R. (1989). Linguistische Analyseebenen, Hierarchisierung und Erklärung im Modell der sprachlichen Selbstregulation. Glottometrika 11, 1-18 (Ed. L. Hřebíček). Bochum: Brockmeyer.

Köhler, R. (1990). Elemente der synergetischen Linguistik. In: Glottometrika 12, 179-187. (Ed. R.Hammerl). Bochum: Brockmeyer,.

Köhler, R. (1991). Diversification of coding methods in grammar. In: Rothe, U. (ed.), Diversification processes in language: Grammar: 47-55. Hagen: Rottman.

Köhler, R. (1991). Diversification of coding methods in grammar. In: Rothe, U. (Hrsg.), Diversification processes in language: grammar: 47-55. Hagen: Rottmann.

Krylov, Ju.K. (1982a).Ob odnoj paradigme lingvostatističeskich raspredelenij. Acta et Commentationens Universitatis Tartuensis 628, 80-102.

Krylov, Ju.K. (1982b). Eine Untersuchung statistischer Gesetzmäßigkeiten auf der paradigmatischen Ebene der Lexik natürlicher Sprachen. In: Guiter, H., Arapov, M.V. (eds.), Studies on Zipf´s law: 234-262. Bochum: Brockmeyer.

Kuße, H. (1991). A und no in N.M. Karamzins Pis´ma Russkogo Putesetvennika. In: Rothe, U. (ed.), Diversification processes in language: grammar: 173-182. Hagen: Rottmann.

Laufer, J., Nemcová, E. (2009). Diversifikation deutscher morphologischer Klassen. Glottometrics 18, 13-25.

Lánský, P., Radil-Weiss, T. (1980). A generalization of the Yule-Simon model, with special reference to word association tests and neural cell assembly formation. J. of Mathematical Psychology 21, 53-65.

Leopold, E. (1998). Stochastische Modellierung lexikalischer Evolutionsprozesse. Hamburg: Kovač.

Meuser, K., Schütte, J.M., Stremme, S. (2008). Pluralallomorphe in den Kurzgeschichten von Wolfdietrich Schnurre. Glottometrics 17, 12-17.

Nemcová, E. (1991). Semantic diversification of Slovak verbal prefixes. In: Rothe, U. (ed.), Diversification processes in language: grammar: 67-74. Hagen: Rottmann.

Palermo, D.S., Jenkins, J.J. (1964). Word association norms. Grade School through College. Minneapolis: University of Minnesota Press.

Pawlowski, A. (1999). The quantitative approach in cultural anthropology: Application of linguistic corpora in the analysis of basic colour terms. J. of Quantitative Linguistics 6, 222 234.

Popescu, I.-I., Altmann, G. (2008). On the regularity of diversification in language. Glottometrics 17, 94-108.

Popescu, I.-I., Kelih, E., Best, K.-H., Altmann, G. (2009). Diversification of the case. Glottometrics 18, 32-39.

Raether, A., Rothe, U. (1991). Diversifikation der deutschen Komposita. In: Rothe, U. (ed.) 1991: 85-91.

Roos, U. (1991). Diversifikation der japanischen Postposition “-ni”. In: Rothe, U. (ed.), Diversification processes in language: grammar: 75-82. Hagen: Rottmann.

Rothe, U. (1986). Die Semantik des kontextuellen et. Frankfurt: Lang.

Rothe, U. (1990). Verteilung der Suffixe denominaler Verben nach ihren semantischen Wortbildungsmustern. Glottometrika 12, 107-114.

Rothe, U. (1990a). Semantische Motivation der Genuszuweisung. Glottometrika 11, 95-106.

Rothe, U. (1990b). Semantische Beziehungen zwischen Präfixen deutscher denominaler Verben und der motivierenden Nomina. Glottometrika 11, 111-121.

Rothe, U. (ed.) (1991). Diversification processes in language: grammar. Hagen: Rottmann.

Rothe, U. (1991a). Diversification processes in grammar. An introduction. In: Rothe, U. (ed.), Diversification processes in language: grammar: 3-32. Hagen: Rottmann.

Rothe, U. (1991b). Diversification of the case in German: genitive. In: Rothe, U. (ed.), Diversification processes in language: grammar: 140-156. Hagen: Rottmann.

Rothe, U. (1991c). Distribution of spelling errors by Japanese English-users. In: Rothe, U. (ed.), Diversification processes in language: grammar: 168-171. Hagen: Rottmann.

Saukkonen, P., Haipus, M., Niemikorpi, A., Sulkala, H. (1979). Suomen kielen taajuussa nasto. A frequency dictionary of Finnish. Porvoo-Helsinki: Juva.

Schweers, A., Zhu, J. (1991). Wortartenklassifikation im Lateinischen, Deutschen und Chinesischen. In: Rothe U. 1991: 157-167.

Schweiger, F. (1987). Zu den Modellen der semantischen Diversifikation von G. Altmann. Folia Linguistica 21, 191-194.

Tiščenko, V. (1970). Častota častii movi v riznich funkcional´nych stiljach sučasnoj ukrains´koj movi. In: Pitanija strukturnoi leksikologii. Kiiv.

Tuldava, J. (1998). Probleme und Methoden der quantitativ-systemischen Lexikologie. Trier: WVT.

Wimmer, G., Altmann, G. (1999). Thesaurus of univariate discrete probability distributions. Essen: Stamm.

Ziegler, A. (1998b). Word class frequencies in Brazilian-Portuguese texts. J. of Quantitative Linguistics 5, 269-280.

Ziegler, A. (2001). Word class frequencies in Portuguese press texts. In: Uhlířová, L., Wimmer, G., Altmann, G., Köhler, R. (Eds.), Text as a linguistic paradigm: levels, constituents, con-structs. Festschrift in honour of Ludek Hřebíček: 295-312. Trier: WVT

Ziegler, A., Best, K.-H., Altmann, G. (2001). A contribution to text spectra. Glottometrics 1, 97-108.

Zipf, G. K. (1935). The psycho-biology of language. An introduction to dynamic philology. Boston: Houghton Mifflin.

Zipf, G.K. (1949). Human behavior and the principle of least effort. Cambridge: Addison Wesley.

Zsilka, T. (1974). Stilisztika és statisztika. Budapest.