Difference between revisions of "Word associations"
Line 15: | Line 15: | ||
Hřebíček starts from two assumptions: | Hřebíček starts from two assumptions: | ||
− | (i) The frequency of an association <math>f_x</math> at any rank x is proportional to the frequency at the first rank <math>f_1</math>. | + | (i) The frequency of an association <math>''f_x''</math> at any rank ''x'' is proportional to the frequency at the first rank <math>''f_1''</math>. |
− | (ii) The frequency fx at rank x is inversely proportional to the rank x. | + | (ii) The frequency ''fx'' at rank ''x'' is inversely proportional to the rank ''x''. |
Putting this together, we obtain | Putting this together, we obtain | ||
Line 34: | Line 34: | ||
(2)<math> P_x = P_1 x ^{-(a+b\ln x)}\quad</math>. | (2)<math> P_x = P_1 x ^{-(a+b\ln x)}\quad</math>. | ||
− | Usually <math>P_1</math> is so important that one gives it a special value, i.e. one modifies the distribution obtaining | + | Usually <math>''P_1''</math> is so important that one gives it a special value, i.e. one modifies the distribution obtaining |
(3)<math> P_x = \begin{cases} \alpha, & x = 1 \\ \frac{(1-\alpha)x^{-(a+b \ln x)}}{T}, & x = 2,3,...,n \end{cases}</math>. | (3)<math> P_x = \begin{cases} \alpha, & x = 1 \\ \frac{(1-\alpha)x^{-(a+b \ln x)}}{T}, & x = 2,3,...,n \end{cases}</math>. | ||
Line 42: | Line 42: | ||
'''3.2. The negative binomial model''' (Altmann 1992) | '''3.2. The negative binomial model''' (Altmann 1992) | ||
− | Assumption: The probability <math>P_x</math> at rank x is proprotional to the probability <math>P_{x-1}</math> at rank x-1, the proportionality being g(x) = (a+bx)/(cx). If ranking begins with x = 1, one solves the equation for the displaced form, i.e. | + | Assumption: The probability <math>''P_x''</math> at rank x is proprotional to the probability <math>''P_{x-1}</''math> at rank ''x-1'', the proportionality being g''(x) = (a+bx)/(cx).'' If ranking begins with ''x = 1'', one solves the equation for the displaced form, i.e. |
(4)<math> P_{x+1} = \frac{a+bx}{cx}P_x</math> | (4)<math> P_{x+1} = \frac{a+bx}{cx}P_x</math> | ||
Line 50: | Line 50: | ||
(5)<math> P_x = {k+x-2 \choose x-1}p^k q^{x-1}, \quad x=1,2,3,...</math> | (5)<math> P_x = {k+x-2 \choose x-1}p^k q^{x-1}, \quad x=1,2,3,...</math> | ||
− | where a/b = k-1, b/c = q. | + | where ''a/b = k-1, b/c = q.'' |
'''Example''': Associations of the word “high” | '''Example''': Associations of the word “high” |
Revision as of 11:38, 13 July 2006
1. Problem and history
Giving a word as a stimulus, different persons respond with different words, e.g. “music” as stimulus can evoke “violin”, “Chopin”, “love”, “melody”, etc. Asking many persons one can observe that the frequency of particular responses (associations) is not equal, on the contrary, the response words can be ranked according to their frequency. The problem is to find the adequate ranking () distribution. Associations are thus both ranking and diversification problems.
Not considering qualitative work and compilation of frequency lists having mostly the character of voluminous books, the first attempt at modelling was probably made by Horvath (1963) who used inductively the Yule distribution. Haight (1966) compared the Borel, the Yule, the logarithmic distributions with the distribution derived by him for this purpose called now Haight-zeta distribution (cf. Wimmer, Altmann 1999) but none of them could yield adequate results. Haight and Jones (1974) as well as Lánský and Radil-Weiss (1980) tried another approach but attained good results only in about 50% of cases. Dolinskij (1994, 1988) used for this purpose the Zipf-Alekseev distribution which is a generalization of Zipf distribution. Altmann (1992) has shown that the deviations from this distribution are extremely small (P ≈ 1 in almost all cases) but the foundation of this distribution was performed by Hřebíček (1995, 1996, 1997). Altmann (1992) used the usual proportionality approach with speaker-hearer balance and obtained the negative binomial distribution.
2. Hypothesis
The ranking of word associations abides by a regular ranking distribution.
3. Derivation
3.1. The Zipf-Alekseev model (Hřebíček 1997: 43)
Hřebíček starts from two assumptions:
(i) The frequency of an association at any rank x is proportional to the frequency at the first rank .
(ii) The frequency fx at rank x is inversely proportional to the rank x. Putting this together, we obtain
or, in logarithmic form
The proportionality is given by Menzerath´s law, i.e.
(1) .
Solving for fx yields
(2).
Usually is so important that one gives it a special value, i.e. one modifies the distribution obtaining
(3).
with .
3.2. The negative binomial model (Altmann 1992)
Assumption: The probability at rank x is proprotional to the probability
and after reparametrization one obtains the 1-displaced negative binomial distribution
(5)
where a/b = k-1, b/c = q.
Example: Associations of the word “high”
Altmann (1992) used the associations of the word “high” (4th grade, male) from Palermo, Jenkins (1964) and obtained the result presented in Table 1 and Fig. 1.
Both results are excellent.
4. Authors: U. Strauss, G. Altmann, J. Eom
5. References:
Altmann, G. (1992). Two models for word association data. Glottometrika 13, 105-120.
Altmann, G., Bagheri, D., Goebl, H., Köhler, R., Prün, C. (2002). Einführung in die quantitative Lexikologie. Götingen: Peust & Gutschmidt.
Dolinskij, V.A. (1994). Moscow Student´s word associations. In: 2nd International Conference on Quantitative Linguistics, September 20-24, 1994, Moscow: 66-68. Moscow: Lomonosov Moscow State University.
Dolinskij, V.A. (1988). Raspredelenie reakcij v ekseprimentach po verbal´nym associacijam. Acta et Commentationes Universitatis Tartuensis 827, 80-101.
Haight, F.A. (1966). Some statistical problems in connection with word association data. J. of Mathematical Psychology 3, 217-233.
Haight, F.A., Jones, R.B. (1974). A probabilistic treatment of qualitative data with special reference to word association tests. J. of Mathematical Psychology 11, 237-244.
Horvath, W.J. (1963). A stochastic model for word association tests. Psychological Review 70, 361-364.
Hřebíček, L. (1995). Text levels. Language constructs, constituents and Menzerath-Altmann law. Trier: WVT.
Hřebíček, L. (1996). Word associations and text. Glottometrika 15, 12-17.
Hřebíček, L. (1997). Lectures on text theory. Prague: Oriental Institute.
Lánský, P., Radil-Weiss, T. (1980). A generalization of the Yule-Simon model, with special reference to word association tests and neural cell assembly formation. J. of Mathematical Psychology 21, 53-65.
Palermo, D.S., Jenkins, J.J. (1964): Word association norms. Grade School through College. Minneapolis: University of Minnesota Press.
Wimmer, G., Altmann, G. (1999). Thesaurus of univariate discrete probability distributions. Essen: Stamm.