Statistical tests of cross-language color naming
Paul Kay (linguistics, University of California at Berkeley)
Terry Regier (psychology, University of Chicago)
Richard Cook (linguistics, University of California at Berkeley)
John O'Leary (computer science, University of Chicago)
It is generally accepted that there are cross-linguistic universal
tendencies in the naming of colors. This is due in large part to the
findings of Berlin and Kay (1969), who found universal patterns in
color naming data collected from a variety of languages. Recently,
however, these well-known universalist findings have been challenged,
on both methodological (Lucy 1997, Saunders & van Brakel 1997) and
substantive (Roberson et al., 2000) grounds. Critically, the original
universalist findings are vulnerable on two key points:
- These findings have not yet been properly tested statistically.
In Berlin and Kay (1969) a small test of three languages was
performed. However, the larger claims of cross-language universals in
color naming rested primarily on the intuitively apparent clustering
of focal color choices for 20 languages on a discretized surface of
highly saturated Munsell colors, roughly approximated by the grid
shown above. Unfortunately, people sometimes perceive a seemingly
non-random clustering of "hits" in items that are actually distributed
randomly over a surface (Clarke, 1946). Thus, it is possible that the
finding of color term universality, based as it is on subjective
perception of clustering, is without statistical foundation -- a point
that has not escaped critics of this work (e.g., Lucy 1997, Saunders
and van Brakel 1997).
- The language sample from which Berlin and Kay (1969) collected
data was strongly biased in favor of written languages from
industrialized societies. Thus, even if their findings of color
term universals had been supported by statistical test, it would still
not be clear how well these results could be expected to generalize to
A study now underway at U.C. Berkeley and the University of Chicago
responds to these concerns. We are statistically testing
comprehensive color naming data, collected from 110 unwritten
languages from non-industrialized societies, through the
World Color Survey.
Through these tests, we seek to establish: (i) whether color
terms from different languages cluster together in perceptual color
space at rates greater than chance; (ii) whether these clusters are
located near the points where earlier studies of languages from
industrialized societies have placed universal focal or landmark
colors; and (iii) whether the color term systems of the languages
studied tend to fall into a small number of distinct types forming a
developmental sequence, as proposed by universally oriented work on
color naming, e.g., Kay and Maffi (1999).
The color naming data on which these tests are based is
publicly available on this website.
Berlin, Brent and Paul Kay (1969). Basic Color Terms. Berkeley
and Los Angeles: University of California Press.
- Clarke, R.D. (1946). An application of the Poisson distribution.
Journal of the Institute of Actuaries 72, 481.
- Kay, Paul and Luisa Maffi (1999). Color appearance and the emergence
and evolution of basic color lexicons. American Anthropologist
- Lucy, John (1997). The linguistics of "color". In C.L. Hardin and Luisa
Maffi (eds.) Color Categories in Thought and Language.
Cambridge: Cambridge University Press.
- MacLaury, Robert E. (1997). Color and Cognition in Mesoamerica.
Austin: University of Texas Press.
- Roberson, Debi; Davies, Ian; and Jules Davidoff (2000). Color categories
are not universal: Replications and new evidence from a stone-age
culture. Journal of Experimental Psychology: General,
- Saunders, B.A.C. and J. van Brakel (1997). Are there non-trivial
constraints on colour categorization? Behavioral and
Brain Sciences 20, 167-228.
Last updated: 20030603