August 18, 2015

At The Conglomerate, Gordon Smith: Corpus Linguistics in the Courts (Again) (discussing a recent case on interpretation from the Utah Supreme Court).  Quoting Justice Tom Lee (concurring):

In this age of information, we have ready access to means for testing our resolution of linguistic ambiguity. Instead of just relying on the limited capacities of the dictionary or our memory, we can access large bodies of real-world language to see how particular words or phrases are actually used in written or spoken English. Linguists have a name for this kind of analysis; it is known as corpus linguistics.

The fancy Latin name makes this enterprise seem esoteric and daunting. It is not. We all engage in it even if we don’t attach the technical label to it. A corpus is a body, and corpus linguistics analysis is no more than a study of language employing a body of language. When we communicate using words we naturally access a large corpus—the body of language we have been exposed to during our lifetimes—to decode the groups of letters or sounds we encounter. The most basic corpus linguistics analysis involves our split-second effort to access the body of language in our heads in our ongoing attempt to decode words or phrases we may be uncertain of. We all do that repeatedly every day.

It is a small step to utilize a tool to aid our linguistic memory. Judges do this with some frequency as well. Naturally. If judges are entitled to consult the corpus of language in our heads (and how could we not?), we must also be permitted to supplement and check our memory against publicly available sources of language.

As the post goes on to explain, the majority opinion have a sharply differing view.

(Via Eugene Volokh at Volokh Conspiracy).

Posted at 6:10 AM