The University of Chicago >> The Department of Linguistics >>
John A. Goldsmith
My linguistics blog... ...CV... ..Courses...
John Goldsmith

You can't understand the answer if you don't understand the question.

I think that the longer you spend thinking about language, the more you spend your time trying to understand what the question is.

I spend a lot of time working on algorithms that take relatively raw representations of natural language and infer structure in the device that generated the data. Why? Because there is no more direct a way to understand the idea of structure in the data. (If you don't think it's too metaphysical, we could even speak of structure that is immanent in the data.) We live in a cognitive age: the conventional wisdom is that this structure exists only in a derivative sense: it is dependent on the existence of a psychological faculty that inheres in speakers and hearers. I don't think we need to go there. There is another way to understand the reality of structure in the data.

So no sooner is the question clearly posed than it twists and turns into a philosophical question before our very eyes.

And here's another sad fact: the only way you can understand the answers provided by a generation of scholars is to go back to what they were learning when they were students so that you can figure out what the question is that they are answering. We can trust scholars to give us their best understanding of the answers they have found, but not to give us an equally good understanding of the question. And if you don't understand the question, then you don't really understand the answer. So what's to be done?

One answer is to read the thoughts of the people who have created the thoughts of the preceding generations. It's not just one answer; it's by far the most important answer.

All of this is some kind of explanation for why my publications seem to be all over the place: in the last ten years, I've been writing about unsupervised computational learning of morphology, about the history of linguistics, about empiricism and what it might mean for linguistics, about maximum likelihood models in phonology, about tone in Tonga, Shi, and Kirundi. They are just ways to understanding the question, that's all.

Hmm. Here's how a U of C cartoonist saw me.

Here are slides from a talk at Ohio State University this coming week, entitled "Optimization is the answer. Now, what is the question?"



Recent papers

Syllables. A chapter in the forthcoming second volume of the Handbook of Phonological Theory.

Theory, kernels, data, methods.Not a formal paper, but a paper that I read at the 17th Manchester Phonology Meeting, May 2009.

Segmentation and morphology. To appear in The Handbook of Computational Linguistics and Natural Language Processing. September 2008. An overview of computational work on morphology and on the problem of learning to segment words and morphemes.

Information theoretic approaches to phonology: the case of Finnish vowel harmony With Jason Riggle. November 2007.

Draft of: Your Turing Machine or Mine? June 2007.
A video presentation of this at UCL: meeting on Machine Learning and Cognitive Science of Language Acquisition

Towards a new empiricism June 2007. This will appear as part of a book written with Alex Clark, Nick Chater, and Amy Perfors.

Generative phonology in the late 1940s. February 2007. This appeared more recently in Phonology.

Analogy in morphology. February 2007.

Learning phonological categories December 2006. With Aris Xanthos. This paper appeared in Language, March 2009.

Probability for linguists. March 2007

Generative phonology: its origins, its principles, its successors. With Bernard Laks.

An algorithm for the unsupervised learning of morphology


Linguistica The Linguistica Project

See our Linguistica homepage at Linguistica.uchicago.edu for executable, for source code, and for research papers. This project is the development of a computer program that automatically performs morphological analysis of a raw text corpus that you give it.



Courses: past, present, and future

Spring 2008: The history of linguistics: where we came from, and how we got here.

Introduction to linguistics 2 (winter 2008)

Introduction to computational linguistics

Materials from course on the assassination of John F. Kennedy

Seminar on the history of phonological theory, 1950-1990

The Zulu language (Fall 2005)

Class notes from a course on the history of the cognitive sciences -- an unusual perspective.


Bantu tone

I'm working on two papers on tone in Bantu languages, one on Shi, based on material in Louise Polak-Bynon's grammar (a brief handout on this), and one on KiHunde, based on material I have gathered.

Academia Academia



John Komlos, Penny Gold and I have written an introduction to academic life for people interested in getting into it, and for people already engaged in it. We try to explain the good and the bad sides of it as a career choice, and give some suggestions that often people don't hear till it's...well, till it's too late. It's called The Chicago Guide to your Academic Career: a Portable Mentor from Graduate School Through Tenure.



Observations



When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind. It may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science. William Thomson, Lord Kelvin.


History of LxThe History of Linguistics

Towards a new empiricism is a paper in progress, since the fall of 2006. The point of the paper is to justify a view of linguistics that sees linguistics as a science and neither a part nor a satellite of psychology, and one that places equal emphasis on theoretical insight and on empirical coverage. Best of all, it provides an explicit means for testing and justifying hypotheses, through the good offices of bayesian reasoning, and Minimum Description Length analysis in particular.

A paper (with Bernard Laks, Université de Paris X) on the history of generative phonology.

A recent review article on Bruce Nevin's book on Zellig Harris. (That's a photo of Zellig Harris, on the right.)

My 2004 CLS paper on the role of the algorithm in generative grammar.

See also the reference to a paper below on information theory, entropy, and phonology in the 20th century.

Geoff Huck and I wrote a book Ideology and Linguistic Theory (1995) dealing with generative semantics and interpretive semantics in the post-Aspects period, showing how many of the critical suggestions of generative semantics were integrated into mainstream linguistic thought in following generations, and that the professional battle waged during this period was often disconnected from the intellectual issues that were referenced. One of our earlier papers is available here: Distributionalist and Mediationalist Themes in the Development of Linguistic Theory.

A review article on Robert Barsky's Noam Chomsky: A Life of Dissent.



Miscellaneous

Nicolas Ruwet (1933-2001)

Dynamic computational networks

Using an HMM to learn sonority in French (new: thanks, Colin Sprague)

Links to a few PowerPoints

View Vectors

Spectral graph theory

Personal web page

Graphical images of units of Z/N, N prime or composite

Phonological complexity

Computational Linguistics at the University of Chicago

Former students

Looking for another John Goldsmith, perhaps?


Phonology
Handbook Along with Jason Riggle and Aris Xanthos, I have been working over the past couple of years on rethinking phonological theory from a bayesian point of view, asking the question: Can phonology be understood as the search for models that are the most probable, given what we know of the phonological complexities of the world's languages? Out of this work have come two papers so far, one focusing on learning phonological categories and the other on treating vowel harmony in Finnish.

Alan Yu, Jason Riggle, and I are editing a second volume of the Handbook of Phonological Theory. I edited the first edition, which came out in 1995.

A book that I edited, Essential Readings in Phonological Theory, was published by Blackwell's in 1999. The second volume should come out in late 2009 or so.

What is phonology? First chapter of a book that I'm going to finish before too long, probably entitled What is Phonology? This chapter is primarily about flapping in American English.

Probabilistic models of grammar: phonology as information minimization.

First paper on autosegmental phonology: pretty rough. November 1973 My second paper on autosegmental phonology, but the first one with a theory and a name. Marginal comments from Noam Chomsky. Spring 1974. Strange to look at it now.



Bioinformatics


Work with Ridg Scott, Terry Clark, and Jing Liu on extending algorithms that have been developed in computational linguistics to applications in bioinformatics.
Links
LingDept
CS Dept
Other

Hi-res photo