PhD Research School in Linguistics and Philology

PhD Course 31.08. - 02.09.15

Workshop on Statistics for linguistics

Main content

Dates: 31 August - 2 September (10.00 - 15.00)

 Venue:  Studentsenteret, seminar room B (31 Aug.) https://www.google.com/maps/place/Parkveien+1,+5007+Bergen,+Norway/


Dragefjellet skole, seminar room E (1–2 Sep.) https://www.google.com/maps/place/Magnus+Lagab%C3%B8tes+Plass+1,+5010+Bergen,+Norway/

Course instructor: Melanie Bell, Anglia Ruskin University

Contents of the course

Melanie Bell, Senior Lecturer at Anglia Ruskin University, will give a Workshop on Statistics for Linguistics August 31 to September 2, 2015.  The workshop will be a follow-up to her introductory statistics course at the LingPhil summer school in Fevik and will cover more advanced statistical techniques.  The workshop will include both lectures on various topics and comments on the individual PhD projects of course participants.  It will be possible to participate in the workshop either with or without a presentation of your project.

Those PhD candidates who want their projects to be commented on will send in a manuscript ahead of time.  This will include a brief description of the PhD project, as well as an explicit statement of the hypotheses that are going to be tested, and as much methodological detail as possible.  The manuscript should be accompanied by some data, which could be made up if real data are not available yet.  Each candidate who has supplied data will give a brief oral presentation of their project, and the data will be used to exemplify the statistical techniques covered in the course.

There is some scope to adapt the course content according to the projects presented, but the following areas will provisionally be included:

Testing hypotheses
Comparing groups
Multiple Regression
Logistic regression
Mixed effects modelling


All participants will be expected to have a working knowledge of the topics covered in the introductory course: types of data, patterns of distribution and descriptive statistics.  If you have not taken an introductory course in statistics, you can still participate, but in that case you should prepare by reading (and doing the exercises for) Chapters 1–4 in Chris Butler’s book Statistics in Linguistics. This book is out of print but has been made available by the author online, see:


The author has corrected some errors in the printed version and therefore urges people to use the online version.

All participants should also bring their own laptop to the workshop and should have the following software installed before the workshop starts:

R: http://www.r-project.org/

RStudio: http://rstudio.org/

In addition, you should ensure that you know how to get data into R and how to manipulate dataframes in R by working through Chapter 1 of the following book:

R. H. Baayen Analyzing Linguistic Data: A Practical Introduction to Statistics using R


Recommended reading:

The reading for the course comes from the book by Chris Butler, mentioned above, and Keith Johnson’s Quantitative Methods In Linguistics:


Everyone should read at least Chapter 1 of Johnson before the workshop starts. Based on the provisional list of topics, the other relevant sections of the books are as follows:

Testing hypotheses
Butler: Chapters 5 and 6
Johnson: Sections 2.1 to 2.3

Comparing groups
Butler: Chapters 7 and 8
Johnson: Section 3.1

Butler: Chapter 11
Johnson: Section 2.4

Multiple Regression
Johnson: Section 3.2

Logistic regression
Butler: Chapter 9
Johnson: Sections 5.1 to 5.4

Mixed effects modelling
Johnson: Sections 7.1 to 7.3

Credits: 3 ECTS credit points for participating with a presentation, 1 ECTS point for participating without

Venue: University of Bergen

Program: Daily classes from 10:00 to 15:00 with a one hour lunch break at 12:00