Statistical Methods for NLP

Course Plan

Please check this page often for updates

Please check this page often for updates

Class Topics Materials Code
Class 1

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

Må 16/1 13:15 – 15:00

Notes: See also the current New York article: The great AI awakening.

Introduction to the class

Introduction to Machine Learning

Combinatorics

Computational Statistics using Python (& R)

Class 1 Presentation

Class 1 Presentation (printer friendly version1)

Charalambos Themistocleous (2017). Introduction to R. Part A. Language Fundamentals (manuscript)

Introduction to Python: Python Programming LanguageScipy/Numpy Quickstart Tutorial (see also Class 5 notes)
1Please prefer the printer friendly version to save paper.

The code of the examples in the presentation (frequency lists, concordances, etc.) can be accessed here for those interested to see how they were created.
Class 2

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 19/1 10:15 – 12:00

 

Probability Theory: Introduction Class 2 Presentation
Class 2 (printer friendly version)
 
Class 3

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 23/1 10:15 – 12:00

Notes: Using probabilities in everyday decision making and how do avoid biases: 

Tversky, A., & Kahneman, D. (1974). Judgment under Uncertainty: Heuristics and Biases. Science, 185(4157), 1124-1131. doi:10.1126/science.185.4157.1124

 

Law of Total probability

Independent vs. Dependent Events

Conditional Probability
Bayesian Theorem

 

Class 3 Presentation
Class 3 (printer friendly version)
 

 

 

Class 4

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

26/1 10:15:00 AM - 12:00:00 PM

Discrete Variables

Continuous Variables

Distributions

Bernoulli Distribution

Binomial Distribution

Hypergeometric Distribution

Random Variables

Class 4 Presentation

Class 4 (printer friendly version)

The code of the examples in the presentation (frequency lists, concordances, etc.) can be accessed here. 

 

Assignment: Task 1
Class 5

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 30/1 10:15:00 AM – 12:00:00 PM

Computer Exercise 1: Distributions and
Random number generation based on distribution
Class 5 Presentation

Class 5 (printer friendly version)

Code

Data

Class 6

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 02/02/16 10:15 - 12:00
Continuous Variables

Hypothesis Testing

Statistical concepts

Linear Models

Linear Mixed effects Models

Class 6 Presentation

Class 6 (printer friendly version)

 
Assignment: Task 2
Class 7

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 06/02/16 10:15 – 12:00

Information Theory

Entropy

Class 7 Presentation

Class 7 (printer friendly version)

Class 8
 

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

To 09/02/16 10:15:00 AM – 12:00:00 PM

Machine learning
Classification
Basic Concepts
Class 8 Presentation

Class 8 (printer friendly version)

Class 9
 

Lecturers: Mehdi Ghanimifard

Room: Lab 4

Må 13/2 10:15:00 – 12:00

FIRST ASSIGNMENT/DEADLINE 23/02/17
Naive Bayes
Class 10
 

Lecturers: Mehdi Ghanimifard

Room: Lab 4

To 16/2 10:15 AM – 12:00 PM

 FIRST ASSIGNMENT/DEADLINE 23/02/17
Naive Bayes
Class 11

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

Må 20/2 10:15 – 12:00

Evaluation

Classification Models: LDA

Class 11 Presentation

Class 11 (printer friendly version)

Class 12

Lecturers: Mehdi Ghanimifard

Room: Lab 4

To 23/2 10:15 – 12:00

SECOND ASSIGNMENT

Deadline: 02/03/17

Class 13

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 27/2 10:15 – 12:00

Hidden Markov Models

Viderbi

Class 13 Presentation

Class 13 (printer friendly version)

Class 14

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 02/03/ 10:15:00 AM – 12:00:00

Hidden Markov Models
Training and Evaluating HMMs
Class 14 Presentation

Class 14 (printer friendly version)

Class 15

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

Må 06/03 10:15:00 AM – 12:00:00 PM

Class 15 Presentation

Class 15 (printer friendly version)

Class 16

Lecturers: Charalambos (Haris) Themistocleous

Room: T346

To 09/03 10:15 – 12:00

Class 16 Presentation

Class 16 (printer friendly version)

Class 17

Lecturers: Mehdi Ghanimifard

Room: Lab 4

Må 13/3 10:15 – 12:00

Class 18

Lecturers: Charalambos (Haris) Themistocleous

Room: Lab 4

To 16/3 10:15 – 12:00

Neural Networks
Deep Neural Networks
Class 18 Presentation

Class 18 (printer friendly version)

 

 

 

 

Course Literature

Course Books

  • Christopher Manning and Hinrich Schütze (1999) Foundations of Statistical Natural Language Processing, Cambridge, Massachusetts, USA. MIT Press. Also see the book’s supplemental materials website at Stanford.
  • Joseph K. Blitzstein, Jessica Hwang (2014). Introduction to Probability. London: CRC Press. Taylor & Francis.
  • James Gareth, Witten Daniela, Hastie Trevor and Robert Tibshirani (). An Introduction to Statistical Learning. Springer. Available online by the authors here.  Slides and videos for Statistical Learning MOOC by Hastie and Tibshirani available separately here. Slides and video tutorials related to this book by Abass Al Sharif can be downloaded here.

Complementary Textbooks

  • Daniel Jurafsky and James Martin (2008) An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Second Edition. Prentice Hall.
  • Russell, Stuart J.; Norvig, Peter (2009), Artificial Intelligence: A Modern Approach (3rd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-604259-7.

Resources

Course Description

7.5 hecr, 2nd semester, 1st study period

The purpose of this course is to give an introduction to probabilistic modeling, statistical methods and their use within the field of language technology. The following topics will be covered in the course:

  • Probability theory
  • Information theory
  • Statistical theory (sampling, estimation, hypothesis testing)
  • Language modeling
  • Part-of-speech tagging
  • Syntactic parsing
  • Word sense disambiguation
  • Machine translation
  • Evaluation

Elective course offered by the programme for students taking the one-year degree: Degree of Master of Arts (60 credits) in Language Technology (Filosofie magisterexamen i språkteknologi).

Course Syllabus

The course syllabus in full as adopted by the head of department can be downloaded in pdf.

Course syllabus in English

Course syllabus in Swedish

Application

The course can be offered as a freestanding single subject course for students not on the MLT programme. Information on application deadlines and admissions in the university course catalogue: