CSCI 544 - Applied Natural Language Processing

Spring 2015
Kenji Sagae


Latest Announcements


Time:

MW 3:30pm-4:50pm

Location:

SAL 101

Instructor:

Kenji Sagae
Office hours: Wednesday 2:30-3:30pm PHE 514
or by appointment

Teaching Assistant:

Justin Garten
Office hours: Tuesday 5:00-7:00pm Leavey (LVL17)



Goals:

This course covers both fundamental and cutting-edge topics in Natural Language Processing (NLP) and provides students with hands-on experience in NLP applications.

Audience:

This graduate course is intended for:

Prerequisities:

Proficiency in programming, algorithms and data structures, basic knowledge of linear algebra and machine learning.

Related Courses

This course is part of USC's curriculum in natural language processing. There is a sister course, CSCI 662 Advanced Natural Language Processing, offered in the Fall semester, which covers complementary (and advanced) material and is intended for PhD students (or students who want to continue to a PhD program).

Coursework:

Students will work with real datasets and will build their own language processing, text classification and sentiment analysis systems. Grades will be based on:

Homework and Project Guidelines

Homework 0
Homework 1
Homework 2
Homework 3
Project


Spring 2015 Schedule

Date Instructor Lecture
January 12 Sagae Introduction and basic concepts
January 14 Sagae Text Classification (Naive Bayes)

Reading: Manning, Raghavan and Schutze, Introduction to information retrieval, Chapter 13
January 19 MLK Holiday
January 21 Sagae More classification (Perceptron)

Reading: Hal Daume III, A course in machine learning, Chapter 3
January 26 Sagae Sequence labeling (perceptron, POS tagging)
January 28 Sagae Part-of-speech tagging

Reading: Ratnaparkhi, A maximum entropy model for part-of-speech tagging
February 2 Sagae Shallow Parsing, NER and NLP tools

Reading: Sha and Pereira, Shallow parsing with conditional random fields
Tjong Kim Sang and Buchholz, Introduction to the CoNLL-2000 Shared Task: Chunking
Tjong Kim Sang, Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition
February 4 Sagae Parsing
February 9 Sagae Parsing
February 11 Sagae PCFG

Reading: Charniak (1997), Statistical Techniques for Natural Language Parsing
Reading: Mark Johnson (1998), PCFG models of linguistic tree representations
February 16 President's Day
February 18 Sagae Shift-Reduce Parsing, Dependency Parsing

Nivre (2008), Algorithms for deterministic incremental dependency parsing
Sagae and Lavie (2005), A classifier-based parser with linear run-time complexity
February 23 Sagae Semantic Role Labeling

Reading: Gildea and Palmer (2002), The necessity of parsing for predicate argument recognition
Propbank information
CoNLL shared tasks on dependency-based semantic role labeling: 2008 and 2009
February 25 Sagae Language Modeling

Reading: Chen & Goodman (1998) An Empirical Study of Smoothing Techniques for Language Modeling
Links to LM toolkits:
OpenGRM
SRI Language Modeling Toolkit
March 2 Sagae Speech Acts

Reading: Stolcke et al. (2000) Dialogue act modeling for automatic tagging and recognition of conversational speech
March 4 Sagae Named Entity Recognition revisited, Information Extraction
March 9 Sagae Named Entity Discrimination
March 11 Sagae Class project discussion, clustering

Manning, Raghavan and Schütze, Introduction to Information Retrieval, Chapter 15 (flat clustering) and Chapter 16 (hierarchical clustering).
March 16 Spring Break
March 18 Spring Break
March 23 Sagae Word classes, knowledge representation
March 25 Sagae Knowledge and information extraction
March 30 Sagae Discourse
April 1 Sagae Domain adaptation
April 6 Sagae Domain adaptation II

McClosky, Charniak and Johnson. Automatic Domain Adaptation for Parsing. NAACL 2010.

Daume III. Frustratingly easy domain adaptation. ACL 2007.
April 8 Garten Distributed word representations
April 13 Sagae NLP for Social Media, review
April 15 Sagae NLP applications
April 20 Georgila Speech Synthesis
April 22 Sagae Wrap up, the road ahead
April 27 Class Presentations (MPH 101)
April 29 Class Presentations (MPH 101)

Setup guides


Statement for Students with Disabilities:

Any student requesting academic accommodations based on a disability is required to register with Disability Services and Programs (DSP) each semester. A letter of verification for approved accommodations can be obtained from DSP. Please be sure the letter is delivered to me (or to TA) as early in the semester as possible. DSP is located in STU 301 and is open 8:30 a.m.-5:00 p.m., Monday through Friday. The phone number for DSP is (213) 740-0776.

Statement on Academic Integrity:

USC seeks to maintain an optimal learning environment. General principles of academic honesty include the concept of respect for the intellectual property of others, the expectation that individual work will be submitted unless otherwise allowed by an instructor, and the obligations both to protect one's own academic work from misuse by others as well as to avoid using another's work as one's own. All students are expected to understand and abide by these principles. Scampus, the Student Guidebook, contains the Student Conduct Code in Section 11.00, while the recommended sanctions are located in Appendix A: http://www.usc.edu/dept/publications/SCAMPUS/gov/. Students will be referred to the Office of Student Judicial Affairs and Community Standards for further review, should there be any suspicion of academic dishonesty. The Review process can be found at: http://www.usc.edu/student-affairs/SJACS/.