Skip to main content

Subject to change as the term progresses.

Lecture Topic Readings
August 24, 2016 Introduction
  • Introduction of social media and natural language processing research
  • Overview of the course
August 31, 2016 Twitter and Twitter API Tutorial
  • Brief history of Twitter
  • Key features of Twitter
  • Hands-on instructions on obtaining Twitter data via APIs
September 7, 2016 Language Identification and Naïve Bayes
  • Domain/Genre Difference
  • Language Identification
  • Supervised Learning and Classification
  • Naïve Bayes Algorithm + feature selection (Information Gain)
September 14, 2016 Paraphrase Data Sources
  • Overview of paraphrase research
  • WordNet, DIRT, MRPC (Microsoft Research Paraphrase Corpus), PPDB (Paraphrase Database), etc.
September 21, 2016 no class - due to Engineering Expo
September 26, 2016 (Monday) Invited Talk by Jiwei Li (Stanford)
  • 4:00 -- 5:00pm, Dreese 480
  • Teaching a Machine to Converse
September 28, 2016 Paraphrase Identification and Logisitc Regression
  • Linear Regression
  • Cost Function
  • Gradient Descent
September 30, 2016 (Friday) AI Seminar Talk by Jeniya Tabassum (OSU)
  • 3:00 -- 4:00pm, Dreese 480
  • A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter
Octorber 5, 2016 Paraphrase Identification and Logisitc Regression (cont')
  • Logistic Regression
  • Decision Boundary
Octorber 12, 2016 Twitter Paraphrase and Latent Variable Model
  • Evaluation: Precision, Recall, F-measure
  • Multiple Instance Learning
  • Conditional Log-linear Model with Latant Variables
  • Crowdsourcing
Octorber 18, 2016 (Tuesday) Invited Talk by Margaret Mitchell
  • 4:00 -- 5:00pm, Dreese 480
  • From Naming Concrete Objects to Sharing Abstract Thought: Vision-to-Language Begins to Grow Up
Octorber 19, 2016 Twitter Paraphrase and Latent Variable Model (cont')
Octorber 26, 2016 Automatic Summarization for Twitter and the PageRank Algorithm
  • SumBasic algorithm
  • PageRank algorithm
  • Graph visualization
  • Event-based summarization system
November 2, 2016 no class - The instructor will be attending EMNLP in Austin
November 9, 2016 Tokenization, Normalization, POS/NE Tagging for Twitter
  • Challenges in processing social media text
  • Tokenization, Emoticons
  • Noisy Text Normalization
  • Part-of-speech tagging
  • Named entity recognition
  • Sequential tagging methods (HMM and CRF)
November 16, 2016 Vector Semantics
  • Unsupervised Learning
  • Class-based Clustering: Brown Clusters
  • Soft Clustering: Singular Value Decomposition (SVD)
  • Neural Word Embeddings: Word2vec (CBOW and Skip-gram)
November 23, 2016 no class - thxgiving!
November 30, 2016 Deep Learning for NLP
  • Neural Network Basicsg: Neuron, Activation Function, Non-linearity, Learning
  • Recurrent Neural Network
  • Long Short-Term Memory Networks
  • Neural Machine Translation
  • Neural Conversation Generation
December 7, 2016 Review and Q&A
  • I will go over some course material (lecture slides or homework) where students have questions or where we didn't have enough time to go into details previously.
  • I will also have extended office hours that day for 1-on-1 meetings on research project review, Q&A, etc.