Skip to main content

Subject to change as the 2020 Spring term progresses. ★ marks the required reading.

Lecture Topic Readings
January 10, 2020 Introduction
  • Introduction of social media and natural language processing research
  • Overview of the course
January 17, 2020 Twitter and Twitter API Tutorial [Reading 1 due]
  • Brief history of Twitter
  • Key features of Twitter
  • Hands-on instructions on obtaining Twitter data via APIs
January 24, 2020 Language Identification and Naïve Bayes [Reading 2 due]
  • Language Identification
  • Supervised Learning
  • Classification
  • Naïve Bayes Algorithm + feature selection (Information Gain)
January 31, 2020 Tokenization, Normalization, POS/NE Tagging for Twitter [Reading 3 due]
  • Tokenization, Emoticons
  • Noisy Text Normalization
  • Named Entity Recognition
  • Weakly Supervised Learning
February 14, 2020 Vector Semantics [Reading 4 due]
  • Unsupervised Learning
  • Class-based Clustering: Brown Clusters
  • Soft Clustering: Singular Value Decomposition (SVD)
  • Neural Word Embeddings: Word2vec (CBOW and Skip-gram)
February 21, 2020 Paraphrase Data Sources [Reading 5 due]
  • Overview of paraphrase research
  • WordNet, DIRT, MRPC (Microsoft Research Paraphrase Corpus), PPDB (Paraphrase Database), etc.
Spring 2019
August 22, 2019 Introduction
  • Introduction of social media and natural language processing research
  • Overview of the course
August 28, 2019 (Wednesday) AI Seminar by Mounica Maddela
  • 4:00 -- 5:00pm, Dreese 480
August 29, 2019 Twitter and Twitter API Tutorial
  • Brief history of Twitter
  • Key features of Twitter
  • Hands-on instructions on obtaining Twitter data via APIs
September 5, 2019 Language Identification and Naïve Bayes [Reading 1 due]
  • Language Identification
  • Supervised Learning
  • Classification
  • Naïve Bayes Algorithm + feature selection (Information Gain)
September 12, 2019 Tokenization, Normalization, POS/NE Tagging for Twitter [Reading 2 due]
  • Tokenization, Emoticons
  • Noisy Text Normalization
  • Named Entity Recognition
  • Weakly Supervised Learning
September 19, 2019 Vector Semantics [Quiz 1 due; Reading 3 due]
  • Unsupervised Learning
  • Class-based Clustering: Brown Clusters
  • Soft Clustering: Singular Value Decomposition (SVD)
  • Neural Word Embeddings: Word2vec (CBOW and Skip-gram)
September 26, 2019 Automatic Summarization [Quiz 2 & Reading 4 due]
  • PageRank algorithm
  • Event-based summarization
Octorber 3, 2019 Recent Research Highlights [Reading 5 due]
  • pairwise neural ranking model
  • computational sociolinguistics
Octorber 17, 2019 Paraphrase Data Sources [Reading 6 due]
  • Overview of paraphrase research
  • WordNet, DIRT, MRPC (Microsoft Research Paraphrase Corpus), PPDB (Paraphrase Database), etc.
Octorber 24, 2019 Paraphrase Identification and Linear Regression [Reading 7 due]
  • Linear Regression
  • Cost Function
  • Gradient Descent
Octorber 31, 2019 Paraphrase Identification and Logisitc Regression [Reading 8 due]
  • Logistic Regression
  • Decision Boundary
November 7, 2019 Deep Learning for NLP (part 1) [Reading 9 due]
  • Neural Network Basics: Neuron, Activation Function, Non-linearity, Learning
  • Recurrent Neural Network
  • Long Short-Term Memory Networks
  • Neural Machine Translation
  • Neural Conversation Generation
November 14, 2019 Deep Learning for NLP (part 2) [Reading 10 due]
  • Attention Model
  • Convolutional Neural Networks
  • A Case Study: Neural Models for Twitter Paraphrase Identification
November 21, 2019 Sentiment Analysis [Reading 11 due]
  • Sentiment Analysis
  • A Case Study: Social Attention Model
Spring 2018
January 11, 2018 Introduction
January 18, 2018 Twitter and Twitter API Tutorial [Quiz1 due]
  • Brief history of Twitter
  • Key features of Twitter
  • Hands-on instructions on obtaining Twitter data via APIs
January 25, 2018 Language Identification and Naïve Bayes [Reading 1 due]
  • Domain/Genre Difference
  • Language Identification
  • Supervised Learning and Classification
  • Naïve Bayes Algorithm + feature selection (Information Gain)
February 1, 2018 Tokenization, Normalization, POS/NE Tagging for Twitter [Homework 1 due, Reading 2 due]
  • Tokenization, Emoticons
  • Noisy Text Normalization
February 8, 2018 POS/NE Tagging for Twitter [No Summary Due]
  • Part-of-speech tagging
  • Named entity recognition
February 15, 2018 Paraphrase Data Sources [Read3 due]
  • Overview of paraphrase research
  • WordNet, DIRT, MRPC (Microsoft Research Paraphrase Corpus), PPDB (Paraphrase Database), etc.
February 22, 2018 Paraphrase Identification and Linear Regression [No Summary Due]
  • Linear Regression
  • Cost Function
  • Gradient Descent
March 1, 2018 Paraphrase Identification and Logisitc Regression [No Summary Due]
  • Logistic Regression
  • Decision Boundary
March 8, 2018 Vector Semantics [Read4 due]
  • Unsupervised Learning
  • Class-based Clustering: Brown Clusters
  • Soft Clustering: Singular Value Decomposition (SVD)
  • Neural Word Embeddings: Word2vec (CBOW and Skip-gram)
March 29, 2018 Predicting Things from Text
April 5, 2018 Discovering User Attribute Stylistic Differences via Paraphrasing (Guest Lecture by Wei Xu) [Read5 due]
April 12, 2018 Deep Learning for NLP
  • Neural Network Basicsg: Neuron, Activation Function, Non-linearity, Learning
  • Recurrent Neural Network
  • Long Short-Term Memory Networks
  • Neural Machine Translation
  • Neural Conversation Generation
April 19, 2018 Sentiment Analysis, Convolutional Neural Networks and Attention [Read6 due]
  • Sentiment Analysis
  • Attention Model
  • Convolutional Neural Networks
Fall 2017
August 22, 2017 Introduction
  • Introduction of social media and natural language processing research
  • Overview of the course
August 29, 2017 Twitter and Twitter API Tutorial [Quiz1 due]
  • Brief history of Twitter
  • Key features of Twitter
  • Hands-on instructions on obtaining Twitter data via APIs
September 5, 2017 no class [HW1 due]
  • instructor travels for conference (EMNLP and W-NUT)
September 12, 2017 Guest Lecture by Pravar Mahajan [Read1 due]
  • Python Numpy Tutorial
September 15, 2017 drop deadline
September 19, 2016 Language Identification and Naïve Bayes [Read2 due]
  • Domain/Genre Difference
  • Language Identification
  • Supervised Learning and Classification
  • Naïve Bayes Algorithm + feature selection (Information Gain)
September 26, 2016 Guest Lecture by Alan Ritter [Read3 due]
October 3, 2016 Paraphrase Data Sources [Read4 & Quiz2 due]
  • Overview of paraphrase research
  • WordNet, DIRT, MRPC (Microsoft Research Paraphrase Corpus), PPDB (Paraphrase Database), etc.
Octorber 10, 2017 Guest Lecture by Jeniya Tabassum [Read5 due]
Octorber 17, 2016 Paraphrase Identification and Linear Regression
  • Linear Regression
  • Cost Function
  • Gradient Descent
October 24, 2017 Tokenization, Normalization, POS/NE Tagging for Twitter
  • Tokenization, Emoticons
  • Noisy Text Normalization
  • Part-of-speech tagging
  • Named entity recognition
October 27, 2017 withdraw deadline
Octorber 31, 2016 Paraphrase Identification and Logisitc Regression [Read6 & Quiz3 due Oct 30]
  • Logistic Regression
  • Decision Boundary
November 7, 2017 Brainstorm Research Ideas [Slides due Nov 6]
  • 2:20-2:50pm 1-minute student presentations
  • 2:50-3:10pm group discussions (round 1)
  • 3:10-3:30pm group discussions (round 2)
  • 3:30-3:40pm break and wrapping up
  • 3:40-4:10pm 5-minute group presentations
November 14, 2017 Vector Semantics [Quiz4 & Read7 due - last Quiz]
  • Unsupervised Learning
  • Class-based Clustering: Brown Clusters
  • Soft Clustering: Singular Value Decomposition (SVD)
  • Neural Word Embeddings: Word2vec (CBOW and Skip-gram)
November 17, 2017 (3:00-4:00pm) Distinghuished Guest Speaker Talk by Dr. Lyle Ungar
  • Dreese Labs 480
  • Measuring Psychological Traits using Social Media
November 21, 2017 Deep Learning for NLP [HW2-part1 due Nov 20]
  • Neural Network Basicsg: Neuron, Activation Function, Non-linearity, Learning
  • Recurrent Neural Network
  • Long Short-Term Memory Networks
  • Neural Machine Translation
  • Neural Conversation Generation
November 28, 2017 (12:45-1:30pm) Clippers Talk by Wuwei Lan
  • Jennings Hall 140
  • Learning Large-scale Paraphrases Continuously from Twitter
November 28, 2017 Automatic Summarization for Twitter and the PageRank Algorithm [Read8 due Nov 30 - last Read]
  • SumBasic algorithm
  • PageRank algorithm
  • Graph visualization
  • Event-based summarization system
December 5, 2017 Sentiment Analysis, Convolutional Neural Networks and Attention [HW2-part2 due Dec 4]
  • Sentiment Analysis
  • Attention Model
  • Convolutional Neural Networks
Fall 2016
September 26, 2016 (Monday) Invited Talk by Jiwei Li (Stanford)
  • 4:00 -- 5:00pm, Dreese 480
  • Teaching a Machine to Converse
Octorber 12, 2016 Twitter Paraphrase and Latent Variable Model
  • Evaluation: Precision, Recall, F-measure
  • Multiple Instance Learning
  • Conditional Log-linear Model with Latant Variables
  • Crowdsourcing
Octorber 18, 2016 (Tuesday) Invited Talk by Margaret Mitchell (Microsoft Research)
  • 4:00 -- 5:00pm, Dreese 480
  • From Naming Concrete Objects to Sharing Abstract Thought: Vision-to-Language Begins to Grow Up