Skip to main content

Word2vec : Assignment 3 (Extracurricular)

This homework is optional. Submit on Carmen by April 20 (11:59pm); strictly no late submission or alternative submission methods will be accepted for this extracurricular assignment. You may choose to do this homework in a group of two students or all by yourself.

Due to the time limitation of this 2-credit special topic course, we only got to quickly walk through some basic building blocks of the deep learning in the class. However, you may get a much better understanding by following the provided readings and implementing the algorithms in this assignment:

  • A softmax function and a sigmoid function
  • A simple neural network with back propagation
  • Word2vec models (Skip-gram, CBOW) with negative sampling

You will also apply your implemented Skip-gram model with negative sampling to train semantic word vectors on the Stanford Sentiment Treebank (SST), and visualize them.

You will find this documentation very helpful.

Instructions

You will use Jupyter Notebook, an interactive Python programming and data visualization tool, for this homework. You can follow this guide to install Anaconda, that conveniently includes Python, the Jupyter Notebook and other commonly used packages for scientific computing and data science. The starter code is tested for Python 3.4 that you are recommended to use.

You can download the homework assignment zip file from Piazza that contains the starter code, some visualization, and explanatory text. Then, open the included *.ipynb file in the Jupyter Notebook and follow all the instructions to do your work there.