Vinija's Notes • CS224n: Natural Language Processing with Deep Learning

🔠 CS224n: Natural Language Processing with Deep Learning

A distilled compilation of my notes for Stanford's CS224n: Natural Language Processing with Deep Learning.
Natural language processing (NLP) is a crucial part of artificial intelligence (AI), modeling how people share information. In recent years, deep learning approaches have obtained very high performance on many NLP tasks. In this course, students gain a thorough introduction to cutting-edge neural networks for NLP.
Each section is divided by topic to help you get bite-sized information that is easy to understand!

📌 Table of Contents

xLSTM

Extended Long Short-Term Memory

Word Vectors

Word Embeddings; TF-IDF; BM25; WordNet; Word2Vec; FastText; GloVe

Parameter Efficient Fine Tuning

Adapters; LoRA; QLora; Surgical Fine-tuning

Retrieval Augmented Generation

MuRAG

Hypernetworks

Word Vectors

word vectors; bag of words; contextualized embeddings; positional embeddings; Word2Vec (CBOW + Skip-gram)

Context Length in LLMs

RoPE scaling; Long-LoRA

Finetuning, RAG, vs prompt engineering to leverage LLMs for a different task from which it was trained on

Chain-of-Thought; Chain-of-Prompt; Auto-COT; RAG

NLP Tasks

named entity recognition; dependency parsing

Neural Networks

RNNs; LSTMs; CNNs; regularization; dropout

Regularization

L1; L2; Dropout

Data Sampling

Hard Sampling; Negative Sampling

Language Models

n-gram language model; neural language model; ELMO; GPT-3

AI Text Detection Techniques

DetectGPT; GPTZero; Watermarking

ConversationalAI

Applications; Future Direction

Machine Translation

seq2seq; neural machine translation

Word Sense Disambiguation

Hallucination

Prompting; Chain of Verification; Human-in-the-loop

Attention

the bottleneck problem; seq2seq with attention; self attention; multi-headed attention

Knowledge Graphs

knowledge graph overview; entity linking; ERNIE; KGLM

Architectures: Building Blocks

LSTM; RNN; Transformer

Mixture-of-Experts

Architecture; Benefits of MoE; Popular MoE models

Transformer

encoder; decoder; BERT; T5 language model

Preprocessing

Stopwords; Lemmatization; Stopwords

FineTuning

Benefits; Models and how they finetune

Generative AI

Tokenizer

Metrics

Course Info

Course website

Lectures - Winter 2017

Lectures - Winter 2019

Full syllabus

Course description:

Natural language processing (NLP) or computational linguistics is one of the most important technologies of the information age. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, etc.
In recent years, deep learning (or neural network) approaches have obtained very high performance across many different NLP tasks, using single end-to-end neural models that do not require traditional, task-specific feature engineering.
In this course, students will gain a thorough introduction to cutting-edge research in Deep Learning for NLP. Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models. As piloted last year, CS224n will be taught using PyTorch this year.

📸 Credits

The in-line diagrams are taken from the link's below:
- Stanford's CS224n: Natural Language Processing with Deep Learning. - Jay Alammar's Blog.

Citation

If you found our work useful, please cite it as:

@misc{Chadha2020DistilledNotesCS231n,
  author        = {Jain, Vinija and Chadha, Aman},
  title         = {Distilled Notes for Stanford CS224n: Natural Language Processing with Deep Learning},
  howpublished  = {\url{https://www.aman.ai}},
  year          = {2020},
  note          = {Accessed: 2020-07-01},
  url           = {www.aman.ai}
}

A. Chadha, Distilled Notes for Stanford CS224n: Natural Language Processing with Deep Learning, https://www.aman.ai, 2020, Accessed: July 1 2020.