Slides: (slides)

Getting Started with Your Project

What is the CS 230 Project?

CS230’s main goal is to prepare you to apply deep learning algorithms to real-world tasks, and leave you well-qualified to start deep learning and AI research. The final project jumpstarts you in these directions, and is the main focus of the class.

Your project will be done in groups of 1 to 3 students (4 with TA approval), and has the following steps:

Proposal (Due October 8, Tuesday 11:59 PM)
Milestone (Due November 15, Friday 11:59 PM)
Final Video (Due December 3, Tuesday 11:59 PM)
Final Report (Due December 3, Tuesday 11:59 PM)

To help guide your project, TAs will host project office hours (15 mins per group, per week) with mandatory meetings for the first meeting, week after the proposal, week after the milestone, and week before the final submission.

The CS230 project can be combined with other classes’ final projects, but must be deep learning focused.

Types of Projects

Historically, students who have built successful CS230 projects have done one of the following:

Used popular network architectures to perform a novel task. Students have used the YOLO algorithm to detect humans on drone footage. They then cropped the humans out of the pictures and filled in the humans’ positions with generated background using a Generative Adversarial Network. Students have also fine-tuned existing networks to achieve state-of-the-art accuracy in Tree Species Identification.
Came up with custom neural network architectures to perform an existing (or novel) task. Students made changes to the U-Net neural network to improve performance on tasks such as brain tumor segmentation. Students have also improved task accuracy by adding and training an attention mechanism on top of an existing RNN architecture.
Re-implemented a famous research paper. For instance, students have reimplemented the popular WaveNet algorithm from scratch in a different deep learning framework.
Done a research project. A past group designed a neural network to debias word vectors using a novel algorithm. They then submitted their paper to a conference.

Note that the common denominator of these projects is that students have contributed something novel. Of course, the above categories are not the only ones. You’re encouraged to talk with your project mentor to figure out if your project idea is aligned with CS230 expectations.

Steps to Build a Deep Learning Project

Step 1: Choose a Project Idea

Look at past projects from CS230 and other Stanford machine learning classes (CS229, CS229A, CS221, CS224N, CS231N).
Look at recent research papers in deep learning using an academic search engine such as Google Scholar, searching through main machine learning conferences such as ICML and NeurIPS, or going through this blog.

Step 2: Literature Review

Find related research papers and Github repositories (We’ll look at this in section 2!).

For specific knowledge topics, check out these great GitHub repositories:

Awesome NLP
Awesome GAN
Awesome DL
Awesome CV (also has other awesome lists!)
Awesome Transfer Learning

Step 3: Find a Dataset

The focus of CS230’s project is on developing deep learning models. Finding the right application and dataset is very important, but it is only the first piece in the puzzle. Don’t spend too much time on data collection. In order to maximize your time spent on developing models, it might be a good idea to look at well defined tasks and datasets.

Build a new dataset

Use a web scraping library such as MechanicalSoup.
Generate synthetic data from a model currently being used in your research.
Collect data yourself (Kian once spent two days on campus collecting short recordings of students saying “activate” to build a speech recognition algorithm).

Use a pre-existing dataset

Use Google’s Dataset Search tool.
Look through Kaggle’s datasets and competitions (current and past).
Use a dataset from your own research.
Use a standard benchmarked dataset for a specific task.

Computer Vision

Examples of computer vision tasks (source)

Image Classification
- Imagenet: 14M images and more than 20k classes.
- MNIST: Dataset of handwritten digits with 10 classes. 70k low resolution images (50Mb)
- CIFAR 10/100: Dataset with 60k low resolution images (10 and 100 classes respectively)
Object Detection
- COCO: Dataset for object detection, image segmentation and image captioning. It has more than 200k images with 80 object categories.
- Pascal VOC: Dataset of 20k images labelled with bounding boxes and 20 classes.
- Open Images: 9M images that have been annotated with image-level labels and object bounding boxes.

Sequence Models

Examples of NLP and speech applications (source)

Natural Language Processing
- SQuAD: The Stanford Question Answering Dataset
- Google Books n-grams
- Yelp Open Dataset: A subset of Yelp’s businesses, reviews, and user data.
Time Series
- Yahoo Finance API
Video Classification
Music
- Million Song: A collection of audio features and metadata for a million contemporary popular music tracks.
Computational Biology
- Deep Bind

Miscellaneous

Results of the image-to-image translation model CycleGAN Zhu, Park et al. (source)

Generative Adversarial Networks
- CelebA: a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.
Recommendation Systems and Autoencoders
- MovieLens: 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users.
- Netflix Prize Challenge:
Adversarial Attack and Defense
- CleverHans: A Python library to benchmark machine learning systems’ vulnerability to adversarial examples.
Deep Reinforcement Learning
- OpenAI Gym: A toolkit for developing and comparing reinforcement learning algorithms.
Style Transfer

Questions to ask when choosing a dataset

How much data do I need?

In general, the more data the better: tens of thousands is a good number for many tasks (exception in medical imaging).

Here’s some more intuition. The amount of data required depends on:

Complexity of the task (output, number of features…). For instance, the amount of data needed generally increases with the number of classes in classification.
- Binary classification cat/non cat (simplest)
- 10 classes classification (more complex)
- Image classification for ImageNet: 14 million images, more than 20k classes (most complex)
Complexity of the model (number of parameters): shallow / deep neural network.
Reference papers related to your algorithm and specific application (ex. Generative Adversarial Networks)

How diverse and balanced is the dataset?

For instance, for classification, we may want to collect more data for the class which has more errors (to help the network better understand that class).

What if there isn’t enough data?

Find or collect more data (or, use a different dataset)
Try data augmentation techniques
Transfer learning

We’ll cover many of these concepts later on in the course!

How clean is the dataset?

Ideally, you’re working with a dataset that requires minor cleaning and preprocessing, so that you can focus on the neural networks. However, if you need to spend significant time preparing the data, we will take it into account when evaluating your project!

Is the dataset academic, licensed, or proprietary?

Academic datasets are great for academic, non-commercial use
If you’re working with healthcare data, keep in mind privacy and legal concerns
Dataset licenses: CC, CC BY, CC BY-NC (non-commercial use)
For source code, open source licenses include: MIT, Apache, GNU GPL

Step 4: Choose an appropriate evaluation metric and build your first model

Step 5: Iterate on your model and evaluation to improve your results

We’ll talk more about these last two steps later in the course and sections!

Q&A

As you progress on your project, your project mentor will answer your specific questions and help guide you towards a successful project. Have fun getting started, we’re excited to see what you come up with!

Section 1 (Week 1)

Instructors

Time and Location