One of CS230's main goals is to prepare you to apply machine learning algorithms to real-world tasks, or to leave you well-qualified to start machine learning or AI research. The final project is intended to start you in these directions.
The teaching team has put together a github repository with project code examples, including a computer vision and a natural language processing example (both in Tensorflow and Pytorch). There is also a series of posts to help you familiarize yourself with the project code examples, get ideas on how to structure your deep learning project code, and to setup AWS. The code examples posted are optional and are only meant to help you with your final project. The code can be reused in your projects, but the examples presented are not complex enough to meet the expectations of a quarterly project. For specific questions about the code/posts, please drop by Guillaume Genthial (Tensorflow NLP), Olivier Moindrot (Tensorflow Vision), Surag Nair (PyTorch) or Russell Kaplan's (AWS setup) office hours.
This quarter in CS230, you will learn about a wide range of deep learning applications. Part of the learning will be online, during in-class lectures and when completing assignments, but you will really experience hands-on work in your final project. We would like you to choose wisely a project that fits your interests. One that would be both motivating and technically challenging.
Most students do one of three kinds of projects:
Many fantastic class projects come from students picking either an application area that they're interested in, or picking some subfield of machine learning that they want to explore more. So, pick something that you can get excited and passionate about! Be brave rather than timid, and do feel free to propose ambitious things that you're excited about. (Just be sure to ask us for help if you're uncertain how to best get started.) Alternatively, if you're already working on a research or industry project that deep learning might apply to, then you may already have a great project idea.
A very good CS230 project will be a publishable or nearly-publishable piece of work. Each year, some number of students continue working on their projects after completing CS230, submitting their work to a conferences or journals. Thus, for inspiration, you might also look at some recent deep learning research papers. Two of the main machine learning conferences are ICML and NIPS. You can find papers from recent ICML conferences online: (here). All NIPS papers are online: (here). Finally, looking at class projects from other machine learning/ deep learning classes is a good way to get ideas (CS229 Fall 2017, CS231N Spring 2017, CS224N Winter 2017).
Once you have identified a topic of interest, it can be useful to look up existing research on relevant topics by searching related keywords on an academic search engine such as: http://scholar.google.com. Another important aspect of designing your project is to identify one or several datasets suitable for your topic of interest. If that data needs considerable pre-processing to suit your task, or that you intend to collect the needed data yourself, keep in mind that this is only one part of the expected project work, but can often take considerable time. We still expect a solid methodology and discussion of results, so pace your project accordingly.
Notes on a few specific types of projects:
This section contains the detailed instructions for the different parts of your project.
Submission: We’ll be using Gradescope for submission of all four parts of the final project. We’ll announce when submissions are open for each part. You should submit on Gradescope as a group: that is, for each part, please make one submission for your entire project group and tag your team members.
We will not be disclosing the breakdown of the 40% that the final project is worth amongst the different parts, but the poster and final report will combine to be the majority of the grade. Projects will be evaluated based on:
In the proposal, below your project title, include the project category. The category can be one of:
|Project mentors||Based off of the topic you choose in your proposal, we’ll suggest a project mentor given the areas of expertise of the TAs. This is just a recommendation; feel free to speak with other TAs as well.|
Your proposal should be a PDF document, giving the title of the project, the project category, the full names of all of your team members, the SUNet ID of your team members, and a 300-500 word description of what you plan to do.Your project proposal should include the following information:
|Grading||The project proposal is mainly intended to make sure you decide on a project topic and get feedback from TAs early. As long as your proposal follows the instructions above and the project seems to have been thought out with a reasonable plan, you should do well on the proposal.|
|Submission||Submit on Gradescope (see description under deadline for instructions)|
The milestone will help you make sure you're on track, and should describe what you've accomplished so far, and very briefly say what else you plan to do. You should write it as if it's an “early draft" of what will turn into your final project. You can write it as if you're writing the first few pages of your final project report, so that you can re-use most of the milestone text in your final report. Please write the milestone (and final report) keeping in mind that the intended audience is Profs. Ng and Katanforoosh and the TAs. Thus, for example, you should not spend two pages explaining what logistic regression is. Your milestone should include the full names of all your team members and state the full title of your project. Note: We will expect your final writeup to be on the same topic as your milestone. In order to help you the most, we expect you to submit your running code. Your code should contain a baseline model for your application. Along with your baseline model, you are welcome to submit additional parts of your code such as data pre-processing, data augmentation, accuracy matric(s), and/or other models you have tried. Please clean your code before submitting, comment on it, and cite any resources you used. Please do not submit your dataset. However, you may include a few samples of your data in the report if you wish.
|Contributions||Please include a section that describes what each team member worked on and contributed to the project. This is to make sure team members are carrying a fair share of the work for projects.|
|Grading||The milestone is mostly intended to get feedback from TAs to make sure you’re making reasonable progress. As long as your milestone follows the instructions and you seem to have tested any assumptions which might prevent your team from completing the project, you should do well on the milestone.|
Your milestone should be at most 3 pages, excluding references. Similar to to the proposal, it should include
|Submission||Submit on Gradescope. Code can either be a link to a github repository, or the relevant files themselves.|
The class projects will be presented at a poster presentation, at a location and time that will be announced later. Each team should prepare a poster, and be prepared to give a very short explanation, in front of the poster, about their work. At the poster session, you'll also have an opportunity to see what everyone else did for their projects. We will supply poster-boards and easels for displaying the posters.
|Format||Here are some poster guidelines (please note that 36x24 in means 36 in wide by 24 in tall, i.e. it's better if your poster is formatted landscape). You can also look at posters from previous years in CS229/CS231N/CS224N. Note: Despite example given in guidelines, posters with nice, illustrative figures are preferred over posters with lots of text.|
|Grading||We will be grading posters on the poster quality and clarity, the technical content of the poster, as well as the knowledge demonstrated by the team when discussing their work with teaching staff at the poster session.|
Because the teaching staff will have only a few hours to see a large number of posters at the poster session, we'll only be able to get an overview of the work you did at the session. We know that most students work very hard on the final projects, and so we are extremely careful to give each writeup ample attention, and read and try very hard to understand everything you describe in it.
After the class, we will also post all the final writeups online so that you can read about each other's' work. If you do not want your write-up to be posted online, then please create a private Piazza post or contact us at email@example.com at least a week in advance of the final submission deadline.
Final project writeups can be at most 5 pages long (including appendices and figures). We will allow for extra pages containing only references. If you did this work in collaboration with someone else, or if someone else (such as another professor) had advised you on this work, your write-up must fully acknowledge their contributions. For shared projects, we also require that you submit the final report from the class you're sharing the project with. You are strongly encouraged to use this format (here's a link to the overleaf files). If you are not using this format, make sure to include all sections given in the format.
|Contributions||Please include a section that describes what each team member worked on and contributed to the project.|
|Code||Please include a zip file or peferably a link to a Github repository with the code for your final project. You do not have to include the data or additional libraries (so if you submit a zip file, it should not exceed 5MB).|
|Grading||The final report will be judged based off of the clarity of the report, the relevance of the project to topics taught in CS230, the novelty of the problem, and the technical quality and significance of the work.|
Deadlines are listed in the project page and on the schedule page of the website.
No, we don't restrict you to only use methods/topics/problems taught in class. That said, you can always consult a TA if you are unsure about any method or problem statement.
We don't mind you using a dataset that is not public, as long as you have the required permissions to use it. We don't require you to share the dataset either as long as you can accurately describe it in the Final Report.
No, but please explicitly state the work which was done by team members enrolled in CS230 in your proposal, milestone and final report. This extends to projects that were done in collaboration with research groups as well.
We recommend teams of 3 students, while teams sizes of 1 or 2 are also acceptable. The team size will be taken under consideration when evaluating the scope of the project in breadth and depth, meaning that a three-person team is expected to accomplish more than a one-person team would.
The reason we encourage students to form teams of 3 is that, in our experience, this size usually fits best the expectations for the CS230 projects. In particular, we expect the team to submit a completed project (even for team of 1 or 2), so keep in mind that all projects require to spend a decent minimum effort towards gathering data, and setting up the infrastructure to reach some form of result. In a three-person team this can be shared much better, allowing the team to focus a lot more on the interesting stuff, e.g. results and discussion.
In exceptional cases, we can allow a team of 4 people. If you plan to work on a project in a team of 4, please come talk to one of the TAs beforehand so we can ensure that the project has a large enough scope.
The term project is 40% of the final grade.