Final Project
For this final project, you will work in teams to (1) identify a problem to solve or question to answer, (2) analyze a relevant dataset to achieve that objective, (3) evaluate the performance of your approach, and (4) clearly communicate your findings to a wide audience. Your final video will be evaluated by a panel of judges bonus points will be awarded to the winning team.
Proposal
Your team will submit a short project proposal. Your project proposal will be a 1-2 page document that includes (a) the names of the members of your team, (b) a detailed description of the problem/question to solve/answer and why it is important, (c) a description of the dataset you propose using including a succinct visualization of one or more aspects of the data relevant to your proposal, and (d) your proposed machine learning approach and how you will evaluate your performance.
If you’re looking for inspiration on how to select a project idea, start by exploring the active competitions and datasets on kaggle.com. Please stop by office hours if you’d like to discuss specific project ideas or for any other help in selecting your project idea.
Report
The final project report that you submit will consist of two parts: (1) a report and (2) a 1-3 minute video communicating the key takeaways from your project.
- Abstract. [150 words maximum] This should be the one paragraph that captures the significance of what you did and why you did it.
- Introduction. Provide a description of the problem and the value in finding a solution, motivate your reader as to why he/she should care about this question. The idea is to get your reader excited about the solution you are about to present.
- Background. This section should cite problems that have been previously addressed that relate to your work, and the key takeaways of the studies that explored that work. The idea here is to place the problem you’re working on in context and to let the reader know that you’re not working in a knowledge vacuum. For finding relevant literature, a good starting point is Google Scholar.
- Data. Describe and visualize your data. Make sure every caption fully describes the figure. You may want to visualize the raw data and/or extracted features. What challenges are inherent to this problem? How might they be overcome? What take away messages can you get simply from visualizing your data?
- Methods. Present your machine learning solution (a description of any preprocessing, feature extraction, classification/regression techniques) and why you made each of the choices you did. Discuss any methods that you didn’t create yourself and please cite relevant literature to support your claims. Also include a flow chart of your methodology to the reader can easily conceptualize your solution. Describe your approach to measuring generalization performance, what metric(s) you used and why. Imagine that you are writing this section so that someone could recreate your results.
- Results. Include a complete performance assessment that includes your validation approach (cross validation, train/validate/test split, etc.) and the key metrics of performance for the problem (ROC curves, PR curves, confusion matrices if applicable, etc.). You should also compare your outcomes to at one baseline model (a simple model to exhibit your improvements) in addition to comparison against random chance guessing in the classification setting. This section should be supported with visualizations including examples where your method worked well, examples where it failed, and hypotheses supported by evidence as to why in each case.
- Conclusions. It’s critical to have a strong ending and not just let the energy fizzle out of the report. Many readers, if pressed for time, will simply read your abstract and your conclusions. In fact, you may want to start by writing your conclusions. Very succinctly recap the problem you were studying and what was your approach to the solution. Focus on explaining the key takeaways from your work - these should not be merely a set of bullet points, but fleshed out conclusions. As you're writing your conclusions think about if the reader took nothing else away from reading your report, what would you want them to know most? Did you identify one particular approach that worked well? Was there a challenge that you faced that opens the door to working on solving a new problem? What avenues of research would you pursue next?
- Roles. Since this is a team project, we want to know what your specific contribution was to this project. Provide detail on your individual role and how it contributed to the competition. Each team member should clearly articulate an individual role.
- References [no word limits]. An alphabetical list of references cited in this work. A minimum of 10 are required. Consider using the Zotero citation manager for collecting and compiling your references.
You will submit THREE printed PDF versions of this report that meet the following requirements:
- Word limit: your report should be no longer than 2,500 words, not including references and figure captions
- Figures are highly encouraged, and should each be referenced in the text (such that every figure has a clear point to the story that you tell). Every figure should have a caption, figure number, axis labels (with units if applicable), and legend, if applicable. If you use any figures that are not your own, they should be cited as well.
- While the specific citation format is not critical, it should be consistent and follow a known model (MLA, IEEE, Chicago, APA, etc.).
- Your report should have a title page and list the names of each of the team members as well as your team letter (A-H).
Video
For the 1-3 minute video summarizing / advertising the problem that you worked on and your machine learning solution. This video should be visually compelling and should not miss the “forest for the trees” – don’t get lost in technical details, this video is NOT the place for extreme technical detail, but instead for CLEARLY conveying the big picture of your problem/question and your solution/answer. Imagine your aunt and uncle watching this video – would they know what’s going on? Would they find it interesting / engaging / compelling? For inspiration for what makes a good explanatory video, watch videos from the following series:
- Two Minute Papers by Károly Zsolnai-Fehér. Concise 1-4 minute summaries of cutting edge research papers.
- 3Blue1Brown by Grant Sanderson. Mathematical concepts conveyed clearly, intuitively, and visually.
- Welch Labs by Stephen Welch. Series on machine learning, neural networks, and imaginary numbers.
Once you're working on producing your video, ask your friends (especially those who may not be as technically inclined) for feedback. Do they think it was engaging/easy to follow/enjoyable to watch? Ask them their takeaways: did they get the message you were trying to communicate? Address their feedback to help you ensure the quality of your video. You're encouraged to use the audio-visual medium to the fullest to clearly present your project.
You'll submit your video as a .mp4 file to the instructional team (please test your file to make sure it plays before submitting).
Peer Evaluation
Since this is a team project, you will also be evaluated by your teammates (and yourself). This is a chance to offer each other feedback and reflect on your own performance. You will be rating your fellow team members on the following criteria:
- Was dependable in attending meetings to work on the project
- Did work accurately and completely
- Completed work on time
- Contributed positively to team discussions
- Helped others when needed
- Responded to communications in a timely manner
- Treated other team members respectfully
- Demonstrated a positive attitude about the team and its work
You can download the peer evaluation form here
.