Kaggle Competition

Competition Link: Solar PV in Aerial Imagery

Kaggle is a machine learning competition platform, and you will be competing with each other in teams to obtain the best performing machine learning algorithm. But it’s not only your rank on the Kaggle competition that matters, but how you and your team present of your solution and how clearly your team is able to communicate it.

Good luck! The competition ends February 27 at 11:59 Eastern Standard Time

Report

Through this project you will access aerial imagery data, process the data into a form to ready it for machine learning algorithms, train and test multiple supervised learning techniques, and evaluate the performance of your algorithms. Your report on your solution should be composed in a way that tells the story of your work in this competition. You will want to include the content below. The format below is meant to explicitly lay out the components that must be addressed, although your goal is to both clearly explain and get others excited about what you did, so you’re encouraged to be creative in how the material is presented. No section of this report, unless otherwise stated, should be less than 250 words. Figures should be used throughout to support your narrative.

  1. Abstract. [150 words maximum] This should be the one paragraph that captures the significance of what you did and why you did it.
  2. Introduction. Provide a description of the problem and the value in finding a solution, motivate your reader as to why he/she should care about this question. The idea is to get your reader excited about the solution you are about to present.
  3. Background. This section should cite problems that have been previously addressed that relate to your work, and the key takeaways of the studies that explored that work. The idea here is to place the problem you’re working on in context and to let the reader know that you’re not working in a knowledge vacuum. For finding relevant literature, a good starting point is Google Scholar.
  4. Data. Describe and visualize your data. Make sure every caption fully describes the figure. You may want to visualize the raw data and/or extracted features. What challenges are inherent to this problem? How might they be overcome? What take away messages can you get simply from visualizing your data?
  5. Methods. Present your machine learning solution (a description of any preprocessing, feature extraction, classification/regression techniques) and why you made each of the choices you did. Discuss any methods that you didn’t create yourself and please cite relevant literature to support your claims. Also include a flow chart of your methodology to the reader can easily conceptualize your solution. Describe your approach to measuring generalization performance, what metric(s) you used and why. Imagine that you are writing this section so that someone could recreate your results.
  6. Results. Include a complete performance assessment that includes your validation approach (cross validation, train/validate/test split, etc.) and the key metrics of performance for the problem (ROC curves, PR curves, confusion matrices if applicable, etc.). You should also compare your outcomes to at one baseline model (a simple model to exhibit your improvements) in addition to comparison against random chance guessing in the classification setting. This section should be supported with visualizations including examples where your method worked well, examples where it failed, and hypotheses supported by evidence as to why in each case.
  7. Conclusions. It’s critical to have a strong ending and not just let the energy fizzle out of the report. Many readers, if pressed for time, will simply read your abstract and your conclusions. In fact, you may want to start by writing your conclusions. Very succinctly recap the problem you were studying and what was your approach to the solution. Focus on explaining the key takeaways from your work - these should not be merely a set of bullet points, but fleshed out conclusions. As you're writing your conclusions think about if the reader took nothing else away from reading your report, what would you want them to know most? Did you identify one particular approach that worked well? Was there a challenge that you faced that opens the door to working on solving a new problem? What avenues of research would you pursue next?
  8. Roles. Since this is a team project, we want to know what your specific contribution was to this project. Provide detail on your individual role and how it contributed to the competition. Each team member should clearly articulate an individual role.
  9. References [no word limits]. An alphabetical list of references cited in this work. A minimum of 10 are required. Consider using the Zotero citation manager for collecting and compiling your references.

Figures. In your report, figures should be used to convey your ideas and EVERY figure should have a caption that clearly explains the figure's purpose and how to read it, axis labels (with units if applicable), a legend if more than one quantity is plotted, and should be referenced in the text (every figure has a clear point to the story that you tell and there are no figures just thrown in without an explained reason). If you use any figures that are not your own, the source of the figure should be explicitly cited as well.

Submission. Your report will be submitted on Gitlab and you will ALSO submit THREE printed PDF versions of this report. Additionally, your team's final code (that corresponds to your highest kaggle leaderboard submission) will also be submitted on Gitlab. Code should be cleaned, easy to re-use and well-commented.

Peer Evaluation

Since this is a team project, you will also be evaluated by your teammates (and yourself). This is a chance to offer each other feedback and reflect on your own performance. You will be rating your fellow team members on the following criteria:

  1. Was dependable in attending meetings to work on the project
  2. Did work accurately and completely
  3. Completed work on time
  4. Contributed positively to team discussions
  5. Helped others when needed
  6. Responded to communications in a timely manner
  7. Treated other team members respectfully
  8. Demonstrated a positive attitude about the team and its work

You can download the peer evaluation form here

.