CS 100 Lab 6: Titanic & Data Science

For this lab, you'll need to complete a submission for the Titanic competition based on the provided notebook that yields a score of at least 78%.

Step 1: Fork the template notebook

You'll need an account on Kaggle to do this step --- you should be able to do this by clicking on "Register with Google" on the Kaggle registration page.

After creating an account, visit the CS 100 Data Science Notebook and click on the "Copy and Edit" button at the top right of the page. After agreeing to the rules of the competition, you should be dropped into the notebook editor.

Step 2: Evaluate the notebook and submit results

Once in the notebook, select the first cell in the notebook and evaluate it with the key combination "shift-enter". You can continue to do this for all the cells (in succession) in the notebook, so that you see the results. For this step, don't change the contents of any cells just yet.

To submit the results computed by the notebook, first click on the "Commit" button at the top right of the page, and in the dialog that pops up click "Open Version" to open the notebook status page in a new tab/window. You may have to wait a few seconds to refresh the status page before seeing an "Output" tab; when you do, click on "Submit to Competition" to submit the notebook results and get your score.

The provided notebook (which uses a straightforward sex-based prediction model) should yield an accuracy score of 76.55%. In the next step, you will try to get this score past 78%.

Step 3: Improving the score

The only code you will need to touch in your fork of the notebook is in the cell just under the heading "Making predictions". Specifically, you'll be adding additional if-else clauses in the for loop which iterates over all the rows in the test data. Of course, you can fiddle with the cells in the "Basic analysis" and "Working with rows manually" sections to print out information that may help you form hypotheses about how to better make predictions.

After changing the cell, make sure that the number of predictions you created matches the number of rows in test_data --- the second cell in the "Making predictions" section contains an assertion that tests this for you.

When you're ready to test your updated results, just make a new submission as explained in the previous section. Remember, your goal is to make a submission with a score of at least 78%!

Step 4: Submission

To submit your work, the first thing you'll need to do is make sure your notebook is public (so we can access it). You can update the access settings by going to the notebook status page and clicking on "Access" on the top right, and selecting "Public" in the drop-down menu labeled "Privacy". You'll also want to make sure you submitted the score for the notebook (the public score should be displayed on the notebook status page).

Submit your work using this form. You'll need to type in your team member names and paste your notebook page URL.