Introduction

Overview

Teaching: 5 min
Exercises: 0 min
Questions
  • How do you approach and evaluate data of unknown origin?

  • What are the pain points when inheriting a data project?

Objectives
  • Evaluate a project for reproducibility.

  • Identify assumptions and red flags.

  • Recognize documentation and structure gaps.

You have just started a new job and have to take over the work of a previous employee who has left the lab and gone off the grid. You receive an Excel file of this person’s life work. Your boss has instructed you to:

  1. Make sense of all the data he has collected
  2. Write a report on the findings to share with others in the lab, so they may use the data and analyses in their own work.

The file which was sent to you can be downloaded here: gapminderDataFiveYear_superDirty.xlsx.

Download the file. With the goal of making sense of the data, what can you tell me about this data, and how do you know that?

Key Points

  • Using disorganized data is time-consuming and error prone.

  • Collaborators like your past self do not respond to email.