Abschnittsübersicht

    • Data Science Workflow

    • For carrying out data projects there do exists guidelines on what to do in which order. A popular approach is the Cross-industry standard process for data mining (CRISP-DM) shown below.

      CRISP data mining process diagram
      licensed under CC BY-SA 3.0 DEED from https://commons.wikimedia.org/wiki/File:CRISP-DM_Process_Diagram.png

      There exist different approaches highlighting different parts of the process. But usually it involves, a definition or framing of the problem/question to be answered, various steps involving the data like data collection and data preprocessing, model related steps like modeling and model evaluation followed by the deployment of the model. It is advisable to also include a specific step to verify and update the understanding.

      I want to highlight that these steps are typically performed incrementally and a discovery during modeling may lead to an updated data processing. So the steps are not strictly followed but often reiterated as needed.