Kurs: Applied Data Science in Tourism

Abschnittsübersicht

General Information

Aktivität Welcome to Applied Data Science in Tourism auswählen

Welcome to our class Applied Data Science in Tourism!

The course starts on the 31th March 2025 at 9:30am and finishes on the 11th April 2025. Normally, we start at 9:30am, followed by a lunch break from 12:30pm to 1:30pm. After 1:30pm we have our afternoon session until 3pm except for Wednesdays which are reserved for language classes etc. Classes take typically place in 01.011 or online if needed (see virtual class room below).

Schedule

First Week

Date	Topic
31st March 2025	Organization, Exam, and Introduction (9:30 to 11:00) Presentation by Sophia Quint from visitBerlin (11:00 to 12:00) Researching Data (13:30 to 15:00)
1st April 2025	First Steps in Metabase (9:30 to 12:30 and 13:30 to 15:00)
2nd April 2025	More Steps in Metabase (9:30 to 12:30) Enriching Data
3rd April 2025	First Steps in Orange (9:30 to 12:30) Orange for Forecasting (13:30 to 15:00)
4th April 2025	More Forecasting in Orange (9:30 to 12:30) Summary of the week (13:30 to 15:00)

Second Week

Date	Topic
7th April 2025	Predicting Prices on Airbnb Data (9:30 to 12:30 and 13:30 to 15:00)
8th April 2025	Clustering and finding patterns in Airbnb Data (9:30 to 12:30 and 13:30 to 15:00)
9th April 2025	Word Cloud of Reviews in Airbnb Data (9:30 to 12:30)
10th April 2025 in 01.103	GNTB Knowledge Graph (9:30 to 12:30) Exam Preparation and Questions (13:30 to 15:00)
11th April 2025 online in our virtual class room	Guest Card Oberstaufen (9:30 to 12:30) Course Summary (13:30 to 15:00) visitBerlin Dashboard/SARIMA in SPSS

Aktivität Announcements auswählen

Announcements Forum

In case of unforeseen circumstances or important information will be shared here.
Aktivität Virtual Class Room auswählen

Virtual Class Room BigBlueButton

Abschnitt Introduction auswählen

Introduction
- Aktivität Applied Data Science in Tourism auswählen
  
  Applied Data Science
  
  from Introduction to Environmental Data Science by Jerry Davis
- Aktivität Tools auswählen
  
  In the class we will use the following tools. Our goal is to learn how to use these tools and apply them skillfully with our touristic data sets.
  
  Metabase
  
  Orange (KNIME/RapidMiner)
  
  (Jupyter Notebooks)
  
  (RStudio)
  
  There are many more tools. The above choice is primarily caused by the fact that the above tools offer Open Source version. This means you can freely use them without needing to pay any money. The first two tools also require no programming skills.
Abschnitt Data about Berlin visitors auswählen

Data about Berlin visitors
- Aktivität Topic of the week auswählen
  
  During our first week, we will work together on a data project. Our goal is to forecast the Berlin visitors originating from the US. We will search for relevant data, create our first visualizations and try to forecast the development of visitors from the US.
- Aktivität Task: Find the overnight stay/accommodation data for Berlin auswählen
  
  Task
  
  First we want to figure out who is visiting Berlin. What are the top residences of people visiting Berlin?
  
  Try to find officially collected data covering who is visiting Berlin. Which time duration is covered by the data? How can you access this data? What is stored in these files?
  
  Since some German is helpful break up into groups of two making sure at least one person speak German.
- Aktivität Data "Guests, overnight stays and accommodation capacity in the state of Berlin" auswählen
  
  The data is linked from https://www.statistik-berlin-brandenburg.de/archiv/g-iv-1-m. The linked page lists Excel files for each month starting 2009.
- Aktivität Guests and overnight stays in the state of Berlin auswählen
  
  Guests and overnight stays in the state of Berlin Verzeichnis
  
  The files covered in this folder present the data available at https://www.statistischebibliothek.de/mir/receive/BBSerie_mods_00000050 but in a single CSV file for Berlin guests and residence for easy access. The data is also provided as SQLite files.
Abschnitt Exploring data using Metabase auswählen

Exploring data using Metabase
- Aktivität Installing Metabase auswählen
  
  After watching the short introductory video at metabase.com, follow the installation instructions at https://www.metabase.com/docs/latest/installation-and-operation/running-the-metabase-jar-file#local-installation to install Metabase. Basically, you need to install Java and download the metabase.jar file. After starting Metabase, you can access it using your browser at http://localhost:3000.
- Aktivität Task: Create some chart in Metabase auswählen
  
  Task
  
  Familiarize yourself with the Metabase tool and the sample x-rays. You can find the documentation at https://www.metabase.com/learn/.
  
  Add our two SQLite files as separate datasets. You can also
  upload CSVs to Metabase to the Sample Database.
  
  Create a stacked bar chart (a bar for each type of domestic accommodation) over time. You can find documentation at https://www.metabase.com/learn/metabase-basics/querying-and-dashboards/visualization/bar-charts. It's fine to follow their example to learn about the bar chart and then translate your learning into your own bar chart.
  Did you notice the data problem for clinics/centers and clinics?
  
  Now that you are familiar with the basics of Metabase how about you create some new chart of your own using the data. Remember you can select, filter, aggregate, and group the data prior to visualizing it. Present your created chart in class.
- Aktivität Visitor wheel auswählen
- Aktivität Task: Top residences auswählen
  
  Task
  
  Create a chart showing the top foreign residences of people visiting Berlin.
  It could look something like the chart below
  
  When do the two regularly occurring peaks happen over the year for visitors from the US? Can you make out differences within these two peaks?
- Aktivität Task: Indicators for US Visitors auswählen
  
  Task
  
  It is possible to predict the number of future visitors from historical data. We will do this starting tomorrow. What are downsides to predicting the number of visitors this way? What are the implicitly made assumptions?
  
  What are indicators predictive of visitors? Which could be publicly accessed and downloaded and help us predicting US visitors?
- Aktivität Task: Google Trends auswählen
  
  Task
  
  With Google Trends you can get data from Google showing what people search on Google depending on certain time spans and geographic locations. Play with Google Trends. Can you find search terms with a similar pattern than our Berlin visitors from the US?
  
  What is the value that reported by Google to indicate the volume of searches (the y-axis on the Google Trends charts). Research the definition and draw conclusions when comparing different search terms using Google Trends.
- Aktivität Search Interest between 1 and 100 auswählen
  
  "Indexing: Google Trends data is pulled from a random, unbiased sample of Google searches, which means we don’t have exact numbers for any terms or topics. In order to give a value to terms, we index data from 1-100, where 100 is the maximum search interest for the time and location selected."
  "Normalization: When we look at search interest in a topic or query, we are not looking at the total number of searches. Instead, we look at the percentage of searches for that topic, as a proportion of all searches at that time and location."
  
  from https://newsinitiative.withgoogle.com/resources/trainings/basics-of-google-trends/ and see also https://support.google.com/trends/answer/4365533
- Aktivität Task: US Visitors and Google Trends auswählen
  
  Task
  
  Building on yesterday's exploration of data from Google Trends, let's create a final chart contrasting the official data of US visitors with searches on Google. For this download the google_trends.sqlite file below.
  
  This file contains of two tables which have to be joined by month year to combine/enrich the data. You can read about joins at https://www.metabase.com/docs/latest/questions/query-builder/join.
  
  As an inspiration your final chart, could look at follows:
  Note the two y-axes.
- Aktivität google_trends.sqlite auswählen
  
  google_trends.sqlite Datei
  
  This file contains Google Trends information and the data about Berlin visitors.
Abschnitt Forecasting US visitors auswählen

Forecasting US visitors
- Aktivität Task: Installing and familiarizing with Orange auswählen
  
  Task
  
  Orange is an software for data mining. It does not require any textual programming. The data analysis is performed by arranging nodes and edges where nodes are representing the operations and the edges represent the flow of the data. Install Orange from the official sources at https://orangedatamining.com/download/.
  
  Familiarize yourself with Orange by loading the Berlin visitors data below and filter for the US residence. Look at the data using the Data Table.
- Aktivität residence.csv auswählen
  
  residence.csv Datei
- Aktivität guests.csv auswählen
  
  guests.csv Datei
- Aktivität Time-Series Forecasting with Orange by Nathan Humphrey auswählen
- Aktivität Task: Forecasting US visitors auswählen
  
  Task
  
  Based on the above video, create a forecast for the US visitors for our data. Play with the (hyper-)parameters similarly to how Nathan Humphrey is doing it. Can you make it work? If not, speculate why it might not work. Research about ARIMA and SARIMA models.
  
  In addition, try out the Seasonal Adjustment node and create a line chart with trend, seasonal and residual plots.
  Note, the needed workaround Edit Domain for a bug in Orange (see https://github.com/biolab/orange3-timeseries/issues/281) to fix the error variable month_year is not in domain in the Seasonal Adjustment widget.
- Aktivität Installing Orange on a Windows pool PC auswählen
  
  If you want to use Orange in a computer pool, you can follow these instructions for setting it up:
  
  Download the portable Version of Orange for Windows from https://orangedatamining.com/download/ and extract the zip file in some folder
  Be patient. It takes some time to extract the file since it contains lots of files and data.
  
  Navigate to the folder and double click on the link called Orange.
  Again be patient. Starting Orange for the first time usually takes longer but later starts are quicker.
  
  This setup was tested in the computer pool room 05.110 using Orange 3.38.1.
  
  These instructions might also work on others computers.
- Aktivität Forecasting in Orange auswählen
- Aktivität Trend, Season and Residuals auswählen
- Aktivität Task: Forecasting US visitors auswählen
  
  Task
  
  Yesterday, you used the Seasonal Adjustment node. How did you choose the seasonal period? Find a way to let Orange determine the seasonal period. This way the computer figures out a value itself instead of us trying manually. It also helps to review and critically question our guessed value.
  
  Apply an ARIMA model to the trend of US visitors and create a line chart of the trend with forecasting and a line chart for the seasonal pattern.
- Aktivität Task: Hypothesis testing auswählen
  
  Task
  
  Orange does not provide an SARIMA (ARIMA model handling also seasonality patterns; see https://datascience.stackexchange.com/questions/120136/seasonal-arima). Check out other models and try to use these for forecasting US visitors and forecast 5 years ahead.
  
  Perform a statistical hypothesis test in Orange to determine whether different time series from Google Trends predict the US visitors. What lag is reported for different search terms (see the google_trends.csv file below)? You can learn the very basics about a suitable hypothesis test at https://www.statology.org/granger-causality-test-in-r/. For this you have to merge the US visitor data with the Google Trends data similarly how you have done it in Metabase but now in Orange.
- Aktivität google_trends.csv auswählen
  
  google_trends.csv Datei
- Aktivität Orange Workflow for US visitor data auswählen
- Aktivität Orange Workflow for US visitor data auswählen
  
  Orange Workflow for US visitor data Datei
Abschnitt First Week Summary auswählen

First Week Summary
- Aktivität First Week Summary auswählen
  
  What we did during the week?
  
  Sophia Quint brought is the case of forecasting US visitors visiting Berlin.
  
  You researched and identified relevant data.
  
  You installed Metabase on your computer and are able to use it from now on.
  
  You explored the researched data in form of SQLite files using Metabase and created different charts.
  
  You identified the "peak behavior" of Berlin visitors from the US and visualized it.
  
  You learned about Google Trends data and found search terms with similar patterns relating to our US visitor data.
  
  You enriched the US visitor data with Google Trends data showing a peak in searches before actual US visitors arrive in Berlin.
  
  You installed Orange on your computers and learned about its basic widgets/nodes and edges for modeling the flow of data.
  
  You familiarized yourself with Orange and loaded the data in form of CSV files.
  
  You learned about different nodes in Orange, specifically you applied ARIMA (autoregressive (AR) integrated (I) moving average (MA)) and VAR (vector autoregression) models to forecast US visitors.
  
  You performed your (first ever?) statistical hypothesis test for determining whether search terms on Google Trends are predictive for forecasting US visitors. You applied the so called Granger Causality test.
  
  In summary, you did a lot. But actually more importantly, I hope, you learned some meta lessons along the way.
  
  Using a powerful but complicated machine like a computer is challenging but can be rewarding. You need to exact when instructing a computer. It's expected to struggle and to run into problems. When you solve a problem, that's learning. If you run into a problem and solve it yourself, you are much more likely to remember.
  
  Things go wrong. That's expected and normal. This way you gain experience. The more you know, the easier it gets. Also, make sure to reflect on what you are doing and form hypotheses internally to verify your understanding. This way you get better and avoid dead-ends.
  
  The more you do (in and across tools), reflecting when it does not work and trying to fix it in a principled non-random manner, you will gain more insights and confidence. Problems come in similar patterns with similar solutions.
  
  With the Internet, it's very likely that another person ran into the very same problem (and was kind enough to document and provide a solution to it) as you are experiencing right now. Use that to your advantage and search for similar problems online to find solutions.
  
  It's normal to struggle. You saw me struggle myself (more than once in Metabase and more than once in Orange). The more you use a tool the more you become an expert in it. If you are not using regularly, you have the chance for re-learning things you have forgotten.
  
  The next week will basically be the same as this week with
  
  new datasets to explore and
  
  new methods to learn and apply
  
  to gain more experience and confidence. In the end you should be able to solve data science problems with the tools we covered. in class yourself and transfer some of your knowledge and skills to other tools in future.
Abschnitt Supervised Learning auswählen

Supervised Learning
- Aktivität Task: Classification on Airbnb auswählen
  
  Classification on Airbnb data
  
  In a classification, we try to predict a categorical variable from features. In the data the model observes the relationship between input features and output target and tries to learn patterns for prediction the target label on unseen data.
  
  Task
  
  Download the detailed listing data for Berlin (or any other place you find interesting) from https://insideairbnb.com/get-the-data/.
  
  What columns/fields does this dataset contain? Familiarize yourself with the different columns. How are categorical columns vs numeric columns displayed in Orange?
  
  Select neighbourhood_group_cleansed as the target and predict it using different classification models like kNN, Tree, and Random Forest. You can analyze the model results using a Confusion Matrix.
  You can use video tutorials like the following to learn about classification.
  
  What are the most useful but obvious columns for predicting the neighbourhood?
- Aktivität classification.ows auswählen
  
  classification.ows Datei
  
  Exemplary solution to the above task
- Aktivität Task: Regression on Airbnb auswählen
  
  Regression on Airbnb data
  
  In a regression, we try to predict a numeric variable from features. It's similar to classification except that the predicted value is continuous.
  
  Task
  
  Use the same data as for the previous task.
  
  Predict the prices of offered accommodations on Airbnb. Use different models like Linear Regression, kNN, and Random Forest to predict the prices.
  The following video explains linear regression among other concepts.
  
  Which model has the lowest mean absolute error (MAE) for you?
- Aktivität Hint auswählen
  
  You can use the following code in a Python Script within Orange to remove the `$` from the price column. Afterwards you can turn this column in a numeric columns using Edit Domain to use it as a target for prediction in a regression model.
  
  out_data = in_data.copy()
  
  for row in out_data:
  
  row['price'] = row['price'].value.replace("$", "")
- Aktivität regression.ows auswählen
  
  regression.ows Datei
  
  Exemplary solution to the above task
Abschnitt Unsupervised Learning auswählen

Unsupervised Learning
- Aktivität Overview auswählen
  
  In the last section we learned about classification and regression models. These are models that observe the input data (features) along with the output data (labels). Their purpose is to model relationship between inputs and outputs for prediction. In this section we are looking at methods that focus on uncovering patterns in data. Their primary goal is not to make predictions but to help getting a better understanding of data and its patterns.
- Aktivität Task: Clustering and Dimensionality Reduction auswählen
  
  Clustering and Dimensionality Reduction
  
  Task
  
  Familiarize yourself with the k-means algorithm by using the Interactive k-Means widget. What are the two basic steps this algorithm is performing? Note, that the algorithm relies on calculating distances between points. What is the purpose of this algorithm?
  
  Build a data pipeline in Orange to apply the k-means clustering algorithm on the Airbnb data. Try to uncover patterns within the data. Use the Data Sampler to reduce the size of the data.
  Try finding cluster using two features only first. This makes it easy to verify the identified clusters.
  
  Usually, the data has more than two or three dimensions (columns/features). This makes it challenging to visually show patterns. Make use of dimensionality reduction methods like principal component analysis (PCA) to project the data into two dimensions.
- Aktivität kmeans.ows auswählen
  
  kmeans.ows Datei
  
  Exemplary solution to the above task
- Aktivität clustering.ows auswählen
  
  clustering.ows Datei
  
  Exemplary solution to the above task
- Aktivität Task: Embed and cluster images auswählen
  
  Embedding Images from Airbnb
  
  Task
  
  Did you notice that the Airbnb data set contains URLs to images (see column picture_url)? Sample 20% of these images and download them using the Save Images widget from the Image Analytics add-on.
  
  Use a (convolutional) neural network in Orange to embed the Airbnb accommodation images (Image Embedding). This basically turns any image into a point in a high-dimensional space. Then you can apply clustering methods and dimensionality reduction to gain insights about the images. You can view images using the Image Viewer widget.
- Aktivität Task: Word Cloud auswählen
  
  Text Analysis or Natural Language Processing (NLP)
  
  Task
  
  The Airbnb data provides also unstructured texts like reviews and description texts. Examine the relevant columns to see what kind of texts these columns contain.
  
  To provide widgets for text analysis, you need to install the Text add-on. Similar to the Form TimeSeries, you have to turn the data into a Corpus for doing text analysis in Orange. Make sure to only take a small sample. Otherwise you might overload your computer. Build a word cloud using the review texts and Preprocess Text to bring the text into shape where needed.
- Aktivität Word Cloud of Reviews auswählen
- Aktivität wordclouds.ows auswählen
  
  wordclouds.ows Datei
  
  Exemplary solution to the above task
- Aktivität Task: Sentiment Analysis and Topic Modeling auswählen
  
  Task
  
  Compare the review ratings with the reviews by doing a Sentiment Analysis. Try out different methods for doing sentiment analysis.
  
  Try out other widgets from the Text Mining section like Topic Modeling.
- Aktivität wordcloud_and_sentiment.ows auswählen
  
  wordcloud_and_sentiment.ows Datei
  
  Exemplary solution to the above task
Abschnitt GNTB Knowledge Graph auswählen

GNTB Knowledge Graph
- Aktivität SQLite Database POIs Knowledge Graph auswählen
  
  SQLite Database POIs Knowledge Graph Datei
  
  SQLite Database of POIs extracted from the Open Data/Knowledge Graph
- Aktivität POIs in the GNTB Knowledge Graph as CSV auswählen
  
  POIs in the GNTB Knowledge Graph as CSV Datei
- Aktivität SQLite Database POIs Knowledge Graph as of 9th April 2025 auswählen
  
  SQLite Database POIs Knowledge Graph as of 9th April 2025 Datei
  
  SQLite Database of POIs extracted from the Open Data/Knowledge Graph
- Aktivität POIs in the GNTB Knowledge Graph as CSV as of 9th April 2025 auswählen
  
  POIs in the GNTB Knowledge Graph as CSV as of 9th April 2025 Datei
- Aktivität Task: Develop a dashboard in Metabase auswählen
  
  TASK
  
  Develop a dashboard in Metabase using the PointOfInterest (or Sight) data of GNTB's knowledge graph. You can use the screenshot below for inspiration.
  
  Build a chart that goes beyond what is shown in example below. Consult the Metabase documentation for inspiration. There is also a glossary in case you are not familiar with a Metabase term.
  
  Present your dashboard afterwards in class.
- Aktivität Screenshot Sample Dashboard auswählen
- Aktivität Sample Dashboard auswählen
  
  Sample Dashboard Datei
- Aktivität Student Dashboards 2024 auswählen
  
  Student Dashboards 2024 Datei
- Aktivität Student Dashboard 2025 auswählen
  
  Student Dashboard 2025 Datei
Abschnitt Allgäu-Walser-Card auswählen

Allgäu-Walser-Card
- Aktivität Allgäu-Walser Guest Card The Allgäu-Walser-Card (w... auswählen
  
  Allgäu-Walser Guest Card
  
  The Allgäu-Walser-Card (website only in German) is a card for tourists and locals in the Allgäu and the Kleinwalsertal. Here we only look at a part of the uses of this card, specifically for Oberstaufen, mainly the year 2020 and the product Oberstaufen PLUS. With the card you can access additional services (see https://www.allgaeu-walser-card.com/urlauber/urlaubsorte/oberallgaeu/oberstaufen/).
  
  Task
  
  Load the data provided below into Metabase.
  
  Build a model by joining the card uses with the outlet information extracted from Google Maps.
  
  Come up with questions for what to explore in this data set.
  
  Try to answer those questions using Metabase.
- Aktivität SQLite Database Allgäu-Walser-Card auswählen
  
  SQLite Database Allgäu-Walser-Card Datei
  
  Make sure to delete this file on your computer when you are finished with the class. Because this database contains personal identifiable information though it is pseudonymized.
- Aktivität Use of Allgäu-Walser-Card data in CSV file auswählen
  
  Use of Allgäu-Walser-Card data in CSV file Datei
- Aktivität Outlets of Allgäu-Walser-Card in CSV file auswählen
  
  Outlets of Allgäu-Walser-Card in CSV file Datei
Abschnitt Course Summary auswählen

Course Summary
- Aktivität Data Science Workflow auswählen
- Aktivität Data Science Workflow auswählen
  
  Data Science Workflow Datei
- Aktivität For carrying out data projects there do exists gui... auswählen
  
  For carrying out data projects there do exists guidelines on what to do in which order. A popular approach is the Cross-industry standard process for data mining (CRISP-DM) shown below.
  
  licensed under CC BY-SA 3.0 DEED from https://commons.wikimedia.org/wiki/File:CRISP-DM_Process_Diagram.png
  
  There exist different approaches highlighting different parts of the process. But usually it involves, a definition or framing of the problem/question to be answered, various steps involving the data like data collection and data preprocessing, model related steps like modeling and model evaluation followed by the deployment of the model. It is advisable to also include a specific step to verify and update the understanding.
  
  I want to highlight that these steps are typically performed incrementally and a discovery during modeling may lead to an updated data processing. So the steps are not strictly followed but often reiterated as needed.
- Aktivität Video: BahnMining – Pünktlichkeit ist eine Zier auswählen
- Aktivität BahnMining - Pünktlichkeit ist eine Zier auswählen
  
  BahnMining - Pünktlichkeit ist eine Zier Datei
Abschnitt Issues and Problems auswählen

Issues and Problems

Nicht verfügbar
Abschnitt Travel & Tourism Development Index auswählen

Travel & Tourism Development Index

Nicht verfügbar
Abschnitt Applications of AI methods in Tourism auswählen

Applications of AI methods in Tourism

Nicht verfügbar
Abschnitt Examination auswählen

Examination
- Aktivität Exam auswählen
  
  Make sure you are registered for the exam in EMMA if you want to take the exam. You can cancel your registration until 7 days before the exam.
  
  The exam consists of a presentation which has to be given on the 20th June 2025 in 01.011 from 9:00 to 14:00. Upload your presentation below before the day of the examination. The exact schedule will be added below and finalized 7 days before the exam.
- Aktivität Exam Schedule auswählen
  
  Time on 20th June 2025 Examinee
  
  09:00 to 09:20
- Aktivität Presentation auswählen
  
  In your presentation you are expected to show your skills for working with data. The presentation is supposed to take 15 minutes followed by some questions. In total the exam will take around 20 minutes. You can use the data sets and tools we covered in class. You can also work with other data sets and tools that you are already familiar with or would like to learn. When you are using another data set make sure to also share that data set with me for verification purposes.
  For my grading of your work, keep in mind
  
  I'd like you to show and apply what you have learned in class.
  
  Prepare and present your slides/dashboard/documents in English.
  
  Use the terminology that we covered in class (filter, select, aggregate, bar chart, scatter plot, bubble chart, categorical variable, metric feature, supervised/unsupervised learning, etc.) during your presentation.
  
  Show that you can summarize visually in a concise and non-misleading way.
  
  Present a more elaborate analyses to reveal some patterns in your data set.
  
  Aim for a coherent presentation and jump not to much between different data sets. This also allows you to go into some depth.
  
  I'm most interested in your visuals and your explanation of those and the insights you got.
  
  Don't forget to include definitions for data fields. Sometimes it is not obvious what is being counted or measured.
  
  For a very good grade you are expected to do more than just replicate and rearrange the charts and analyses we did in class. Learn a new analysis, apply it and present its results.
  
  Have a critical mind and reflect on the obtained results. Make sure that you applied the method correctly and check that the data is plausible. Report hyperparameters if they are important and non-obvious for the analysis.
  
  Good luck!
  
  You can find an example from the German Tourism Association (Deutscher Tourismusverband) below. It provides basic visualizations.
- Aktivität Hints Some of the data sets we covered come with r... auswählen
  
  Hints
  
  Some of the data sets we covered come with reports. You can use those reports for inspiration.
  
  You are allowed to talk to your fellow students about methods and how to apply them. You can also brain storm together. But make sure that you don't copy from each other. We want to avoid same/similar charts or analyses on the same/similar data.
- Aktivität Example from the German Tourism Association auswählen
  
  Example from the German Tourism Association Datei
  
  Original can be found at https://www.deutschertourismusverband.de/fileadmin/user_upload/Footer/Presse/Zahlen-Daten-Fakten_2024.pdf
- Aktivität Example from the German Tourism Association translated by deepl.com auswählen
  
  Example from the German Tourism Association translated by deepl.com Datei
  
  Original can be found at https://www.deutschertourismusverband.de/fileadmin/user_upload/Footer/Presse/Zahlen-Daten-Fakten_2024.pdf
- Aktivität Upload your presentation auswählen
  
  Upload your presentation Aufgabe
  
  Upload your presentation here. You can upload either slides as PDF file or an exported Metabase dashboard that you are going to present. Make sure though your presentation is in a single PDF file. Your submission will be provided already connected to a device for presentation. Don't forget to include any custom data in a ZIP file that was not covered in class. Latest submission is the day before the exam.

Time on 20th June 2025	Examinee
09:00 to 09:20

Abschnittsübersicht

Schedule

First Week

Second Week

Applied Data Science

Task

Task

Task

Task

Task

Task

Task

Task

Task

Task

Classification on Airbnb data

Task

Regression on Airbnb data

Task

Clustering and Dimensionality Reduction

Task

Embedding Images from Airbnb

Task

Text Analysis or Natural Language Processing (NLP)

Task

Task

TASK

Allgäu-Walser Guest Card

Task