In a classification, we try to predict a categorical variable from features. In the data the model observes the relationship between input features and output target and tries to learn patterns for prediction the target label on unseen data.
What columns/fields does this dataset contain? Familiarize yourself with the different columns. How are categorical columns vs numeric columns displayed in Orange?
Select neighbourhood_group_cleansed as the target and predict it using different classification models like kNN, Tree, and Random Forest. You can analyze the model results using a Confusion Matrix. You can use video tutorials like the following to learn about classification.
What are the most useful but obvious columns for predicting the neighbourhood?
In a regression, we try to predict a numeric variable from features. It's similar to classification except that the predicted value is continuous.
Task
Use the same data as for the previous task.
Predict the prices of offered accommodations on Airbnb. Use different models like Linear
Regression, kNN, and Random Forest to predict the prices. The following video explains linear regression among other concepts.
Which model has the lowest mean absolute error (MAE) for you?
You can use the following code in a Python Script within Orange to remove the `$` from the price column. Afterwards you can turn this column in a numeric columns using Edit Domain to use it as a target for prediction in a regression model.