We train a machine learning model on a large data set for predicting property values in the Norwegian real estate market. Our model is a gradient boosted regression tree. The data set is the largest market data set of properties in Norway considered in the research literature. We achieve state of the art accuracy. The novelty of our work lies in the fact that we use a minimal feature set in our model, and we have the largest data set in the research literature. Moreover, we have used only freely and publicly accessible data which are simple to obtain. Our data set covers most property types: freehold houses, apartments, rentals and cabins. This shows that useful estimation models with high accuracy can be built with quite simple resources.
We train a machine learning model on large data set for predicting property values in the Norwegian real estate market. Our model is a gradient boosted regression tree. The data set is the largest market data set of properties in Norway considered in the research literature. We achieve state of the art accuracy.
A large scale market data set of real estate properties is collected from sales and rental ads on publicly accessible internet sites. The property advertisements show property features and appraisal values made by real estate brokers. We train a gradient boosted regression tree model on selected features of the data set. This is a multivariate regression model built with supervised learning. We do 5-fold cross validation to assess the accuracy and robustness of the model.
The gradient boosted regression tree models are already known to give the best prediction accuracy on real estate price valuations. We achieve state of the art pre- diction accuracy using a minimal feature set and only publicly and freely available sales advertisement data.
The novelty of our work lies in the fact that we use a minimal feature set in our model, and we have the largest data set in the research literature, and moreover we have used only freely and publicly accessible data which are simple to obtain. This shows that useful estimation models with high accuracy can be built with quite simple resources.