Real Estate Supervised ML/AI Linear Regression Revisited - USA House Price Prediction
Linear regression is an algorithm of supervised Machine Learning (ML) in which the predicted output is continuous with having a constant slope . Consider a company of real estate with datasets containing the property prices of a specific region. The price of a property is based on essential factors like bedrooms, areas, and parking. Majorly, a real estate company requires :
So real estate experts assume that the trained, tested and deployed ML model would be capable of learning and predicting how far people would go in the bidding in order to buy a house, based on selected features. That’s the ML housing price prediction algorithm in a nutshell.
Step 1: Exploratory Data Analysis (EDA)
All attributes are numeric except for the "Address" field. Its type is an object, so it can contain any type of Python object. A quick way to get a feel for what kind of data you’re dealing with is to plot a histogram and a box/whisker plot for each numerical attribute, as shown below.
Figure 1: Histograms and box/whisker plots of raw input data .
We see that Avg. area Income, Area Population and Avg. Area House Age have the largest correlation with house prices.
In the above scatter X-plot, we see test versus predictions are of a line form, which means our model has done good predictions.
In the above histogram plot we see data is in bell shape(Normally Distributed), which means our model has done good predictions.