A machine learning project utilizing Gradient Boosting and Lasso Regression to estimate property values with high precision based on the Ames Housing Dataset.
From raw data to a deployable model, here is the technical pipeline used in this project.
Handled missing values using KNN imputation. Corrected skewness in target variables using Log Transformation (`np.log1p`) to normalize the distribution for regression models.
Created interaction features (e.g., `TotalSF` = Basement + 1stFlr + 2ndFlr). Encoded categorical variables using One-Hot Encoding and addressed high cardinality.
Stacked XGBoost, LightGBM, and Lasso Regression. Used GridSearchCV for hyperparameter tuning to minimize Root Mean Squared Error (RMSE).
Enter the details of a hypothetical property below. This form uses a JavaScript approximation of the trained model's coefficients to generate a real-time estimate.
Estimated Market Value
Fill out the form to see the prediction
Running XGBoost Model...
Visualizing feature importance and prediction accuracy on test data.
* "OverallQual" is the dominant predictor, followed by Living Area.
* High correlation along the diagonal indicates strong model accuracy.