<aside> 💡 Recommended book for learning exploration are : Multivariate Data Analysis by Joseph F Hair → Chapter ‘Examining your data’ And the following data exploration is based on Kaggle Dataset House Price Prediction by using advance Regression model

</aside>

  1. Understanding the problem : Philosophical analysis about meaning and importance
  2. Univariable : focus on dependent variable
  3. Multivariate study : understand how the dependent variable and independent variables relate
  4. Basic cleaning : handle missing data, outliers and categorical variable
  5. Test assumption : check if our data assumptions required by most multivariate techniques

Understanding the problem


The “Type” and “Segment” is for future reference , “Expectation” help us in our sixth sense. To fill column , reading description of all variable is must thing to do and asking the following questions:

Untitled

Untitled

After this we can filter these variable into one place and look carefully to the variable with ‘High Expectation’ then creating the scatter plots between those variable and target variable “SalePrice”, then filling the conclusion column

Independent columns