In this session, we performed an analysis of churn and LTV on SQL and Python.
The approach that we followed for the analysis is described here:
- We began by creating the churn database using all the csv files that were available to us.
- Next, Favio created an ERD using dbdiagram.io and then checked whether the database satisfied the properties of an RDBMS by validating the normalization rules.
- Then we opened the SQL browser and performed some data exploration and analysis on the churn data.
- Further, we moved to Python, where we removed missing values and joined all the tables to create a single table with all the data that we have with us.
- After generating the combined dataframe, we found that there was a significant difference in the LTVs of the top 20% and bottom 80% customers.
- Finally, we performed a complex analysis to check which factors influence the LTVs of the customers and what are the conjectures that we can draw for pulling people into the high-LTV group.