Refresh

📌 Project: Customer Churn Prediction and Retention Strategy

As a Stragegy Analyst for Gym, I predicted customer churn for the gym’s chain and developed retention strategies.

Project Link

Cluster of Customers

Customers can be optimally classified into 5 clusters

Churn Prediction

Targeting the top 40% of the customers, we would capture about 95% of clients who would churn.

📊 General Findings

The total number of customers living near the gym is 5× higher than those living far away. >50% of distant customers churn, compared to <50% of nearby customers.
More than 50% of customers are employees of partner companies. ~50% of non-partner customers churn vs. ~20% of partner-company customers.
More customers join without promo friends, but those who join via promo friends churn significantly less.
Most customers sign 1-month contracts. >70% of 1-month contract customers churn, while <5% of 12-month contract customers churn. No customers choose 3-month contracts.
Younger customers churn more than older customers.
Longer-tenured customers churn less than new customers.
There are 5 optimal customer clusters.
Clusters 3 and 4 have the highest churn risk, while Clusters 1 and 2 show stronger loyalty.
Overall churn rate is about 27%.
About 95% of churners fall within the top 40% high-risk customers.

🎯 Recommendations

Focus retention campaigns on the top 40% of high-risk customers, potentially retaining up to 95% of churners.
Promote and incentivize 12-month contracts to reduce churn among short-term members.
Provide proactive engagement and personalized offers for Clusters 3 and 4.
Introduce structured loyalty programs, including rewards and referral incentives.
Develop initiatives targeting younger customers, such as flexible plans or youth-focused promotions.
Strengthen or expand partner company relationships to increase the base of low-churn customers.
Encourage friend-referral or promo-friend campaigns to reduce churn among new members.

Software and Tools

📌 Project: A/A/B Test to Inform Business Decisions

Investigated user behavior for a company’s app, and conducted an A/A/B test to assist managers to make an informed business decision.

Project Link

User Distribution by Group

All groups were present at all times for the test.

User Behaviour

The funnel shows stages of customers’ behavior on the app. The group sizes at each stage indicate the data was split approximately equally.

📊 General Findings

In preparing the data for analysis, missing values were checked, and about 0.17% of duplicates were found in the data and deleted.
I ensured no participant belong to more than one group, and columns were converted to the required data types.
There were 7551 unique users in the logs, 5 types of events, and 243713 events in the logs.
The data spanned a period of about two weeks. The minimum date was 2019-07-25 00:00:00, and the maximum date was 2019-08-07 00:00:00.
The data was comparatively incomplete from 2019-07-25 to 2019-07-31. From 2019-08-01, the data was comparatively complete. Hence, I chose to keep data from the period 2019-08-01 and ignored the earlier section, i.e. from 2019-07-25 to 2019-07-31.
The event funnel was studied. Users initially visit the main screen, followed by an offer screen, then a cart screen is offered, and payment is made. At the last stage, an optional tutorial is provided. Ignoring this order, and following the order based on how the data is sorted by the number of users and its’ importance; the stage with the most users is lost, which is from the main screen stage to the offer screen (about 38%).
The share of users that make the entire journey from their first event to payment is about 47% (high conversion rate). This can be boosted by:
- Conducting an A/B test for each event to reveal the stage of the event where conversion of the service can significantly be enhanced.
- Using a conversion rate optimization (CRO) planner, shortening forms, increasing trust and removing friction, etc. For instance, the conversion rate at the tutorial stage is not encouraging and should be considered for removal as it is contributing to a lower conversion.
There is no statistically significant difference between groups A1 and A2 which implies the groups were split properly.
There is also no statistically significant difference between groups A and B which implies the test was not successful.

🎯 Recommendation

There is no difference between the groups. Hence, do not change the fonts for the entire app.

Software and Tools

📌 Project: Business Review of Markets Across the World Economy

📌 Project: Business Metrics of Yandex Afisha

As a Junior Data Analyst in the analytical department at Yandex. I analyzed the business metrics of the Yandex Afisha app to help the marketing experts optimize marketing expenses.

The highest number of visits to the Yandex Afesha app was on Black Friday (24.11.2017). March 31, 2018, was a popular holiday plus observances Worldwide - a holiday can adversely impact visits to Yandex Afisha but black friday stimulated visits.

User Retention by Cohort

The June 2017 cohort had the highest retention rate as of month 11. By the first month (month 1), all cohorts had retention rates of less than 10%.

Lifetime Value (LTV) Cohort Analysis

The June 2017 cohort had the longest duration of LTV; contributed the longest time. However, the September 2017 cohort had the highest LTV. June 2018 cohort had the least LTV.

Customer Acquisition Cost (CAC) Cohort Analysis

CAC per cohort shows uniform but d for each cohort. The August 2017 cohort had the highest cost in a given month while the May 2018 cohort had the least.

Return on Marketing Investment (ROMI) Cohort Analysis

The September 2017 cohort had the highest return on investments, followed by the June 2017 cohort. May 2018 cohort had the lowest return on investments. No cohort has recouped 100% of investments.

📊 General Findings

On average, about 907 people use the Yandex Afisha app every day, about 5621 people use it every week, and about 23228 people use it every month.
The highest daily visits occur on a Black Friday and the lowest daily visits occur on a holiday
On average, there is about 1 session per day, and each session lasts about 60 seconds.
By the first month (month 1), all cohorts had retention rates of less than 10%. None of the May 2018 cohort came back after their first visit. Only June 2017 cohort was retained till month 11.
The December 2017 users made about 4400 orders (the highest), this is followed by October 2017, and November 2017. Users in the months of June, July, and August have the least number of orders.
On average, people start buying within 0 minutes (immediately), and the average purchase size is about 5.00 dollars.
The June 2017 cohort contributed the longest time but the September 2017 cohort had the highest LTV. May and June 2018 cohorts had the least LTVs.
CAC per month/cohorts shows uniform but different costs for each cohort. The August 2017 cohort had the highest cost in a given month while the May 2018 cohort had the least.
Users of platforms/source 1 and 2 bring in the highest revenue, and users of platform 7 bring in the least.
Platform 3 has the highest cost but it is amongst the least revenue generators.
Platforms 1 and 2 bring in the highest revenue, and amongst the least in cost. They are the most profitable platforms.
The investments in all the sources are not yet worthwhile as the highest (source 1) is yet to recoup 100% of the investment. Also, investments by cohorts are not yet worthwhile.

🎯 Recommendations

Invest more in sources 1 and 2, and cut costs on platform 3; without platform 3, revenues would exceed cost.
Introduce strategies to boost retention rate; improve user experience with the app.

Software and Tools

📌 Project: Product Range Analysis

As a junior analyst at an online store that sells household goods, I analyzed the store’s product range for the period 29/11/2018 to 07/12/2019.

Project Link

Product Categorization Model

A near-perfect model was built to categorize the products.

Products in Additional Assortment

About 99% of the products were sold together with others.

📊 General Findings

The highest unit price of a product cost £38,970.00.
The unit price had a mean of about £4.60.
Invoice number 573585 had the highest number of products ordered (1113 products). The top ten invoices show the customers of the store are mostly wholesalers.
Kitchenware is the most frequently purchased category, and plant and accessories are the least frequently purchased category.
The highest daily orders were on November 30th, 2018, followed by November 15th, 2019 (141 and 136 orders respectively). The lowest daily order was on 4th February 2019 - just 11 orders.
The number of total monthly orders from December 2018 to November 2019 increased by about 121%.
Revenues are comparatively lower from January to July and higher from August to November.
Regency Cakestand 3 tier and paper craft little birdie are the top two products in terms of revenue generation. Regency Cakestand 3 tier generated revenue amounting to about £174,200.00 - the highest.
The most canceled product order is Regency Cakestand 3 tier - canceled 180 times.
On average, paper craft little birdie generated the highest revenue - about £168,469.00.
On average, Kitchen ware generated the least revenue - about £18 while home decorations generated the highest- about £23.
There were 1501 products that were sold by themselves. The rabbit night light was sold alone 32 times- the most sold alone product.
About 99.7% of the products were sold together with others - additional assortment.
Jumbo bag and pink polka dot and jumbo bag red retro spot were the products most often sold together.
White hanging heart t-light holder was the product sold the most with others - about 2300 times. Kitchenware was most often included in additional assortment (about 152,610 times) and plant and accessories were the least - about 15,840 times.
Event and party category was mostly present in shopping carts with Kitchenware.
The difference between average revenue from home decorations and Kitchenware was statistically significant.
The average revenue generated by papercraft little birdie is not statistically and significantly different from the average revenue from all other products.

🎯 Recommendations

Since about 99.7% of products are included in the additional assortment, there is a need to create a product recommendation system.
Home decorations is the third most purchased category but has the highest average revenue. Hence, invest more in advertising home decorations to boost purchase rates and revenue.
As plant and accessories are the least frequently purchased category, increase advertising investment to enhance orders.
Regency Cakestand 3 tier is the leading revenue generator on aggregate but the most canceled product order. Pay much attention to this product. For instance, why does it often get canceled? If the cancellation rate is minimized, revenue would be maximized.

Software and Tools

📌 Project: A/B Test for an International Online Store

I have received an analytical task from an international online store. I have to launch an A/B test and give insights into changes related to the introduction of an improved recommendation system.

Project Link

Customer Journey

Revenue From Each Group

Cumulative revenue from Group A exceeds Group B

📊 General Findings

EU participants dominated all other regions in the sample.
The maximum order total for purchase events is about $500.00, and the minimum is about $5.00, with a mean of about $23.88 and a standard deviation of about 72.22.
About 66% of users proceed from login to the product page; about 50% of those at the product page proceed to purchase.
About 15% of customers proceed to purchase without putting the product in the cart.
About 33.33% of users convert.
In the initial test by my predecessor, 441 users belonged to both groups, and 446 users participated in both tests.
The highest number of events occurred on the 21st of December 2020 (14044 events).
8961 participants; representing about 23% of the new users on or before 21st December 2020, from the EU region were used for the test.
There is a statistically significant difference between groups A and B which implies the test was successful.

🎯 Recommendations

There is a significant difference between the groups. However, group A significantly exceeds Group B in number of customers and revenue.
Thus, there will be a reduction in purchases with the introduction of the improved recommendation system; do not introduce the recommendation system.

Software and Tools

📌 Project: Predicting Credit Card Approvals

Built an automatic credit card approval predictor using machine learning techniques.

Project Link

Decile Analysis

The decile analysis shows that the top 10% of customers have about an 80% probability that their credit cards would be approved. Customers in deciles 1-6 have more than a 50% chance that their credit cards would be approved. Customers in deciles 7-10 have less than a 50 % chance of getting their credit cards approved.

📊 General Findings

More than 50% of credit card applications get approved.
Males apply for credit cards more than females.
Most credit card applicants are between the ages of 20 and 40. This age group also has more credit card approvals than the other age groups.
Applicants aged above 60 years are the least to apply for credit cards and the ones who are less likely to have their credit cards approved.
The smaller a customer’s debt, the higher the chances of a credit card approval.
The model has about 88% accuracy in predicting credit card approval.
Top 10% of customers have about 80% probability that their credit cards would be approved.
Customers in deciles 1-6 have more than a 50% chance that their credit cards would be approved.
Customers in deciles 7-10 have less than a 50 % chance of getting their credit cards approved.

🎯 Recommendations

Approve credit cards of customers in deciles 1-6.
Do not approve the credit cards of customers in deciles 7-10.

Software and Tools

📌 Project: Video Games Sales Analysis

I analyzed video game sales data to identify patterns that determine whether a game succeeds or not.

Project Link

Number of Games Released in a Year

The number of games released in a year peaked in 2008 and significantly started falling in 2010.

Profitable and Non-profitable Platforms.

The PS2 platform is the most profitable platform, the PCFX platform is the least non-profitable platform.

📊 General Findings

Before 1994, there was no year that more than 100 games were released. However, from 1994 to 2016, more than 100 games have been released every year.
PS, PS2, and Nintendo DS used to be popular platforms but now have zero sales.
Generally, it takes about a year or less for new platforms to appear. On average, old platforms take about 8 years to fade.
Atari 2600 platform existed for the longest number of years (35 years)
Since 2010, the most profitable platforms are PS3, 3DS, DS, PC, PS4, Wii, X360, and XOne. In sum total, PS3 and X360 are leading in sales.
In 2016, all platforms had experienced their peak sales and started shrinking, but PS4, XOne, and 3DS were leading in sales.
Since 2010, average sales on various platforms differed but not significantly different especially among the highest performing platforms as average sales were all below $1 million. PS4 had the highest average sales of about $0.80 million, and the GBA platform had about $0.05 million in sales (the lowest).
There is a positive correlation between reviews of platforms and the sale of games.
Generally, since 2010, Action genres have been the most profitable considering the sum of total yearly sales. However, on average, the most profitable genre is Shooter. The least profitable game genre is Puzzle.
PS3 and X360 platforms dominate sales in Europe and North American markets.
Nintendo 3DS and PS3 dominate market share in Japan.
The PS3 platform is at least among the top two dominant market shares of games across Europe, North America, and Japan.
Action and Shooter genres are dominating the market share of sales in Europe and North America while Role Playing and Action dominate in Japan.
Action genre is at least among the top two dominating genres across the three regions.
ESRB ratings affect sales in individual regions.
Average user ratings of the Xbox One and PC platforms are the same.
Average user ratings for the Action and Sports genres are different.

📌 Project: Customer Churn Prediction and Retention Strategy

Cluster of Customers

Churn Prediction

📊 General Findings

🎯 Recommendations

Software and Tools

📌 Project: A/A/B Test to Inform Business Decisions

User Distribution by Group

User Behaviour

📊 General Findings

🎯 Recommendation

Software and Tools

📌 Project: Business Review of Markets Across the World Economy

Software and Tools

📌 Project: Business Metrics of Yandex Afisha

Daily Visits to Yandex Afisha

User Retention by Cohort

Lifetime Value (LTV) Cohort Analysis

Customer Acquisition Cost (CAC) Cohort Analysis

Return on Marketing Investment (ROMI) Cohort Analysis

📊 General Findings

🎯 Recommendations

Software and Tools

📌 Project: Product Range Analysis

Product Categorization Model

Products in Additional Assortment

📊 General Findings

🎯 Recommendations

Software and Tools

📌 Project: A/B Test for an International Online Store

Customer Journey

Revenue From Each Group

📊 General Findings

🎯 Recommendations

Software and Tools

📌 Project: Predicting Credit Card Approvals

Decile Analysis

📊 General Findings

🎯 Recommendations

Software and Tools

📌 Project: Video Games Sales Analysis

Number of Games Released in a Year

Profitable and Non-profitable Platforms.

📊 General Findings

Software and Tools