For companies like Facebook, one of the most important metrics by which one can measure user engagement and retention in a social media site is through its Monthly Active Users. It refers to users who have taken specified actions such as ‘sign-in’, ‘like’, or ‘comment’, in both the current and previous month.
This blog post will walk you through a set of SQL interview questions that can help measure and understand user engagement, using a Facebook dataset.
Table of Contents
ToggleThe Scenario
We have a table named user_actions which logs various actions performed by users on the platform.
The table has the following columns:
- user_id (integer)
- event_id (integer)
- event_type (string: “sign-in”, “like”, “comment”)
- event_date (datetime)
- month (integer: numerical format of the month)
Using this table, let’s explore three key SQL interview questions that can help in analyzing user engagement.
Analyzing User Engagement: Facebook SQL Interview Questions
-
Calculate Monthly Active Users (MAUs)
-
Find Users with the Most Actions in a Given Month
-
Calculate the Average Number of Actions Per User Per Month
First, let’s create the user_actions table with the necessary columns:
CREATE TABLE user_actions ( user_id INTEGER, event_id INTEGER, event_type VARCHAR(50), -- assuming a reasonable length for event_type event_date DATETIME, month INTEGER );
Inserting Sample Data
INSERT INTO user_actions (user_id, event_id, event_type, event_date, month) VALUES (445, 7765, 'sign-in', '2022-05-31 12:00:00', 7), (445, 3634, 'like', '2022-06-05 12:00:00', 7), (648, 3124, 'like', '2022-06-18 12:00:00', 7), (648, 2725, 'sign-in', '2022-06-22 12:00:00', 7), (648, 8568, 'comment', '2022-07-03 12:00:00', 7), (445, 4363, 'sign-in', '2022-07-05 12:00:00', 7), (445, 2425, 'like', '2022-07-06 12:00:00', 7), (445, 2484, 'like', '2022-07-22 12:00:00', 7), (648, 1423, 'sign-in', '2022-07-26 12:00:00', 7), (445, 5235, 'comment', '2022-07-29 12:00:00', 7), (742, 6458, 'sign-in', '2022-07-03 12:00:00', 7), (742, 1374, 'comment', '2022-07-19 12:00:00', 7);
User Engagement Analysis using SQL
1. Calculate Monthly Active Users (MAUs)
Write a query to find the number of Monthly Active Users (MAUs) in July 2022. An active user is someone who has performed actions like ‘sign-in’, ‘like’, or ‘comment’ in both June and July 2022.
WITH june_actions AS ( SELECT DISTINCT user_id FROM user_actions WHERE event_date >= '2022-06-01' AND event_date < '2022-07-01' ), july_actions AS ( SELECT DISTINCT user_id FROM user_actions WHERE event_date >= '2022-07-01' AND event_date < '2022-08-01' ) SELECT 7 AS month, COUNT(DISTINCT july_actions.user_id) AS MAUs FROM july_actions JOIN june_actions ON july_actions.user_id = june_actions.user_id;
Detailed Explanation:
- Identify June Actions: The first part of the query creates a list of users (june_actions) who performed any action in June 2022.
- Identify July Actions: The second part creates a list of users (july_actions) who performed any action in July 2022.
- Find Overlapping Users: The final query joins these two lists to find users who are in both june_actions and july_actions. These are the users who were active in both June and July.
- Count Active Users: The query then counts these users to determine the number of Monthly Active Users (MAUs) for July 2022.
2. Find Users with the Most Actions in a Given Month
Write a query to find the user(s) with the highest number of actions (events) in July 2022.
SELECT user_id, COUNT(*) AS action_count FROM user_actions WHERE event_date >= '2022-07-01' AND event_date < '2022-08-01' GROUP BY user_id ORDER BY action_count DESC LIMIT 1;
Detailed Explanation:
- Filter July Data: The query starts by selecting actions that occurred in July 2022.
- Count Actions per User: It counts the number of actions each user performed in this period.
- Group by User ID: The results are grouped by user_id to ensure we get a count for each user.
- Order by Action Count: The query sorts the results in descending order based on the number of actions.
- Return Top User: The LIMIT 1 clause ensures that only the user with the most actions is returned.
3. Calculate the Average Number of Actions Per User Per Month
Write a query to find the average number of actions each user performs per month in 2022.
SELECT MONTH(event_date) AS month, AVG(action_count) AS avg_actions_per_user FROM ( SELECT user_id, COUNT(*) AS action_count, MONTH(event_date) AS month FROM user_actions WHERE YEAR(event_date) = 2022 GROUP BY user_id, MONTH(event_date) ) AS user_monthly_actions GROUP BY month ORDER BY month;
Detailed Explanation:
- Subquery to Count Actions: The inner subquery counts the number of actions each user performs per month. It groups the results by user_id and month to get these counts.
- Filter by Year: The subquery filters actions to include only those from 2022.
- Calculate Averages: The outer query calculates the average number of actions per user for each month by averaging the action counts from the subquery.
- Group by Month: The results are grouped by month to get the monthly averages.
- Order by Month: The query sorts the results by month to present them in chronological order.
Summary
Analyzing user engagement with SQL is an essential skill for data analysts and engineers. The questions we discussed help measure how active users are and how many keep coming back, which are important for any platform. By learning these queries, you can understand user behavior better and make decisions based on data to improve user experience and engagement. Knowing these basic SQL concepts can greatly improve your ability to analyze and understand user data.
I hope you liked this article on Analyzing User Engagement using SQL. Feel free to ask valuable questions in the comments section below.
One Response
Great , thanks for sharing