Active User Retention: Facebook SQL Interview Question

For companies like Facebook, one of the most important metrics by which one can measure user engagement and retention in a social media site is through its Monthly Active Users. It refers to users who have taken specified actions such as ‘sign-in’, ‘like’, or ‘comment’, in both the current and previous month. 

This blog post will walk you through a set of SQL interview questions that can help measure and understand user engagement, using a Facebook dataset.

The Scenario

We have a table named user_actions which logs various actions performed by users on the platform. 

The table has the following columns:

  • user_id (integer)
  • event_id (integer)
  • event_type (string: “sign-in”, “like”, “comment”)
  • event_date (datetime)
  • month (integer: numerical format of the month)

Using this table, let’s explore three key SQL interview questions that can help in analyzing user engagement.

Analyzing User Engagement: Facebook SQL Interview Questions

  1. Calculate Monthly Active Users (MAUs)

  2. Find Users with the Most Actions in a Given Month

  3. Calculate the Average Number of Actions Per User Per Month

First, let’s create the user_actions table with the necessary columns:

CREATE TABLE user_actions (
    user_id INTEGER,
    event_id INTEGER,
    event_type VARCHAR(50), -- assuming a reasonable length for event_type
    event_date DATETIME,
    month INTEGER
);

Inserting Sample Data

INSERT INTO user_actions (user_id, event_id, event_type, event_date, month) VALUES
(445, 7765, 'sign-in', '2022-05-31 12:00:00', 7),
(445, 3634, 'like', '2022-06-05 12:00:00', 7),
(648, 3124, 'like', '2022-06-18 12:00:00', 7),
(648, 2725, 'sign-in', '2022-06-22 12:00:00', 7),
(648, 8568, 'comment', '2022-07-03 12:00:00', 7),
(445, 4363, 'sign-in', '2022-07-05 12:00:00', 7),
(445, 2425, 'like', '2022-07-06 12:00:00', 7),
(445, 2484, 'like', '2022-07-22 12:00:00', 7),
(648, 1423, 'sign-in', '2022-07-26 12:00:00', 7),
(445, 5235, 'comment', '2022-07-29 12:00:00', 7),
(742, 6458, 'sign-in', '2022-07-03 12:00:00', 7),
(742, 1374, 'comment', '2022-07-19 12:00:00', 7);

User Engagement Analysis using SQL

1. Calculate Monthly Active Users (MAUs)

Write a query to find the number of Monthly Active Users (MAUs) in July 2022. An active user is someone who has performed actions like ‘sign-in’, ‘like’, or ‘comment’ in both June and July 2022.

WITH june_actions AS (
    SELECT DISTINCT user_id
    FROM user_actions
    WHERE event_date >= '2022-06-01' AND event_date < '2022-07-01'
),
july_actions AS (
    SELECT DISTINCT user_id
    FROM user_actions
    WHERE event_date >= '2022-07-01' AND event_date < '2022-08-01'
)
SELECT 7 AS month, COUNT(DISTINCT july_actions.user_id) AS MAUs
FROM july_actions
JOIN june_actions
ON july_actions.user_id = june_actions.user_id;
  Detailed Explanation:
  • Identify June Actions: The first part of the query creates a list of users (june_actions) who performed any action in June 2022.
  • Identify July Actions: The second part creates a list of users (july_actions) who performed any action in July 2022.
  • Find Overlapping Users: The final query joins these two lists to find users who are in both june_actions and july_actions. These are the users who were active in both June and July.
  • Count Active Users: The query then counts these users to determine the number of Monthly Active Users (MAUs) for July 2022.

2. Find Users with the Most Actions in a Given Month

Write a query to find the user(s) with the highest number of actions (events) in July 2022.

SELECT user_id, COUNT(*) AS action_count
FROM user_actions
WHERE event_date >= '2022-07-01' AND event_date < '2022-08-01'
GROUP BY user_id
ORDER BY action_count DESC
LIMIT 1;
  Detailed Explanation:
  • Filter July Data: The query starts by selecting actions that occurred in July 2022.
  • Count Actions per User: It counts the number of actions each user performed in this period.
  • Group by User ID: The results are grouped by user_id to ensure we get a count for each user.
  • Order by Action Count: The query sorts the results in descending order based on the number of actions.
  • Return Top User: The LIMIT 1 clause ensures that only the user with the most actions is returned.

3. Calculate the Average Number of Actions Per User Per Month

Write a query to find the average number of actions each user performs per month in 2022.

SELECT
    MONTH(event_date) AS month,
    AVG(action_count) AS avg_actions_per_user
FROM (
    SELECT user_id, COUNT(*) AS action_count, MONTH(event_date) AS month
    FROM user_actions
    WHERE YEAR(event_date) = 2022
    GROUP BY user_id, MONTH(event_date)
) AS user_monthly_actions
GROUP BY month
ORDER BY month;
  Detailed Explanation:
  • Subquery to Count Actions: The inner subquery counts the number of actions each user performs per month. It groups the results by user_id and month to get these counts.
  • Filter by Year: The subquery filters actions to include only those from 2022.
  • Calculate Averages: The outer query calculates the average number of actions per user for each month by averaging the action counts from the subquery.
  • Group by Month: The results are grouped by month to get the monthly averages.
  • Order by Month: The query sorts the results by month to present them in chronological order.

Summary

Analyzing user engagement with SQL is an essential skill for data analysts and engineers. The questions we discussed help measure how active users are and how many keep coming back, which are important for any platform. By learning these queries, you can understand user behavior better and make decisions based on data to improve user experience and engagement. Knowing these basic SQL concepts can greatly improve your ability to analyze and understand user data.

I hope you liked this article on Analyzing User Engagement using SQL. Feel free to ask valuable questions in the comments section below.

Kaggle Master & Senior Data Scientist ( Ambitious, Adventurous, Attentive)

One Response

Leave a Reply

Your email address will not be published. Required fields are marked *

Share This Post
Latest Post
7 Powerful Steps to Build Successful Data Analytics Project
Master Real-Time Image Resizing with Python: A 3-Step Guide
Mastering SQL for Average Product Ratings: Amazon Interview Question
SQL Query to find Top 3 Departments with the Highest Average Salary
Master SQL Today: 10 Powerful Hands-On Beginner’s Guide
Linear Regression: A Comprehensive Guide with 7 Key Insights
Meta LLAMA 3.1 – The Most Capable Large Language Model
Understanding Dispersion of Data: 4 Key Metrics Explained
Find Employees Who Joined on Same Date & Month: 3 Powerful SQL Methods
Ultimate Guide to Mastering Prompt Engineering Techniques – Part 2

One Response

Leave a Reply

Your email address will not be published. Required fields are marked *