What is it like working as an Analytics Engineer?

Analytics Engineer refers to a Data Science professional focused on transforming data into information that is easy to access to the end-user. They provide static and dynamic reports that empower the business team without them needing to think about the complexity behind data analysis.

In this case study, I want to talk about what would be common tasks that an Analytics Engineer would need to perform and how I’d navigate them.

In this scenario, the Analytics Engineer works for Bankio a digital bank from Brazil. Like most digital banks in Brazil, Bankio offers free transfers for every bank account in the country. It also has many products like an investment account, a savings account, an individual bank account, a credit card without an annual fee, and many more.

Task 1: SQL Query

A Bussiness Analyst from Bankio asks for your assistance writing a SQL query to get all the account’s monthly balance between January 2020 to December 2020.

SQL Query Solution (click to expand)
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
SELECT
  a.*,
  -- Here I calculate the cumulative sum of total deposits for each customer ordering it 
  -- by month, if its null I change the value to 0 then I subtract the cumulative sum of
  -- total withdrawals for each customer ordering it by month, if its null I change the 
  -- value to 0 and I save this as ours account_monthly_balance
  NVL(SUM(total_transfer_in) OVER (PARTITION BY customer_id 
ORDER BY
  action_month), 0) - NVL(SUM(total_transfer_out) OVER (PARTITION BY customer_id 
ORDER BY
  action_month), 0) AS account_monthly_balance 
FROM
  -- total transactions in/out subquery
( 
  SELECT
    * 
  FROM
    -- total deposits subquery
( 
    SELECT
      action_month, customer_id, SUM(amount) AS total_transfer_in 
    FROM
      (
        SELECT
          * 
        FROM
          -- regular deposits subquery
( 
          SELECT
            d_month.action_month, accounts.customer_id, SUM(transfer_ins.amount) AS amount 
          FROM
            d_time 
            INNER JOIN
              transfer_ins 
              ON transfer_ins.transaction_completed_at = d_time.time_id 
            LEFT JOIN
              d_month USING(month_id) 
            LEFT JOIN
              accounts USING(account_id) 
          WHERE
            transfer_ins.status = 'completed' 
          GROUP BY
            d_month.action_month, accounts.customer_id ) transfer_in 
          UNION ALL
          SELECT
            * 
          FROM
            -- pix deposits subquery
( 
            SELECT
              d_month.action_month, accounts.customer_id, SUM(pix_movements.pix_amount) AS amount 
            FROM
              d_time 
              INNER JOIN
                pix_movements 
                ON pix_movements.pix_completed_at = d_time.time_id 
              LEFT JOIN
                d_month USING(month_id) 
              LEFT JOIN
                accounts USING(account_id) 
            WHERE
              pix_movements.status = 'completed' 
              AND pix_movements.in_or_out = 'pix_in' 
            GROUP BY
              d_month.action_month, accounts.customer_id ) 
      )
    GROUP BY
      action_month, customer_id ) 
      FULL JOIN
        (
          SELECT
            action_month,
            customer_id,
            SUM(amount) AS total_transfer_out 
          FROM
            -- total withdrawals subquery
( 
            SELECT
              * 
            FROM
              -- regular withdrawal subquery
( 
              SELECT
                d_month.action_month, accounts.customer_id, SUM(transfer_outs.amount) AS amount 
              FROM
                d_time 
                INNER JOIN
                  transfer_outs 
                  ON transfer_outs.transaction_completed_at = d_time.time_id 
                LEFT JOIN
                  d_month USING(month_id) 
                LEFT JOIN
                  accounts USING(account_id) 
              WHERE
                transfer_outs.status = 'completed' 
              GROUP BY
                d_month.action_month, accounts.customer_id ) 
              UNION ALL
              SELECT
                * 
              FROM
                -- pix withdrawal subquery
( 
                SELECT
                  d_month.action_month, accounts.customer_id, SUM(pix_movements.pix_amount) AS amount 
                FROM
                  d_time 
                  INNER JOIN
                    pix_movements 
                    ON pix_movements.pix_completed_at = d_time.time_id 
                  LEFT JOIN
                    d_month USING(month_id) 
                  LEFT JOIN
                    accounts USING(account_id) 
                WHERE
                  pix_movements.status = 'completed' 
                  AND pix_movements.in_or_out = 'pix_out' 
                GROUP BY
                  d_month.action_month, accounts.customer_id ) ) 
                GROUP BY
                  action_month,
                  customer_id 
        )
        USING (action_month, customer_id) ) a;

Task 2: Key Performance Indicators

computer screen with 8 rectangles filled with key indicators

Photo by Stephen Dawson on Unsplash

Another collegue from Bankio is interested in analysing the success of the company product PIX on a business and technical level. So they asked you to come up with some key indicators to measure this.

Mean processing time of PIX transactions (click to expand) This can be obtained using the time a customer requests a PIX transaction and when it is completed, we calculate this for all the PIX transactions then we take the mean value. PIX is supposed to be instantaneous, so this metric should be as small as possible.

The proportion of PIX failures (click to expand) This indicator is important because it’s inconvenient for the client to have their transaction fail. We can calculate this by dividing the sum of failed PIX transactions by the total PIX transactions. This measure should be minimized.

The proportion of transactions using PIX (click to expand)

PIX success can be measured by the proportion of transaction movements using PIX over normal transactions. So just count how many completed transactions using PIX were made and divide it by the total amount of completed transactions. A bigger measurement reflects PIX success over regular transactions.

Alternatively instead of just counting the transactions we can evaluate how much money each transaction type is moving.

The proportion of in/out of PIX (click to expand)

This measure is good to analyze if customers are using their PIX to receive more money or to send money. It would be better if more customers are receiving more money than they are sending. Because Bankio already had free transactions for any bank, before PIX came around, others still had to pay fees to send money to your Bankio account. For this reason, it is better to count how many transactions are coming in through PIX and divide it by all PIX transactions. The higher the better.

In this case we could also sum the balance of deposit and withdrawals from the Bankio account using PIX and compare it with regular transactions.

Task 3: Daily Investment Return

Bankio has a customer banking account that allows you to invest on a fixed rate income product. Consider that this product provides customers with a daily return of 0.01% according to their daily invested balance amount. Calculate how much each customer has on their bankio account during the year 2020.

This return is calculated daily after all withdrawals and/or deposits made on a given day. And every day, even weekends, generate some return.

The following example describes customer A who begins investing in this fixed income product on day 16 of the first month. The prior balance was zero since this consumer is making a first-time deposit into the investment. His initial deposit was 1,000, and at the end of the day, it produced a daily income rate of 0.01% of his balance. The same product is still being consumed by this client at different times throughout the month. Remember that this is just a dummy sample of the transaction log with daily calculations applied. Note that the income for that day should be set to zero in the event of negative Movements.

DayMonthAccount IDDepositWithdrawalEnd of Day IncomeAccount Daily Balance
161A100000.11000.10
201A50000.151500.55
22A02000.131302.48
192A10002000.212104.78

Movements = Previous Day Balance + Deposit - Withdrawal

End of Day Income = Movements * Income Rate

Account Daily Balance = Movements + End of Day Income


Glossary

Account Monthly Balance

It is the amount of money a customer had in their account at the end of a given month.

Account info (branch, number and check-digit)

In Brazil, a bank account can be uniquely identifiable by three numbers. The branch code, which indicates which bank branch the accounts were opened in, comes first. The second is the account number that a branch uses to identify accounts. The check-digit, which is only used for error detection, is the last.

CPF

It is the Brazilian individual taxpayer registry identification.

PIX

In Brazil, this is the most recent method of money transmission. It’s unpaid. It is immediate, and all that is required to complete a transaction is the Pix-Key associated with the account.

Non PIX transfers

These are the conventional methods for transferring money between bank accounts. This type of transaction requires the CPF, the branch code, the account number, and the check digit of the account that will receive the funds to be provided. Most banks charge a fee in these transactions, and the confirmation of the transaction typically takes several hours to days.