I have a table of calls data I want to figure out the count Unique accounts called everyday and take sum of unique accounts called by monthly basis - sql

I have a table with 2 unique columns one has an account number and the other is the date. The sample data is given below.
Date account
9/8/2020 555
9/8/2020 666
9/8/2020 777
9/8/2020 888
9/9/2020 555
9/9/2020 999
9/10/2020 555
9/10/2020 222
9/10/2020 333
9/11/2020 666
9/11/2020 111
I would like to calculate the number of unique accounts called every day and sum it up for a month for example if account number 555 is called on 8sept, p sept and 20 Sept its is not adding up to the cumulative sum the result should look like this
date Cumulative Unique Accounts Called SO Far this month
9/8/2020 4
9/9/2020 5
9/10/2020 7
9/11/2020 8
Thank you in advance for your help.

You can do this with aggregation and window functions. First, get the first date for each account, then aggregate and accumulate:
select min_date,
count(*) as as_of_date,
sum(count(*)) over (partition by year(min_datedate), month(min_datedate)
order by min_date
) as cumulative_unique_count
from (select account, min(date) as min_date
from t
group by account, year(date), month(date)
) t
group by min_date;

You can try the below -
with cte as
(
select date,count(*) as total from
(
select date,count,row_number() over(partition by count order by date) as rn
from tablename
)A where rn=1 group by date
)
select date,sum(total) over(order by date) as cum_sum
from cte

Related

Counting unique combinations up until a date - per month

I am looking into a table with transaction data of a two-sided platform, where you have buyers and sellers. I want to know the total amount of unique combinations of buyers and sellers. Let's say, Abe buys from Brandon in January, that's 1 combination. If Abe buys with Cece in February, that makes 2, but if Abe then buys from Brandon again, it's still 2.
My solution was to use the DENSE_RANK() function:
WITH
combos AS (
SELECT
t.buyerid, t.sellerid,
DENSE_RANK() OVER (ORDER BY t.buyerid, t.sellerid) AS combinations
FROM transactions t
WHERE t.transaction_date < '2018-05-01'
)
SELECT
MAX(combinations) AS total_combinations
FROM combos
This works fine. Each new combo gets a higher rank, and if you select the MAX of that result, you know the amount of unique combos.
However, I want to know this total amount of unique combos on a per month basis. The problem here is that if I group per transaction month, it only counts the unique combos in that month. In the example of Abe, it would be a unique combo in January, and then another combo in the next month, because that's how grouping works in SQL.
Example:
transaction_date buyerid sellerid
2018-01-03 3828 219
2018-01-08 2831 123
2018-02-10 3828 219
The output of DENSE_RANK() named combinations over all these rows is:
transaction_date buyerid sellerid combinations
2018-01-03 3828 219 1
2018-01-08 2831 123 2
2018-02-10 3828 219 2
And therefore, when selecting the MAX combinations you know the amount of unique buyer/seller combos, which is here.
However, I would like to see a running total of unique combos up until each start of the month, for all months until now. But, when we group on month, it would go like this:
transaction_date buyerid sellerid month combinations
2018-01-03 3828 219 jan 1
2018-01-08 2831 123 jan 2
2018-02-10 3828 219 feb 1
While I actually would want an output like:
month total_combinations_at_month_start
jan 0
feb 2
mar 2
How should I solve this? I've tried to find help on all kinds of window functions, but no luck until now. Thanks!
Here is one method:
WITH combos AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY sellerid, buyerid ORDER BY t.transaction_date) as combo_seqnum,
ROW_NUMBER() OVER (PARTITION BY sellerid, buyerid, date_trunc('month', t.transaction_date) ORDER BY t.transaction_date) as combo_month_seqnum
FROM transactions t
WHERE t.transaction_date < '2018-05-01'
)
SELECT 'Overall' as which, COUNT(*)
FROM combos
WHERE combo_seqnum = 1
UNION ALL
SELECT to_char(transaction_date, 'YYYY-MM'), COUNT(*)
FROM combos
WHERE combo_month_seqnum = 1
GROUP BY to_char(transaction_date, 'YYYY-MM');
This puts the results in separate rows. If you want a cumulative number and number per month:
SELECT to_char(transaction_date, 'YYYY-MM'),
SUM( (combo_month_seqnum = 1)::int ) as uniques_in_month,
SUM(SUM( (combo_seqnum = 1)::int )) OVER (ORDER BY to_char(transaction_date, 'YYYY-MM')) as uniques_through_month
FROM combos
GROUP BY to_char(transaction_date, 'YYYY-MM')
Here is a rextester illustrating the solution.

SQL COUNT the number purchase between his first purchase and the follow 10 months

every customer has different first-time purchase date, I want to COUNT the number of purchases they have between the following 10 months after the first purchase?
sample table
TransactionID Client_name PurchaseDate Revenue
11 John Lee 10/13/2014 327
12 John Lee 9/15/2015 873
13 John Lee 11/29/2015 1,938
14 Rebort Jo 8/18/2013 722
15 Rebort Jo 5/21/2014 525
16 Rebort Jo 2/4/2015 455
17 Rebort Jo 3/20/2016 599
18 Tina Pe 10/8/2014 213
19 Tina Pe 6/10/2016 3,494
20 Tina Pe 8/9/2016 411
my code below just use ROW_NUM function to identify the first purchase, but I don't know how to do the calculations or there's a better way to do it?
SELECT client_name,
purchasedate,
Dateadd(month, 10, purchasedate) TenMonth,
Row_number()
OVER (
partition BY client_name
ORDER BY client_name) RM
FROM mytable
You might try something like this - I assume you're using SQL Server from the presence of DATEADD() and the fact that you're using a window function (ROW_NUMBER()):
WITH myCTE AS (
SELECT TransactionID, Client_name, PurchaseDate, Revenue
, MIN(PurchaseDate) OVER ( PARTITION BY Client_name ) AS min_PurchaseDate
FROM myTable
)
SELECT Client_name, COUNT(*)
FROM myCTE
WHERE PurchaseDate <= DATEADD(month, 10, min_PurchaseDate)
GROUP BY Client_name
Here I'm creating a common table expression (CTE) with all the data, including the date of first purchase, then I grab a count of all the purchases within a 10-month timeframe.
Hope this helps.
Give this a whirl ... Subquery to get the min purchase date, then LEFT JOIN to the main table to have a WHERE clause for the ten month date range, then count.
SELECT Client_name, COUNT(mt.PurchaseDate) as PurchaseCountFirstTenMonths
FROM myTable mt
LEFT JOIN (
SELECT Client_name, MIN(PurchaseDate) as MinPurchaseDate GROUP BY Client_name) mtmin
ON mt.Client_name = mtmin.Client_name AND mt.PurchaseDate = mtmin.MinPurchaseDate
WHERE mt.PurchaseDate >= mtmin.MinPurchaseDate AND mt.PurchaseDate <= DATEADD(month, 10, mtmin.MinPurchaseDate)
GROUP BY Client_name
ORDER BY Client_name
btw I'm guessing there's some kind of ClientID involved, as nine character full name runs the risk of duplicates.

Firebird Query- Return first row each group

In a firebird database with a table "Sales", I need to select the first sale of all customers. See below a sample that show the table and desired result of query.
---------------------------------------
SALES
---------------------------------------
ID CUSTOMERID DTHRSALE
1 25 01/04/16 09:32
2 30 02/04/16 11:22
3 25 05/04/16 08:10
4 31 07/03/16 10:22
5 22 01/02/16 12:30
6 22 10/01/16 08:45
Result: only first sale, based on sale date.
ID CUSTOMERID DTHRSALE
1 25 01/04/16 09:32
2 30 02/04/16 11:22
4 31 07/03/16 10:22
6 22 10/01/16 08:45
I've already tested following code "Select first row in each GROUP BY group?", but it did not work.
In Firebird 2.5 you can do this with the following query; this is a minor modification of the second part of the accepted answer of the question you linked to tailored to your schema and requirements:
select x.id,
x.customerid,
x.dthrsale
from sales x
join (select customerid,
min(dthrsale) as first_sale
from sales
group by customerid) p on p.customerid = x.customerid
and p.first_sale = x.dthrsale
order by x.id
The order by is not necessary, I just added it to make it give the order as shown in your question.
With Firebird 3 you can use the window function ROW_NUMBER which is also described in the linked answer. The linked answer incorrectly said the first solution would work on Firebird 2.1 and higher. I have now edited it.
Search for the sales with no earlier sales:
SELECT S1.*
FROM SALES S1
LEFT JOIN SALES S2 ON S2.CUSTOMERID = S1.CUSTOMERID AND S2.DTHRSALE < S1.DTHRSALE
WHERE S2.ID IS NULL
Define an index over (customerid, dthrsale) to make it fast.
in Firebird 3 , get first row foreach customer by min sales_date :
SELECT id, customer_id, total, sales_date
FROM (
SELECT id, customer_id, total, sales_date
, row_number() OVER(PARTITION BY customer_id ORDER BY sales_date ASC ) AS rn
FROM SALES
) sub
WHERE rn = 1;
İf you want to get other related columns, This is where your self-answer fails.
select customer_id , min(sales_date)
, id, total --what about other colums
from SALES
group by customer_id
So simple as:
select CUSTOMERID min(DTHRSALE) from SALES group by CUSTOMERID

SQL Server : count types with totals by date change

I need to count a value (M_Id) at each change of a date (RS_Date) and create a column grouped by the RS_Date that has an active total from that date.
So the table is:
Ep_Id Oa_Id M_Id M_StartDate RS_Date
--------------------------------------------
1 2001 5 1/1/2014 1/1/2014
1 2001 9 1/1/2014 1/1/2014
1 2001 3 1/1/2014 1/1/2014
1 2001 11 1/1/2014 1/1/2014
1 2001 2 1/1/2014 1/1/2014
1 2067 7 1/1/2014 1/5/2014
1 2067 1 1/1/2014 1/5/2014
1 3099 12 1/1/2014 3/2/2014
1 3099 14 2/14/2014 3/2/2014
1 3099 4 2/14/2014 3/2/2014
So my goal is like
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 10
If the M_startDate = RS_Date I need to count the M_id and then for
each RS_Date that is not equal to the start date I need to count the M_Id and then add that to the M_StartDate count and then count the next RS_Date and add that to the last active count.
I can get the basic counts with something like
(Case when M_StartDate <= RS_Date
then [m_Id] end) as Test.
But I am stuck as how to get to the result I want.
Any help would be greatly appreciated.
Brian
-added in response to comments
I am using Server Ver 10
If using SQL SERVER 2012+ you can use ROWS with your the analytic/window functions:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT *,SUM(CT) OVER(ORDER BY RS_Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Run_CT
FROM cte
Demo: SQL Fiddle
If stuck using something prior to 2012 you can use:
;with cte AS (SELECT RS_Date
,COUNT(DISTINCT M_ID) AS CT
FROM Table1
GROUP BY RS_Date
)
SELECT a.RS_Date
,SUM(b.CT)
FROM cte a
LEFT JOIN cte b
ON a.RS_DAte >= b.RS_Date
GROUP BY a.RS_Date
Demo: SQL Fiddle
You need a cumulative sum, easy in SQL Server 2012 using Windowed Aggregate Functions. Based on your description this will return the expected result
SELECT p_id, RS_Date,
SUM(COUNT(*))
OVER (PARTITION BY p_id
ORDER BY RS_Date
ROWS UNBOUNDED PRECEDING)
FROM tab
GROUP BY p_id, RS_Date
It looks like you want something like this:
SELECT
RS_Date,
SUM(c) OVER (PARTITION BY M_StartDate ORDER BY RS_Date ROWS UNBOUNDED PRECEEDING)
FROM
(
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
) counts
The inline view computes the counts of distinct M_Id values within each (M_StartDate, RS_Date) group (distinctness enforced only within the group), and the outer query uses the analytic version of SUM() to add up the counts within each M_StartDate.
Note that this particular query will not exactly reproduce your example results. It will instead produce:
RS_Date Active
-----------------
1/1/2014 5
1/5/2014 7
3/2/2014 8
3/2/2014 2
This is on account of some rows in your example data with RS_Date 3/2/2014 having a later M_StartDate than others. If this is not what you want then you need to clarify the question, which currently seems a bit inconsistent.
Unfortunately, analytic functions are not available until SQL Server 2012. In SQL Server 2010, the job is messier. It could be done like this:
WITH gc AS (
SELECT M_StartDate, RS_Date, COUNT(DISTINCT M_Id) AS c
FROM my_table
GROUP BY M_StartDate, RS_Date
)
SELECT
RS_Date,
(
SELECT SUM(c)
FROM gc2
WHERE gc2.M_StartDate = gc.M_StartDate AND gc2.RS_Date <= gc.RS_Date
) AS Active
FROM gc
If you are using SQL 2012 or newer you can use LAG to produce a running total.
https://msdn.microsoft.com/en-us/library/hh231256(v=sql.110).aspx

Hourly sum of values

I have a table with the following structure and sample data:
STORE_ID | INS_TIME | TOTAL_AMOUNT
2 07:46:01 20
3 19:20:05 100
4 12:40:21 87
5 09:05:08 5
6 11:30:00 12
6 14:22:07 100
I need to get the hourly sum of TOTAL_AMOUNT for each STORE_ID.
I tried the following query but i don't know if it's correct.
SELECT STORE_ID, SUM(TOTAL_AMOUNT) , HOUR(INS_TIME) as HOUR FROM VENDAS201302
WHERE MINUTE(INS_TIME) <=59
GROUP BY HOUR,STORE_ID
ORDER BY INS_TIME;
Not sure why you are not considering different days here. You could get the hourly sum using Datepart() function as below in Sql-Server:
DEMO
SELECT STORE_ID, SUM(TOTAL_AMOUNT) HOURLY_SUM
FROM t1
GROUP BY STORE_ID, datepart(hour,convert(datetime,INS_TIME))
ORDER BY STORE_ID
SELECT STORE_ID,
HOUR(INS_TIME) as HOUR_OF_TIME,
SUM(TOTAL_AMOUNT) as AMOUNT_SUM
FROM VENDAS201302
GROUP BY STORE_ID, HOUR_OF_TIME
ORDER BY INS_TIME;