I have a sales table:
date, user_id, product
there are 26 products(a-z), and those users who have purchased both 'a' and 'b' product are classified as acquired customers.
What I want is the daily level count of acquired customers as a SQL query
Say for eg, A user 'X' bought product 'a' on 1st apr, and bought product 'b' on 20th apr then he will be deemed as acquired on 20th apr.
Need a SQL query for this
Sample data:
date user_id Product sale
01-04-2019 123 a 200
01-04-2019 234 b 300
01-04-2019 345 a 200
02-04-2019 123 b 300
03-04-2019 234 b 300
04-04-2019 555 g 400
05-04-2019 666 a 200
05-04-2019 666 b 300
Desired Output from sql query:
date ac-quired_users
01-04-2019 0
02-04-2019 1
03-04-2019 0
04-04-2019 0
05-04-2019 1
obviously there will be a lot more data
You can use window functions for this. First, get the "start" date for each user:
select userid, min(date)
from (select t.*,
sum(case when product = 'a' then 1 else 0 end) over (partition by userid order by date) as cnt_a,
sum(case when product = 'b' then 1 else 0 end) over (partition by userid order by date) as cnt_b
from t
) t
from t
group by userid;
Then aggregate this:
select date, count(*)
from (select userid, min(date)
from (select t.*,
sum(case when product = 'a' then 1 else 0 end) over (partition by userid order by date) as cnt_a,
sum(case when product = 'b' then 1 else 0 end) over (partition by userid order by date) as cnt_b
from t
) t
from t
group by userid
) u
group by date
order by date;
Related
let's say there's a table have data like below
id
status
date
1
4
2022-05
2
3
2022-06
I want find count of id of each month by their status. Something like this below
date
count(status1) = 4
count(status2) =3
2022-05
1
null
2022-06
null
1
I tried doing
-- select distinct (not working)
select date, status1, status2 from
(select date, count(id) as "status1" from myTable
where status = 4 group by date) as myTable1
join
(select date, count(id) as "status2" from myTable
where status = 3 group by date) as myTable2
on myTable1.date = myTable2.date;
-- group by (not working)
but it does duplicate the data needed.
and I am using SQL Server.
select d.date,
sum
(
case
when d.status=4 then 1
else 0
end
)count_status_4,
sum
(
case
when d.status=5 then 1
else 0
end
)count_status_5
from your_table as d
group by d.date
In the table below, I want to know how many customers ordered lunch without a coffee. The result would be 1, for sale ID 300, because two lunches were ordered but only one coffee.
It’s been 8 years since I last used SQL! How do I say “group the records by sale ID and for each group, drop groups where there is no lunch or COUNT(coffee) < COUNT(lunch)"?
SALE ID
Product
100
coffee
100
lunch
200
coffee
300
lunch
300
lunch
300
coffee
here is one way:
select count(*) from (
select saleID
from tablename
group by saleID
having sum(case when product ='coffee' then 1 else 0 end) = 0
and sum(case when product ='lunch' then 1 else 0 end) = 1
) t
You can do it with aggregation and the conditions in the HAVING clause.
This query:
SELECT sale_id
FROM tablename
GROUP BY sale_id
HAVING SUM(product = 'lunch') > SUM(product = 'coffee');
returns all the sale_ids that you want.
This query:
SELECT DISTINCT COUNT(*) OVER () counter
FROM tablename
GROUP BY sale_id
HAVING SUM(product = 'lunch') > SUM(product = 'coffee');
returns the number of sale_ids that you want.
See the demo.
select count(*) from (
--in this subquery calculate counts and ignore items that haven't any lunch
select
saleID, sum(case when product ='coffee' then 1 else 0 end) as coffee,
sum(case when product ='lunch' then 1 else 0 end) lunch
from tablename
group by saleID
having sum(case when product ='lunch' then 1 else 0 end) >= 1 --Here we are ignoring all items haven't any lunch
) t
where lunch > coffee -- we check second condition be ok
I need to make a PIVOT table from Source like this table
FactID UserID Date Product QTY
1 11 01/01/2020 A 600
2 11 02/01/2020 A 400
3 11 03/01/2020 B 500
4 11 04/01/2020 B 200
6 22 06/01/2020 A 1000
7 22 07/01/2020 A 200
8 22 08/01/2020 B 300
9 22 09/01/2020 B 100
Need Pivot Like this where Product QTY is QTY by Last Date
UserID A B
11 400 200
22 200 100
My try PostgreSQL
Select
UserID,
MAX(CASE WHEN Product='A' THEN 'QTY' END) AS 'A',
MAX(CASE WHEN Product='B' THEN 'QTY' END) AS 'B'
FROM table
GROUP BY UserID
And Result
UserID A B
11 600 500
22 1000 300
I mean I get a result by the maximum QTY and not by the maximum date!
What do I need to add to get results by the maximum (last) date ??
Postgres doesn't have "first" and "last" aggregation functions. One method for doing this (without a subquery) uses arrays:
select userid,
(array_agg(qty order by date desc) filter (where product = 'A'))[1] as a,
(array_agg(qty order by date desc) filter (where product = 'B'))[1] as b
from tab
group by userid;
Another method uses select distinct with first_value():
select distinct userid,
first_value(qty) over (partition by userid order by product = 'A' desc, date desc) as a,
first_value(qty) over (partition by userid order by product = 'B' desc, date desc) as b
from tab;
With the appropriate indexes, though, distinct on might be the fastest approach:
select userid,
max(qty) filter (where product = 'A') as a,
max(qty) filter (where product = 'B') as b
from (select distinct on (userid, product) t.*
from tab t
order by userid, product, date desc
) t
group by userid;
In particular, this can use an index on userid, product, date desc). The improvement in performance will be most notable if there are many dates for a given user.
You can use DENSE_RANK() window function in order to filter by the last date per each product and UserID before applying conditional aggregation such as
SELECT UserID,
MAX(CASE WHEN Product='A' THEN QTY END) AS "A",
MAX(CASE WHEN Product='B' THEN QTY END) AS "B"
FROM
(
SELECT t.*, DENSE_RANK() OVER (PARTITION BY Product,UserID ORDER BY Date DESC) AS rn
FROM tab t
) q
WHERE rn = 1
GROUP BY UserID
Demo
presuming all date values are distinct(no ties occur for dates)
I am trying to write a SQL statement that will return a set of Distinct set of CompanyNames from a table based on the most recent SaleDate withing a specified date range from another table.
T01 = Account
T02 = TransHeader
The fields of importance are:
T01.ID, T01.CompanyName
T02.AccountID, T02.SaleDate
T01.ID = T02.AccountID
What I want to return is the Max SaleDate for each CompanyName without any duplicate CompanyNames and only the Max(SaleDate) as LastSale. I will be using a Where Clause to limit the SaleDate range.
I tried the following but it returns all the records for all SalesDates in the range. This results in the same company being listed multiple times.
Current MS-SQL Query
SELECT T01.CompanyName, T02.LastSale
FROM
(SELECT DISTINCT ID, IsActive, ClassTypeID, CompanyName FROM Account) T01
FULL OUTER JOIN
(SELECT DISTINCT AccountID, TransactionType, MAX(SaleDate) LastSale FROM TransHeader group by AccountID, TransactionType, SaleDate) T02
ON T01.ID = T02.AccountID
WHERE ( ( T01.IsActive = 1 )AND
( (Select Max(SaleDate)From TransHeader Where AccountID = T01.ID AND TransactionType in (1,6) AND SaleDate is NOT NULL)
BETWEEN '01/01/2016' AND '12/31/2018 23:59:00' AND (Select Max(SaleDate)From TransHeader Where AccountID = T01.ID AND TransactionType in (1,6) AND SaleDate is NOT NULL) IS NOT NULL
)
)
ORDER BY T01.CompanyName
I thought the FULL OUTER JOIN was the ticket but it did not work and I am stuck.
Sample data Account Table (T01)
ID CompanyName IsActive ClassTypeID
1 ABC123 1 1
2 CDE456 1 1
3 EFG789 1 1
4 Test123 0 1
5 Test456 1 1
6 Test789 0 1
Sample data Transheader table (T02)
AccountID TransactionType SaleDate
1 1 02/03/2012
2 1 03/04/2013
3 1 04/05/2014
4 1 05/06/2014
5 1 06/07/2014
6 1 07/08/2015
1 1 08/09/2016
1 1 01/15/2016
2 1 03/20/2017
2 1 03/21/2017
3 1 03/04/2017
3 1 04/05/2018
3 1 05/27/2018
4 1 06/01/2018
5 1 07/08/2018
5 1 08/01/2018
5 1 10/11/2018
6 1 11/30/2018
Desired Results
CompanyName LastSale (Notes note returned in the result)
ABC123 01/15/2016 (Max(SaleDate) LastSale for ID=1)
CDE456 03/21/2017 (Max(SaleDate) LastSale for ID=2)
EFG789 05/27/2018 (Max(SaleDate) LastSale for ID=3)
Testing456 10/11/2018 (Max(SaleDate) LastSale for ID=5)
ID=4 & ID=6 are note returned because IsActive = 0 for these records.
One option is to select the maximum date in the select clause.
select
a.*,
(
select max(th.saledate)
from transheader th
where th.accountid = a.id
and th.saledate >= '2016-01-01'
and th.saledate < '2019-01-01'
) as max_date
from account a
where a.isactive = 1
order by a.id;
If you only want to show transaction headers with sales dates in the given date range, then you can just inner join the maximum dates with the accounts. In order to do so, you must group your date aggregation per account:
select a.*, th.max_date
from account a
join
(
select accountid, max(saledate) as max_date
from transheader
and saledate >= '2016-01-01'
and saledate < '2019-01-01'
group by accountid
) th on th.accountid = a.id
where a.isactive = 1
order by a.id;
select CompanyName,MAX(SaleDate) SaleDate from Account a
inner join Transheader b on a.id = b.accountid
group by CompanyName
order by 1
Ex :-
Customer_ID Transaction_type
111 Payroll
111 Saving
112 payroll
113 Online
113 Payroll
114 Payroll
1) I want Customer_Id 112 and 114 who has only payroll account.
2) I want customers 111 and 113 who has other transaction type with payroll separately.
The question is still not much clear but the simple GROUP BY Clause is enough for the above
Who has only one payroll account
select Customer_ID
from table
group by Customer_ID
having count(Transaction_type) = 1 and
sum(case when Transaction_type = 'payroll' then 1 else 0 end) = 1
Who has other transaction type with payroll separate
select Customer_ID
from table
group by Customer_ID
having count(Transaction_type) > 1 and
sum(case when Transaction_type = 'payroll' then 1 else 0 end) = 1
Assuming you want to query customers having only one transaction type
SELECT Customer_ID, COUNT(*)
FROM yourtable
GROUP BY Customer_ID
HAVING COUNT(*) = 1;
Same way to distinctly query the customers having different transaction type
SELECT Customer_ID, COUNT(*)
FROM yourtable
GROUP BY Customer_ID
HAVING COUNT(*) > 1;
PS: This can be refined based on your question