numbers of users buying the exact same product from the same shop for > 2 times in 1 years - sql

I have data like this:
date user prod shop cat1 cat2
2022-02-01 1 a a ah g
2022-02-02 1 a1 b ah g
2022-04-03 1 a a ah g
2022-04-19 1 a a ah g
2022-05-01 2 b c bg g
I want to know how many user buy the same product in the same shop for >2 times in period 1 year. The result i want like:
table 1
cat1 number_of_user
ah 1
table 2
cat2 number_of_user
g 1
For total user, my query like:
WITH data_product AS(
SELECT DATE(payment_time) date,
user,
CONCAT(prod, "_", shop) product_shop,
cat1,
cat2
FROM
a
WHERE
DATE(payment_time) BETWEEN "2022-01-01" AND DATE_SUB(current_date, INTERVAL 1 day)
ORDER BY 1,2,3),
purchased AS (
SELECT user, product_shop, count(product_shop) tot_purchased
FROM data_product
GROUP BY 1,2
HAVING COUNT(product_shop) > 2
)
SELECT COUNT(user) number_of_user FROM purchased
Please help to get number of user buy the same product in the same shop more than 2 times in period based on cat1 and cat2.

Try this:
create temporary table table1 as(
select *,extract(YEAR from date) as year from `projectid.dataset.table`
);
create temporary table table2 as(
select * except(date,cat2) ,count(user) over(partition by cat1,year,user,prod,shop) tcount from table1
);
create temporary table table4 as(
select * except(date,cat1) ,count(user) over(partition by cat2,year,user,prod,shop) tcount from table1
);
select distinct year,cat1 ,count(distinct user) number_of_user from table2 where tcount>2 group by YEAR,cat1;
select distinct year,cat2 ,count(distinct user) number_of_user from table4 where tcount>2 group by YEAR,cat2;
If you want a single result set you can union both the select statements.

I think this query might work. The first part shows count of customers who purchased same product in category1 from same shop during one year. Second part shows that for category2, then we concatenate the two set by union operation :
with cte as
(select distinct
PDate,userID as userID,prod as prod,shop,cat1 as cat1,cat2,
count(userID) over (partition by UserID,prod,shop,year(Pdate),cat1) as cat1_count,
count(PDate) over (partition by UserID,prod,shop,year(Pdate),cat2) as cat2_count
from tbl1)
select
cte.cat1 as c1,'0' as c2,count(distinct cte.cat1) as Num
from cte
where cte.cat1_count>1
group by cte.prod,cte.userID,cte.cat1
union
select
'0',cte.cat2,count(distinct cte.cat2)
from cte
where cte.cat2_count>1
group by cte.prod,cte.userID,cte.cat2

Related

multiple top n aggregates query defined as a view (or function)?

I couldn't find a past question exactly like this problem. I have an orders table, containing a customer id, order date, and several numeric columns (how many of a particular item were ordered on that date). Removing some of the numberics, it looks like this:
customer_id date a b c d
0001 07/01/22 0 3 3 5
0001 07/12/22 12 0 50 0
0002 06/30/22 5 65 0 30
0002 07/20/22 1 0 19 2
0003 08/01/22 0 0 99 0
I need to sum each numeric column by customer_id, then return the top n customers for each column. Obviously that means a single customer may appear multiple times, once for each column. Assuming top 2, the desired output would look something like this:
column_ranked customer_id sum rank
'a' 001 12 1
'a' 002 6 2
'b' 002 65 1
'b 001 3 2
'c' 003 99 1
'c' 001 53 2
'd' 002 30 1
'd' 001 5 2
(this assumes no date range filter)
My first thought was a CTE to collapse the table into its per-customer sums, then a union from the CTE, with a limit n clause, once for each summed column. That works if the date range is hard-coded into the CTE .... but I want to define this as a view, so it can be called by users something like this:
SELECT * from top_customers_view WHERE date_range BETWEEN ( date1 and date2 )
How can I pass the date restriction down to the CTE? Or am I taking the wrong approach entirely? If a view isn't possible, can it be done as a function? (without using a costly cursor, that is.)
Since the date ranges clearly produce a massive number of combinations you cannot generate a view with them. You can write a query, however, as shown below:
with
p as (select cast ('2022-01-01' as date) as ds, cast ('2022-12-31' as date) as de),
a as (
select top 10 customer_id, 'a' as col, sum(a) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
b as (
select top 10 customer_id, 'b' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
c as (
select top 10 customer_id, 'c' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
),
d as (
select top 10 customer_id, 'd' as col, sum(b) as s
from t cross join p where date between ds and de
group by customer_id order by s desc
)
select * from a
union all select * from b
union all select * from c
union all select * from d
order by customer_id, col, s desc
The date range is in the second line.
See db<>fiddle.
Alternatively, you could create a data warehousing solution, but it would require much more effort to make it work.

SQL Group By most recent date and sales value

I have the following sales table that displays the customer ID, their name, the order amount, and the order date.
ID
Name
Order
Date
1
A
25
11/10/2006
1
A
10
5/25/2010
1
A
10
6/18/2018
2
B
20
3/31/2008
2
B
15
11/15/2010
3
C
35
1/1/2019
3
C
20
4/12/2007
3
C
10
3/20/2010
3
C
5
10/19/2012
4
D
15
12/12/2013
4
D
15
2/18/2010
5
E
25
12/11/2006
6
F
10
5/1/2016
I am trying to group the data so that for each customer it would only show me their most recent order and the amount, as per below:
ID
Name
Order
Date
1
A
10
6/18/2018
2
B
15
11/15/2010
3
C
35
1/1/2019
4
D
15
12/12/2013
5
E
25
12/11/2006
6
F
10
5/1/2016
So far I've only been able to group by ID and Name, because adding the Order column would also group by that column as well.
SELECT
ID,
Name,
MAX(Date) 'Most recent date'
FROM Table
GROUP BY Customer, Customer
How can I also add the order amount for each Customer?
SELECT ID, Name, Order, Date FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC) AS sn
FROM your_table_name
) A WHERE sn = 1;
You could use a subqoery for max date
SELECT
ID,
Name,
MAX(Date) 'Most recent date'
FROM Table
GROUP BY Customer, Customer
select a.ID, a.Name, b.max_date
from Table a
inner join (
select name, max(Date) max_date
from Table
group by name
) b on a. name = b.name and a.date = b.max_date
You can use this query to get the expected result:
SELECT S.*
FROM Sales S
CROSS APPLY
(
SELECT ID, Max(Date) MaxDate
FROM Sales
GROUP BY ID
)T
WHERE S.ID = T.ID
AND S.Date = T.MaxDate
ORDER BY S.ID

How to pull unique SQL values from two different tables into one column?

I have two tables, each with non-distinct IDs, products, and sales dates. I want a combined table where the rows are the distinct IDs, and the columns are the earliest sales date of each product.
The tables look like the following:
Table_1
Member_ID
Product
Sales_Date
1
A
01/01/2021
1
A
02/01/2021
2
A
01/01/2021
3
A
02/02/2021
Table_2
Member_ID
Product
Sales_Date
1
B
04/01/2021
1
B
05/01/2021
2
B
04/01/2021
3
B
03/01/2021
And my desired end table would be:
Merged_table
Member_ID
Product_A_earliest_date
Product_B_earliest_date
1
01/01/2021
04/01/2021
2
01/01/2021
04/01/2021
3
02/01/2021
03/01/2021
I have tried the following code to merge the tables:
create table merged_table as
select member_id, min(a.sales_date) as Product_A_earliest_date, min(b.sales_date) as Product_B_earliest_date from(
select member_id from table_1 as a
UNION
select member_id from table_2 as b);
But this provides 'missing at EOF' errors. Am I incorrectly using the UNION function?
assuming sql server or some other engine that supports ctes...
the first challenge is getting all the ids... it cant be the case that every sales person will sell both products
once you have that, its fairly straight forward.
i did not test this out, but this is the approach.
;with ids as ( select distinct member_Id from table_1
union
select distinct member_id from table_2
)
, uniqueIds as ( select distinct member_id from ids)
;with t1 as ( select member_id, product, min(sales_date) [sales_date]
from Table_1
group by member_id,product)
, t2 as ( select member_id, product, min(sales_date) [sales_date]
from Table_2
group by member_id,product)
select ui.member_Id, t1.salesDate [EarlyDate_a], t2.salesDate [earlydate_b]
from uniqueIds ui
left join t1_ on ui.memberId = t1.memberId
left join t2_ on ui.memberId = t2.memberId
We usually get missing at EOF when the statement is incomplete.
There are a few issues with your sql statement:
The alias a used for table_1 and b used for table_b will not exist outside of the subquery used to create the union
You are using the aggregate function min and a non-aggregate column member_id without having member_id in the group by clause.
Your subquery being used in your outermost from clause is missing an alias
You may try the following query
SELECT
Member_ID,
MIN(
CASE WHEN Product='A' THEN Sales_Date END
) as Product_A_earliest_date,
MIN(
CASE WHEN Product='B' THEN Sales_Date END
) as Product_B_earliest_date
FROM (
SELECT Member_ID, Product, Sales_Date FROM table_1
UNION ALL
SELECT Member_ID, Product, Sales_Date FROM table_2
) t
GROUP BY Member_ID
ORDER BY Member_ID
member_id
product_a_earliest_date
product_b_earliest_date
1
2021-01-01
2021-04-01
2
2021-01-01
2021-04-01
3
2021-02-02
2021-03-01
View working demo on DB Fiddle
Let me know if this works for you.

Return last amount for each element with same ref_id

I have 2 tables, one is credit and other one is creditdetails.
Creditdetails creates new row every day for each of credit.
ID Amount ref_id date
1 2 1 16.03
2 3 1 17.03
3 4 1 18.03
4 1 2 16.03
5 2 2 17.03
6 0 2 18.03
I want to sum up amount of every row with the unique id and last date. So the output should be 4 + 0.
You can use ROW_NUMBER to filter on the latest amount per ref_id.
Then SUM it.
SELECT SUM(q.Amount) AS TotalLatestAmount
FROM
(
SELECT
cd.ref_id,
cd.Amount,
ROW_NUMBER() OVER (PARTITION BY cd.ref_id ORDER BY cd.date DESC) AS rn
FROM Creditdetails cd
) q
WHERE q.rn = 1;
A test on db<>fiddle here
With this query:
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
you get all the last dates for each ref_id, so you can join it to the table creditdetails and sum over amount:
select sum(amount) total
from creditdetails c inner join (
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
) g
on g.ref_id = c.ref_id and g.maxdate = c.date
I think you want something like this,
select sum(amount)
from table
where date = ( select max(date) from table);
with the understanding that your date column doesn't appear to be in a standard format so I can't tell if it needs to be formatted in the query to work properly.

Find duplicate values with different year in SQL

How can I search duplicate records in a table but has different years.
My sample data:
Cus_No Item_No Ord_Dt Orders
1 A 2016 1
1 A 2017 2
1 B 2016 1
2 B 2015 1
2 B 2018 1
Output needed
Cus_No Item_No Ord_Dt Orders
1 A 2016 1
1 A 2017 2
2 B 2015 1
2 B 2018 1
I am trying to collect all records with the same Cus_No, the same Item_No that has any value in Orders and exist in any year in Ord_dt. The reason is, I need to find those items with the same customer number that has orders from all years.
I am using MS Query and this is the SQL statement I tried but still displays all records.
SELECT `'table'`.Cus_No, `'table'`.Item_No, `'table'`.Ord_Dt, `'table'`.Orders
FROM `'table'`
WHERE (`'table'`.Orders>=1) AND (`'table'`.Ord_Dt In ('2016','2017'))
Assuming you want to identify records corresponding to customer items which appeared across more than one year, we can try the following:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT Cus_No, Item_No
FROM yourTable
GROUP BY Cus_No, Item_No
HAVING COUNT(DISTINCT Ord_Dt) > 1
) t2
ON t1.Cus_No = t2.Cus_No AND
t1.Item_No = t2.Item_No
Below query returns duplicates -
select * from test where (cus_no, item_no) in (
select Cus_No, item_no from test group by Cus_No, item_no having count(*) > 1)
Stab in the dark:
with agg as (
select cust_no, item_no from T
group by cust_no, item_no
where ord_dt in (...)
having count(*) = <num years in list>
)
select *
from T t inner join agg a
on a.cust_no = t.cust_no
and a.item_no = t.item_no;