Find duplicate values with different year in SQL - sql

How can I search duplicate records in a table but has different years.
My sample data:
Cus_No Item_No Ord_Dt Orders
1 A 2016 1
1 A 2017 2
1 B 2016 1
2 B 2015 1
2 B 2018 1
Output needed
Cus_No Item_No Ord_Dt Orders
1 A 2016 1
1 A 2017 2
2 B 2015 1
2 B 2018 1
I am trying to collect all records with the same Cus_No, the same Item_No that has any value in Orders and exist in any year in Ord_dt. The reason is, I need to find those items with the same customer number that has orders from all years.
I am using MS Query and this is the SQL statement I tried but still displays all records.
SELECT `'table'`.Cus_No, `'table'`.Item_No, `'table'`.Ord_Dt, `'table'`.Orders
FROM `'table'`
WHERE (`'table'`.Orders>=1) AND (`'table'`.Ord_Dt In ('2016','2017'))

Assuming you want to identify records corresponding to customer items which appeared across more than one year, we can try the following:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT Cus_No, Item_No
FROM yourTable
GROUP BY Cus_No, Item_No
HAVING COUNT(DISTINCT Ord_Dt) > 1
) t2
ON t1.Cus_No = t2.Cus_No AND
t1.Item_No = t2.Item_No

Below query returns duplicates -
select * from test where (cus_no, item_no) in (
select Cus_No, item_no from test group by Cus_No, item_no having count(*) > 1)

Stab in the dark:
with agg as (
select cust_no, item_no from T
group by cust_no, item_no
where ord_dt in (...)
having count(*) = <num years in list>
)
select *
from T t inner join agg a
on a.cust_no = t.cust_no
and a.item_no = t.item_no;

Related

numbers of users buying the exact same product from the same shop for > 2 times in 1 years

I have data like this:
date user prod shop cat1 cat2
2022-02-01 1 a a ah g
2022-02-02 1 a1 b ah g
2022-04-03 1 a a ah g
2022-04-19 1 a a ah g
2022-05-01 2 b c bg g
I want to know how many user buy the same product in the same shop for >2 times in period 1 year. The result i want like:
table 1
cat1 number_of_user
ah 1
table 2
cat2 number_of_user
g 1
For total user, my query like:
WITH data_product AS(
SELECT DATE(payment_time) date,
user,
CONCAT(prod, "_", shop) product_shop,
cat1,
cat2
FROM
a
WHERE
DATE(payment_time) BETWEEN "2022-01-01" AND DATE_SUB(current_date, INTERVAL 1 day)
ORDER BY 1,2,3),
purchased AS (
SELECT user, product_shop, count(product_shop) tot_purchased
FROM data_product
GROUP BY 1,2
HAVING COUNT(product_shop) > 2
)
SELECT COUNT(user) number_of_user FROM purchased
Please help to get number of user buy the same product in the same shop more than 2 times in period based on cat1 and cat2.
Try this:
create temporary table table1 as(
select *,extract(YEAR from date) as year from `projectid.dataset.table`
);
create temporary table table2 as(
select * except(date,cat2) ,count(user) over(partition by cat1,year,user,prod,shop) tcount from table1
);
create temporary table table4 as(
select * except(date,cat1) ,count(user) over(partition by cat2,year,user,prod,shop) tcount from table1
);
select distinct year,cat1 ,count(distinct user) number_of_user from table2 where tcount>2 group by YEAR,cat1;
select distinct year,cat2 ,count(distinct user) number_of_user from table4 where tcount>2 group by YEAR,cat2;
If you want a single result set you can union both the select statements.
I think this query might work. The first part shows count of customers who purchased same product in category1 from same shop during one year. Second part shows that for category2, then we concatenate the two set by union operation :
with cte as
(select distinct
PDate,userID as userID,prod as prod,shop,cat1 as cat1,cat2,
count(userID) over (partition by UserID,prod,shop,year(Pdate),cat1) as cat1_count,
count(PDate) over (partition by UserID,prod,shop,year(Pdate),cat2) as cat2_count
from tbl1)
select
cte.cat1 as c1,'0' as c2,count(distinct cte.cat1) as Num
from cte
where cte.cat1_count>1
group by cte.prod,cte.userID,cte.cat1
union
select
'0',cte.cat2,count(distinct cte.cat2)
from cte
where cte.cat2_count>1
group by cte.prod,cte.userID,cte.cat2

Sql query like pivot table from vertical list in postgresql effectively

I want to write an sql query like pivot table from vertical list. Let's see it on an example:
My table data:
Table1 -> Metadata:
id text
1 t1
2 t2
3 t3
Table2 -> Report:
date category revenue metadata_id
2020-01-01 TRIAL 1 1
2020-01-01 PURCHASE 1.2 2
2020-01-03 SUBSCRIPTION 1.4 3
2020-01-03 PURCHASE 1.1 3
...
In here, I want to create an sql query to get resul from specific range and ids filter like:
Request:
start-date: 2020-01-01
end-date: 2020-01-30
ids: 1,2....100
Expected result:
id text category_trial category_purchase category_SUBSCRIPTION
1 t1 1 0 0
2 t2 0 1.2 0
3 t3 0 1.1 1.4
In here, I wrote an sql like below:
select
m.id,
m.text,
t1.rev as category_trial,
t2.rev as category_purchase,
t3.rev as category_SUBSCRIPTION
from metadata m
left join
(
select
metadata_id,
sum(revenue) as rev
from report where category = 'TRIAL' and report_date between '2020-01-01' and '2020-01-30'
group by metadata_id
) t1 on t1.metadata_id = m.id
left join
(
select
metadata_id,
sum(revenue) as rev
from report where category = 'PURCHASE' and report_date between '2020-01-01' and '2020-01-30'
group by metadata_id
) t2 on t2.metadata_id = m.id
left join
(
select
metadata_id,
sum(revenue) as rev
from report where category = 'SUBSCRIPTION' and report_date between '2020-01-01' and '2020-01-30'
group by metadata_id
) t3 on t3.metadata_id = m.id
...
In here I have more than 7 categories.
My problem in here, this sql is working but the performance is not enough. Is there any suggestion to improve performance of it?
Note: I wrote it in postgresql and I use indexing.
I do not use Postgres often, but this query should be faster:
select id, text,
sum(case category when 'TRIAL' then revenue else 0 end) cat_tri,
sum(case category when 'PURCHASE' then revenue else 0 end) cat_pur,
sum(case category when 'SUBSCRIPTION' then revenue else 0 end) cat_sub
from (
select id, text, category, revenue
from metadata m join report r on m.id = r.metadata_id
where date_ between '2020-01-01' and '2020-01-30'
and id between 1 and 100 ) t
group by id, text
order by id
dbfiddle
Result as expected, only one join, filtering and grouping.

Return last amount for each element with same ref_id

I have 2 tables, one is credit and other one is creditdetails.
Creditdetails creates new row every day for each of credit.
ID Amount ref_id date
1 2 1 16.03
2 3 1 17.03
3 4 1 18.03
4 1 2 16.03
5 2 2 17.03
6 0 2 18.03
I want to sum up amount of every row with the unique id and last date. So the output should be 4 + 0.
You can use ROW_NUMBER to filter on the latest amount per ref_id.
Then SUM it.
SELECT SUM(q.Amount) AS TotalLatestAmount
FROM
(
SELECT
cd.ref_id,
cd.Amount,
ROW_NUMBER() OVER (PARTITION BY cd.ref_id ORDER BY cd.date DESC) AS rn
FROM Creditdetails cd
) q
WHERE q.rn = 1;
A test on db<>fiddle here
With this query:
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
you get all the last dates for each ref_id, so you can join it to the table creditdetails and sum over amount:
select sum(amount) total
from creditdetails c inner join (
select ref_id, max(date) maxdate
from creditdetails
group by ref_id
) g
on g.ref_id = c.ref_id and g.maxdate = c.date
I think you want something like this,
select sum(amount)
from table
where date = ( select max(date) from table);
with the understanding that your date column doesn't appear to be in a standard format so I can't tell if it needs to be formatted in the query to work properly.

using group by in subquery in sql

how to get around this error :
Unable to use an aggregate or a subquery in an expression used in the
GROUP BY list of a GROUP BY clause.
here is my query :
select Id, name,dayA,monthA,yearA,
sum(x) as x,
(select SUM(x) group by month) as total,
from table_A
group by Id,name,monthA,dAyA,yearA, SUM(x)
in other words :
sample data :
id name dayA monthA yearA x
===========================
1 name1 2 3 2016 4
2 name2 2 3 2016 3
3 name1 2 3 2016 2
Expected result :
id name dayA monthA yearA x total
===================================
1 name1 2 3 2016 4 6
2 name2 2 3 2016 3 3
3 name1 2 3 2016 2 6
Thanks in advance
you're query has more problem.
(select SUM(x) group by month) as total, is it from the same table, not likely since column month is not mention inyour group by. When using sub query in a query, you must guaranteed that i will only return one record.
Based on your sample data and expected results...
create table table_A(
id int,
name varchar(25),
dayA int,
monthA int,
yearA int,
x int
)
insert into table_A
values (1,'name1',2,3,2016,4),
(2,'name2',2,3,2016,3),
(2,'name1',2,3,2016,2)
select ta.id, ta.name, ta.dayA, ta.monthA, ta.yearA, ta.x, total.Total from table_A as ta
left join
(select name, sum(x) as Total from table_A group by name) total on ta.name = total.name
group by
ta.id, ta.name, ta.dayA, ta.monthA, ta.yearA, ta.x, total.name, total.Total
May be this is what you want:
select table_A.*, TotalSums.total
from table_A
left join (select name, monthA, dayA, yearA, sum(x) as total from table_A group by name, monthA, dayA, yearA) as TotalSums
on table_A.name = TotalSums.name
and table_A.monthA = TotalSums.monthA
and table_A.dayA = TotalSums.dayA
and table_A.yearA = TotalSums.yearA
order by id
i think this is what you're looking for
select Id, main.name,dayA,main.monthA,main.yearA,
sum(x) as x,
,max(total.total) as total
from table_A as main
join (select SUM(x) total ,name ,monthA,yearA from table_A group by name,monthA,yearA) as total
on main.name = total.name
and main.monthA = total.monthA
and main.yearA = total.yearA
group by Id,main.name,monthA,dAyA,yearA

left join without duplicate values using MIN()

I have a table_1:
id custno
1 1
2 2
3 3
and a table_2:
id custno qty descr
1 1 10 a
2 1 7 b
3 2 4 c
4 3 7 d
5 1 5 e
6 1 5 f
When I run this query to show the minimum order quantities from every customer:
SELECT DISTINCT table_1.custno,table_2.qty,table_2.descr
FROM table_1
LEFT OUTER JOIN table_2
ON table_1.custno = table_2.custno AND qty = (SELECT MIN(qty) FROM table_2
WHERE table_2.custno = table_1.custno )
Then I get this result:
custno qty descr
1 5 e
1 5 f
2 4 c
3 7 d
Customer 1 appears twice each time with the same minimum qty (& a different description) but I only want to see customer 1 appear once. I don't care if that is the record with 'e' as a description or 'f' as a description.
First of all... I'm not sure why you need to include table_1 in the queries to begin with:
select custno, min(qty) as min_qty
from table_2
group by custno;
But just in case there is other information that you need that wasn't included in the question:
select table_1.custno, ifnull(min(qty),0) as min_qty
from table_1
left outer join table_2
on table_1.custno = table_2.custno
group by table_1.custno;
"Generic" SQL way:
SELECT table_1.custno,table_2.qty,table_2.descr
FROM table_1, table_2
WHERE table_2.id = (SELECT TOP 1 id
FROM table_2
WHERE custno = table_1.custno
ORDER BY qty )
SQL 2008 way (probably faster):
SELECT custno, qty, descr
FROM
(SELECT
custno,
qty,
descr,
ROW_NUMBER() OVER (PARTITION BY custno ORDER BY qty) RowNum
FROM table_2
) A
WHERE RowNum = 1
If you use SQL-Server you could use ROW_NUMBER and a CTE:
WITH CTE AS
(
SELECT table_1.custno,table_2.qty,table_2.descr,
RN = ROW_NUMBER() OVER ( PARTITION BY table_1.custno
Order By table_2.qty ASC)
FROM table_1
LEFT OUTER JOIN table_2
ON table_1.custno = table_2.custno
)
SELECT custno, qty,descr
FROM CTE
WHERE RN = 1
Demolink