Eliminate a duplicate result from a single column - sql

Let's say I have a result from a query that looks like this:
ContactID LeadSalePrice
---------------------------
45 19.90
45 18.00
32 17.50
But, I want to eliminate duplicate ContactID's, always taking the higher price result. So what I want is:
ContactID LeadSalePrice
---------------------------
45 19.90
32 17.50
Here's (a simplified version of) the query:
SELECT
sc.ContactID
, c.LeadSalePrice
FROM
LeadSalesCampaignCriterias c
JOIN LeadSalesCampaigns sc ON c.LeadSalesCampaignID = sc.LeadSalesCampaignID
WHERE
...
ORDER BY
LeadSalePrice DESC
I've been playing around with DISTINCT and GROUP BY, but I'm not getting it.

Just use GROUP BY:
SELECT sc.ContactID, MAX(c.LeadSalePrice) as LeadSalePrice
FROM LeadSalesCampaignCriterias c JOIN
LeadSalesCampaigns sc
ON c.LeadSalesCampaignID = sc.LeadSalesCampaignID
WHERE ...
GROUP BY sc.ContactID;

Another option is the WITH TIES and Row_Number()
Select Top 1 with Ties *
From YourTable
Order By Row_Number() over (Partition By ContactID Order By LeadSalePrice Desc)
Returns
ContactID LeadSalePrice
32 17.50
45 19.90

Related

SQL Query Problem Involving (SUM, Group By, Order by, I guess? and maybe total, or even count)

By using SQL query, find out the Top 5 highest total Transaction Value, which Industry are they? and the number of stores in that industry?
My SQL data looks like this:
Store Name
Industry
Transaction Value
Ace
A
196
Ace
A
193
Area
A
168
Apple
A
165
Boy
B
145
Boy
B
143
Bull
B
136
Bread
B
131
Cat
C
116
Cat
C
106
Cake
C
104
Candy
C
102
Dog
D
101
Dog
D
92
Door
D
80
Daddy
D
75
Egg
E
70
Egg
E
67
Earl
E
66
Eagle
E
61
This is just for your reference, Top 5 highest Transaction Value are:
No.
Store Name
Industry
Total Transaction Value
1
Ace
A
389
2
Boy
B
288
3
Cat
C
222
4
Dog
D
193
5
Area
A
168
SQL Query Results should look something like this:
Industry
No. of Stores
A
2
B
1
C
1
D
1
E
0
select a.industry, sum(case when b.name is null then 0 else 1 end) as no
from
(select distinct industry from transactions ) a
left join
(select name, industry
from transactions
group by name, industry
order by sum(transaction_vaule) desc limit 5) b
on a.industry = b.industry
group by a.industry
order by a.industry
I think I have a solution for you. Please check my code I have used Common Table Expression ,CASE,SUM and group by =>
WITH CTE AS
(
SELECT industry, SUM(TransactionValue) AS Transaction_Value,
COUNT(StoreName) AS StoreCount FROM MYTable
GROUP BY StoreName,industry
ORDER BY SUM(TransactionValue) DESC
Limit 5
)
SELECT T1.industry,
SUM((CASE WHEN c.industry IS NULL THEN 0
ELSE 1 END)) as CT
FROM
(SELECT DISTINCT Industry FROM MYTable) AS T1
LEFT JOIN CTE as c ON T1.industry=c.industry
GROUP BY T1.industry
Note: Subquery is not best practice, but in your case, I think there will be no performance issue. Also, please check the code because, I do not have Snowflake SQL database installed, so there might be some syntactical error can be evident
.
To get a deterministic result, you must be aware of ties. Let's say the top 9 results are
Cat/A/600, Dog/A/500, Cat/B/500, Dog/B/400, Cat/C/300, Dog/C/300, Cat/D/300, Dog/D/200, Cat/E/100
Which is the top fifth? Cat/C/300 or Dog/C/300 or Cat/D/300? Or none of them? If we pick a row arbitrarily (by LIMIT 5 or FETCH FIRST 5 ROWS ONLY) we prefer one industry over another.
In standard SQL we have the clause FETCH FIRST 5 ROWS WITH TIES, but snowflake doesn't feature this, unfortunately. It does however feature DENSE_RANK. It ranks my sample rows thus:
#1: Cat/A/600
#2: Dog/A/500
#2: Cat/B/500
#3: Dog/B/400
#4: Cat/C/300
#4: Dog/C/300
#4: Cat/D/300
#5: Dog/D/200
#6: Cat/E/100
because the five top values are 600, 500, 400, 300, and 200.
The query:
select industry, count(case when rnk <= 5 then 1 end) as stores
from
(
select industry, dense_rank() over (order by sum(transaction_value) desc) as rnk
from mytable
group by store_name, industry
) ranked
group by industry
order by industry;
If you only want to show top industries:
select industry, count(*) as stores
from
(
select industry, dense_rank() over (order by sum(transaction_value) desc) as rnk
from mytable
group by store_name, industry
) ranked
where rnk <= 5
group by industry
order by industry;

Select based on max date from another table

I'm trying to do a simple Select query by getting the country based on the MAX Last update from the other table.
Order#
1
2
3
4
The other table contains the country and the last update:
Order# Cntry Last Update
1 12/21/2019 9:19 PM
1 US 1/10/2020 1:07 AM
2 JP 7/29/2020 12:15 PM
3 CA 4/12/1992 2:04 PM
3 GB 11/6/2001 9:26 AM
3 DK 2/1/2005 3:04 AM
4 CN 8/20/2013 12:04 AM
4 10/1/2015 4:04 PM
My desired result:
Order# Country
1 US
2 JP
3 DK
4
Not sure the right solution for this. So far i'm stuck with this:
SELECT Main.[Order#], tempTable.Cntry
FROM Main
LEFT JOIN (
SELECT [Order#], Cntry, Max([Last Update]) as LatestDate FROM Country
GROUP BY [Order#], Cntry
) as tempTable ON Main.[Order#] = tempTable.[Order#];
Thanks in advance!
If needs only number of order and country,maybe don't need two tables:
SELECT distinct order, country
FROM
(
SELECT order, LAST_VALUE (country) OVER (PARTITION by [order] order by last_update) country FROM Country
) X
In SQL Server, you can use a correlated subquery:
update main
set country = (select top (1) s.country
from secondtable s
where s.order# = main.order#
order by s.lastupdate desc
);
EDIT:
A select would look quite simimilar:
select m.*,
(select top (1) country
from secondtable s
where s.order# = main.order#
order by s.lastupdate desc
)
from main m
I don't have time to try it with sample data, but is that what you are looking for?
select order orde, cntry
from table
where last_update =
(select max(last_update) from table where order = orde)

Count double occurrences in order list

I have a list of orders, I need to find which ones occur with code 47 more than once with different users. For example:
ORDER_ID CODE USER
111 47 1
111 47 2
222 47 1
333 47 1
333 47 2
444 47 1
The expected result is 111 and 333.
How can I accomplish this?
Regards
I think you want aggregation and having:
select order_id
from orders o
where code = 47
group by order_id
having min(user) <> max(user);
You can also express the having as:
having count(distinct user) >= 2
You can try below -
select order_id from tablename
group by order_id
having count(distinct user)>=1
You can do it via row_number() as well
Select distinct order_id from
(select order_id, code, row_number()
over
( Partition by order_id, code
Order by order_id, code) rn
from
tablename
where user in (1,2)
) where rn>=1
But I guess you already have a user column hence i dont think you require extra manipulation
Select orderid, code from table
Group by orderid, code having
max(distinct user) >=1

Top 2 Months of Sales by Customer - Oracle

I am trying to develop a query to pull out the top 2 months of sales by customer id. Here is a sample table:
Customer_ID Sales Amount Period
144567 40 2
234567 50 5
234567 40 7
144567 80 10
144567 48 2
234567 23 7
desired output would be
Customer_ID Sales Sum Period
144567 80 10
144567 48 2
234567 50 5
234567 40 7
I've tried
select sum(net_sales_usd_spot), valid_period, customer_id
from sales_trans_price_output
where valid_period in (select valid_period, sum(net_sales_usd_spot)
from sales_trans_price_output
where rank<=2)
group by valid_period, customer_id
error is
too many values ORA-00913.
I see why, but not sure how to rework it.
Try:
SELECT *
FROM (
SELECT t.*,
row_number() over (partition by customer_id order by sales_amount desc ) rn
FROM sales_trans_price t
)
WHERE rn <= 2
ORDER BY 1,2 desc
Demo: http://sqlfiddle.com/#!4/882888/3
what if you change your where clause to:
where valid_period in
(
select p.valid_period from sales_trans_price_output p
join (select valid_period, sum(net_sales_usd_spot)
from sales_trans_price_output
where rank<=2) s on s.valid_period = p.valid_period
)
It might be ugly and need refactoring, but I think this is the logic you're after.
The error is because of this.
where valid_period in (select valid_period, sum(net_sales_usd_spot)
from sales_trans_price_output
where rank<=2)
The subquery can only contain one field.
You are on the right track using rank, but you might not be using it correctly. Google oracle rank to find the correct syntax.
Back to what you are looking to achieve, a derived table is the approach I would use. That's simply a subquery with an alias. Or, if you use the keyword with, it might be called a CTE - Computed Table Expression.
Try it
SELECT * FROM (
SELECT T.*,
RANK () OVER (PARTITION BY CUSTOMER_ID
ORDER BY VALID_PERIOD DESC) FN_RANK
FROM SALES_TRANS_PRICE_OUTPUT T
) A
WHERE A.FN_RANK <= 2
ORDER BY CUSTOMER_ID ASC, VALID_PERIOD DESC, FN_RANK DESC

SQL Query for avoiding any repetition for a specific column terms

I am looking to design a query in which I need DISTINCT terms in a column without repetition. I am using the SQL Server 2008 R2 edition.
Here is my sample table:
id bank_code bank_name interest_rate
----------------------------------------------------------
1 123 abc 3.5
2 456 xyz 3.7
3 123 abc 3.4
4 789 pqr 3.3
5 123 abc 3.6
6 456 xyz 3.1
What I want is, to sort the table descending on the 'interest_rates' column but without any repetition of the terms in 'bank_code'.
Here is what I want:
id bank_code bank_name interest_rate
----------------------------------------------------------
2 456 xyz 3.7
5 123 abc 3.6
4 789 pqr 3.3
I have been trying the DISTINCT operator but it selects the unique combination of all the columns and not the single column for repetition.
Here is what I am doing, which clearly would not do get me what I want:
SELECT DISTINCT TOP 5 [ID], [BANK_CODE]
,[BANK_NAME]
,[INTEREST_RATE]
FROM [SAMPLE]
ORDER BY [INTEREST_RATE] DESC
Is there a way to achieve this?
Any help is appreciated.
;WITH x AS
(
SELECT id,bank_code,bank_name,interest_rate,
rn = ROW_NUMBER() OVER (PARTITION BY bank_code ORDER BY interest_rate DESC)
FROM dbo.[SAMPLE]
)
SELECT id,bank_code,bank_name,interest_rate
FROM x WHERE rn = 1
ORDER BY interest_rate DESC;
Try using analytical functions:
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY bank_code ORDER BY interes_rate DESC) Corr
FROM [Sample]
)
SELECT id, bank_code, banck_name, interest_rate
FROM CTE
WHERE Corr = 1
not sure about the [] syntax, but you probably need something like this:
SELECT min([ID]), [BANK_CODE], [BANK_NAME], max([INTEREST_RATE])
FROM [SAMPLE]
GROUP BY [BANK_CODE], [BANK_NAME]
ORDER BY 4 DESC
How about something like this. It is simple, but will duplicate if you have interest rates that are the same.
select ID, #sample.Bank_code, bank_name, #sample.interest_Rate
from #sample
join
(
SELECT [BANK_CODE], MAX(interest_rate) as interest_Rate
FROM #sample
GROUP BY bank_code
) as groupingtable
on groupingtable.bank_code = #sample.bank_code
and groupingtable.interest_Rate = #sample.interest_rate