Select all data with no duplicate data

Select all data with no duplicate data - sql

I have some data in database:
Name | Country | Status
Mary | USA | Pending
Jane| Japan | Pending
Jane | Korea | Pending
Adrian | China | Pending
Peter | Singapore | Pending
Jack | Malaysia | Pending
Adrian | China | Updated
Jane | Japan | Updated
May I know how to use the SELECT query to select all the data with no duplicate data? (If the duplicates data exist, select only the Status with Updated)

Try:
SELECT Name, Country, MAX(Status) as Status FROM (
SELECT TOP 100 PERCENT *
FROM NameCountry
ORDER BY Name ASC, Country ASC, Status DESC
) G
GROUP BY G.Name, G.Country
ORDER BY G.Name, G.Country
Check my Demo

From your comment, you seem to mean only data where the first two columns are duplicated. The easiest way, I think, is to use row_number(), which is available in most databases:
select name, country, status
from (select t.*,
row_number() over (partition by name, country
order by (case when status = 'Pending' then 0 else 1 end)
) as seqnum
from t
) t
where seqnum = 1

Related

Select multiple distinct rows

I have a table with following data.
id country serial other_column
1 us 123 1
2 us 456 1
3 gb 123 1
4 gb 456 1
5 jp 777 1
6 jp 888 1
7 us 123 2
8 us 456 3
9 gb 456 4
10 us 123 1
11 us 123 1
Is there a way to fetch 2 rows per unique country and unique serial?
For example, expecting following results from my query.
us,123,1 comes twice cos there was 3 of the same kind and I want 2 rows per unique country and unique serial.
us,123,1
us,123,1
us,456,1
gb,123,1
gb,456,1
jp,777,1
jp,888,1
I can't use:
select distinct country, serial from my_table;
Since I want 2 rows per distinct value match for country and serial. Pls advice.

Assign DENSE_RANK and ROW_NUMBER to your data set using a CTE or subquery then return rows with a ROW_NUMBER less than 3 and a DENSE_RANK equal to 1. Also, since you did not specify the ORDERING, I've added a custom ORDER BY to handle your Country sorting to match your desired output above.
SELECT
ID,
Country,
Serial,
other_column
FROM
(SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS RN,
DENSE_RANK() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS DR
FROM my_table) A
WHERE RN < 3 AND DR = 1
ORDER BY CASE WHEN Country = 'us' THEN 1
WHEN Country = 'gb' THEN 2
WHEN Country = 'jp' THEN 3
ELSE 4
END ASC, Country, Serial, other_column, ID
Result:
| ID | Country | Serial | other_column |
|----|---------|--------|---------------|
| 1 | us | 123 | 1 |
| 10 | us | 123 | 1 |
| 2 | us | 456 | 1 |
| 3 | gb | 123 | 1 |
| 4 | gb | 456 | 1 |
| 5 | jp | 777 | 1 |
| 6 | jp | 888 | 1 |
Fiddle here.

since you didn't specify any logic which rows it should pick based on the 'other column' value (i am assuming it doesn't matter for you).
Having said that, my code will always pick two rows based on unique country and serial with the other_column value as ascending. For example if you have 3 rows:
us, 123, 1
us, 123, 1
us, 123, 2
it will go for first two since other_column value is set to ASC, if you want the other way around you can change the code to order by DESC within the partition by clause.
If there are less than 3 rows for the country and serial it would just pick 1 row.
for example
us, 456, 1
us, 456, 1
us, 123, 1
us, 123, 2
us, 123, 3
would result in:
us, 456,1
us, 123,1
us, 123,2
with main as (
select
*,
count(*) over(partition by country, serial) as total_occurence,
row_number() over(partition by country, serial order by other_column) as rank_
from <table_name>
),
conditions as (
select *,
case when total_occurence < 3 and rank_ = 1 then true
when total_occurence >=3 and rank_ in (1,2) then true else
false end as is_relevant
from main
)
select * from conditions where is_relevant

Row Number by Certain Column in SQL

I have a table that contains customer transactions. It looks like this:
Tha data is sorted by Total Transaction. I want to create a column that contains number by City. For Example, the first row shows City is London so the values is 1, second row becaus it's from London too, the value is also 1. When the Next Row is not London, the value is 2. So it looks like this:
Is there a way to create that row number in SQL Server?

You can try using dense_rank()
select *,dense_rank() over(order by city) as cityNumber
from tablename
order by total_transaction desc

I believe the question is valid and as per my understanding on the requirement , you need a two level of sub query to get to the final result,
Here I have used max as the data first has to be sorted by Total Transaction and then we can use dense_rank to give a row number using the max value and city.
select t.city as "City"
,dense_rank() over (order by max_total_per_city desc,city) as "City Number"
,t.customer as "Customer"
,t.total_transaction as "Total Transaction"
from
(
select *
,max(total_transaction) over (partition by city) as max_total_per_city
from tableName t
) t
order by total_transaction desc

You can get the CityNumbers with ROW_NUMBER() window function:
select City, row_number() over (order by max(TotalTransaction) desc) CityNumber
from tablename
group by City
so you can join the above query to the table:
select t.City, c.CityNumber, t.Customer, t.Totaltransaction
from tablename t inner join (
select City, row_number() over (order by max(TotalTransaction) desc) CityNumber
from tablename
group by City
) c on c.City = t.City
order by t.TotalTransaction desc
Or with DENSE_RANK() window function:
select t.City,
dense_rank() over (order by (select max(TotalTransaction) from tablename where City = t.City) desc) as cityNumber,
t.Customer,
t.TotalTransaction
from tablename t
order by t.TotalTransaction desc
See the demo.
Results:
> City | CityNumber | Customer | Totaltransaction
> :--------- | ---------: | :------- | ---------------:
> London | 1 | Michael | 250
> London | 1 | Edward | 180
> Paris | 2 | Michael | 160
> Madrid | 3 | Luis | 153
> London | 1 | Serena | 146
> Madrid | 3 | Lionel | 133
> Manchester | 4 | Frank | 96

There is a table having country and city columns as shown in the below table input. I need the output as mentioned below

I need a SQL query to get the desired output from the input table

You can do this with a UNION query, first selecting the distinct country names, and then each of the cities for that country. The output is then ordered by the country; whether the value is a country or a city; and then by the value:
SELECT DISTINCT country AS data, country, 1 AS ctry
FROM cities
UNION ALL
SELECT city, country, 0
FROM cities
ORDER BY country, ctry DESC, data
Output:
data country ctry
India India 1
BNG India 0
CHN India 0
HYD India 0
Sweden Sweden 1
GOTH Sweden 0
STOCK Sweden 0
VAXO Sweden 0
Demo on dbfiddle

It really looks like you are willing to interleave the records, with each country followed by its related countries.
The actual solution heavily depends on your datase, so let me assume that yours supports window functions, row constructor values() and lateral joins (SQL Server and Postgres are two candidates).
In SQL Server, you could do:
select distinct rn, idx, val
from (
select t.*, dense_rank() over(order by country) rn
from mytable t
) t
cross apply (values (t.country, 1), (t.city, 2)) as v(val, idx)
order by rn, idx, val
Demo on DB Fiddle:
rn | idx | val
-: | --: | :-----
1 | 1 | INDIA
1 | 2 | BNG
1 | 2 | CHN
1 | 2 | HYD
2 | 1 | SWEDEN
2 | 2 | STOCK
2 | 2 | VAXO
In Postgres, you would just replace outer apply with cross join lateral: Demo.

A running summary of totals in SQL Server

Come up against an issue where I want to summarize results in a query.
Example as follows:
NAME | FRUIT | PRICE
-----+-------+------
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | GRAPE | 3
This is my table at the moment, what i need though is to have a summary of Johns business, like below:
NAME | FRUIT | PRICE
-----+-------+------
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | TOTAL | 8
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | TOTAL | 9
I have tried to group the information but it does not reflect what i want, plus if John were to have different fruit it would need to sum that up before it sums up the next part and it needs to have a running total for all values in the NAME field as there will be a number of customers.
Any advice would be great
EDIT
I have tried using Rollup but I keep getting totals of all values in a seperate column where as I would like to see it as the way it is formatted above

A solution with UNION and GROUP BY.
;WITH PricesWithTotals AS
(
SELECT
Name,
Fruit,
Price
FROM
YourTable
UNION ALL
SELECT
Name,
Fruit = 'TOTAL',
Price = SUM(Price)
FROM
YourTable
GROUP BY
Name
)
SELECT
Name,
Fruit,
Price
FROM
PricesWithTotals
ORDER BY
Name,
CASE WHEN Fruit <> 'Total' THEN 1 ELSE 999 END ASC,
Fruit

This will get you a running total per customer per fruit:
create table #Sales([Name] varchar(20), Fruit varchar(20), Price int)
insert into #Sales([Name], Fruit, Price)
values
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('DAVE','GRAPE',3),
('DAVE','GRAPE',3),
('DAVE','GRAPE',3)
Select c.*
, SUM(Price) OVER (PARTITION BY c.[Name], c.[Fruit] ORDER BY c.[Name], c.[Fruit] rows between unbounded preceding and current ROW ) as RunningTotal
from #Sales c
order by c.[Name], c.[Fruit] asc
drop table #Sales
Output:

The solution to your problem is GROUPING SETS. However, your rows are not unique. Alas, so this adds a unique value, just so you can keep your original rows:
with t as (
select t.*, row_number() over (order by (select null)) as seqnum
from t
)
select name, ,
coalesce(fruit, 'Total') as fruit,
sum(price) as price
from t
group by grouping sets ( (name, fruit, seqnum), (name) )
order by name,
(case when fruit is not null then 1 else 2 end);

Sql to select Max event on two grouped by one column

I have the following table :
+----------+------+
| country | event |
+----------+-----------+
| usa | running |
| usa | running |
| usa | running |
| canada | running |
| Canada | running |
| usa | javline |
| canada | javline |
| canada | javline |
| canada | javline |
+----------+-----------+
I want to get the following out by sql query:
USA | Running | 3
Canada | Javline | 3
i tried using the following query on MS sql server :
select country, case when c > 1 then null else event end event
from (select country, [ModelName], recs, count(*) over (partition by event, recs ) c,
row_number() over (partition by country order by recs desc) rn
from (select country, event, count(*) recs
from table
group by country, event) )
where rn = 1
order by 1
But I get an error :
Msg 102, Level 15, State 1, Line 12
Incorrect syntax near ')'.
Any pointers to correct solution is appreciated. Thanks.

You need to put an alias on your subquery:
select
country,
case when c > 1 then null else event end event
from (
select -- No event here
country,
[ModelName],
recs,
count(*) over (partition by event, recs ) c,
row_number() over (partition by country order by recs desc) rn
from (
select country, event, count(*) recs -- No ModelName here
from [table]
group by country, event
) x -- You need to put an alias here
)t -- and here
where rn = 1
order by 1
Note that the above query will still produce errors:
Invalid column name 'ModelName'.
Invalid column name 'event'.
This is because ModelName is not included in your innermost subquery and event is not included in the outermost subquery.
Based on your sample data, you can use this query to achieve the desired result:
;WITH Cte AS(
SELECT Country, Event, COUNT(*) AS CC
FROM [Table]
GROUP BY Country, Event
)
,CteRowNumber AS(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY Country ORDER BY CC DESC)
FROM Cte
)
SELECT Country, Event, CC
FROM CteRowNumber
WHERE RN = 1

You could do it using window function inside a cte:
-- this counts a number per each country and event
with q as(
select country,event,
row_number() over(partition by country,event order by country) r
from your_table t
)
--this takes only the maximum of them
select *
from q
where r=(select max(r)
from q q2
where q2.country=q.country)
Result:
| country | event | r |
|---------|---------|---|
| canada | javline | 3 |
| usa | running | 3 |

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select all data with no duplicate data - sql

Try: SELECT Name, Country, MAX(Status) as Status FROM ( SELECT TOP 100 PERCENT * FROM NameCountry ORDER BY Name ASC, Country ASC, Status DESC ) G GROUP BY G.Name, G.Country ORDER BY G.Name, G.Country Check my Demo

Related

Select multiple distinct rows

Row Number by Certain Column in SQL

There is a table having country and city columns as shown in the below table input. I need the output as mentioned below

A running summary of totals in SQL Server

Sql to select Max event on two grouped by one column

Categories

Resources