How to select distinct rows with a specified condition - sql

Suppose there is a table
_ _
a 1
a 2
b 2
c 3
c 4
c 1
d 2
e 5
e 6
How can I select distinct minimum value of all the rows of each group?
So the expected result here is:
_ _
a 1
b 2
c 1
d 2
e 5
EDIT
My actual table contains more columns and I want to select them all. The rows differ only in the last column (the second one in the example). I'm new to SQL and possibly my question is ill-formed in it initial view.
The actual schema is:
| day | currency ('EUR', 'USD') | diff (integer) | id (foreign key) |
The are duplicate pairs (day, currency) that differ by (diff, id). I want to see a table with uniquer pairs (day, currency) with a minimum diff from the original table.
Thanks!

in your case it's as simple as this:
select column1, min(column2) as column2
from table
group by column1
for more than two columns I can suggest this:
select top 1 with ties
t.column1, t.column2, t.column3
from table as t
order by row_number() over (partition by t.column1 order by t.column2)
take a look at this post https://stackoverflow.com/a/13652861/1744834

You can use the ranking function ROW_NUMBER() to do this with a CTE. Especially, if there are more column other than these two column, it will give the distict values like so:
;WITH RankedCTE
AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY Colmn2 ) rownum
FROM Table
)
SELECT column1, column2
FROM RankedCTE
WHERE rownum = 1;
This will give you:
COLUMN1 COLUMN2
a 1
b 2
c 1
d 2
e 5
SQL Fiddle Demo

SELECT ColOne, Min(ColTwo)
FROM Table
GROUP BY ColOne
ORDER BY ColOne
PS: not front of a,machine, but give above a try please.

select MIN(col2),col1
from dbo.Table_1
group by col1

Related

How to display records in SQL Server that have same value in Column 2 but value in Column 1 should not be present in any other record?

I have a below table like this -
Policy Column1 Column2
A 4 100
B 4 100
C 3 100
D 3 100
E 2 100
F 5 100
The Output should be
Policy Column1 Column 2
E 2 100
F 5 100
Can someone please guide me.
Assuming you're after rows where the value in column1 is unique (i.e. no other row has that same value in column 1)
select max(Policy) Policy -- if we're just getting 1 row, the max is the only value
, Column1
, max(Column2) Column2 -- as above
from myTable
group by Column1 -- group by column 1 then
having count(1) = 1 -- count how many rows are in that group; if it's unique we get 1
Select * from policies
where Column1 in (select Column1
from policies
group by Column1
having count(*)=1);
DBFiddle demo is here
You want to divide the data into chunk via partition by. Then you can determine which of them has only a single row.
with data as (
select *, count(*) over (partition by Column1, Column2) as cnt
from T
)
select Policy, Column1, Column2 from data where cnt = 1;
This kind of query was harder to write twenty years ago. If you're learning SQL I'd encourage you to get a handle on some of the foundational concepts before diving into table expressions and analytic functions.

SQL Joining two tables and removing the duplicates from the two tables but without loosing any duplicates from the tables itslef

I want to join two tables and remove duplicates from both the tables but keeping any duplicate value found in the first table.
T1
Name
-----
A
A
B
C
T2
Name
----
A
D
E
Expected result
A - > FROM T1
A - > FROM T1
B
C
D
E
I tried union but removes all duplicates of 'A' from both tables.
How can I achieve this?
Filter T2 before UNION ALL
select col
from T1
union all
select col
from T2
where not exists (select 1 from T1 where T1.col = T2.col)
Assuming you want the number of duplicates from the table with the most repetitions for each value, you can do it with the ROW_NUMBER() windowing function, to eliminate duplicates by their sequence with the set of repetitions in each table.
SELECT Name FROM (
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
) x
ORDER BY Name
To see how this works out, we add two B rows to T2 then do this:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
Name Row
A 1
A 2
B 1
C 1
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
B 1
B 2
D 1
E 1
Now UNION them without ALL to combine and eliminate duplicates:
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T1
UNION
SELECT Name, ROW_NUMBER() OVER ( PARTITION BY Name ORDER BY Name ) AS Row
FROM T2
Name Row
A 1
A 2
B 1
B 2
C 1
D 1
E 1
The final query up top is then just eliminating the Row column and sorting the result, to ensure ascending order.
See SQL Fiddle for demo.
select * from T1
union all
select * from T2 where name not in (select distinct name from T1)
Sql Fiddle Demo
you should use "union all" instead of "union".
"union" remove other duplicated records while "union all" gives all of them.
for you result,because of we filtered intersects from table 2 in "where",we don't need "UNION ALL"
select col1 from t1
union
select col1 from t2 where t2.col1 not in(select t1.col1 from t1)
I D'not know the following code is good practice or not But it's working
select name from T1
UNION
select name from T2 Where name not in (select name from T1)
The Above Query Filter the value based on T1 value and then join two tables values and show the result.
I hope it's helps you thanks.
Note : It's not better way to get result it's affect your performance.
I sure i update the better solution after my research
You want all names from T1 and all names from T2 except the names that are in T1.
So you can use UNION ALL for the 2 cases and the operator EXCEPT to filter the rows of T2:
SELECT Name FROM T1
UNION ALL
(
SELECT Name FROM T2
EXCEPT
SELECT Name FROM T1
)
See the demo.
Results:
> | Name |
> | :--- |
> | A |
> | A |
> | B |
> | C |
> | D |
> | E |

Recursive Lag Column Calculation in SQL

I am trying to write a procedure that inserts calculated table data into another table.
The problem I have is that I need each row's calculated column to be influenced by the result of the previous row's calculated column. I tried to lag the calculation itself but this does not work!
Such as:
(Max is a function I created that returns the highest of two values)
Id Product Model Column1 Column2
1 A 1 5 =MAX(Column1*2, Lag(Column2))
2 A 2 2 =MAX(Column1*2, Lag(Column2))
3 B 1 3 =MAX(Column1*2, Lag(Column2))
If I try the above in SQL:
SELECT
Column1,
MyMAX(Column1,LAG(Column2, 1, 0) OVER (PARTITION BY Product ORDER BY Model ASC) As Column2
FROM Source
...it says column2 is unknown.
Output I get if I LAG the Column2 calculation:
Select Column1, MyMAX(Column1,LAG(Column1*2, 1, 0) OVER (PARTITION BY Product ORDER BY Model ASC) As Column2
Id Column1 Column2
1 5 10
2 2 10
3 3 6
Why 6 on row 3? Because 3*2 > 2*2.
Output that I want:
Id Column1 Column2
1 5 10
2 2 10
3 3 10
Why 10 on row 3? Because previous result of 10 > 3*2
The problem is I can't lag the result of Column2 - I can only lag other columns or calculations of them!
Is there a technique of achieving this with LAG or must I use Recursive CTE? I read that LAG succeeds CTE so I assumed it would be possible. If not, what would this 'CTE' look like?
Edit: Or alternatively - what else could I do to resolve this calculation?
Edit
In hindsight, this problem is a running partitioned maximum over Column1 * 2. It can be done as simply as
SELECT Id, Column1, Model, Product,
MAX(Column1 * 2) OVER (Partition BY Model, Product Order BY ID ASC) AS Column2
FROM Table1;
Fiddle
Original Answer
Here's a way to do this with a recursive CTE, without LAG at all, by joining on incrementing row numbers. I haven't assumed that your Id is contiguous, hence have added an additional ROW_NUMBER(). You haven't mentioned any partitioning, so haven't applied same. The query simply starts at the first row, and then projects the greater of the current Column1 * 2, or the preceding Column2
WITH IncrementingRowNums AS
(
SELECT Id, Column1, Column1 * 2 AS Column2,
ROW_NUMBER() OVER (Order BY ID ASC) AS RowNum
FROM Table1
),
lagged AS
(
SELECT Id, Column1, Column2, RowNum
FROM IncrementingRowNums
WHERE RowNum = 1
UNION ALL
SELECT i.Id, i.Column1,
CASE WHEN (i.Column2 > l.Column2)
THEN i.Column2
ELSE l.Column2
END,
i.RowNum
FROM IncrementingRowNums i
INNER JOIN lagged l
ON i.RowNum = l.RowNum + 1
)
SELECT Id, Column1, Column2
FROM lagged;
SqlFiddle here
Edit, Re Partitions
Partitioning is much the same, by just dragging the Model + Product columns through, then partitioning by these in the row numbering (i.e. starting back at 1 each time the Product or Model resets), including these in the CTE JOIN condition and also in the final ordering.
WITH IncrementingRowNums AS
(
SELECT Id, Column1, Column1 * 2 AS Column2, Model, Product,
ROW_NUMBER() OVER (Partition BY Model, Product Order BY ID ASC) AS RowNum
FROM Table1
),
lagged AS
(
SELECT Id, Column1, Column2, Model, Product, RowNum
FROM IncrementingRowNums
WHERE RowNum = 1
UNION ALL
SELECT i.Id, i.Column1,
CASE WHEN (i.Column2 > l.Column2)
THEN i.Column2
ELSE l.Column2
END,
i.Model, i.Product,
i.RowNum
FROM IncrementingRowNums i
INNER JOIN lagged l
ON i.RowNum = l.RowNum + 1
AND i.Model = l.Model AND i.Product = l.Product
)
SELECT Id, Column1, Column2, Model, Product
FROM lagged
ORDER BY Model, Product, Id;
Updated Fiddle

Omit duplicate rows then pick the surviving row base on certain criteria

So I have the following data in my table
id id2 flag
1 11 0 <- this row should not be part of the result
1 12 1 <- this row should survive the distinct operation
2 13 0
3 14 0
I want my result to be
id id2 flag
1 12 1
2 13 0
3 14 0
How would I construct a query like such?
Thanks
EDIT1: Sorry, using two column dummy data doesn't correctly reflect the problem I am facing. I added another column, which complicates the problem. As you can see I can't group on id2 because they are all unique. But the row with id2 = 11 should be omitted from the result.
EDIT2: Changed the question to use 'omit' instead of 'remove'
EDIT3:
select id, id2, max(flag)
from table
group by id, id2
This query returns all 4 rows because group by id2 includes all 4 rows.
When you want to apply additional criteria to the data, you typically use GROUP BY instead of DISTINCT. For example, if you would like to keep flag of 1 if it exists, or keep zero otherwise, you can do this:
SELECT id, MAX(flag) as flag -- Since 1 > 0, MAX() works fine
FROM myTable
GROUP BY id -- This keeps only distinct ids
EDIT : (in response to edits #2&3)
Another solution would be using NOT EXISTS in a subquery, like this:
SELECT id, id2, flag
FROM myTable o
WHERE NOT EXISTS (
SELECT * FROM myTable i WHERE o.id=i.id AND i.flag > o.flag
)
;with CTE as
(
select
row_number() over (partition by id order by flag desc) as rn,
id,
id2,
flag
from myTable
)
SELECT * from CTE where rn = 1

get subset of a table in SQL

I want to get a subset of a table, here's the example:
1 A
2 A
3 B
4 B
5 C
6 D
7 D
8 D
I want to get the unique record, but with the smallest id:
1 A
3 B
5 C
6 D
How can I write the SQL in SQL Server? Thanks!
Use a common-table expression like this:
;WITH DataCTE AS
(
SELECT ID, OtherCol,
ROW_NUM() OVER(PARTITION BY OtherCol ORDER BY ID) 'RowNum'
FROM dbo.YourTable
)
SELECT *
FROM DataCTE
WHERE RowNum = 1
This "partitions" your data by the second column you have (A, B, C) and orders by the ID (1, 2, 3) - smallest ID first.
Therefore, for each "partition" (i.e. each value of your second column), the entry with RowNum = 1 is the one with the smallest ID for each value of the second column.
select min(id), othercol
from thetable
group by othercol
and maybe with
order by othercol
... at the end if thats important
Try this:
SELECT MIN(Id) AS Id, Name
FROM MyTable
GROUP BY Name
select min(id), column2
from table
group by column2
It helps if you provide the table information in the question - I've just guessed at the column names...