get subset of a table in SQL - sql

I want to get a subset of a table, here's the example:
1 A
2 A
3 B
4 B
5 C
6 D
7 D
8 D
I want to get the unique record, but with the smallest id:
1 A
3 B
5 C
6 D
How can I write the SQL in SQL Server? Thanks!

Use a common-table expression like this:
;WITH DataCTE AS
(
SELECT ID, OtherCol,
ROW_NUM() OVER(PARTITION BY OtherCol ORDER BY ID) 'RowNum'
FROM dbo.YourTable
)
SELECT *
FROM DataCTE
WHERE RowNum = 1
This "partitions" your data by the second column you have (A, B, C) and orders by the ID (1, 2, 3) - smallest ID first.
Therefore, for each "partition" (i.e. each value of your second column), the entry with RowNum = 1 is the one with the smallest ID for each value of the second column.

select min(id), othercol
from thetable
group by othercol
and maybe with
order by othercol
... at the end if thats important

Try this:
SELECT MIN(Id) AS Id, Name
FROM MyTable
GROUP BY Name

select min(id), column2
from table
group by column2
It helps if you provide the table information in the question - I've just guessed at the column names...

Related

SQL script to identify row based on min value

How to write a SQL statement (in SQL Server) to get a row with minimum value based on two columns?
For example:
Type Rank Val1 val2
------------------------------
A 6 486.57 38847
B 6 430 56345
C 5 390 99120
D 5 329 12390
E 4 350 11109
E 4 320 11870
The SQL statement should return the last row in above table, because it has min value for Rank, and Val1.
Something like this:
select *
from Table1
where rank = (select min(rank) from Table1)
and Val1 = (select min(Val1)
from Table1
where rank = (select min(rank) from Table1))
Or this, if you like a simple life:
select top 1 *
from Table1
order by rank asc, Val1 asc
with cte as (
select *, row_number() over (order by rank, val1) as rn
from dbo.yourTable
)
select *
from cte
where rn = 1;
The idea here is that I'm assigning a 1..n enumeration to the rows based on rank and, in the case of ties, Val1. I return the row that takes the value of 1. If there is the possibility of a tie, use rank() instead of row_number().
I'm assuming that Type is the primary key for your table, and that you only want a row that has both the lowest Val1 and lowest Val2 (so if one row has the lowest Val1, but not the lowest Val2, this returns no data). I'm not sure about these assumptions, but your question could probably be clarified a bit.
Here's the code:
SELECT
*
FROM
Table1
WHERE
Type IN
(
SELECT
Type
FROM
Table1
GROUP BY
Type
HAVING
MIN(Val1) AND MIN(val2)
)

Keep Track of already summed tuples sql

If we have a table with values for a and b, is there a way to only add up the b's if its not a duplicate a? For example
a b
1 2
2 3
2 3
so we would get only 5 (instead of 8)
A sort of
select sum(b if unique a),
from table
where ...
The following query selects the lowest value of b for each group a
select min(b) min_b
from mytable
group by a
You can then sum those values by selecting the sum from a derived table
select sum(min_b) from (
select min(b) min_b
from mytable
group by a
) t
http://sqlfiddle.com/#!9/d82c5/1
You haven't specified your RDBMS, but if you are using a database which supporting window functions like SQL Server, you can query the unique rows first by using WITH clause and ROW_NUMBER() function and then get the SUM out of that.
;WITH C AS(
SELECT a, b,
ROW_NUMBER() OVER (PARTITION BY a ORDER BY a) AS Rn
FROM Table1
)
SELECT SUM(b) FROM C
WHERE Rn = 1
SQL Fiddle

How to create "subsets" as a result from a column in SQL

Let's suppose that I've got as a result from one query the next set of values of one column:
Value
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
9 I
10 J
Now, I would like to see this information with another order, establishing a limit to the number of values of every single subset. Now suppose that I choose 3 as a limit,the information will be given like this (one column for all the subsets):
Values
1 A, B, C
2 D, E, F
3 G, H, I
4 J,
Obviously, the last row will contain the remaining values when their number is smaller than the limit established.
Is it possible to perform a query like this in SQL?
What about if the limit is dynamic?. It can be chosen randomly.
create table dee_t (id int identity(1,1),name varchar(10))
insert into dee_t values ('A'),('B'),('c'),('D'),('E'),('F'),('g'),('H'),('I'),('J')
;with cte as
(
select (id-1)/3 +1 rno ,* from dee_t
) select rno ,
(select name+',' from cte where rno = c.rno for xml path('') )
from cte c group by rno
You can do this by using few calculations with row_number, like this:
select
GRP,
max(case when RN = 1 then Value end),
max(case when RN = 2 then Value end),
max(case when RN = 0 then Value end)
from (
select
row_number() over (order by Value) % 3 as RN,
(row_number() over (order by Value)+2) / 3 as GRP,
Value
from Table1
) X
group by GRP
The first row_number creates numbers for the columns (1,2,0,1,2,0...) and the second one creates numbers for the rows (1,1,1,2...). Those are then used to group the values into correct place using case, but you can also use pivot instead of it if you like it more.
If you want them into same column, of course just concatenate the cases instead of selecting them on different columns, but beware of nulls.
Example in SQL Fiddle
Thanks a lot for all your reply. Finally I've got a Solution with the help of Rajen Singh
This is the code than can be used:
WITH CTE_0 AS
(
SELECT DISTINCT column_A_VALUE AS id
FROM Table
WHERE column_A_VALUE IS NOT NULL
), CTE_1 AS
(
SELECT ROW_NUMBER() OVER (ORDER BY id) RN, id
FROM CTE_0
), CTE_2 AS
(
SELECT RN%30 GROUP, ID
FROM CTE_1
)SELECT STUFF(( SELECT ','''+CAST(ID AS NVARCHAR(20))+''''
FROM CTE_2
WHERE GROUP = A.GROUP
FOR XML PATH('')),1,1,'') IDS
FROM CTE_2 A
GROUP BY GROUP

Merge multiple rows with same ID into one row

How can I merge multiple rows with same ID into one row.
When value in first and second row in the same column is the same or when there is value in first row and NULL in second row.
I don't want to merge when value in first and second row in the same column is different.
I have table:
ID |A |B |C
1 NULL 31 NULL
1 412 NULL 1
2 567 38 4
2 567 NULL NULL
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
I want to get table:
ID |A |B |C
1 412 31 1
2 567 38 4
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
I think there's a simpler solution to the above answers (which is also correct). It basically gets the merged values that can be merged within a CTE, then merges that with the data not able to be merged.
WITH CTE AS (
SELECT
ID,
MAX(A) AS A,
MAX(B) AS B,
MAX(C) AS C
FROM dbo.Records
GROUP BY ID
HAVING MAX(A) = MIN(A)
AND MAX(B) = MIN(B)
AND MAX(C) = MIN(C)
)
SELECT *
FROM CTE
UNION ALL
SELECT *
FROM dbo.Records
WHERE ID NOT IN (SELECT ID FROM CTE)
SQL Fiddle: http://www.sqlfiddle.com/#!6/29407/1/0
WITH Collapsed AS (
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
)
SELECT
*
FROM
Collapsed
UNION ALL
SELECT
*
FROM
dbo.MyTable T
WHERE
NOT EXISTS (
SELECT *
FROM Collapsed C
WHERE T.ID = C.ID
);
See this working in a SQL Fiddle
This works by creating all the mergeable rows through the use of Min and Max--which should be the same for each column within an ID and which usefully exclude NULLs--then appending to this list all the rows from the table that couldn't be merged. The special trick with EXISTS ... INTERSECT allows for the case when a column has all NULL values for an ID (and thus the Min and Max are NULL and can't equal each other). That is, it functions like Min(A) = Max(A) AND Min(B) = Max(B) AND Min(C) = Max(C) but allows for NULLs to compare as equal.
Here's a slightly different (earlier) solution I gave that may offer different performance characteristics, and being more complicated, I like less, but being a single flowing query (without a UNION) I kind of like more, too.
WITH Collapsible AS (
SELECT
ID
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
), Calc AS (
SELECT
T.*,
Grp = Coalesce(C.ID, Row_Number() OVER (PARTITION BY T.ID ORDER BY (SELECT 1)))
FROM
dbo.MyTable T
LEFT JOIN Collapsible C
ON T.ID = C.ID
)
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
Calc
GROUP BY
ID,
Grp
;
This is also in the above SQL Fiddle.
This uses similar logic as the first query to calculate whether a group should be merged, then uses this to create a grouping key that is either the same for all rows within an ID or is different for all rows within an ID. With a final Min (Max would have worked just as well) the rows that should be merged are merged because they share a grouping key, and the rows that shouldn't be merged are not because they have distinct grouping keys over the ID.
Depending on your data set, indexes, table size, and other performance factors, either of these queries may perform better, though the second query has some work to do to catch up, with two sorts instead of one.
You can try something like this:
select
isnull(t1.A, t2.A) as A,
isnull(t1.B, t2.B) as B,
isnull(t1.C, t2.C) as C
from
table_name t1
join table_name t2 on t1.ID = t2.ID and .....
You mention the concepts of first and second. How do
you define this order? Place that order defining condition
in here: .....
Also, I assume you have exactly 2 rows for each ID value.

How to select distinct rows with a specified condition

Suppose there is a table
_ _
a 1
a 2
b 2
c 3
c 4
c 1
d 2
e 5
e 6
How can I select distinct minimum value of all the rows of each group?
So the expected result here is:
_ _
a 1
b 2
c 1
d 2
e 5
EDIT
My actual table contains more columns and I want to select them all. The rows differ only in the last column (the second one in the example). I'm new to SQL and possibly my question is ill-formed in it initial view.
The actual schema is:
| day | currency ('EUR', 'USD') | diff (integer) | id (foreign key) |
The are duplicate pairs (day, currency) that differ by (diff, id). I want to see a table with uniquer pairs (day, currency) with a minimum diff from the original table.
Thanks!
in your case it's as simple as this:
select column1, min(column2) as column2
from table
group by column1
for more than two columns I can suggest this:
select top 1 with ties
t.column1, t.column2, t.column3
from table as t
order by row_number() over (partition by t.column1 order by t.column2)
take a look at this post https://stackoverflow.com/a/13652861/1744834
You can use the ranking function ROW_NUMBER() to do this with a CTE. Especially, if there are more column other than these two column, it will give the distict values like so:
;WITH RankedCTE
AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY column1 ORDER BY Colmn2 ) rownum
FROM Table
)
SELECT column1, column2
FROM RankedCTE
WHERE rownum = 1;
This will give you:
COLUMN1 COLUMN2
a 1
b 2
c 1
d 2
e 5
SQL Fiddle Demo
SELECT ColOne, Min(ColTwo)
FROM Table
GROUP BY ColOne
ORDER BY ColOne
PS: not front of a,machine, but give above a try please.
select MIN(col2),col1
from dbo.Table_1
group by col1