sql group by only rows which are in sequence - sql

Say I have the following table:
MyTable
---------
| 1 | A |
| 2 | A |
| 3 | A |
| 4 | B |
| 5 | B |
| 6 | B |
| 7 | A |
| 8 | A |
---------
I need the sql query to output the following:
---------
| 3 | A |
| 3 | B |
| 2 | A |
---------
Basically I'm doing a group by but only for rows which are together in the sequence. Any ideas?
Note that the database is on sql server 2008. There is a post on this topic however it uses oracle's lag() function.

This is known as the "islands" problem. Using Itzik Ben Gan's approach:
;WITH YourTable AS
(
SELECT 1 AS N, 'A' AS C UNION ALL
SELECT 2 AS N, 'A' AS C UNION ALL
SELECT 3 AS N, 'A' AS C UNION ALL
SELECT 4 AS N, 'B' AS C UNION ALL
SELECT 5 AS N, 'B' AS C UNION ALL
SELECT 6 AS N, 'B' AS C UNION ALL
SELECT 7 AS N, 'A' AS C UNION ALL
SELECT 8 AS N, 'A' AS C
),
T
AS (SELECT N,
C,
DENSE_RANK() OVER (ORDER BY N) -
DENSE_RANK() OVER (PARTITION BY C ORDER BY N) AS Grp
FROM YourTable)
SELECT COUNT(*),
C
FROM T
GROUP BY C,
Grp
ORDER BY MIN(N)

this will work for you...
SELECT
Total=COUNT(*), C
FROM
(
SELECT
NGroup = ROW_NUMBER() OVER (ORDER BY N) - ROW_NUMBER() OVER (PARTITION BY C ORDER BY N),
N,
C
FROM MyTable
)RegroupedTable
GROUP BY C,NGroup

Just for fun, without any SQL-specific functions and NOT assuming that the ID column is monotonically increasing:
WITH starters(name, minid, maxid) AS (
SELECT
a.name, MIN(a.id), MAX(a.id)
FROM
mytable a RIGHT JOIN
mytable b ON
(a.name <> b.name AND a.id < b.id)
WHERE
a.id IS NOT NULL
GROUP BY
a.name
),
both(name, minid, maxid) AS (
SELECT
name, minid, maxid
FROM
starters
UNION ALL
SELECT
name, MIN(id), MAX(id)
FROM
mytable
WHERE
id > (SELECT MAX(maxid) from starters)
GROUP BY
name
)
SELECT
COUNT(*), m.name, minid
FROM
both INNER JOIN
mytable m ON
id BETWEEN minid AND maxid
GROUP BY
m.name, minid
Result (ignore the midid column):
(No column name) name minid
3 A 1
3 B 4
2 A 7

Related

Getting all the values in one query that aren't in another with a group by

Given that I am using Redshift, how would I get the counts for a query that asks:
Given table A and table B, give me all the count of values in Table A for that grouping that aren't in table B;
So if table A and B look like:
Table A
Id | Value
==========
1 | "A"
1 | "B"
2 | "C"
And table B:
Id | Value
==========
1 | "A"
1 | "D"
2 | "C"
I would want:
Id | Count
==========
1 | 1
2 | 0
You can use left join and group by:
select a.id, sum( (b.id is null)::int )
from a left join
b
on a.id = b.id and a.value = b.value
group by a.id;
Use except and subquery
with a as
(
select 1 as id, 'A' as v
union all
select 1,'B'
union all
select 2,'C'
),b as
(
select 1 as id, 'A' as v
union all
select 1,'D'
union all
select 2,'C'
), c as
(
select id,v from a except select id,v from b
)
select id,sum ( (select count(*) from c where c.id=a.id and c.v=a.v))
from a group by id
output
id cnt
1 1
2 0
online demo which will work in redshift

In Oracle, how to select specific row while aggregating all rows

I have a requirement that I need to both aggregate all rows by id, and find 1 specific row among the rows of the same id. It's like 2 SQL queries, but I want to make it in 1 SQL query. I'm using Oracle database.
for example,table t1 whose data looks like:
id | name | num
----- -------- -------
1 | 'a' | 1
2 | 'b' | 3
2 | 'c' | 6
2 | 'd' | 6
I want to aggregate the data by the id, find the 'name' with the highest 'count', and sum all count of the id to 'total_count'.
There are 2 rows with same num, pick up the first one.
id | highest_num | name_of_highest_num | total_num | avg_num
----- ------------- --------------------- ------------ -------------------
1 | 1 | 'a' | 1 | 1
2 | 6 | 'c' | 15 | 5
Can I get this result by 1 Oracle SQL query?
Thanks in advance for any replies.
Oracle Setup:
CREATE TABLE table_name ( id, name, num ) AS
SELECT 1, 'a', 1 FROM DUAL UNION ALL
SELECT 2, 'b', 3 FROM DUAL UNION ALL
SELECT 2, 'c', 6 FROM DUAL UNION ALL
SELECT 2, 'd', 6 FROM DUAL;
Query:
SELECT id,
MAX( num ) AS highest_num,
MAX( name ) KEEP ( DENSE_RANK LAST ORDER BY num ) AS name_of_highest_num,
SUM( num ) AS total_num,
AVG( num ) AS avg_num
FROM table_name
GROUP BY id
Output:
ID HIGHEST_NUM NAME_OF_HIGHEST_NUM TOTAL_NUM AVG_NUM
-- ----------- ------------------- --------- -------
1 1 a 1 1
2 6 d 15 5
Here's one option using row_number in a subquery with conditional aggregation:
select id,
max(num) as highest_num,
max(case when rn = 1 then name end) as name_of_highest_num,
sum(num) as total_num,
avg(num) as avg_num
from (
select id, name, num,
row_number() over (partition by id order by num desc) rn
from a
) t
group by id
SQL Fiddle Demo
Sounds like you want to use some analytic functions. Something like this should work
select id,
num highest_num,
name name_of_highest_num,
total total_num,
average avg_num
from (select id,
num,
name,
rank() over (partition by id
order by num desc, name asc) rnk,
sum(num) over (partition by id) total,
avg(num) over (partition by id) average
from table t1)
where rnk = 1

SQL - Left Join many-to-many only once

I have a two tables that are setup like the following examples
tablea
ID | Name
1 | val1
1 | val2
1 | val3
2 | other1
3 | other
tableb
ID | Amount
1 | $100
2 | $50
My desired output would be to left join tableb to tablea but only join tableb once on each value. ID is the only relationship
tablea.ID | tablea.Name | tableb.id | tableb.amount
1 | val1 | 1 | $100
1 | val2
1 | val3
2 | other1 | 2 | $50
3 | other
Microsoft SQL
You can do the following:
select ROW_NUMBER() OVER(ORDER BY RowID ASC) as RowNum, ID , Name
from tablea
which gives you :
RowNum | RowID | Name
1 | 1 | val1
2 |1 | val2
3 |1 | val3
4 |2 | other1
5 |3 | other
You then get the minimum row number for each RowID:
Select RowId, min(RowNum)
From (
select ROW_NUMBER() OVER(ORDER BY RowID ASC) as RowNum, ID , Name
from tablea )
Group By RowId
Once you have this you can then join tableb onto tablea only where the RowId is the minimum
WITH cteTableA As (
select ROW_NUMBER() OVER(ORDER BY RowID ASC) as RowNum, ID , Name
from tablea ),
cteTableAMin As (
Select RowId, min(RowNum) as RowNumMin
From cteTableA
Group By RowId
)
Select a.RowID, a.Name, b.Amount
From cteTableA a
Left join cteTableAMin amin on a.RowNum = amin.RowNumMin
and a.ID = amin.RowId
Left join tableb b on amin.ID = b.ID
This can be tidied up... but helps to show whats going on.
Then you MUST specify which row in tableA you wish to join to. If there are more than one row in the other table, How can the query processor know which one you want ?
If you want the one with the lowest value of name, then you might do this:
Select * from tableB b
join tableA a
on a.id = b.Id
and a.name =
(Select min(name) from tableA
where id = b.id)
but even that won't work if there multiple rows with the same values for both id AND name. What you might really need is a Primary Key on tableA.
Use:
select
a.id,
a.name,
b.amount
from
(select
id,
name,
row_number() over (partition by id order by name) as rn
from tablea) a
left join (
select
id,
amount,
row_number() over (partition by id order by amount) as rn
from tableb) b
on a.id = b.id
and a.rn = b.rn
order by a.id, a.name

How to comapre two columns of a table in sql?

In a table there are two columns:
-----------
| A | B |
-----------
| 1 | 5 |
| 2 | 1 |
| 3 | 2 |
| 4 | 1 |
-----------
Want a table where if A=B then
-------------------
|Match | notMatch|
-------------------
| 1 | 5 |
| 2 | 3 |
| Null | 4 |
-------------------
How can i do this?
I tried something which shows the Matched part
select distinct C.A as A from Table c inner join Table d on c.A=d.B
Try this:
;WITH TempTable(A, B) AS(
SELECT 1, 5 UNION ALL
SELECT 2, 1 UNION ALL
SELECT 3, 2 UNION ALL
SELECT 4, 1
)
,CTE(Val) AS(
SELECT A FROM TempTable UNION ALL
SELECT B FROM TempTable
)
,Match AS(
SELECT
Rn = ROW_NUMBER() OVER(ORDER BY Val),
Val
FROM CTE c
GROUP BY Val
HAVING COUNT(Val) > 1
)
,NotMatch AS(
SELECT
Rn = ROW_NUMBER() OVER(ORDER BY Val),
Val
FROM CTE c
GROUP BY Val
HAVING COUNT(Val) = 1
)
SELECT
Match = m.Val,
NotMatch= n.Val
FROM Match m
FULL JOIN NotMatch n
ON n.Rn = m.Rn
Try with EXCEPT, MINUS and INTERSECT Statements.
like this:
SELECT A FROM TABLE1 INTERSECT SELECT B FROM TABLE1;
You might want this:
SELECT DISTINCT
C.A as A
FROM
Table c
LEFT OUTER JOIN
Table d
ON
c.A=d.B
WHERE
d.ID IS NULL
Please Note that I use d.ID as an example because I don't see your schema. An alternate is to explicitly state all d.columns IS NULL in WHERE clause.
Your requirement is kind of - let's call it - interesting. Here is a way to solve it using pivot. Personally I would have chosen a different table structure and another way to select data:
Test data:
DECLARE #t table(A TINYINT, B TINYINT)
INSERT #t values
(1,5),(2,1),
(3,2),(4,1)
Query:
;WITH B AS
(
( SELECT A FROM #t
EXCEPT
SELECT B FROM #t)
UNION ALL
( SELECT B FROM #t
EXCEPT
SELECT A FROM #t)
), A AS
(
SELECT A val
FROM #t
INTERSECT
SELECT B
FROM #t
), combine as
(
SELECT val, 'A' col, row_number() over (order by (select 1)) rn FROM A
UNION ALL
SELECT A, 'B' col, row_number() over (order by (select 1)) rn
FROM B
)
SELECT [A], [B]
FROM combine
PIVOT (MAX(val) FOR [col] IN ([A], [B])) AS pvt
Result:
A B
1 3
2 4
NULL 5

Oracle SQL -- Combining two tables, but taking duplicates from one?

I have these tables:
Table A
Num Letter
1 A
2 B
3 C
Table B
Num Letter
2 C
3 D
4 E
I want to union these two tables, but I only want each number to appear once. If the same number appears in both tables, I want it from Table B instead of table A.
Result
Num Letter
1 A
2 C
3 D
4 E
How could I accomplish this? A union will keep duplicates and an intersect would only catch the same rows -- I consider a row a duplicate when it has the same number, regardless of the letter.
Try this: http://www.sqlfiddle.com/#!4/0b796/1
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select
Num
,case when count(*) > 1 then
min(case when src = 'B' then Letter end)
else
min(Letter)
end as Letter
from a
group by Num
order by Num;
Output:
| NUM | LETTER |
----------------
| 1 | A |
| 2 | C |
| 3 | D |
| 4 | E |
And another one:
SELECT COALESCE(b.num, a.num) num, COALESCE(b.letter, a.letter) letter
FROM a FULL JOIN b ON a.num = b.num
ORDER BY 1;
With your data:
WITH a AS
(SELECT 1 num, 'A' letter FROM dual
UNION ALL SELECT 2, 'B' FROM dual
UNION ALL SELECT 3, 'C' FROM dual),
b AS
(SELECT 2 num, 'C' letter FROM dual
UNION ALL SELECT 3, 'D' FROM dual
UNION ALL SELECT 4, 'E' FROM dual)
SELECT COALESCE(b.num, a.num) num, COALESCE(b.letter, a.letter) letter
FROM a FULL JOIN b ON a.num = b.num
ORDER BY 1;
NUM L
---------- -
1 A
2 C
3 D
4 E
The efficiency might be lacking, but it produces the correct answer.
select nums.num, coalesce(b.letter, a.letter)
from
(select num from b
union
select num from a) nums
left outer join b
on (b.num = nums.num)
left outer join a
on (a.num = nums.num);
Or you can use Oracle-specific technique to make the code shorter: http://www.sqlfiddle.com/#!4/0b796/11
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select Num, min(Letter) keep(dense_rank first order by src desc) as Letter
from a
group by Num
order by Num;
Output:
| NUM | LETTER |
----------------
| 1 | A |
| 2 | C |
| 3 | D |
| 4 | E |
The code works regardless of min(letter) or max(letter), it has the same output, it gives the same output. Important is you use keep dense_rank. Another important thing is, the order matter, we use order by src desc to give priority to source table B when keeping a row.
And to really make it shorter, use keep dense_rank last, and omit the desc on order by, asc is the default anyway http://www.sqlfiddle.com/#!4/0b796/12
with a as
(
select Num, 'A' as src, Letter
from tblA
union
select Num, 'B' as src, Letter
from tblB
)
select Num, min(Letter) keep(dense_rank last order by src) as Letter
from a
group by Num
order by Num;
Again, using min or max on Letter doesn't matter, as long as your keep dense_rank get the prioritized/preferred row
Another option is to combine the UNION and MINUS commands as follows:
SELECT
NUM, LETTER
FROM
TABLE B
UNION
( SELECT
NUM, LETTER
FROM
TABLE A
WHERE
NUM IN (SELECT
NUM
FROM
TABLE A
MINUS
SELECT
NUM
FROM
TABLE B ))
SELECT A.*
FROM A
WHERE A.NUM NOT IN
(SELECT A.NUM
FROM B
WHERE A.NUM=B.NUM
AND B.NUM IS NOT NULL
AND A.NUM IS NOT NULL
)
UNION
SELECT * FROM B;