Expanding information from one row to all similarly grouped rows in SQL

Expanding information from one row to all similarly grouped rows in SQL - sql

I am not sure of the logic required to accomplish this, but I want to take a table like this...
+----+------+
| Id | Type |
+----+------+
| 10 | A |
| 10 | B |
| 10 | C |
| 20 | A |
| 20 | C |
+----+------+
...and end up with a table like this...
+----+------+---+---+---+
| Id | Type | A | B | C |
+----+------+---+---+---+
| 10 | A | 1 | 1 | 1 |
| 10 | B | 1 | 1 | 1 |
| 10 | C | 1 | 1 | 1 |
| 20 | A | 1 | 0 | 1 |
| 20 | C | 1 | 0 | 1 |
+----+------+---+---+---+
...where each Id will have new columns created to consolidate information about Type into every row of that Id. Since 10 has a row of types A, B, and C, then all rows that have an ID of 10 should have a 1/true in the new columns A, B and C.
I know how to do this on a per-row basis, but can't wrap my head around how to consolidate the information from multiple rows into each row of the same ID.

Try this below logic- Demo
SELECT *,
(SELECT COUNT(DISTINCT Type) FROM your_table B WHERE B.ID = A.Id and B.Type = 'A') A,
(SELECT COUNT(DISTINCT Type) FROM your_table C WHERE C.ID = A.Id and C.Type = 'B') B,
(SELECT COUNT(DISTINCT Type) FROM your_table D WHERE D.ID = A.Id and D.Type = 'C') C
FROM your_table A
And just another option- Demo
SELECT *,
SUM(CASE WHEN Type= 'A' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) A,
SUM(CASE WHEN Type= 'B' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) B,
SUM(CASE WHEN Type= 'C' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) C
FROM your_table

Related

SQL to get count of distinct rows based on different rules

Say you have a table like:
| key | status |
| --- | ------ |
| 3 | A |
| 4 | A |
| 4 | C |
| 5 | B |
| 6 | B |
| 6 | C |
| 7 | A |
| 7 | B |
I want a query that returns, in a single row, the count of the number of rows that contain a specific status, but applying some priority rules. The rules would be different for each row and something like:
Column a_count = count of any distinct key that has a status of A
Column b_count = count of any distinct key that has a status of B, but where the same key does not also appear with a status of A
Column c_count = count of any distinct key that has a status of C, but where the same key does not also appear with a status of A or B
The point being that the total of all counts should equal the total number of distinct keys in the source table. In my sample data above, the results should be:
| a_count | b_count | c_count |
| ------- | ------- | ------- |
| 3 | 2 | 0 |

should be able to do your pivot with case statements and not exists.
SELECT Count (CASE
WHEN status = 'A' THEN 1
ELSE 0
END) AS a_count,
Count (CASE
WHEN status = 'B'
AND NOT EXISTS (SELECT 1
FROM mytable b
WHERE a.KEY = b.KEY
AND b.status = 'A') THEN 1
ELSE 0
END) AS b_count,
Count (CASE
WHEN status = 'C'
AND NOT EXISTS (SELECT 1
FROM mytable c
WHERE a.KEY = c.KEY
AND c.status IN ( 'A', 'B' )) THEN 1
ELSE 0
END) AS c_count
FROM mytable a

For every value in row c in d, return rows with maximum value of a

I have 4 columns a ,b ,c, d
sample data
a | b | c | d |
1 | 1 | 101 | 0
2 | 1 | 101 | 0
3 | 1 | 101 | 1
4 | 1 | 102 | 0
5 | 1 | 102 | 0
1 | 2 | 101 | 0
2 | 2 | 101 | 1
Write a SQL command such that it should return those rows where for every value of c in b, return rows with maximum a
i.e
Expect output
a | b | c | d |
3 | 1 | 101 | 1
5 | 1 | 102 | 0
2 | 2 | 101 | 1

You can use a correlated subquery:
select t.*
from t
where t.a = (select max(t2.a) from t t2 where t2.b = t.b and t2.c = t.c);
With an index on t(b, c, a), this often has the best performance.
An alternative is window functions:
select t.*
from (select t.*, row_number() over (partition by b, c order by a desc) as seqnum
from t
) t
where seqnum = 1;

You don't mention the database you are using. In PostgreSQL you can do:
select distinct on (b, c) a, b, c, d
from t
order by b, c, a desc

SQL select distinct when one column in and another column greater than

Consider the following dataset:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 1 | a | 0.2 |
| 1 | b | 8 |
| 1 | c | 3.5 |
| 1 | d | 2.2 |
| 2 | b | 4 |
| 2 | c | 0.5 |
| 2 | d | 6 |
| 3 | a | 2 |
| 3 | b | 4 |
| 3 | c | 3.6 |
| 3 | d | 0.2 |
+---------------------+
I'm tying to develop a sql select statement that returns the top or distinct ID where NAME 'a' and 'b' both exist and both of the corresponding VALUE's are >= '1'. Thus, the desired output would be:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 3 | a | 2 |
+----+-------+--------+
Appreciate any assistance anyone can provide.

You can try to use MIN window function and some condition to make it.
SELECT * FROM (
SELECT *,
MIN(CASE WHEN NAME = 'a' THEN [value] end) OVER(PARTITION BY ID) aVal,
MIN(CASE WHEN NAME = 'b' THEN [value] end) OVER(PARTITION BY ID) bVal
FROM T
) t1
WHERE aVal >1 and bVal >1 and aVal = [Value]
sqlfiddle

This seems like a group by and having query:
select id
from t
where name in ('a', 'b')
having count(*) = 2 and
min(value) >= 1;
No subqueries or joins are necessary.
The where clause filters the data to only look at the "a" and "b" records. The count(*) = 2 checks that both exist. If you can have duplicates, then use count(distinct name) = 2.
Then, you want the minimum value to be 1, so that is the final condition.
I am not sure why your desired results have the "a" row, but if you really want it, you can change the select to:
select id, 'a' as name,
max(case when name = 'a' then value end) as value

you can use in and sub-query
select top 1 * from t
where t.id in
(
select id from t
where name in ('a','b')
group by id
having sum(case when value>1 then 1 else 0)>=2
)
order by id

Get ID if table has one or more row exist for a condition

Suppose I have a table as below:
ID | Account| Status
---+--------+-------
1 | acct1 | A
1 | acct2 | S
1 | acct3 | C
2 | acct4 | C
2 | acct5 | C
3 | acct6 | A
3 | acct7 | C
4 | acct8 | C
4 | acct9 | C
4 | acct10 | C
Condition: return ID if accounts do not have any 'A' and 'S' status.
For this case, I only want ID '2' and '4' to be returned.

You could use HAVING and conditional SUM:
SELECT ID
FROM tab
GROUP BY ID
HAVING SUM(CASE WHEN Status IN ('A', 'S') THEN 1 ELSE 0 END) = 0

First select id which record don't have 'A' and 'S'. Then get distinct record:
Select distinct(ID) as ID
from table_name where id not in
(
select ID from table_name where status in('A', 'S')
)

Merge multiple rows in SQL with tie breaking on primary key

I have a table with data like the following
key | A | B | C
---------------------------
1 | x | 0 | 1
2 | x | 2 | 0
3 | x | NULL | 4
4 | y | 7 | 1
5 | y | 3 | NULL
6 | z | NULL | 4
And I want to merge the rows together based on column A with largest primary key being the 'tie breaker' between values that are not NULL
Result
key | A | B | C
---------------------------
1 | x | 2 | 4
2 | y | 3 | 1
3 | z | NULL | 4
What would be the best way to achieve this assuming my data is actually 40 columns and 1 million rows with an unknown level of duplications?

Using ROW_NUMBER and conditional aggregation:
SQL Fiddle
WITH cte AS(
SELECT *,
rnB = ROW_NUMBER() OVER(PARTITION BY A ORDER BY CASE WHEN B IS NULL THEN 0 ELSE 1 END DESC, [key] DESC),
rnC = ROW_NUMBER() OVER(PARTITION BY A ORDER BY CASE WHEN C IS NULL THEN 0 ELSE 1 END DESC, [key] DESC)
FROM tbl
)
SELECT
[key] = ROW_NUMBER() OVER(ORDER BY A),
A,
B = MAX(CASE WHEN rnB = 1 THEN B END),
C = MAX(CASE WHEN rnC = 1 THEN C END)
FROM cte
GROUP BY A

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Expanding information from one row to all similarly grouped rows in SQL - sql

Related

SQL to get count of distinct rows based on different rules

For every value in row c in d, return rows with maximum value of a

SQL select distinct when one column in and another column greater than

Get ID if table has one or more row exist for a condition

Merge multiple rows in SQL with tie breaking on primary key

Categories

Resources