SQL Server Pivot Assistance - sql

SQL Server. It may be solved by using Pivot.
I have data (all are in string):
X Y Z ---Heading
A a p
A b q
A c r
B a s
B b t
B c u
I want output:
a b c ---Heading
A p q r
B s t u

You can try using case expression. here is the demo.
select
X,
max(case when Y = 'a' then Z end) as a,
max(case when Y = 'b' then Z end) as b,
max(case when Y = 'c' then Z end) as c
from myTable
group by
X
order by
X;
output:
| x | a | b | c |
| --- | --- | --- | --- |
| A | p | q | r |
| B | s | t | u |

Related

sql - Deletion in closure table with multiple same paths

I have the following hierarchical structure:
A -> E -> C -> D
|
|
|-> B -> D
Here is the closure table I've come up with:
| Ancestor | Descendant | Depth |
| A | A | 0 |
| B | B | 0 |
| C | C | 0 |
| D | D | 0 |
| E | E | 0 |
| A | E | 1 |
| A | B | 1 |
| A | C | 2 |
| E | C | 1 |
| A | D | 3 |
| E | D | 2 |
| C | D | 1 |
| A | D | 2 |
| B | D | 1 |
I want to remove the link between B and D, and therefore I want to delete the link between A and D (the one of depth 2). The problem is that I don't want to delete the link between A and D of depth 3 since I didn't delete the link between C and D.
For the moment, here is the SQL statement to list the links I want to delete:
SELECT link.ancestor, link.descendant, link.depth
FROM closure_table p,
closure_table link,
closure_table c
WHERE p.ancestor = link.ancestor
AND c.descendant = link.descendant
AND p.descendant = B
AND c.ancestor = D;
but this statement give me rows I don't want to delete:
| Ancestor | Descendant | Depth |
| A | D | 2 |
| A | D | 3 | <- As said before, I want to keep this one
| B | D | 1 |
You can select the ancestor-descendant pair that has the minimum depth of all of those same ancestor-descendant pairs:
with edges(s, e) as (
-- the pairs to be removed
select 'A', 'D'
union all
select 'B', 'D'
),
n_l as (
select c.* from closure c where c.ancestor != c.descendant
)
select c.* from n_l c where exists (select 1 from edges e where e.s = c.ancestor and e.e = c.descendant)
and c.depth = (select min(c1.depth) from n_l c1 where c1.ancestor = c.ancestor and c1.descendant = c.descendant);
Output:
ancestor
descendant
depth
A
D
2
B
D
1
I think I’ve found the solution, for those who are interested:
declare #Descendant nchar(10) = 'D';
declare #Ancestor nchar(10) = 'B';
with cte as
(
select Ancestor, Depth
from closure_table
where Descendant = #Descendant
and Ancestor = #Ancestor
and Depth = 1
union all
select r.Ancestor, l.Depth + 1 as Depth
from cte as l
join closure_table as r on r.Descendant = l.Ancestor
where r.Depth = 1
)
delete closure_table
from closure_table
join cte on cte.Ancestor = closure_table.Ancestor and cte.Depth = closure_table.Depth
where closure_table.Descendant = #Descendant;

Expanding information from one row to all similarly grouped rows in SQL

I am not sure of the logic required to accomplish this, but I want to take a table like this...
+----+------+
| Id | Type |
+----+------+
| 10 | A |
| 10 | B |
| 10 | C |
| 20 | A |
| 20 | C |
+----+------+
...and end up with a table like this...
+----+------+---+---+---+
| Id | Type | A | B | C |
+----+------+---+---+---+
| 10 | A | 1 | 1 | 1 |
| 10 | B | 1 | 1 | 1 |
| 10 | C | 1 | 1 | 1 |
| 20 | A | 1 | 0 | 1 |
| 20 | C | 1 | 0 | 1 |
+----+------+---+---+---+
...where each Id will have new columns created to consolidate information about Type into every row of that Id. Since 10 has a row of types A, B, and C, then all rows that have an ID of 10 should have a 1/true in the new columns A, B and C.
I know how to do this on a per-row basis, but can't wrap my head around how to consolidate the information from multiple rows into each row of the same ID.
Try this below logic- Demo
SELECT *,
(SELECT COUNT(DISTINCT Type) FROM your_table B WHERE B.ID = A.Id and B.Type = 'A') A,
(SELECT COUNT(DISTINCT Type) FROM your_table C WHERE C.ID = A.Id and C.Type = 'B') B,
(SELECT COUNT(DISTINCT Type) FROM your_table D WHERE D.ID = A.Id and D.Type = 'C') C
FROM your_table A
And just another option- Demo
SELECT *,
SUM(CASE WHEN Type= 'A' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) A,
SUM(CASE WHEN Type= 'B' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) B,
SUM(CASE WHEN Type= 'C' THEN 1 ELSE 0 END) OVER(PARTITION BY Id) C
FROM your_table

Alternatives to a conditional full outer join

I need to compare the records from two tables: X and Y. Each record has two ids: ID1 and ID2. Either ID1 or ID2 can be null in either table, but both can’t be null at once. I need to produce a view with all the information from both tables:
Rows where X.ID1 = Y.ID1 and X.ID2 = Y.ID2
Rows where X.ID1 = Y.ID1 but X.ID2 <> Y.ID2
Rows where X.ID1 <> Y.ID1 but X.ID2 = Y.ID2
Rows where X.ID1 and Y.ID1 don’t have any matches at all
Rows where X.ID2 and Y.ID2 don’t have any matches at all
Example:
X: Y:
|---------------| |---------------|
| ID1 | ID2 | | ID1 | ID2 |
|---------------| |---------------|
| 1 | A | | 1 | A |
| 2 | B | | 2 | C |
| 3 | NULL | | NULL | B |
| NULL | D | | 5 | NULL |
|---------------| |---------------|
Output:
|---------------------------------------|
| XID1 | YID1 | XID2 | YID2 | SRC |
|---------------------------------------|
| 1 | 1 | A | A | X+Y |
| 2 | 2 | B | C | X+Y |
| 3 | NULL | NULL | NULL | X |
| NULL | 5 | NULL | NULL | Y |
| 2 | NULL | B | B | X+Y |
| NULL | 2 | NULL | C | Y |
| NULL | NULL | D | NULL | X |
|---------------------------------------|
My first obvious solution was to do a FULL OUTER JOIN:
SELECT … FROM X FULL OUTER JOIN Y ON X.ID1 = Y.ID1 OR X.ID2 = Y.ID2
This works, but a conditional within a join has terrible performance, and this view would take up to a minute to run. Removing the conditional takes the execution time down to less than a second, but then I lose matching by one of the IDs.
How can I elegantly achieve the above without using a conditional join? I’ve tried:
Joining by concatenation of the two IDs, but this only matches when both IDs match
Doing a CROSS JOIN and filtering by X.ID1=Y.ID1 OR X.ID2=Y.ID2, but this loses the cases without any matches. This is the most promising approach.
Doing a UNION ALL of X and Y and then grouping by ID1 and ID2, but this once again only matches when both IDs match
You can try decomposing this into multiple joins. I think the logic is:
SELECT …
FROM X JOIN
Y
ON X.ID1 = Y.ID1
UNION ALL
SELECT …
FROM X JOIN
Y
ON X.ID1 <> Y.ID1 AND X.ID2 = Y.ID2
UNION ALL
SELECT ...
FROM X
WHERE NOT EXISTS (SELECT 1 FROM Y WHERE Y.ID1 = X.ID1) AND
NOT EXISTS (SELECT 1 FROM Y WHERE Y.ID2 = X.ID2)
UNION ALL
SELECT ...
FROM Y
WHERE NOT EXISTS (SELECT 1 FROM X WHERE Y.ID1 = X.ID1) AND
NOT EXISTS (SELECT 1 FROM X WHERE Y.ID2 = X.ID2) ;
If I read your conditions correctly, you could try something like this. Union the two left joins together and take a distinct of the two sets.
SELECT DISTINCT ... FROM (
SELECT … FROM X LEFT JOIN Y ON X.ID1 = Y.ID1
UNION ALL
SELECT … FROM X LEFT JOIN Y ON X.ID2 = Y.ID2
UNION ALL
SELECT … FROM Y LEFT JOIN X ON Y.ID1 = X.ID1 WHERE X.ID1 is null
UNION ALL
SELECT … FROM Y LEFT JOIN X ON Y.ID2 = X.ID2 WHERE X.ID2 is null
)
In situations where I have to choose between doing an OR in the join, or a union of two left joins, I find the union to be faster.
EDIT: Updated to include Y on the left as well.

Find the difference for results from two select

I have two tables:
table_1:
A | B | C
z | x | 12
z | c | 13
z | c | 10
a | s | 14
a | d | 11
table_2:
A | B | C
z | c | 10
z | x | 15
z | x | 11
a | d | 14
a | s | 12
I want to:
- group the tables by A and B
- and find the difference for SUM of C for AB.
I started with:
SELECT A, B, SUM(C) from table_1 GROUP BY A, B;
SELECT A, B, SUM(C) from table_2 GROUP BY A, B;
but I don't know how to JOIN them with adding additional column that is equal
to table_1.sum(C) - table_2.sum(c)
Expected result like:
A | B | sum1 | sum2 | diff
z | x | 12 | 26 | -14
z | c | 23 | 10 | 13
a | s | 14 | 12 | 2
a | d | 11 | 14 | -3
Use join with subquery
select X.A,X.B, sum1, sum2, sum1-sum2 as diff from
(
SELECT A, B, SUM(C) sum1
from table_1 GROUP BY A, B
)X inner join
(
SELECT A, B, SUM(C) sum2
from table_2 GROUP BY A, B
)Y on X.A=Y.A and X.B=Y.B
What do you want to happen when the groups are not the same in the two tables? inner join can be dangerous because groups will disappear.
If you want to keep all groups, then one method is union all/group by:
select a, b, sum(c1) as sum1, sum(c2) as sum2,
(sum(c2) - sum(c1)) as diff
from ((select a, b, c as c1, 0 as c2
from table_1
) union all
(select a, b, 0 as c1, c as c2
from table_2
)
) t
group by a, b

SQL Server : most frequent value in each row

How can I find most frequent value in each row in SQL Server?
Example:
1 a d a a c a b --> a
2 b a c b b b d --> b
3 h a h h b c d --> h
4 d d c h g p m --> d
5 e e g h d e h --> e
In first row, 'a' is most frequent value, etc.
Considering these values are in separate columns, with an UNPIVOT query the solution would look something like.....
Test Data
Declare #T table (ID INT , Col1 varchar(1) , Col2 varchar(1) , Col3 varchar(1)
, Col4 varchar(1) , Col5 varchar(1) , Col6 varchar(1) , Col7 varchar(1))
Insert Into #T values
('1','a','d','a','a','c','a','b'),
('2','b','a','c','b','b','b','d'),
('3','h','a','h','h','b','c','d'),
('4','d','d','c','h','g','p','m'),
('5','e','e','g','h','d','e','h');
Query
WITH X AS (
Select ID , Val, COUNT(*)total
,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY COUNT(*) DESC) rn
from #T
UNPIVOT (Val FOR N IN (Col1,Col2,Col3,Col4,Col5,Col6,Col7))up
GROUP BY ID , Val
)
Select t.* , Val
FROM X
INNER JOIN #T t ON x.ID = t.ID
WHERE rn = 1
Result Set
+----+------+------+------+------+------+------+------+-----+
| ID | Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | Col7 | Val |
+----+------+------+------+------+------+------+------+-----+
| 1 | a | d | a | a | c | a | b | a |
| 2 | b | a | c | b | b | b | d | b |
| 3 | h | a | h | h | b | c | d | h |
| 4 | d | d | c | h | g | p | m | d |
| 5 | e | e | g | h | d | e | h | e |
+----+------+------+------+------+------+------+------+-----+