SQL - GROUP BY - Dynamic Columns - sql

How can we achieve this?
Actual Table:
.-------.---------.-------.------.---------.
| EmpId | Project | Title | Role | Values |
|-------|---------|-------|----- |---------|
| 1 | aaa |xxx | A| 100|
| 1 | aaa |yyy | B| 120|
| 1 | aaa |zzz | C| 90|
.-------.---------.-------.------.---------.
Target 1:
.-------.---------.-------.----.----.----.
| EmpId | Project | Title | A | B | C |
|-------|---------|-------|--- |----|----|
| 1 | aaa |xxx | 100|null|null|
| 1 | aaa |yyy |null| 120|null|
| 1 | aaa |zzz |null|null| 90|
.-------.---------.-------.----.----.----.
Target 2:
.-------.---------.----.----.----.
| EmpId | Project | A | B | C |
|-------|---------|--- |----|----|
| 1 | aaa | 100| 120| 90|
.-------.---------.----.----.----.
Conditions:
In Target 1, Columns A/B/C are dynamically generated.(Pivot-ed, constant change of column names).
The columns A/B/C are not actually A/B/C. Its a result of a pivot table or stored procedure. It could be A/B/C/D or M/N or X/Y/Z.
Column Title is not at all important in Target 2.

Just use aggregation:
select EmpId, Project, max(A) as a, max(B) as b, max(C) as c
from t
group by EmpId, Project;

Use aggregation. MAX() ignores NULL values:
SELECT empid, project, MAX(A) as A, MAX(B) as B, MAX(C) as C
FROM mytable
GROUP BY empid, project
Demo on DB Fiddle:
| empid | project | A | B | C |
| ----- | ------- | --- | --- | --- |
| 1 | aaa | 100 | 120 | 90 |

SELECT
T1.EmpId,
T1.Project,
T2.A,
T3.B,
T4.C
FROM
Table T1
LEFT JOIN
Table T2 ON
T2.EmpId=T1.EmpId
AND T2.Project=T1.Project
AND T2.A IS NOT NULL
LEFT JOIN
Table T3 ON
T3.EmpId=T1.EmpId
AND T3.Project=T1.Project
AND T3.B IS NOT NULL
LEFT JOIN
Table T4 ON
T4.EmpId=T1.EmpId
AND T4.Project=T1.Project
AND T4.B IS NOT NULL

with cte (id,pro,title,rol,val) as (
select 1,'aaa','xxx','A',100 union all
select 1,'aaa','yyy','B',120 union all
select 1,'aaa','zzz','C',90)
select id,pro,title,[a],[b],[c] from (
select * from cte ) a
pivot
(max(val) for rol in ([a],[b],[c])) aa
with cte (id,pro,title,rol,val) as (
select 1,'aaa','xxx','A',100 union all
select 1,'aaa','yyy','B',120 union all
select 1,'aaa','zzz','C',90)
select id,pro,max([a]) A,max([b]) B,max([c]) C from (
select * from cte ) a
pivot
(max(val) for rol in ([a],[b],[c])) aa
group by id,pro

Related

Grouping over the subquery in SQL on unique id

I've a query which gets results from temp table. It has aggregate columns which are derived from the temp table:
SELECT
DISTINCT
SUM(a),
SUM(b),
c,
d,
id1
FROM
#tmpTable
.
.
.
join with many other tables
I want to now get the SUM of columns c & d returned from the query along with all other columns. It will be group by id1. It should look something like:
+--------------------------------------------
||Sum(A) |Sum(B)|C |D |id1 |
|-------------------------------------------+
| 12 |34 |1 | 3 | 1 |
|-------------------------------------------+
| 22 |37 | 2 | 4 | 2 |
|-------------------------------------------+
| 33 | 55 | 3 | 5 | 1 |
|-------------------------------------------+
| 44 | 25 | 5 | 6 | 2 |
+---------+------+------+---------+---------+
Final result should be this:
+--------------------------------------------
||Sum(A) |Sum(B)|Sum(C)|Sum(d) |id1 |
|-------------------------------------------+
| 12 |34 |4 | 8 | 1 |
|-------------------------------------------+
| 22 |37 | 7 | 10 | 2 |
|-------------------------------------------+
| 33 | 55 | 4 | 8 | 1 |
|-------------------------------------------+
| 44 | 25 | 7 | 10 | 2 |
+---------+------+------+---------+---------+
select
x.sum_a,
x.sum_b,
x.sum_c,
x.sum_d,
t.id1
from
tmpTable t
join
(
select
id1,
sum(A) as sum_a,
sum(B) as sum_b,
sum(C) as sum_c,
sum(D) as sum_d
from
tmpTable
group by
id1
) x on t.id1 = x.id1
Seeing as you have different grouping criteria for A and B, you can group them separately to C and D. The below (using common table expression) might start you on the right track:
; with SummaryValues AS
(
select id1, sum(C) as SumC, SUM(D) as SumD
from #SourceTable
group by id1
)
select SUM(st.A), SUM(st.b), sv.SumC, sv.SumD, st.id1
from #SourceTable st
inner join SummaryValues sv
on st.id1 = sv.id1
group by <whatever grouping you are using>
If your current real query is summing up a and b the way you want and generating that first sample output, maybe something like:
SELECT DISTINCT
SUM(a),
SUM(b),
SUM(c) OVER (PARTITION BY id1),
SUM(d) OVER (PARTITION BY id1),
id1
FROM
#tmpTable
.
.
.
join with many other tables
to get the second one.

Impala - Does impala allow multi GROUP_CONCAT in one query

For example, I have a table below
+-----------+-------+------------+
| Id | a| b|
+-----------+-------+------------+
| 1 | 6 | 20 |
| 1 | 4 | 55 |
| 1 | 9 | 56 |
| 1 | 2 | 67 |
| 1 | 7 | 80 |
| 1 | 5 | 66 |
| 1 | 3 | 33 |
| 1 | 8 | 34 |
| 1 | 1 | 52 |
I want the output would be like below by using Impala
+-----------+-------------------+-----------------------------+
| Id | a | b |
+-----------+-------------------+-----------------------------+
| 1 | 6,4,9,2,7,5,3,8,1 | 20,55,56,67,80,66,33,34,52 |
+-----------+-------------------+-----------------------------+
In Impala, I have used
SELECT Id,
group_concat(DISTINCT a) AS a,
group_concat(DISTINCT b) AS b
FROM table GROUP BY Id
It will always get Syntax error. Just wondering is that we are not allowed to use multi group_concat for one query in Impala? or not allow to use multi Distinct for one query?
From the documentation for GROUP_CONCAT:
You cannot apply the DISTINCT operator to the argument of this function.
But, as workaround, we can use two separate subqueries to find the distinct values:
WITH cte1 AS (
SELECT Id, GROUP_CONCAT(a) AS a
FROM (SELECT DISTINCT Id, a FROM yourTable) t
GROUP BY Id
),
cte2 AS (
SELECT Id, GROUP_CONCAT(b) AS b
FROM (SELECT DISTINCT Id, b FROM yourTable) t
GROUP BY Id
)
SELECT
t1.Id,
t1.a,
t2.b
FROM cte1 t1
INNER JOIN cte2 t2
ON t1.Id = t2.Id;

Getting the last updated name

I am having a table having records like this:
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
I need to get value of A after it was last updated from a different value, for example here it would be the row at ID 6.
Try this query (MySQL syntax):
select min(ID)
from records
where name = 'A'
and ID >=
(
select max(ID)
from records
where name <> 'A'
);
Illustration:
select * from records;
+------+------+
| ID | name |
+------+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
| 5 | B |
| 6 | A |
| 7 | A |
| 8 | A |
+------+------+
-- run query:
+---------+
| min(ID) |
+---------+
| 6 |
+---------+
Using the Lag function...
SELECT Max([ID])
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A'
AND prvval <> 'A'
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/2/0
If you want to get the whole row, you can do this...
SELECT Top 1 *
FROM (SELECT [name], [ID],
Lag([name]) OVER (ORDER BY [ID]) AS PrvVal
FROM tablename) tbl
WHERE [name] = 'A' AND prvval <> 'A'
ORDER BY [ID] DESC
Online Demo: http://www.sqlfiddle.com/#!18/a55eb/22/0
The ANSI SQL below uses a self-join on the previous id.
And the where-clause gets those with a name that's different from the previous.
select max(t1.ID) as ID
from YourTable as t1
left join YourTable as t2 on t1.ID = t2.ID+1
where (t1.name <> t2.name or t2.name is null)
and t1.name = 'A';
It should work on most RDBMS, including MS Sql Server.
Note that with the ID+1 that there's an assumption that are no gaps between the ID's.

Access Queries comparing two tables

I have two tables in Access, Table A and Table B:
Table MasterLockInsNew:
+----+-------+----------+
| ID | Value | Date |
+----+-------+----------+
| 1 | 123 | 12/02/13 |
| 2 | 1231 | 11/02/13 |
| 4 | 1265 | 16/02/13 |
+----+-------+----------+
Table InitialPolData:
+----+-------+----------+---+
| ID | Value | Date |Type
+----+-------+----------+---+
| 1 | 123 | 12/02/13 | x |
| 2 | 1231 | 11/02/13 | x |
| 3 | 1238 | 10/02/13 | y |
| 4 | 1265 | 16/02/13 | a |
| 7 | 7649 | 18/02/13 | z |
+----+-------+----------+---+
All I want are the rows from table B for IDs not contained in A. My current code looks like this:
SELECT Distinct InitialPolData.*
FROM InitialPolData
WHERE InitialPolData.ID NOT IN (SELECT Distinct InitialPolData.ID
from InitialPolData INNER JOIN
MasterLockInsNew
ON InitialPolData.ID=MasterLockInsNew.ID);
But whenever I run this in Access it crashes!! The tables are fairly large but I don't think this is the reason.
Can anyone help?
Thanks
or try a left outer join:
SELECT b.*
FROM InitialPolData b left outer join
MasterLockInsNew a on
b.id = a.id
where
a.id is null
Simple subquery will do.
select * from InitialPolData
where id not in (
select id from MasterLockInsNew
);
Try using NOT EXISTS:
SELECT Distinct i.*
FROM InitialPolData AS i
WHERE NOT EXISTS (SELECT 1
FROM MasterLockInsNew AS m
WHERE m.ID = i.ID)

I need a specific output

I have to get a specific output format from my tables.
Let's say I have a simple table with 2 columns name and value.
table T1
+---------------+------------------+
| Name | Value |
+---------------+------------------+
| stuff1 | 1 |
| stuff1 | 1 |
| stuff2 | 2 |
| stuff3 | 1 |
| stuff2 | 4 |
| stuff2 | 2 |
| stuff3 | 4 |
+---------------+------------------+
I know the values are in the interval 1-4. I group it by name and value and count number of the same rows as Number and get the following table:
table T2
+---------------+------------------+--------+
| Name | Value | Number |
+---------------+------------------+--------+
| stuff1 | 1 | 2 |
| stuff2 | 2 | 2 |
| stuff3 | 1 | 1 |
| stuff3 | 4 | 1 |
+---------------+------------------+--------+
Here is the part when I need your help! What should I do if I want to get these format?
table T3
+---------------+------------------+--------+
| Name | Value | Number |
+---------------+------------------+--------+
| stuff1 | 1 | 2 |
| stuff1 | 2 | 0 |
| stuff1 | 3 | 0 |
| stuff1 | 4 | 0 |
| stuff2 | 1 | 0 |
| stuff2 | 2 | 2 |
| stuff2 | 3 | 0 |
| stuff2 | 4 | 0 |
| stuff3 | 1 | 1 |
| stuff3 | 2 | 0 |
| stuff3 | 3 | 0 |
| stuff3 | 4 | 1 |
+---------------+------------------+--------+
Thanks for any suggestions!
You start with a cross join to generate all possible combinations and then left-join in the results from your existing query:
select n.name, v.value, coalesce(nv.cnt, 0) as "Number"
from (select distinct name from table t) n cross join
(select distinct value from table t) v left outer join
(select name, value, count(*) as cnt
from table t
group by name, value
) nv
on nv.name = n.name and nv.value = v.value;
Variation on the theme.
Differences between Gordon Linoff and Owen existing answers.
I prefer GROUP BY to get the Names rather than a DISTINCT. This may have better performance in a case like this. (See Rob Farley's still relevant article.)
I explode the subqueries into a series of CTEs for clarity.
I use table T2 as the question now labels the group results set instead of showing that as as subquery.
WITH PossibleValue AS (
SELECT 1 Value
UNION ALL
SELECT Value + 1
FROM PossibleValue
WHERE Value < 4
),
Name AS (
SELECT Name
FROM T1
GROUP BY Name
),
NameValue AS (
SELECT Name
,Value
FROM Name
CROSS JOIN
PossibleValue
)
SELECT nv.Name
,nv.Value
,ISNULL(T2.Number,0) Number
FROM NameValue nv
LEFT JOIN
T2 ON nv.Name = T2.Name
AND nv.Value = T2.Value
Yet another solution, this time using a Table Value Constructor in a CTE to build a table of name value combinations.
WITH value AS
( SELECT DISTINCT t.name, v.value
FROM T1 AS t
CROSS JOIN (VALUES (1),(2),(3),(4)) AS v (value)
)
SELECT v.name AS 'Name', v.value AS 'Value', COUNT(t.name) AS 'Number'
FROM value AS v
LEFT JOIN T1 AS t ON t.value = v.value AND t.name = v.name
GROUP BY v.name, v.value, t.name;