How to use group by with union in T-SQL - sql

How can I using group by with union in T-SQL? I want to group by the first column of a result of union, I wrote the following SQL but it doesn't work. I just don't know how to reference the specified column (in this case is 1) of the union result.
SELECT *
FROM ( SELECT a.id ,
a.time
FROM dbo.a
UNION
SELECT b.id ,
b.time
FROM dbo.b
)
GROUP BY 1

You need to alias the subquery. Thus, your statement should be:
Select Z.id
From (
Select id, time
From dbo.tablea
Union All
Select id, time
From dbo.tableb
) As Z
Group By Z.id

GROUP BY 1
I've never known GROUP BY to support using ordinals, only ORDER BY. Either way, only MySQL supports GROUP BY's not including all columns without aggregate functions performed on them. Ordinals aren't recommended practice either because if they're based on the order of the SELECT - if that changes, so does your ORDER BY (or GROUP BY if supported).
There's no need to run GROUP BY on the contents when you're using UNION - UNION ensures that duplicates are removed; UNION ALL is faster because it doesn't - and in that case you would need the GROUP BY...
Your query only needs to be:
SELECT a.id,
a.time
FROM dbo.TABLE_A a
UNION
SELECT b.id,
b.time
FROM dbo.TABLE_B b

Identifying the column is easy:
SELECT *
FROM ( SELECT id,
time
FROM dbo.a
UNION
SELECT id,
time
FROM dbo.b
)
GROUP BY id
But it doesn't solve the main problem of this query: what's to be done with the second column values upon grouping by the first? Since (peculiarly!) you're using UNION rather than UNION ALL, you won't have entirely duplicated rows between the two subtables in the union, but you may still very well have several values of time for one value of the id, and you give no hint of what you want to do - min, max, avg, sum, or what?! The SQL engine should give an error because of that (though some such as mysql just pick a random-ish value out of the several, I believe sql-server is better than that).
So, for example, change the first line to SELECT id, MAX(time) or the like!

with UnionTable as
(
SELECT a.id, a.time FROM dbo.a
UNION
SELECT b.id, b.time FROM dbo.b
) SELECT id FROM UnionTable GROUP BY id

Related

Union of multiple queries using the count function

I'm working on learning more about how the UNION function works in SQL Server.
I've got a query that is directed at a single table:
SELECT Category, COUNT(*) AS Number
FROM Table1
GROUP BY Category;
This returns the number of entries for each distinct line in the Category column.
I have multiple tables that are organized by this Category column and I'd like to be able to have the results for every table returned by one query.
It seems like UNION will accomplish what I want it to do but the way I've tried implementing the query doesn't work with COUNT(*).
SELECT *
FROM (SELECT Table1.Category
Table1.COUNT(*) AS Number
FROM dbo.Table1
UNION
SELECT Table2.Category
Table2.COUNT(*) AS Number
FROM dbo.Table2) AS a
GROUP BY a.Category
I'm sure there's an obvious reason why this doesn't work but can anyone point out what that is and how I could accomplish what I'm trying to do?
You cannot write a common Group by clause for two different select's. You need to use Group by clause for each select
SELECT TABLE1.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE1
GROUP BY Category
UNION ALL --UNION
SELECT TABLE2.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE2
GROUP BY Category
If you really want to remove duplicates in result then change UNION ALL to UNION
COUNT as any associated aggregation function has to have GROUP BY specified. You have to use group by for each sub query separately:
SELECT * FROM (
SELECT TABLE1.Category,
COUNT(*) as Number
FROM dbo.TABLE1
GROUP BY TABLE1.Category
UNION ALL
SELECT TABLE2.Category,
COUNT(*) as Number
FROM dbo.TABLE2
GROUP BY TABLE2.Category
) as a
It is better to use UNION ALL vs UNION - UNION eliminates duplicates from result sets, since - let say - you want to merge both results as they are it is safer to use UNION ALL

SELECT statement on top of UNION statement

select *
from
{
SELECT
ID, CLASS, CHANGE_NUMBER AS OBJECT_NUMBER
FROM table_A
UNION
SELECT
ID, CLASS, CUST_NO AS OBJECT_NUMBER
FROM table_B
ORDER BY ID
} x where x.id ='5434';
Help me to run this query.
I am getting error "invalid table name"
I would suggest writing the query like this:
select x.*
from (SELECT ID, CLASS, CHANGE_NUMBER AS OBJECT_NUMBER FROM table_A
UNION ALL
SELECT ID, CLASS, CUST_NO AS OBJECT_NUMBER FROM table_B
) x
where x.id = '5434';
Notes:
The curly braces are probably your syntax problem.
Use UNION ALL instead of UNION, unless you really want to incur the overhead of removing duplicates.
The ORDER BY is not needed. After all, you are only choosing one id.
If you do have an ORDER BY, it is better practice to put it in the outer query than in the subquery.
Use '(' bracket instead of '{'.
select * from
(
SELECT ID,CLASS, CHANGE_NUMBER AS OBJECT_NUMBER FROM table_A
UNION
SELECT ID,CLASS,CUST_NO AS OBJECT_NUMBER FROM table_B
ORDER BY ID
) x where x.id ='5434';

Aggregate two columns and rows into one

I have the following table structure
start|end
09:00|11:00
13:00|14:00
I know
SELECT ARRAY_AGG(start), ARRAY_AGG(end)
Will result in
start|end
[09:00,13:00]|[11:00,14:00]
But how can i get the following result?
result
[09:00,11:00,13:00,14:00]
BTW, I'm using Postgres
You could do array concatenation (if order is not important):
SELECT ARRAY_AGG(start) || ARRAY_AGG(end) FROM TABLE1
If order is important you could use Gordon's approach but:
add aggregate order array_agg(d order by d ASC)
use unnest instead of union all, because Gordon's solution (union all) performs two sequence scan. If table is big it could be better for performance to use:
SELECT array_agg(d ORDER BY d ASC) FROM(
SELECT unnest(ARRAY[start] || ARRAY[end]) as d from table1
) sub
which performs only one sequence scan on table (and will be faster).
One method is to unpivot them and then aggregate:
select array_agg(d)
from (select start as d from t
union all
select end as d from t
) t;
A similar method uses a cross join:
select array_agg(case when n.n = 1 then t.start else t.end end)
from t cross join
(select 1 as n union all select 2) n;
I assume the start and end are character type
select ARRAY_AGG(col)
from(select string_agg(strt::text||','||en::text,',') col
from b
)t

order by only one dataset of a union in a tsql union of datasets

I have the following problem.
Let TableA(Id int, Name nvarchar(200)) and TableB(Id int, Name nvarchar(200)).
If we run the following query:
SELECT *
FROM
(SELECT *
FROM TableA)
UNION
(SELECT *
FROM TableB)
we get the union of the two datasets.
My Problem is that I want the results of the second dataset to be the ordered by the Name column.
The reason why I need this, is the fact that TableA is a temporary table in my query, that always will hold one record, and this record I want to be the first in the resulting dataset from the union of the two datasets. Also, I want the multiple records of the TableB to be ordered by the Name column.
Unfortunately, when I try to execute the following query
SELECT *
FROM
(SELECT *
FROM TableA)
UNION
(SELECT *
FROM TableB
ORDER BY Name)
I get an ambiguous error message, that informs me that I have an incorrect syntax near the keyword order.
Thanks in advance for any help.
try this:
select id
, name
from
(select 1 as ordercol
, a.id
, a.name
from tableA
union
select 2 as ordercol
, b.id
, b.name
from tableB) i
order by ordercol, name
the error message resulted in you trying to union two subselects. you can put union between two selects that will then be put into a subselect. there is always a select after a union (or union all). i would also suggest you use a union all, that saves time because sql-server will otherwise try and remove records that are in both selects (which in this case is impossible due to the ordercol-column)
i have included a second order-by column that will order the first select before the second. if you order by that first and then by name, you should get the desired result.

Creating SQL UNION where second side of the union depends on first side

I would .like to perform a union of two queries where second query depends on first:
SELECT * FROM company_res t1
UNION
SELECT * FROM company_res t2
WHERE t2.company_id IN (
SELECT c.id
FROM company c
WHERE c.parent_id = t1.company_id
)
ORDER BY company_id, year_code
However, when I run this queries in psql I get an error to the effect that t1 in second query does have a FROM-clause.
Is it possible to have UNION of tow queries that depend on each other?
From your partial example I think you're trying to make a recursive query, and not a classical UNION query, that's an adavnced for of UNIONS if fact.
You need to perform some selections on company_res, and then to add parents of theses companies.
The basic form is:
WITH RECURSIVE t(n) AS (
SELECT 1
UNION ALL
SELECT n+1 FROM t
)
SELECT n FROM t LIMIT 100;
In you case something like that maybe:
WITH RECURSIVE rectable(
company_id,
field2,
field3,
parent_id) AS (
-- here the starting rows, t1 in your example
SELECT
company_res.company_id,
company_res.field2,
company_res.field3,
company.parent_id
FROM company_res
INNER JOIN company ON company_res.company_id=company.id
WHERE (here any condition on the starting points)
UNION ALL
-- here the recursive part
SELECT
orig.company_id,
orig.field2,
orig.field3,
orig.parent_id
FROM rectable rec,company_res orig
INNER JOIN company ON orig.company_id=company.id
WHERE company.parent_id=rec.company_id
-- here you could add some AND sections if you want
)
SELECT company_id,field2, field3,parent_id
FROM rectable
ORDER BY parent_id;
The SELECT * FROM company_res t1 in your query is going to provide you with everything from company_res, regardless of what else you UNION it with from company_res. I doubt that's what you're looking for. See the answer from shahkalpesh.