Group aggregation and descriptive columns

Group aggregation and descriptive columns - sql

A group is defined by column a, b and c. Column x, y and z from each group are the same. Sample:
a|b|c|x|y|z| ....
1 1 1 p r s
1 1 1 p r s
1 1 1 p r s
2 1 2 t u v
2 1 2 t u v
I am looking to achieve the following however without using aggregate function (max(t.x), ...)
select t.a, t.b, t.c,count(*), t.x, t.y, t.z, ....
from t
group by t.a, t.b, t.c;
Is there any other function that can be used in the select statement to include columns x,y and z?
Would you rather use another join to add the descriptive column?

If the columns are the same within a group, just include them in the group by clause:
select t.a, t.b, t.c,count(*), t.x, t.y, t.z, ....
from t
group by t.a, t.b, t.c, t.x, t.y, t.z
If you want a random row with the count, then use window functions:
select t.*
from (select t.*,
count(*) over (partition by a, b, c) as cnt,
row_number() over (partition by a, b, c order by (select NULL)) as seqnum
from t
) t
where seqnum = 1
The order by (select NULL) is used in SQL Server. I'm not sure if it will work in Netezza. Any expression will work for the order by.

Related

SQL Server query for all columns with group by and having

I'm wondering is there a way to query all columns with group by and having in SQL Server? For example, I have 6 columns, a, b,…,f, and this is something I want to get:
Select *
From table
Group by table.b, table.c
Having max(table.d)=table.d
This works in sybase, since I'm trying to migrate stuff from sybase to SQL Server, I'm not sure what I can do in new environment. Thanks.

Why do you want to group by every column when you don't use any aggragate-functions in your select? Just use the following code to get all columns of the table:
select * from table
Group by only gets used when you have aggragete-functions (e.g. max(), avg(), count(), ...) in your select.
Having limits the aggrageted columns and where the normal columns of the table.

You can use MIN, MAX, AVG, and COUNT functions with the OVER clause to provide aggregated values for each column (to imitate the group by clause for each column) and Common table expression CTE to filter out the results (to imitate the having clause) as:
;With CTE as
(
SELECT
MIN(a) OVER (PARTITION BY a) AS MinCol_a
, MAX(b) OVER (PARTITION BY b) AS MaxCol_b
, AVG(c) OVER (PARTITION BY c) AS AvgCol_c
, COUNT(e) OVER (PARTITION BY d) AS Counte_PerCol_d
FROM Tbl_Test
)
select MinCol_a,MaxCol_b ,AvgCol_c,Counte_PerCol_d
from CTE
Join --here you can join the table Test results with other tables
where --any filter condition similar to Having clause

If what you want is to get the rows with maximum d for each combination of b and c then use NOT EXISTS:
select t.* from tablename t
where not exists (
select 1 from tablename
where b = t.b and c = t.c and d > t.d
)
or with rank() window function:
select t.a, t.b, t.c, t.d, t.e, t.f
from (
select *,
rank() over (partition by b, c order by d desc) rn
from tablename
) t
where t.rn = 1

Without using having you can get the result which you want. Try below
Select table.b, table.c, max(table.d)
From table
Group by table.b, table.c

SQL SELECT DISTINCT CONCAT(ColumnA,'|',ColumnB)

So I have two tables and I want to use a concat on one of the columns in each table.
TableA
ColumnA
1
1
2
3
4
4
5
TableB
ColumnX
a
a
b
c
d
d
e
And I want to concat these two columns so they end up looking like the below result WITHOUT dublicates
Result
1|a
2|b
3|c
4|d
5|e
So I have tried to do the following:
SELECT DISTINCT CONCAT(ColumnA,'|',ColumnB) where tableA.Relation = TableB.Relation
But I am still getting dublicates!? Why????

you could generate the row_numbers by row_numbers() function & then concat them (SQL Server)
;with cte as
(
SELECT *, ROW_NUMBER() over (order by (select 1)) rn FROM <table>
),cte1 as
(
SELECT *, ROW_NUMBER() over (order by (select 1)) rn FROM <table>
)
select DISTINCT CONCAT(c.ColumnA, '|',c1.ColumnX) from cte c
join cte1 c1 on c1.rn = c.rn

This is similar only without the CTE
select distinct A.a + '|' + B.b
from
(select a, Row_Number() over (order by a) as rowNum
from TableA) as A,
(select b, Row_Number() over (order by b) as rowNum
from TableB) as B
where A.rowNum = b.rowNum

Select MAX(DateTime) returning multiple lines

I'm trying to select the last MAX(DateTime) status from the table "Zee" but if the DateTime is the same it returns two lines, and I would like to get only the last one (maybe last inserted?).
here is the query:
SELECT Z."ID" AS ID,Z."A" AS A,Z."B" AS B,Z."C" AS C,Z."D" AS D
FROM ZEE Z
INNER JOIN
(SELECT ID, A, B, MAX(C) AS C
FROM ZEE
GROUP BY A, B) groupedtt
ON Z.A = groupedtt.A
AND Z.B = groupedtt.B
AND Z.C = groupedtt.C
WHERE (
Z.B = 103
OR Z.B = 104
);
and the result:
Thanks,
Regards.

I usually use rank() for such things:
select Z."ID" AS ID,Z."A" AS A,Z."B" AS B,Z."C" AS C,Z."D" AS D
from (select Z.*, rank()over(partition by A,B order by C desc, rownum) r from ZEE Z
)Z where Z.r=1

Use the ROW_NUMBER() analytic function (you will also eliminate the self-join):
SELECT ID, A, B, C, D
FROM (
SELECT ID,
A,
B,
C,
D,
ROW_NUMBER() OVER ( PARTITION BY A, B ORDER BY C DESC ) As rn
FROM ZEE
)
WHERE rn = 1;

GROUP BY one column to find MAX, but keep value from another column - SQL

A | B | num
----------------------
123 1 2
123 10 5
Result:
A | B | max_num
-------------------------
123 10 5
Let's say the table name is tab, currently I have
SELECT T.A, MAX(T.num) AS max_num
FROM tab T
GROUP BY T.A
However, the result will not contain the column B.
SELECT T.A, T.B... GROUP BY T.A, T.B
Will also not give the desired result, since max is found based on the A,B pair.
How can I choose the max of num grouped by only A, but then keep the value of B for the max row that is chosen?

1.Select Max num from table
2.Just filter of IN Clause
select * from Mytable where
num in(
select TOP 1 MAX(num)
from mytab
group by colA)
or
For SQL SERVER
You can Use Window function for single Max using ROW_NUMBER ()
select * from (
select ROW_NUMBER () OVER (ORDER BY num desc) rn,*
from tab
)d where d.rn=1

This should do the job:
Select t1.A, T1.B,T1.num from tab t1 where (T1.A,T1.num) in (
SELECT T.A, MAX(T.num) AS max_num
FROM tab T
GROUP BY T.A)
Selection the Record where num equals the max(num)
See the SQLFIDDLE

Do you mean you want the whole rows where c = the max(c) value for each a? This one will give both rows if it's a tie:
select a, b, c
from t as t1
where c = (select max(c) from t t2
where t1.a = t2.a)

Order by newly selected column

I have a query like:
SELECT
R.*
FROM
(SELECT A, B,
(SELECT smth from another table) as C,
ROW_NUMBER() OVER (ORDER BY C DESC) AS RowNumber
FROM SomeTable) R
WHERE
RowNumber BETWEEN 10 AND 20
This gives me an error on ORDER BY C DESC.
I understand why this error is caused, so I've thought of adding another SELECT with ORDER BY and only than selecting rows from 10 to 20. But I don't think it's good to have 3 nested SELECT commands.
How else is it possible to select these rows?

A column cannot refer to an alias on same level, you have to table-derive it first, or use CTE.
SELECT
R.* , ROW_NUMBER() OVER (ORDER BY C DESC) AS RowNumber
FROM
(SELECT A, B, (SELECT smth from another table) as C
FROM SomeTable) R
-- WHERE
-- but you still cannot do this
-- RowNumber BETWEEN 10 AND 20
Need to do this:
select S.*
from
(
SELECT
R.* , ROW_NUMBER() OVER (ORDER BY C DESC) AS RowNumber
FROM
(SELECT A, B,
(SELECT smth from another table) as C
FROM SomeTable) R
) as s
where s.RowNumber between 10 and 20
To avoid deep nesting and to make it at least look pleasant, use CTE:
with R as
(
SELECT A, B, (SELECT smth from another table) as C
FROM SomeTable
)
,S AS
(
SELECT R.*, ROW_NUMBER() OVER (ORDER BY C DESC) AS RowNumber
FROM R
)
SELECT S.*
FROM S
WHERE S.RowNumber BETWEEN 1 AND 20

You cannot use aliased columns in the same SELECT, but you can wrap it into another select to make it work:
SELECT R.*
FROM (SELECT ABC.A, ABC.B, ABC.C, ROW_NUMBER() OVER (ORDER BY C DESC) AS RowNumber
FROM (SELECT A, B, (SELECT smth from another table) as C FROM SomeTable) ABC
) R
WHERE R.RowNumber BETWEEN 10 AND 20

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group aggregation and descriptive columns - sql

Related

SQL Server query for all columns with group by and having

SQL SELECT DISTINCT CONCAT(ColumnA,'|',ColumnB)

Select MAX(DateTime) returning multiple lines

GROUP BY one column to find MAX, but keep value from another column - SQL

Order by newly selected column

Categories

Resources