SQLSERVER group by (aggregate column based on other column) - sql

I have a table which has 3 columns A, B, C
I want to do a query like this:
select A, Max(B), ( C in the row having max B ) from Table group by A.
is there a way to do such a query?
Test Data:
A B C
2 5 3
2 6 1
4 5 1
4 7 9
6 5 0
the expected result would be:
2 6 1
4 7 9
6 5 0

;WITH CTE AS
(
SELECT A,
B,
C,
RN = ROW_NUMBER() OVER(PARTITION BY A ORDER BY B DESC)
FROM YourTable
)
SELECT A, B, C
FROM CTE
WHERE RN = 1

Try this
select t.*
from table t
join (Select A,max(b) B from table group by A) c
on c.a=t.a
and c.b=a.b

Related

Find records on group level which are connected to all other record within the group

I have a scenario where I have to find IDs within each group which are connected to all other IDs in the same group. So basically we have to treat each group separately.
In the table below, the group A has 3 IDs 1, 2 and 3. 1 is connected to both 2 and 3, 2 is connected to both 1 and 3, but 3 is not connected to 1 and 2. So 1 and 2 should be output from group A. Similarly in group B only 5 is connected to all other IDs namely 4 and 6 within group B, so 5 should be output. Similarly from group C, that should be 8, and from group D no records should be output.
So the output of the select statement should be 1, 2, 5, 8.
GRP
ID
CONNECTED_TO
A
1
2
A
1
3
A
2
3
A
2
1
A
3
5
B
4
5
B
5
4
B
5
6
B
6
4
C
7
21
C
7
25
C
8
7
D
9
31
D
10
35
D
11
37
I was able to do this if group level was not required, by below SQL:
SELECT ID FROM <table>
where CONNECTED_TO in (select ID from <table>)
group by ID
having count(*) = <number of records - 1>
But not able to find correct SQL for my scenario. Any help is appreciated.
You may use count and count(distinct) functions as the following:
select id
from tbl T
where connected_to in
(
select id from tbl T2
where T2.grp = T.grp
)
group by grp, id
having count(connected_to) =
(
select count(distinct D.id) - 1
from tbl D
where T.grp = D.grp
)
When count(connected_to) group by grp, id equals to the count(distinct id) - 1 with the same grp, this means that the ID is connected to all other IDs.

Optimized SQL Query to retrieve a Particular set of records from Microsoft SQL Server DB

I want to retrieve the set of data from a column in the table.
My Scenario is:
Iam having a table with name table1_data , In that table there is a Column with name "clm_Name", The data in the column is like this
a
b
c
a
b
c
a
b
a
b
c
a
a
b
c
I want to retrieve the data when a b c are in order if order changes it should not retrieve.(i.e, If we write a query on given data the output should be a b c a b c a b a b c a a b c) only the bolded letters should be shown in output.
If you have a column to sort you can do as follows:
DECLARE #Tbl TABLE (OrderId INT, val NVARCHAR(1))
INSERT INTO #Tbl
VALUES
(1,'a'),
(2,'b'),
(3,'c'),
(4,'a'),
(5,'b'),
(6,'c'),
(7,'a'),
(8,'b'),
(9,'a'),
(10,'b'),
(11,'c'),
(12,'a'),
(13,'a'),
(14,'b'),
(15,'c')
;WITH CTE
AS
(
SELECT
*,
ROW_NUMBER() OVER (ORDER BY (SELECT OrderId)) RowId
FROM #Tbl
), Result
AS
(
SELECT
CurrRow.RowId
FROM
CTE CurrRow LEFT JOIN
(SELECT CTE.val , CTE.RowId - 1 RowId FROM CTE) NextRow ON CurrRow.RowId = NextRow.RowId LEFT JOIN
(SELECT CTE.val , CTE.RowId + 1 RowId FROM CTE) PrivRow ON CurrRow.RowId = PrivRow.RowId
WHERE
PrivRow.val = 'a' AND
CurrRow.val = 'b' AND
NextRow.val = 'c'
)
SELECT
*
FROM
CTE C
WHERE
C.RowId IN (
SELECT Result.RowId FROM Result
UNION ALL
SELECT Result.RowId - 1 FROM Result
UNION ALL
SELECT Result.RowId + 1 FROM Result
)
ORDER BY C.OrderId
Output:
OrderId val RowId
1 a 1
2 b 2
3 c 3
4 a 4
5 b 5
6 c 6
9 a 9
10 b 10
11 c 11
13 a 13
14 b 14
15 c 15

Smarter GROUP BY

Consider Table like this.
I will call it Test
Id A B C D
1 1 1 8 25
2 1 2 5 35
3 1 3 2 75
4 2 2 2 45
5 3 2 5 26
Now I want rows with max 'Id' Grouped by 'A'
Id A B C D
3 1 3 2 75
4 2 2 2 45
5 3 2 5 26
-
--Work, but I do not want
SELECT MAX(Id), A FROM Test GROUP BY A
--I want but do not work
SELECT MAX(Id), A, B, C, D FROM Test GROUP BY A
--Work but I do not want
SELECT MAX(Id), A, B, C, D FROM Test GROUP BY A, B, C, D
--Work and I want
SELECT old.Id, old.A, new.B, new.C, new.D
FROM(
SELECT
MAX(Id) AS Id, A
FROM
Test GROUP BY A
)old
JOIN Test new
ON old.Id = new.Id
Is there a better way to write last query without join
Most databases support window functions:
select *
from (
select *, row_number() over (partition by a order by id desc) rn
from test
) t
where rn = 1
Most DBMS now support Common Table Expressions (CTE). You can use one.
;with maxa as (
select row_number() over(partition by a order by id desc) rn,
id,a,b,c,d from test
)
select id,a,b,c,d
from maxa
where rn=1

Shuffle column in Google's BigQuery based on groupby

I want to randomly shuffle the values for one single column of a table based on a groupby. E.g., I have two columns A and B. Now, I want to randomly shuffle column B based on a groupby on A.
For an example, suppose that there are three distinct values in A. Now for each distinct value of A, I want to shuffle the values in B, but just with values having the same A.
Example input:
A B C
-------------------
1 1 x
1 3 a
2 4 c
3 6 d
1 2 a
3 5 v
Example output:
A B C
------------------
1 3 x
1 2 a
2 4 c
3 6 d
1 1 a
3 5 v
In this case, for A=1 the values for B got shuffled. The same happened for A=2, but as there is only one row it stayed like it was. For A=3 by chance the values for B also stayed like they were. The values for column C stay as they are.
Maybe this can be solved by using window functions, but I am unsure how exactly.
As a side note: This should be achieved in Google's BigQuery.
Is this what you're after ? (you tagged with both Mysql and Oracle .. so I answer here using Oracle)
[edit] corrected based on confirmed logic [/edit]
with w_data as (
select 1 a, 1 b from dual union all
select 1 a, 3 b from dual union all
select 2 a, 4 b from dual union all
select 3 a, 6 b from dual union all
select 1 a, 2 b from dual union all
select 3 a, 5 b from dual
),
w_suba as (
select a, row_number() over (partition by a order by dbms_random.value) aid
from w_data
),
w_subb as (
select a, b, row_number() over (partition by a order by dbms_random.value) bid
from w_data
)
select sa.a, sb.b
from w_suba sa,
w_subb sb
where sa.aid = sb.bid
and sa.a = sb.a
/
A B
---------- ----------
1 3
1 1
1 2
2 4
3 6
3 5
6 rows selected.
SQL> /
A B
---------- ----------
1 3
1 1
1 2
2 4
3 5
3 6
6 rows selected.
SQL>
Logic breakdown:
1) w_data is just your sample data set ...
2) randomize column a (not really needed, you could just rownum this, and let b randomize ... but I do so love (over)using dbms_random :) heh )
3) randomize column b - (using partition by analytic creates "groups" .. order by random radomizes the items within each group)
4) join them ... using both the group (a) and the randomized id to find a random item within each group.
by doing the randomize this way you can ensure that you get the same # .. ie you start with one "3" .. you end with one "3" .. etc.
I feel below should work in BigQuery
SELECT
x.A as A, x.B as Old_B, x.c as C, y.B as New_B
FROM (
SELECT A, B, C,
ROW_NUMBER() OVER(PARTITION BY A ORDER BY B, C) as pos
FROM [your_table]
) as x
JOIN (
SELECT
A, B, ROW_NUMBER() OVER(PARTITION BY A ORDER BY rnd) as pos
FROM (
SELECT A, B, RAND() as rnd
FROM [your_table]
)
) as y
ON x.A = y.A AND x.pos = y.pos

SQL grouping

I have a table with the following columns:
A B C
---------
1 10 X
1 11 X
2 15 X
3 20 Y
4 15 Y
4 20 Y
I want to group the data based on the B and C columns and count the distinct values of the A column. But if there are two ore more rows where the value on the A column is the same I want to get the maximum value from the B column.
If I do a simple group by the result would be:
B C Count
--------------
10 X 1
11 X 1
15 X 1
20 Y 2
15 Y 1
What I want is this result:
B C Count
--------------
11 X 1
15 X 1
20 Y 2
Is there any query that can return this result. Server is SQL Server 2005.
I like to work in steps: first get rid of duplicate A records, then group. Not the most efficient, but it works on your example.
with t1 as (
select A, max(B) as B, C
from YourTable
group by A, C
)
select count(A) as CountA, B, C
from t1
group by B, C
I have actually tested this:
SELECT
MAX( B ) AS B,
C,
Count
FROM
(
SELECT
B, C, COUNT(DISTINCT A) AS Count
FROM
t
GROUP BY
B, C
) X
GROUP BY C, Count
and it gives me:
B C Count
---- ---- --------
15 X 1
15 y 1
20 y 2
WITH cteA AS
(
SELECT
A, C,
MAX(B) OVER(PARTITION BY A, C) [Max]
FROM T1
)
SELECT
[Max] AS B, C,
COUNT(DISTINCT A) AS [Count]
FROM cteA
GROUP BY C, [Max];
Check this out. This should work in Oracle, although I haven't tested it;
select count(a), BB, CC from
(
select a, max(B) BB, Max(C) CC
from yourtable
group by a
)
group by BB,CC