How to grouping with distinct or group max sql - sql

i have date like this Data
id name period difference
6172 A 6 10
6172 A 3 10
10099 AB 12 24
10099 AB 6 24
10099 AB 3 24
10052 ABC 12 26
10052 ABC 6 26
10052 ABC 3 26
9014 ABCD 12 21
9014 ABCD 6 21
9014 ABCD 3 21
how to get result like this
id name period difference
6172 A 6 10
10099 AB 12 24
10052 ABC 12 26
9014 ABCD 12 4
i try with distinct on (id), but the result like this
id name period difference
6172 A 6 10
10099 AB 6 24
10052 ABC 6 26
9014 ABCD 6 4

The query you want looks something like:
SELECT DISTINCT ON (id) *
FROM Data
ORDER BY id, period DESC;
Demo
This is probably the most efficient way to write your query on Postgres. Note that DISTINCT ON syntax does not support more than one column in the ON clause. The above logic happens to work here assuming that id would uniquely identify each group (that is, that id would always be unique). If not, then we might have to resort to using ROW_NUMBER with a partition over id and name.

using max()
select id, name, t2.period, difference from tableA t1
inner join
(select id, max(period) as period from tableA
group by id) t2 on t2.id = t1.id
using distinct()
select distinct id, name, t2.period, difference from tableA

it seems you need just max()
select id,name,max(period),max(difference)
from table group by id,name
Though i have not found difference=4 in your sample data but you used that on output,so i guessed its your typo

Use max()
select id, name, max(period), difference from tablename
group by id, name,difference

You can try my code:
SELECT
id, name, max(period), difference
FROM
data_table
group by id, name,difference
order by name
This is a demo link http://sqlfiddle.com/#!17/9ab8d/2

Related

Counting SUM(VALUE) from previous cell

I have the following table:
A
Sum(Tickets)
01-2022
5
02-2022
2
03-2022
8
04-2022
1
05-2022
3
06-2022
3
07-2022
4
08-2022
1
09-2022
5
10-2022
5
11-2022
3
I would like to create the following extra column 'TotalSum(Tickets)' but I am stuck....
Anyone who can help out?
A
Sum(Tickets)
TotalSum(Tickets)
01-2022
5
5
02-2022
2
7
03-2022
8
15
04-2022
1
16
05-2022
3
19
06-2022
3
22
07-2022
4
26
08-2022
1
27
09-2022
5
32
10-2022
5
37
11-2022
3
40
You may use SUM() as a window function here:
SELECT A, SumTickets, SUM(SumTickets) OVER (ORDER BY A) AS TotalSumTickets
FROM yourTable
ORDER BY A;
But this assumes that you actually have a bona-fide column SumTickets which contains the sums. Assuming you really showed us the intermediate result of some aggregation query, you should use:
SELECT A, SUM(Tickets) AS SumTickets,
SUM(SUM(Tickets)) OVER (ORDER BY A) AS TotalSumTickets
FROM yourTable
GROUP BY A
ORDER BY A;
left join the same table where date is not bigger, then sum that for every date:
select
table1.date,
sum(t.tickets)
from
table1
left join table1 t
on t.date<= table1.date
group by
table1.date;

SQL Row containing max value per another variable

I have a basic SQL query but laptop is about to go out the window lol
I have a table
ID, StudentID, Mark, DateAdded
1 2 78 19/02/2020
2 4 43 19/02/2020
3 2 23 19/02/2020
4 5 91 20/03/2020
5 7 56 20/03/2020
6 9 24 20/03/2020
7 10 56 12/05/2020
8 10 23 12/05/2020
9 10 78 12/05/2020
10 9 23 12/05/2020
What I want to pull out is the entire row which has max score for each unique studentID, so for example
ID. StudentID. Mark. DateAdded
1 2 78 19/02/2020
2 4 43 19/02/2020
4 5 91 20/03/2020
5 7 56 20/03/2020
6 9 24 20/03/2020
9 10 78 12/05/2020
Thanks
You can use analytical function row_number as follows:
Select * from
(select t.*,
Row_number() over (partition by studentid order by mark desc) as rn
From t)
Where rn= 1
You can achieve your result with a group by, but you can't get the ID and DateAdded column too in this way.
SELECT StudentID, MAX(Mark)
FROM YourTable
GROUP BY StudentID
Simply use function MAX() and then a GROUP BY
SELECT StudentID, MAX(Mark) from tableOne GROUP BY studentID
Here is a fiddle link:
http://sqlfiddle.com/#!9/dd8e77/3
I am going to recommend a correlated subquery:
select t.*
from mytable t
where t.mark = (select max(t1.mark) from mytable t1 where t1.studentid = t.studentid)
Why:
Portability: this works in all versions of MySQL/MariaDB, whereas window functions were introduced in MySQL 8.0 / MariaDB 10.2.2
Efficiency: you want an index on (studentid, mark) (or better yet, studentid, mark desc), so the subquery can take advantage of it. In many situations, the correlated subquery is the most efficient approach at such top-1-per group problem.

Need to find the count of user who belongs to different depts

I have table with dept,user and so on, I need to find the number of count of user that belongs to different combinations of the dept.
Lets consider I've a table like this:
dept user
1 33
1 33
1 45
2 11
2 12
3 33
3 15
Then I've to find the uniq user and dept combination: something like this:
select distinct dept,user from x;
Which will give me result like :
Dept user
1 33
1 45
2 11
2 12
3 33
3 15
which actually removes the duplicates of the combination:
And here's the thing which i need to do :
My output should look like this:
dep_1_1 dep_1_2 dep_1_3 dep_2_2 dep_2_1 dep_2_3 Dep_3_1 Dep_3_2 Dep_3_3
2 0 1 2 0 0 1 0 2
So, Basically I need to find the count of common users between all the combinations of departments
Thanks for the help
You can get a row for each department combination using a self-join of your Distinct Select:
with cte as
(
select distinct dept,user from x
)
select t1.dept, t2.dept, count(*)
from cte a st1 join cte as t2
on t1.user = t2.user -- same user
and t1.dept < t2.dept -- different department
group by t1.dept, t2.dept
order by t1.dept, t2.dept

How to select group by 2 columns + id?

I have a problem with data selection using SQL in PostgreSQL database.
I have the following data in one table:
ID ID_X ID_Y
100 1 2
101 1 1
102 1 1
103 1 2
104 5 10
105 5 11
106 5 10
107 5 11
108 8 20
109 8 30
110 8 20
How to write select statement to get the following results?
ID ID_X ID_Y
100 1 2
101 1 1
104 5 10
105 5 11
108 8 20
109 8 30
I know that it is a kind of group by ID_X and ID_Y but how to select also "ID" column without grouping by it?
Maybe there is a way to select using distinct? or group by with subselect? Please help :)
You can use an aggregate function like MIN() or MAX(). In your case you want MIN() to get those specific results.
SELECT MIN(ID), ID_X, ID_Y
FROM [tablename]
GROUP BY ID_X, ID_Y
Try this using Distinct on
select *
from
(
select distinct on (id_x, id_y) ID, id_x, id_y
FROM t order by id_x, id_y,id
) q
order by id
Seems like you want a GROUP BY. Use MIN() to return each group's lowest ID:
select min(ID), ID_X, ID_Y
from tablename
group by ID_X, ID_Y
Alternatively, you can do a NOT EXISTS:
select *
from tablename t1
where not exists (select 1 from tablename t2
where t2.ID_X = t1.ID_X
and t2.ID_Y = t1.ID_Y
and t2.ID < t1.ID)
I.e. return a row as long as there are no (other) row with same ID_X and ID_Y but a lower ID.

How do I get the latest record?

This is my table:
create table test (
id string,
name string,
age string,
modified string)
and this is my data:
id name age modifed
1 a 10 2011-11-11 11:11:11
1 a 11 2012-11-11 12:00:00
2 b 20 2012-12-10 10:11:12
2 b 20 2012-12-10 10:11:12
2 b 20 2012-12-12 10:11:12
2 b 20 2012-12-15 10:11:12
I want to get the latest record (include every columns id,name,age,modified) group by id,as the data above,the correct result is:
1 a 11 2012-11-11 12:00:00
2 b 20 2012-12-15 10:11:12
I am using below query in hive, it is working fine in sql http://sqlfiddle.com/#!2/bfbd5/42 but it is not working fine in hive
select * from test where (id, modified) in(select id, max(modified) from test group by id)
I am using 0.13 version of hive.
Hive only allows one column in an IN subquery. Try a left semijoin:
SELECT *
FROM test a
LEFT SEMI JOIN
(select id, max(modified) as modified from test) b
ON (a.modified = b.modified and a.id=b.id);
It sure seems like you could easily get he right answer using a straight forward query though. Select the max of the two columns and be sure to group by the columns that don't have aggregate functions.
select id
, name
, max(age) as age
, max(modified) as modified
from test
group by id, name;