SQL: Efficient way to get group by results including all table columns

SQL: Efficient way to get group by results including all table columns - sql

Let's consider a simple table below.
id
code
marks
grade
1
X
100
A
2
Y
120
B
3
Z
130
A
4
X
120
C
5
Y
100
A
6
Z
110
B
7
X
150
A
8
X
140
C
Goal: Get maximum marks for each grade, return all the columns.
id
code
marks
grade
7
X
150
A
2
Y
120
B
8
X
140
C
This is very simple if I don't want id and code column
select grade, max(marks)
from table
group by grade;
What could be the most efficient query to get id and code column in the above query?
I tried something like this which didn't work
select * from table t
inner join
(select grade, max(marks)
from table
group by grade) a
on a.grade=t.grade;

In Postgres the most efficient way for this kind of query is to use (the proprietary) distinct on ()
select distinct on (grade) *
from the_table t
order by grade, marks desc;

Are you looking for a correlated subquery?
select t.*
from t
where t.marks = (select max(t2.marks) from t t2 where t2.grade = t.grade);

Related

How to get these rows as columns in an SQL query

I need some help in writing up this SQL query using a single table. Something like this
User ID
Category
Spend
Transactions
Country
1
Sport
30
2
USA
1
Bills
60
3
USA
2
Sport
10
1
MEX
3
Grocery
50
8
CAN
2
Grocery
70
4
MEX
3
Sport
20
5
CAN
3
Bills
30
2
CAN
1
Petrol
60
5
USA
I then want to group the rows by the User id and group the spend and transactions each by the category and having the country as a column by itself like this.
User ID
Sport_Spend
Bills_Spend
Grocery_Spend
Petrol_Spend
Sport_Transactions
Bills_Transactions
Grocery_Transactions
Petrol_Transactions
Country
1
30
60
0
60
2
3
0
5
USA
2
10
0
70
0
1
0
4
0
MEX
3
20
30
50
0
5
2
8
0
CAN
Its stumping me a bit would appreciate some help.

#jarlh comments are most relevant and need to be addressed. But here is something to start with: (ms sql code) (I opted out from transactions columns to reduce the problem, but the coding is just the same) https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=25550539029ba1c4be0826725bf9e00a
with data (UserID,Category,Spend,Transactions,Country) as(
select 1,'Sport',30,2,'USA' union all
select 1,'Bills',60,3,'USA' union all
select 2,'Sport',10,1,'MEX' union all
select 3,'Grocery',50,8,'CAN' union all
select 2,'Grocery',70,4,'MEX' union all
select 3,'Sport',20,5,'CAN' union all
select 3,'Bills',30,2,'CAN' union all
select 1,'Petrol',60,5,'USA'
)
select UserID
,isnull(SUM([Sport]),0)as Sport
,isnull(SUM([Bills]),0)as Bills
,isnull(SUM([Grocery]),0)as Grocery
,isnull(SUM([Petrol]),0)as Petrol
,MAX(Country)as Country
from (
select UserID,Category,Spend,Transactions,Country
from data) p
PIVOT(
SUM(SPEND)
For CATEGORY in ([Sport] ,[Bills] ,[Grocery] ,[Petrol])
)as PivotTable
group by UserID

select
COALESCE(user_id,0) as user_id,
COALESCE(Sport_Spend,0) as Sport_Spend,
COALESCE(Bills_Spend,0) as Bills_Spend,
COALESCE(Grocery_Spend,0) as Grocery_Spend,
COALESCE(Petrol_Spend,0) as Petrol_Spend,
COALESCE(Sport_Transactions,0) as Sport_Transactions,
COALESCE(Bills_Transactions,0) as Bills_Transactions,
COALESCE(Grocery_Transactions,0) as Grocery_Transactions,
COALESCE(Petrol_Transactions,0) as Petrol_Transactions
,country from
(SELECT DISTINCT user_id,country from table_name) as A
LEFT JOIN
(select user_id, spend as Sport_Spend ,transactions as Sport_Transactions from table_name where category='Sport') as B using (user_id)
LEFT JOIN
(select user_id, spend as Bills_Spend ,transactions as Bills_Transactions from table_name where category='Bills') as C using (user_id)
LEFT JOIN
(select user_id, spend as Grocery_Spend ,transactions as Grocery_Transactions from table_name where category='Grocery') as D using (user_id)
LEFT JOIN
(select user_id, spend as Petrol_Spend ,transactions as Petrol_Transactions from table_name where category='Petrol') as E using (user_id)
ORDER BY user_id;

Need to find the count of user who belongs to different depts

I have table with dept,user and so on, I need to find the number of count of user that belongs to different combinations of the dept.
Lets consider I've a table like this:
dept user
1 33
1 33
1 45
2 11
2 12
3 33
3 15
Then I've to find the uniq user and dept combination: something like this:
select distinct dept,user from x;
Which will give me result like :
Dept user
1 33
1 45
2 11
2 12
3 33
3 15
which actually removes the duplicates of the combination:
And here's the thing which i need to do :
My output should look like this:
dep_1_1 dep_1_2 dep_1_3 dep_2_2 dep_2_1 dep_2_3 Dep_3_1 Dep_3_2 Dep_3_3
2 0 1 2 0 0 1 0 2
So, Basically I need to find the count of common users between all the combinations of departments
Thanks for the help

You can get a row for each department combination using a self-join of your Distinct Select:
with cte as
(
select distinct dept,user from x
)
select t1.dept, t2.dept, count(*)
from cte a st1 join cte as t2
on t1.user = t2.user -- same user
and t1.dept < t2.dept -- different department
group by t1.dept, t2.dept
order by t1.dept, t2.dept

SQL query: same rows

I'm having trouble finding the right sql query. I want to select all the rows with a unique x value and if there are rows with the same x value, then I want to select the row with the greatest y value. As an example I've put a part of my database below.
ID x y
1 2 3
2 1 5
3 4 6
4 4 7
5 2 6
The selected rows should then be those with ID 2, 4 and 5.
This is what I've got so far
SELECT *
FROM base
WHERE x IN
(
SELECT x
FROM base
HAVING COUNT(*) > 1
)
But this only results in the rows that occur more than once. I've added the tags R, postgresql and sqldf because I'm working in R with those packages.

Here is a typical way to formulate the query in ANSI SQL:
select b.*
from base b
where not exists (select 1
from base b2
where b2.x = b.x and
b2.y > b.y
);
In Postgres, you would use distinct on for performance:
select distinct on (x) b.*
from base b
order by x, y desc;

You could try this query:
select x, max(y) from base group by x;
And, if you'd also like the id column in the result:
select base.*
from base join (select x, max(y) from base group by x) as maxima
on (base.x = maxima.x and base.y = maxima.max);

Example:
CREATE TABLE tmp(id int, x int ,y int);
INSERT INTO .....
test=# SELECT x, max(y) AS y FROM tmp GROUP BY x;
x | y
---+---
4 | 7
1 | 5
2 | 6

How to declare a row as a Alternate Row

id Name claim priority
1 yatin 70 5
6 yatin 1 10
2 hiren 30 3
3 pankaj 40 2
4 kavin 50 1
5 jigo 10 4
7 jigo 1 10
this is my table and i want to arrange this table as shown below
id Name claim priority AlternateFlag
1 yatin 70 5 0
6 yatin 1 10 0
2 hiren 30 3 1
3 pankaj 40 2 0
4 kavin 50 1 1
5 jigo 10 4 0
7 jigo 1 10 0
It is sorted as alternate group of same row.
I am Using sql server 2005. Alternate flag starts with '0'. In my example First record with name "yatin" so set AlternateFlag as '0'.
Now second record has a same name as "yatin" so alternate flag would be '0'
Now Third record with name "hiren" is single record, so assign '1' to it
In short i want identify alternate group with same name...
Hope you understand my problem
Thanks in advance

Try
SELECT t.*, f.AlternateFlag
FROM tbl t
JOIN (
SELECT [name],
AlternateFlag = ~CAST(ROW_NUMBER() OVER(ORDER BY MIN(ID)) % 2 AS BIT)
FROM tbl
GROUP BY name
) f ON f.name = t.name
demo

You could use probably an aggregate function COUNT() and then HAVING() and then UNION both Table, like:
SELECT id, A.Name, Claim, Priority, 0 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) > 1 ) A
ON YourTable.Name = A.Name
UNION ALL
SELECT id, B.Name, Claim, Priority, 1 as AlternateFlag
FROM YourTable
INNER JOIN (
SELECT Name, COUNT(*) as NameCount
FROM YourTable
GROUP BY Name
HAVING COUNT(*) = 1 ) B
ON YourTable.Name = B.Name
Now, this assumes that the Names are unique meaning the names like Yatin for example although has two counts is only associated to one person.
See my SqlFiddle Demo

You can use Row_Number() function with OVER that will give you enumeration, than use the reminder of integer division it by 2 - so you'll get 1s and 0s in your SELECT or in the view.

PLSQL or SSRS, How to select having all values in a group?

I have a table like this.
ID NAME VALUE
______________
1 A X
2 A Y
3 A Z
4 B X
5 B Y
6 C X
7 C Z
8 D Z
9 E X
And the query:
SELECT * FROM TABLE1 T WHERE T.VALUE IN (X,Z)
This query gives me
ID NAME VALUE
______________
1 A X
3 A Z
4 B X
6 C X
7 C Z
8 D Z
9 E X
But i want to see all values of names which have all params. So, only A and C have both X and Z values, and my desired result is:
ID NAME VALUE
______________
1 A X
2 A Y
3 A Z
6 C X
7 C Z
How can I get the desired result? No matter with sql or with reporting service. Maybe "GROUP BY ..... HAVING" clause will help, but I'm not sure.
By the way I dont know how many params will be in the list.
I realy appreciate any help.

The standard approach would be something like
SELECT id, name, value
FROM table1 a
WHERE name IN (SELECT name
FROM table1 b
WHERE b.value in (x,y)
GROUP BY name
HAVING COUNT(distinct value) = 2)
That would require that you determine how many values are in the list so that you can use a 2 in the HAVING clause if there are 2 elements, a 5 if there are 5 elements, etc. You could also use analytic functions
SELECT id, name, value
FROM (SELECT id,
name,
value,
count(distinct value) over (partition by name) cnt
FROM table1 t1
WHERE t1.value in (x,y))
WHERE cnt = 2

I prefer to structure these "sets within sets" of queries as an aggregatino. I find this is the most flexible approach:
select t.*
from t
where t.name in (select name
from t
group by name
having sum(case when value = 'X' then 1 else 0 end) > 0 and
sum9case when value = 'Y' then 1 else 0 end) > 0
)
The subquery for the in finds all names that have at least one X value and one Y value. Using the same logic, it is easy to adjust for other conditions (X and Y and Z,; X and Y but not Z and so on). The outer query just returns all the rows instead of the names.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: Efficient way to get group by results including all table columns - sql

In Postgres the most efficient way for this kind of query is to use (the proprietary) distinct on () select distinct on (grade) * from the_table t order by grade, marks desc;

Are you looking for a correlated subquery? select t.* from t where t.marks = (select max(t2.marks) from t t2 where t2.grade = t.grade);

Related

How to get these rows as columns in an SQL query

Need to find the count of user who belongs to different depts

SQL query: same rows

How to declare a row as a Alternate Row

PLSQL or SSRS, How to select having all values in a group?

Categories

Resources