Sum numbers in group and find the maximum number per group - sql

For each individual ‘id’ I want to SUM(cc) based on ‘e_sour’
I then want to display each ‘id’ and find the maximum sum(cc) and its partnering ‘e_sour.’
With the code I have written, I can’t seem to show individual ‘id’ in many cases it still shows multiple ‘id’ that are the same argh!
Data I have:
id
e_sour
cc
1
win
400
1
win
400
1
elec
400
2
win
400
2
win
400
Output data:
id
e_sour
cc
1
win
800
2
win
800
WITH Input1 AS
(
select id, e_sour, sum(cc) AS total_cc, rm
from Prod
Group by id, e_sour, rm
Having rm = 'Latest'
), Input2 AS
(
Select id, MAX(total_cc) AS max_total_cc, e_sour
From Input1
GROUP BY id, e_sour
), Input3 AS
(
Select id, MAX(max_total_cc) AS max_total_cc2
From Input2
GROUP BY id
)
Select *
from Input3
Inner join
(Select * from Input2) In2 ON Input3.id = In2.id
ORDER BY Input3.id

If I understand the question correctly, you need a ROW_NUMBER():
SELECT id, e_sour, cc
FROM (
SELECT
id, e_sour, SUM(cc) AS cc,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY SUM(cc) DESC) AS Rn
FROM Prod
GROUP BY id, e_sour
) t
WHERE t.Rn = 1

Related

select value based on max of other column

I have a few questions about a table I'm trying to make in Postgres.
The following table is my input:
id
area
count
function
1
100
20
living
1
200
30
industry
2
400
10
living
2
400
10
industry
2
400
20
education
3
150
1
industry
3
150
1
education
I want to group by id and get the dominant function based on max area. With summing up the rows for area and count. When area is equal it should be based on max count, when area and count is equal it should be based on prior function (i still have to decide if education is prior to industry or vice versa). So the result should be:
id
area
count
function
1
300
50
industry
2
1200
40
education
3
300
2
industry
I tried a lot of things and maybe it's easy, but i don't get it. Can someone help to get the right SQL?
One method uses row_number() and conditional aggregation:
select id, sum(area), sum(count),
max(function) over (filter where seqnum = 1) as function
from (select t.*,
row_number() over (partition by id order by area desc) as seqnum
from t
) t
group by id;
Another method uses ``distinct on`:
select id, sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
function
from t
order by id, area desc;
Use a scalar sub-query for "function".
select t.id, sum(t.area), sum(t.count),
(
select "function"
from the_table
where id = t.id
order by area desc, count desc, "function" desc
limit 1
) as "function"
from the_table as t
group by t.id order by t.id;
SQL Fiddle
you can use sum as window function:
select distinct on (t.id)
id,
sum(area) over (partition by id) as area,
sum(count) over (partition by id) as count,
( select function from tbl_test where tbl_test.id = t.id order by count desc limit 1 ) as function
from tbl_test t
This is how you get the function for each group based on id:
select id, function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null;
(we ensure that no yt2 exists that would be of the same id but of higher areay)
This would work nicely, but you might have several max areas with different values. To cope with this isue, let's ensure that exactly one is chosen:
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id;
Now, let's join this to our main table;
select yourtable.id, sum(yourtable.area), sum(yourtable.count), t.function
from yourtable
join (
select id, max(function) as function
from yourtable yt1
left join yourtable yt2
on yt1.id = yt2.id and yt1.area < yt2.area
where yt2.area.id is null
group by id
) t
on yourtable.id = t.id
group by yourtable.id;

Group By only for some columns

How do I group only some of the top selected columns in my select query?
A wrong but easy answer I would think of is this code;
SELECT TOP 5 brand, name, delivered, count(*)
From myTB
Where type = 'jeans'
Group By brand, name
Order By Count(*) DESC
The result that I'm after should return the below results;
(the above code is wrong and returns an error)
Brand name Delivered Count
-------------------------------------
Levis 304 Slim 9/24 44
Croccer 500 Lose 3/14 22
Croccer 400 Botcut 4/7 14
Lee Botcut 33 5/5 16
Lee Slim 44 10/7 12
In the above results i get the brands together after one another even thuo the count is not decending.
I have tried and the closest that i get is with this code;
SELECT TOP 5 brand, name, delivered, count(*)
From myTB
Where type = 'jeans'
Group By brand, name, delivered
Order By Count(*) DESC
But that returns the data like this;
Brand name Delivered Count
-------------------------------------
Levis 304 Slim 9/24 44
Croccer 500 Lose 3/14 22
Lee Botcut 33 5/5 16
Croccer 400 Botcut 4/7 14
Lee Slim 44 10/7 12
If I try to use "order by count(*), brand" i get, for some reason, the brands in descending order regardles of the count value. It seams like it only order the brand column and not both brand and count
I also tried to do a left join on the same table so that i only needed to Group By in the primary table but thats not right either and the code I come up with was really confusing so I'm going to leave that outside this thread.
It seems like you want to order by the maximum count per brand first and the brand second.
select top 5 t1.* from (
select brand, name, delivered, count(*)
from myTB
where type = 'jeans'
group by brand, name, delivered
) t1 join (
select brand, cnt
from (
select brand, cnt,
row_number() over (partition by brand order by cnt desc) rn
from (select brand, count(*) cnt from myTB group by brand, name, delivered) t1
) t1
where rn = 1
) t2 on t1.brand = t2.brand
order by t2.cnt desc, t2.brand
try this
select TOP 5 t1.* from (SELECT brand, name, delivered, count(*)as 'test'
From myTB
Where type = 'jeans'
Group By brand, name,delivered
) as t1 order by t1.test desc

Which row has the highest value?

I have a table of election results for multiple nominees and polls. I need to determine which nominee had the most votes for each poll.
Here's a sample of the data in the table:
PollID NomineeID Votes
1 1 108
1 2 145
1 3 4
2 1 10
2 2 41
2 3 0
I'd appreciate any suggestions or help anyone can offer me.
This will match the highest, and will also bring back ties.
select sd.*
from sampleData sd
inner join (
select PollID, max(votes) as MaxVotes
from sampleData
group by PollID
) x on
sd.PollID = x.PollID and
sd.Votes = x.MaxVotes
SELECT
t.NomineeID,
t.PollID
FROM
( SELECT
NomineeID,
PollID,
RANK() OVER (PARTITION BY i.PollID ORDER BY i.Votes DESC) AS Rank
FROM SampleData i) t
WHERE
t.Rank = 1
SELECT PollID, NomineeID, Votes
FROM
table AS ABB2
JOIN
(SELECT PollID, MAX(Votes) AS most_votes
FROM table) AS ABB1 ON ABB1.PollID = ABB2.PollID AND ABB1.most_votes = ABB2.Votes
Please note, if you have 2 nominees with the same number of most votes for the same poll, they'll both be pulled using this query
select Pollid, Nomineeid, Votes from Poll_table
where Votes in (
select max(Votes) from Poll_table
group by Pollid
);

Get the mostly occured value in multiple columns of a table

I have table which contains three columns Work, Cost, Duration. I need to get the maximum
occurred values in all three columns. If two values occurred same times, then return the
maximum value from that two. Please see the sample data & result below.
Work Cost Duration
5 2 6
5 8 7
6 8 7
2 2 2
6 2 6
I need to get the result as
Work Cost Duration
6 2 7
I tried with the following query, But it is returning the value for one column, that too it is returning the count for all the values
select Duration, count(*) as "DurationCount" from SimulationResult
group by Duration
order by count(*) desc,Duration desc
You can do something like
select * from
(select top 1 Work from SimulationResult
group by Work
order by count(*) desc, Work desc),
(select top 1 Cost from SimulationResult
group by Cost
order by count(*) desc, Cost desc),
(select top 1 Duration from SimulationResult
group by Duration
order by count(*) desc, Duration desc)
Try the following:
select max(t1.a), max(t2.b), max(t3.c)
from
(select a from (
select a, count(a) counta
from #tab
group by a) tempa
having counta = max(counta)) t1,
(select b from (
select b, count(b) countb
from #tab
group by b) tempb
having countb = max(countb)) t2,
(select c from (
select c, count(c) countc
from #tab
group by c) tempc
having countc = max(countc)) t3

Group By Retrieve 4 Values

I have the following query
SELECT Cod ,
MIN(Id) AS id_Min,
-- retrieve value min in the middle as id_Min_Middle,
-- retrieve value max in the middle as id_Max_Middle,
MAX(Id) AS id_Max,
COUNT(*) AS Tot
FROM Table a ( NOLOCK )
GROUP BY Cod
HAVING COUNT(*)=4
How could I retrieve the values between min and max as I have done for min and max?
If I use (SUM(Id) - (MIN(Id)+MAX(Id)) I get the sum of middle min and max, but not the values I want.
EXAMPLES
Cod | Id
Stack 10
Stack 15
Stack 11
Stack 40
Overflow 1
Overflow 120
Overflow 15
Overflow 100
Required output
Cod | Min | Min_In_The_Middle | Max_In_The_Middle | Max
Stack 10 11 15 40
Overflow 1 15 100 120
Just only one [Table|[Clustered] Index]]Scan (demo here):
SELECT pvt.Cod,
pvt.[1] AS MinValue,
pvt.[2] AS MinInterValue,
pvt.[3] AS MaxInterValue,
pvt.[4] AS MaxValue
FROM
(
SELECT x.Cod, x.ID, x.RowNumAsc
FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY t.Cod ORDER BY t.ID ASC) RowNumAsc,
ROW_NUMBER() OVER(PARTITION BY t.Cod ORDER BY t.ID DESC) RowNumDesc
FROM MyTable t
) x
WHERE x.RowNumAsc = 1 AND x.RowNumDesc = 4
OR x.RowNumAsc = 2 AND x.RowNumDesc = 3
OR x.RowNumAsc = 3 AND x.RowNumDesc = 2
OR x.RowNumAsc = 4 AND x.RowNumDesc = 1
) y
PIVOT ( MAX(y.ID) FOR y.RowNumAsc IN ([1], [2], [3], [4]) ) pvt;
Try using this, best of luck
WITH temp AS
(SELECT cod, MIN (ID) min_id, MAX (ID) max_id
FROM tab
GROUP BY cod
HAVING COUNT (ID) = 4)
SELECT code, temp.min_id,
(SELECT MIN (ID)
FROM tab
WHERE cod = temp.cod AND ID NOT IN (temp.min_id)
GROUP BY cod) min_mid_id,
(SELECT MAX (ID)
FROM tab
WHERE cod = temp.cod AND ID NOT IN (temp.max_id)
GROUP BY cod) max_min_id, temp.max_id
FROM temp;
I'm not sure what it means for your question to be tagged plsql and sql-server. But I'll assume you're working with a database system that supports CTEs and window functions.
To generalize what you're been trying to do, first assign row numbers to the rows, then use whatever technique you want to achieve the pivot:
;WITH OrderedValues as (
SELECT Cod,Id,ROW_NUMBER() OVER (PARTITION BY Cod ORDER BY Id) as rn
COUNT(*) OVER (PARTITION BY Cod) as Cnt
FROM Table (NOLOCK)
), With4Values as (
SELECT * from OrderedValues where Cnt=4
)
SELECT Cod,
--However you want to do the pivot. Here I'll use MAX/CASE
MAX(CASE WHEN rn=1 THEN Id END) as Value1,
MAX(CASE WHEN rn=2 THEN Id END) as Value2,
MAX(CASE WHEN rn=3 THEN Id END) as Value3,
MAX(CASE WHEN rn=4 THEN Id END) as Value4
FROM
With4Values
GROUP BY
Cod
You can hopefully see that this is more easily extended to more columns than answering your overly specific questions about 3 rows, or 4 rows. But if you need to deal with an arbitrary number of columns, you'll have to switch to dynamic SQL.
I understand you want to exclude the extreme values and find min and max for the rest.
This is what I think of, but I had no chance to run and test it...
WITH Extremes AS ( SELECT Cod, MAX(ID) AS Id_Max, MIN(ID) AS Id_Min
FROM [Table] a GROUP BY Cod)
SELECT
e.Cod,
e.Id_Min,
MIN(a.Id) AS id_Min_Middle,
MAX(a.Id) AS id_Max_Middle,
e.Id_Max
FROM Extremes e
LEFT JOIN [Table] a ON a.Cod = e.Cod AND a.Id > e.Id_Min AND a.Id < e.Id_Max
GROUP BY e.Cod