Oracle SQL Group by and sum with multiple conditions - sql

I attached a capture of two tables:
- the left table is a result of others "Select" query
- the right table is the result I want from the left table
The right table can be created following the next conditions:
When the same Unit have all positive or all negative
energy values, the result remain the same
When the same Unit have positive and negative energy values then:
Make a sum of all Energy for that Unit(-50+15+20 = -15) and then take the maximum of absolut value for the Energy.e.g. max(abs(energy))=50 and take the price for that value.
I use SQL ORACLE.
I realy appreciate the help in this matter !
http://sqlfiddle.com/#!4/eb85a/12

This returns desired result:
signs CTE finds out whether there are positive/negative values, as well as maximum ABS energy value
then, there's union of two selects: one that returns "original" rows (if count of distinct signs is 1), and one that returns "calculated" values, as you described
SQL> with
2 signs as
3 (select unit,
4 count(distinct sign(energy)) cnt,
5 max(abs(energy)) max_abs_ene
6 from tab
7 group by unit
8 )
9 select t.unit, t.price, t.energy
10 from tab t join signs s on t.unit = s.unit
11 where s.cnt = 1
12 union all
13 select t.unit, t2.price, sum(t.energy)
14 from tab t join signs s on t.unit = s.unit
15 join tab t2 on t2.unit = s.unit and abs(t2.energy) = s.max_abs_ene
16 where s.cnt = 2
17 group by t.unit, t2.price
18 order by unit;
UNIT PRICE ENERGY
-------------------- ---------- ----------
A 20 -50
A 50 -80
B 13 -15
SQL>
Though, what do you expect if there was yet another "B" unit row with energy = +50? Then two rows would have the same MAX(ABS(ENERGY)) value.

A union all might be the simplest solution:
with t as (
select t.*,
max(energy) over (partition by unit) as max_energy,
min(energy) over (partition by unit) as min_energy
from t
)
select unit, price, energy
from t
where max_energy > 0 and min_energy > 0 or
max_energy < 0 and min_enery < 0
union all
select unit,
max(price) keep (dense_rank first order by abs(energy)),
sum(energy)
from t
where max_energy > 0 and min_energy < 0
group by unit;

Related

Order competitors by multiple conditions

I use a concrete, but hypothetical, example.
Consider a database containing the results of a shooting competition, where each competitor made several series of shots. DB contains 3 tables: Competitors, Series and Shots.
Competitors:
id
name
1
A
2
B
Series:
id
competitorId
1
1
2
1
3
1
4
2
5
2
6
2
Shots:
id
serieId
score
1
1
8
2
1
8
3
1
8
4
2
10
5
2
7
6
2
6
7
3
10
8
3
8
9
3
6
10
4
8
11
4
8
12
4
7
13
5
7
14
5
10
15
5
7
16
6
7
17
6
10
18
6
7
(DDL with the above statements: dbfiddle)
What I need is to order competitors by multiple conditions, which are:
Total score of all series
Number of center hits (center hit has 10 points score)
The next step to order by is:
Highest score on last serie
Highest score on next to last serie
Highest score on next to next to last serie
...
and so on for the number of series in the competition.
The query that uses the first two order conditions is quite straightforward:
SELECT comp.name,
SUM(shots.score) AS score,
SUM(IIF(shots.score = 10, 1, 0)) AS centerHits
FROM Shots shots
INNER JOIN Series series ON series.id = shots.serieId
INNER JOIN Competitors comp ON comp.id = series.competitorId
GROUP BY comp.name
ORDER BY score DESC, centerHits DESC
It produces following results:
name
score
centerHits
A
71
2
B
71
2
With the 3rd order condition I expect B competitor to be above A, because both have the same total score, the same centerHits and the same score for the last serie (24), but the score of next to last serie of B is 24 while A's score is only 23.
I wonder if it's possible to make a query that uses the third and following order conditions.
You should be able to do this pretty simply, as your requirements can be done with normal aggregation and window functions.
For each level of ordering:
"Total score of all series" can be satisfied by summing all scores.
"Number of center hits (center hit has 10 points score)" can be satisfied with a conditional count.
To order by each series working backwards by date, we can aggregate the total score per series (which we calculate using a window function) using STRING_AGG, ordering the aggregation by date (or id). Then if we order the final query by that aggregation, the later series will be sorted first.
This method allows you to order by an arbitrary number of series, as opposed to the other answer.
It's unclear how you define "later" and "earlier" as you have no date column, but I've used series.id as a proxy for that.
SELECT
comp.name,
SUM(shots.score) as totalScore,
COUNT(CASE WHEN shots.score = 10 THEN 1 END) AS centerHits,
STRING_AGG(NCHAR(shots.MaxScore + 65), ',') WITHIN GROUP (ORDER BY series.id DESC) as AllShots
FROM (
SELECT *,
SUM(shots.score) OVER (PARTITION BY shots.serieID) MaxScore
FROM Shots shots
) shots
INNER JOIN Series series ON series.id = shots.serieId
INNER JOIN Competitors comp ON comp.id = series.competitorId
GROUP BY
comp.id,
comp.name
ORDER BY
totalScore DESC,
centerHits DESC,
AllShots DESC;
Note that when grouping by name, you should also add in the primary key to the GROUP BY as the name might not be unique.
A similar, but slightly more complex query, is to pre-aggregate shots in the derived table. This is likely to perform better than using a window function.
SELECT
comp.name,
SUM(shots.totalScore) as totalScore,
SUM(centerHits) AS centerHits,
STRING_AGG(NCHAR(shots.totalScore + 65), ',') WITHIN GROUP (ORDER BY series.id DESC) as AllShots
FROM (
SELECT
shots.serieId,
SUM(shots.score) as totalScore,
COUNT(CASE WHEN shots.score = 10 THEN 1 END) AS centerHits
FROM Shots shots
GROUP BY
shots.serieId
) shots
INNER JOIN Series series ON series.id = shots.serieId
INNER JOIN Competitors comp ON comp.id = series.competitorId
GROUP BY
comp.id,
comp.name
ORDER BY
totalScore DESC,
centerHits DESC,
AllShots DESC;
db<>fiddle
It appears you need a multi-level query, each building on the one prior.
The INNER-MOST query with alias PQ is a simple sum on a per SerieID which gets the total Center Hits and total points for each respective set. Similar to what you had for the counting.
From that, you need to know which series is the latest (most recent) and work your way backwards to the prior and again prior to that. By using the OVER / PARTITION, I am joining to the series table to get the competitor ID and name while I'm at it.
By Partitioning the data based on each competitor, and applying the order based on the SerieID DESCENDING, I am getting the row number which will put the most recent as row_number() becoming 1, 2 and 3 respectively, such that for Competitor A, who had SerieID 1, then 2, then 3 will have the final "MostRecent" column as 3, 2 and 1 respetively, so SerieID 3 = 1 -- the most recent, and SerieID 1 = 3 the OLDEST serie or the competitor.
Similarly for the second competitor B, SerieIDs 4, 5 and 6 become 3, 2, 1 respectively. So now, you have a basis to know what was the latest (1 = most recent), the next to last (2 = next most recent), and next to next to last (3...)
Now that these two parts are all set, I can sum the respective totals, center hits, and now expliitly know what the most recent (1) was for its sort, and second latest (2) and third from last (3) are. These are added to the group by.
Now, if one competitor has 6 shooting series vs another having 4 series (not that it will happen in a real competition, but to understand the context), the 6 series will have their LATEST as the MostRecent = 1, similarly with 4 series, the 4th series will be MostRecent = 1.
So the final group by at the COMPETITOR level, you can assess all the parts in question.
select
F.Name,
F.CompetitorID,
sum( F.SeriesTotalScore ) TotalScore,
sum( F.CenterHits ) CenterHits,
sum( case when F.MostRecent = 1
then F.SeriesTotalScore else 0 end ) MostRecentScore,
sum( case when F.MostRecent = 2
then F.SeriesTotalScore else 0 end ) SecondToMostRecentScore,
sum( case when F.MostRecent = 3
then F.SeriesTotalScore else 0 end ) ThirdToMostRecentScore
from
( select
c.Name,
Se.CompetitorID,
PQ.SerieId,
PQ.CenterHits,
PQ.SeriesTotalScore,
ROW_NUMBER() OVER( PARTITION BY Se.CompetitorID
order by PQ.SerieId DESC) AS MostRecent
from
( select
s.serieId,
sum( case when s.score = 10 then 1 else 0 end ) as CenterHits,
sum( s.Score ) SeriesTotalScore
from
Shots s
group by
s.SerieID ) PQ
Join Series Se
on PQ.SerieID = se.id
JOIN Competitors c
on Se.CompetitorID = c.id
) F
group by
F.Name,
F.CompetitorID
order by
sum( F.SeriesTotalScore ) desc,
sum( F.CenterHits ),
sum( case when F.MostRecent = 1
then F.SeriesTotalScore else 0 end ) desc,
sum( case when F.MostRecent = 2
then F.SeriesTotalScore else 0 end ) desc,
sum( case when F.MostRecent = 3
then F.SeriesTotalScore else 0 end ) desc

SQL Get closest value to a number

I need to find the closet value of each number in column Divide from the column Quantity and put the value found in the Value column for both Quantities.
Example:
In the column Divide the value of 5166 would be closest to Quantity column value 5000. To keep from using those two values more than once I need to place the value of 5000 in the value column for both numbers, like the example below. Also, is it possible to do this without a loop?
Quantity Divide Rank Value
15500 5166 5 5000
1250 416 5 0
5000 1666 5 5000
12500 4166 4 0
164250 54750 3 0
5250 1750 3 0
6250 2083 3 0
12250 4083 3 0
1750 583 2 0
17000 5666 2 0
2500 833 2 0
11500 3833 2 0
1250 416 1 0
There are a couple of answers here but they both use ctes/complex subqueries. There is a much simpler/faster way by just doing a couple of self joins and a group-by
https://www.db-fiddle.com/f/rM268EYMWuK7yQT3gwSbGE/0
select
min(min.quantity) as minQuantityOverDivide
, t1.divide
, max(max.quantity) as maxQuantityUnderDivide
, case
when
(abs(t1.divide - coalesce(min(min.quantity),0))
<
abs(t1.divide - coalesce(max(max.quantity),0)))
then max(max.quantity)
else min(min.quantity) end as cloestQuantity
from t1
left join (select quantity from t1) min on min.quantity >= t1.divide
left join (select quantity from t1) max on max.quantity < t1.divide
group by
t1.divide
If I understood the requirements, 5166 is not closest to 5000 - it's closes to 5250 (delta of 166 vs 84)
The corresponding query, without loops, shall be (fiddle here: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=be434e67ba73addba119894a98657f17).
(I added a Value_Rank as it's not sure if you want Rank to be kept or recomputed)
select
Quantity, Divide, Rank, Value,
dense_rank() over(order by Value) as Value_Rank
from
(
select
Quantity, Divide, Rank,
--
case
when abs(Quantity_let_delta) < abs(Quantity_get_delta) then Divide + Quantity_let_delta
else Divide + Quantity_get_delta
end as Value
from
(
select
so.Quantity, so.Divide, so.Rank,
-- There is no LessEqualThan, assume GreaterEqualThan
max(isnull(so_let.Quantity, so_get.Quantity)) - so.Divide as Quantity_let_delta,
-- There is no GreaterEqualThan, assume LessEqualThan
min(isnull(so_get.Quantity, so_let.Quantity)) - so.Divide as Quantity_get_delta
from
SO so
left outer join SO so_let
on so_let.Quantity <= so.Divide
--
left outer join SO so_get
on so_get.Quantity >= so.Divide
group by so.Quantity, so.Divide, so.Rank
) so
) result
Or, if by closest you mean the previous closest (fiddle here: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=b41fb1a3fc11039c7f82926f8816e270).
select
Quantity, Divide, Rank, Value,
dense_rank() over(order by Value) as Value_Rank
from
(
select
so.Quantity, so.Divide, so.Rank,
-- There is no LessEqualThan, assume 0
max(isnull(so_let.Quantity, 0)) as Value
from
SO so
left outer join SO so_let
on so_let.Quantity <= so.Divide
group by so.Quantity, so.Divide, so.Rank
) result
You don't need a loop, basically you need to find which is lowest difference between the divide and all the quantities (first cte). Then use this distance to find the corresponding record (second cte) and then join with your initial table to get the converted values (final select)
;with cte as (
select t.Divide, min(abs(t2.Quantity-t.Divide)) as ClosestQuantity
from #t1 as t
cross apply #t1 as t2
group by t.Divide
)
,cte2 as (
select distinct
t.Divide, t2.Quantity
from #t1 as t
cross apply #t1 as t2
where abs(t2.Quantity-t.Divide) = (select ClosestQuantity from cte as c where c.Divide = t.Divide)
)
select t.Quantity, cte2.Quantity as Divide, t.Rank, t.Value
from #t1 as t
left outer join cte2 on t.Divide = cte2.Divide

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

How to SELECT top N rows that sum to a certain amount?

Suppose:
MyTable
--
Amount
1
2
3
4
5
MyTable only has one column, Amount, with 5 rows. They are not necessarily in increasing order.
How can I create a function, which takes a #SUM INT, and returns the TOP N rows that sum to this amount?
So for input 6, I want
Amount
1
2
3
Since 1 + 2 + 3 = 6. 2 + 4 / 1 + 5 won't work since I want TOP N ROWS
For 7/8/9/10, I want
Amount
1
2
3
4
I'm using MS SQL Server 2008 R2, if this matters.
Saying "top N rows" is indeed ambiguous when it comes to relational databases.
I assume that you want to order by "amount" ascending.
I would add a second column (to a table or view) like "sum_up_to_here", and create something like that:
create view mytable_view as
select
mt1.amount,
sum(mt2.amount) as sum_up_to_here
from
mytable mt1
left join mytable mt2 on (mt2.amount < mt1.amount)
group by mt1.amount
or:
create view mytable_view as
select
mt1.amount,
(select sum(amount) from mytable where amount < mt1.amount)
from mytable mt1
and then I would select the final rows:
select amount from mytable_view where sum_up_to_here < (some value)
If you don't bother about performance you may of course run it in one query:
select amount from
(
select
mt1.amount,
sum(mt2.amount) as sum_up_to_here
from
mytable mt1
left join mytable mt2 on (mt2.amount < mt1.amount)
group by mt1.amount
) t where sum_up_to_here < 20
One approach:
select t1.amount
from MyTable t1
left join MyTable t2 on t1.amount > t2.amount
group by t1.amount
having coalesce(sum(t2.amount),0) < 7
SQLFiddle here.
In Sql Server you can use CDEs to make it pretty simple to read.
Here is a CDE I did to sum up totals used in sequence. The CDE is similar to the joins above, and holds the total up to any given index. Outside of the CDE I join it back to the original table so I can select it along with other fields.
;with summrp as (
select m1.idx, sum(m2.QtyReq) as sumUsed
from #mrpe m1
join #mrpe m2 on m2.idx <= m1.idx
group by m1.idx
)
select RefNum, RefLineSuf, QtyReq, ProjectedDate, sumUsed from #mrpe m
join summrp on summrp.idx=m.idx
In SQL Server 2012 you can use this shortcut to get a result like Grzegorz's.
SELECT amount
FROM (
SELECT * ,
SUM(amount) OVER (ORDER BY amount ASC) AS total
from demo
) T
WHERE total <= 6
A fiddle in the hand... http://sqlfiddle.com/#!6/b8506/6