SQL Create a column with conditions from a sub query - sql

I have a table with ID and HEIGHT, LENGTH and WIDTH. I need to find the mas measure of every row and then create a column of surcharge of $5 if the biggest measure is between 22 and 30 and 8 if it is >30. The first parte is working fine
select id, max(measure) as max_measure
from (
select id, height as measure from table1
union
select id, length as measure from table1
union
select id, width as measure from table1
) m
group by id
But i cant make the second part, it should be a sub query using the results I got from the first part and looking something roughly like this
select surcharge where
m.max_measure >= 22 and m.max_measure <30
m.max_measure>= 30

select id,
max(measure) as max_measure,
case when max(measure) >= 30 then 8
when max(measure) >= 22 then 5
else 0 end as surcharge

Related

Oracle SQL Group by and sum with multiple conditions

I attached a capture of two tables:
- the left table is a result of others "Select" query
- the right table is the result I want from the left table
The right table can be created following the next conditions:
When the same Unit have all positive or all negative
energy values, the result remain the same
When the same Unit have positive and negative energy values then:
Make a sum of all Energy for that Unit(-50+15+20 = -15) and then take the maximum of absolut value for the Energy.e.g. max(abs(energy))=50 and take the price for that value.
I use SQL ORACLE.
I realy appreciate the help in this matter !
http://sqlfiddle.com/#!4/eb85a/12
This returns desired result:
signs CTE finds out whether there are positive/negative values, as well as maximum ABS energy value
then, there's union of two selects: one that returns "original" rows (if count of distinct signs is 1), and one that returns "calculated" values, as you described
SQL> with
2 signs as
3 (select unit,
4 count(distinct sign(energy)) cnt,
5 max(abs(energy)) max_abs_ene
6 from tab
7 group by unit
8 )
9 select t.unit, t.price, t.energy
10 from tab t join signs s on t.unit = s.unit
11 where s.cnt = 1
12 union all
13 select t.unit, t2.price, sum(t.energy)
14 from tab t join signs s on t.unit = s.unit
15 join tab t2 on t2.unit = s.unit and abs(t2.energy) = s.max_abs_ene
16 where s.cnt = 2
17 group by t.unit, t2.price
18 order by unit;
UNIT PRICE ENERGY
-------------------- ---------- ----------
A 20 -50
A 50 -80
B 13 -15
SQL>
Though, what do you expect if there was yet another "B" unit row with energy = +50? Then two rows would have the same MAX(ABS(ENERGY)) value.
A union all might be the simplest solution:
with t as (
select t.*,
max(energy) over (partition by unit) as max_energy,
min(energy) over (partition by unit) as min_energy
from t
)
select unit, price, energy
from t
where max_energy > 0 and min_energy > 0 or
max_energy < 0 and min_enery < 0
union all
select unit,
max(price) keep (dense_rank first order by abs(energy)),
sum(energy)
from t
where max_energy > 0 and min_energy < 0
group by unit;

SQL query Splitting a column into Multiple rows divide by percentage

How to get percentage of a column and then inserting it as rows
Col1 item TotalAmount**
1 ABC 5558767.82
2 ABC 4747605.5
3 ABC 667377.69
4 ABC 3844204
6 CTB 100
7 CTB 500.52
I need to create a new column percentage for each item which is I have done as :-
Select item, (totalAmount/select sum(totalAmount) from table1) as Percentage
From table1
Group by item
Col1 item TotalAmount percentage
1 ABC 5558767.82 38
2 ABC 4747605.5 32
3 ABC 667377.69 5
4 ABC 3844204 26
6 CTB 100 17
7 CTB 500.52 83
Now, the complex part I have to calculate another amount by multiplying this percentage to an amount from another table say table2
ii) update the Total amount column by spilt the total amount column of table 1 into 2 rows – 1st row of the new Calculate PledgeAmount and 2nd row – (totalAmount – PledgeAmount)
*Select t1.percentage * t2.new amount as [PledgeAmount]
From table 1 join table2 where t1.item=t2.item*
. e.g. for col1 Amount of 5558767.82 will split into two rows.
Final Result sample for :-
Col1 item TotalAmount Type
1 ABC 363700.00 Pledge
1 ABC 5195067.82 Unpledge
....
I am using Temporary table to do calculations.
One of the way I think is to calculate the Pledged and Unpledged amount as new column and Pivot it but its huge table with hundreds of columns it will not perform fast.
Any other efficient way?
You can use a windowing function to solve this problem -- first in a sub-query calculate the total and then in the main query the percent:
Select *, (totalAmount/total_for_item)*100 as percent_of_total
from (
SELECT t.*,
SUM(totalAmount) OVER (PARTITION BY item) as total_for_item
FROM table t
) sub
First, let's get the total amount per item:
SELECT item, SUM( totalAmount ) as sumTotal
INTO #totalperitem
FROM table1
GROUP BY item
Now it's easy to get to the percentages:
SELECT t1.Col1,
t1.item,
t1.totalAmount,
t1.totalAmount/tpi.sumTotal*100 AS percentage
FROM table1 t1
INNER JOIN #totalperitem tpi on ...
Tricky part: Separate rows with/without match in table2. Can be done with a WHERE NOT EXISTS, or, my preference, with a single outer join:
SELECT t1.item,
CASE WHEN tpledged.item IS NULL
THEN "Unpledged"
ELSE "Pledged"
END,
SUM( t1.totalAmount ) AS amount
FROM table1 t1
LEFT OUTER JOIN table2 tpledged ON t1. ... = tpledged. ...
GROUP BY t1.item,
CASE WHEN tpledged.item IS NULL
THEN "Unpledged"
ELSE "Pledged"
END
The basic trick is to create an artificial column from the presence/absence of records in table2 and to also group by that artificial column.

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

Oracle: how to "group by" over a range?

If I have a table like this:
pkey age
---- ---
1 8
2 5
3 12
4 12
5 22
I can "group by" to get a count of each age.
select age,count(*) n from tbl group by age;
age n
--- -
5 1
8 1
12 2
22 1
What query can I use to group by age ranges?
age n
----- -
1-10 2
11-20 2
20+ 1
I'm on 10gR2, but I'd be interested in any 11g-specific approaches as well.
SELECT CASE
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END AS age,
COUNT(*) AS n
FROM age
GROUP BY CASE
WHEN age <= 10 THEN '1-10'
WHEN age <= 20 THEN '11-20'
ELSE '21+'
END
Try:
select to_char(floor(age/10) * 10) || '-'
|| to_char(ceil(age/10) * 10 - 1)) as age,
count(*) as n from tbl group by floor(age/10);
What you are looking for, is basically the data for a histogram.
You would have the age (or age-range) on the x-axis and the count n (or frequency) on the y-axis.
In the simplest form, one could simply count the number of each distinct age value like you already described:
SELECT age, count(*)
FROM tbl
GROUP BY age
When there are too many different values for the x-axis however, one may want to create groups (or clusters or buckets). In your case, you group by a constant range of 10.
We can avoid writing a WHEN ... THEN line for each range - there could be hundreds if it were not about age. Instead, the approach by #MatthewFlaschen is preferable for the reasons mentioned by #NitinMidha.
Now let's build the SQL...
First, we need to split the ages into range-groups of 10 like so:
0-9
10-19
20 - 29
etc.
This can be achieved by dividing the age column by 10 and then calculating the result's FLOOR:
FLOOR(age/10)
"FLOOR returns the largest integer equal to or less than n"
http://docs.oracle.com/cd/E11882_01/server.112/e26088/functions067.htm#SQLRF00643
Then we take the original SQL and replace age with that expression:
SELECT FLOOR(age/10), count(*)
FROM tbl
GROUP BY FLOOR(age/10)
This is OK, but we cannot see the range, yet. Instead we only see the calculated floor values which are 0, 1, 2 ... n.
To get the actual lower bound, we need to multiply it with 10 again so we get 0, 10, 20 ... n:
FLOOR(age/10) * 10
We also need the upper bound of each range which is lower bound + 10 - 1 or
FLOOR(age/10) * 10 + 10 - 1
Finally, we concatenate both into a string like this:
TO_CHAR(FLOOR(age/10) * 10) || '-' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1)
This creates '0-9', '10-19', '20-29' etc.
Now our SQL looks like this:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1),
COUNT(*)
FROM tbl
GROUP BY FLOOR(age/10)
Finally, apply an order and nice column aliases:
SELECT
TO_CHAR(FLOOR(age/10) * 10) || ' - ' || TO_CHAR(FLOOR(age/10) * 10 + 10 - 1) AS range,
COUNT(*) AS frequency
FROM tbl
GROUP BY FLOOR(age/10)
ORDER BY FLOOR(age/10)
However, in more complex scenarios, these ranges might not be grouped into constant chunks of size 10, but need dynamical clustering.
Oracle has more advanced histogram functions included, see http://docs.oracle.com/cd/E16655_01/server.121/e15858/tgsql_histo.htm#TGSQL366
Credits to #MatthewFlaschen for his approach; I only explained the details.
Here is a solution which creates a "range" table in a sub-query and then uses this to partition the data from the main table:
SELECT DISTINCT descr
, COUNT(*) OVER (PARTITION BY descr) n
FROM age_table INNER JOIN (
select '1-10' descr, 1 rng_start, 10 rng_stop from dual
union (
select '11-20', 11, 20 from dual
) union (
select '20+', 21, null from dual
)) ON age BETWEEN nvl(rng_start, age) AND nvl(rng_stop, age)
ORDER BY descr;
I had to group data by how many transactions appeared in an hour. I did this by extracting the hour from the timestamp:
select extract(hour from transaction_time) as hour
,count(*)
from table
where transaction_date='01-jan-2000'
group by
extract(hour from transaction_time)
order by
extract(hour from transaction_time) asc
;
Giving output:
HOUR COUNT(*)
---- --------
1 9199
2 9167
3 9997
4 7218
As you can see this gives a nice easy way of grouping the number of records per hour.
add an age_range table and an age_range_id field to your table and group by that instead.
// excuse the DDL but you should get the idea
create table age_range(
age_range_id tinyint unsigned not null primary key,
name varchar(255) not null);
insert into age_range values
(1, '18-24'),(2, '25-34'),(3, '35-44'),(4, '45-54'),(5, '55-64');
// again excuse the DML but you should get the idea
select
count(*) as counter, p.age_range_id, ar.name
from
person p
inner join age_range ar on p.age_range_id = ar.age_range_id
group by
p.age_range_id, ar.name order by counter desc;
You can refine this idea if you like - add from_age to_age columns in the age_range table etc - but i'll leave that to you.
hope this helps :)
If using Oracle 9i+, you might be able to use the NTILE analytic function:
WITH tiles AS (
SELECT t.age,
NTILE(3) OVER (ORDER BY t.age) AS tile
FROM TABLE t)
SELECT MIN(t.age) AS min_age,
MAX(t.age) AS max_age,
COUNT(t.tile) As n
FROM tiles t
GROUP BY t.tile
The caveat to NTILE is that you can only specify the number of partitions, not the break points themselves. So you need to specify a number that is appropriate. IE: With 100 rows, NTILE(4) will allot 25 rows to each of the four buckets/partitions. You can not nest analytic functions, so you'd have to layer them using subqueries/subquery factoring to get desired granularity. Otherwise, use:
SELECT CASE t.age
WHEN BETWEEN 1 AND 10 THEN '1-10'
WHEN BETWEEN 11 AND 20 THEN '11-20'
ELSE '21+'
END AS age,
COUNT(*) AS n
FROM TABLE t
GROUP BY CASE t.age
WHEN BETWEEN 1 AND 10 THEN '1-10'
WHEN BETWEEN 11 AND 20 THEN '11-20'
ELSE '21+'
END
I had to get a count of samples by day. Inspired by #Clarkey I used TO_CHAR to extract the date of sample from the timestamp to an ISO-8601 date format and used that in the GROUP BY and ORDER BY clauses. (Further inspired, I also post it here in case it is useful to others.)
SELECT
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') AS TS_DAY,
COUNT(*)
FROM
TABLE X
GROUP BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD')
ORDER BY
TO_CHAR(X.TS_TIMESTAMP, 'YYYY-MM-DD') ASC
/
Can you try the below solution:
SELECT count (1), '1-10' where age between 1 and 10
union all
SELECT count (1), '11-20' where age between 11 and 20
union all
select count (1), '21+' where age >20
from age
My approach:
select range, count(1) from (
select case
when age < 5 then '0-4'
when age < 10 then '5-9'
when age < 15 then '10-14'
when age < 20 then '15-20'
when age < 30 then '21-30'
when age < 40 then '31-40'
when age < 50 then '41-50'
else '51+'
end
as range from
(select round(extract(day from feedback_update_time - feedback_time), 1) as age
from txn_history
) ) group by range
I have flexibility in defining the ranges
I do not repeat the ranges in select and group clauses
but some one please tell me, how to order them by magnitude!