Select by increasing order SQL - sql

Table:
id | year | score
-----+------+-----------
12 | 2011 | 0.929
12 | 2014 | 0.933
12 | 2010 | 0.937
12 | 2013 | 0.938
12 | 2009 | 0.97
13 | 2010 | 0.851
13 | 2014 | 0.881
13 | 2011 | 0.885
13 | 2013 | 0.895
13 | 2009 | 0.955
16 | 2009 | 0.867
16 | 2011 | 0.881
16 | 2012 | 0.886
16 | 2013 | 0.897
16 | 2014 | 0.953
Desired Output:
id | year | score
-----+------+-----------
16 | 2009 | 0.867
16 | 2011 | 0.881
16 | 2012 | 0.886
16 | 2013 | 0.897
16 | 2014 | 0.953
I'm having difficulties in trying to output scores that are increasing in respect to the year.
Any help would be greatly appreciated.

So you want to select id = 16 because it is the only one that has steadily increasing values.
Many versions of SQL support lag(), which can help solve this problem. You can determine, for a given id, if all the values are increasing or decreasing by doing:
select id,
(case when min(score - prev_score) < 0 then 'nonincreasing' else 'increasoing' end) as grp
from (select t.*, lag(score) over (partition by id order by year) as prev_score
from table t
) t
group by id;
You can then select all "increasing" ids using a join:
select t.*
from table t join
(select id
from (select t.*, lag(score) over (partition by id order by year) as prev_score
from table t
) t
group by id
having min(score - prev_score) > 0
) inc
on t.id = inc.id;

Related

Pivot table in SQL but keep measure names in column

Im having trouble pivoting a table correct.
My input is this raw data table:
+------+---------+------------+----------+
| YEAR | FACULTY | ADMISSIONS | DROPOUTS |
+------+---------+------------+----------+
| 2018 | LAW | 15 | 2 |
| 2019 | LAW | 18 | 4 |
| 2020 | LAW | 11 | 1 |
| 2018 | MATH | 19 | 1 |
| 2019 | MATH | 17 | 6 |
| 2020 | MATH | 24 | 5 |
+------+---------+------------+----------+
I want to pivot years to row but I also want to keep the measure for admissions and drop outs as row names. E.g I want a table as this:
+---------+------------+------+------+------+
| FACULTY | MEASURE | 2018 | 2019 | 2020 |
+---------+------------+------+------+------+
| LAW | ADMISSIONS | 15 | 18 | 11 |
| LAW | DROPOUTS | 2 | 4 | 1 |
| MATH | ADMISSIONS | 19 | 17 | 24 |
| MATH | DROPOUTS | 1 | 6 | 5 |
+---------+------------+------+------+------+
I can pivot years using:
SELECT *
FROM
(
SELECT FACULTY, YEAR, ADMINISSION, DROPPUTS
FROM TABLE
PIVOT (SUM (ADMISSIONS)
FOR YEAR IN (2018,2019,2020)
)
But I need to pivot both measures and still get the measure names column. Any ideas?
That's unpivoting, then pivoting. If your database supports lateral joins and values(), you can do:
select
t.faculty,
x.measure,
sum(case when t.year = 2018 then x.value end) value_2018,
sum(case when t.year = 2019 then x.value end) value_2019,
sum(case when t.year = 2020 then x.value end) value_2020
from mytable t
cross apply (values ('admission', admission), ('dropout', dropout)) as x(measure, value)
group by t.faculty, x.measure
I would unpivot using apply (assuming you are using SQL Server) and reaggregate:
select t.faculty, v.measure,
max(case when year = 2018 then val end) as [2018],
max(case when year = 2019 then val end) as [2019],
max(case when year = 2020 then val end) as [2020]
from t cross apply
(values ('ADMISSIONS', ADMISSIONS), ('DROPOUTS', DROPOUTS)
) v(measure, val)
group by t.faculty, v.measure

Rolling Average SQL

Hi I have a dataset where I have Year Month and output variables with the values as following:
Year | Month | Output
2015 | 1 | 12
2015 | 2 | 24
2015 | 3 | 2
2015 | 4 | 3
2015 | 5 | 7
2015 | 6 | 3
2015 | 7 | 7
2015 | 8 | 6
2015 | 9 | 7
2015 | 10 | 8
2015 | 11 | 3
2015 | 12 | 6
2016 | 1 | 3
2016 | 2 | 6
2016 | 3 | 8
2016 | 4 | 9
2016 | 5 | 4
......... and so on...
I want to add a new column in the dataset as Rolling_Average
Rolling_Average = Sum of previous 12 month Output/ Output of this month
for example :
Rolling_Average (for 2015-7) = output (2015-01) + output (2015-02) +output (2015-03) + output (2015-04) +output (2015-05) + output (2015-06) / output (2015-07)
I tried couple of queries online to get the output but it didn't work for me. Can someone please help me
Output Required is as follows:
Year | Month | Output | Rolling Average
2015 | 1 | 12 | 12
2015 | 2 | 24 | 0.5
2015 | 3 | 2 | 18
2015 | 4 | 3 | 38/3
2015 | 5 | 7 | 45/7
2015 | 6 | 3 | 48/3
2015 | 7 | 7 | 55/7
2015 | 8 | 6 | 61/6
2015 | 9 | 7 | 68/7
2015 | 10 | 8 | 74/8
2015 | 11 | 3 | 77/3
2015 | 12 | 6 | 83/6
2016 | 1 | 3 | 86/3
2016 | 2 | 6 | 92/6
2016 | 3 | 8 | 100/8
2016 | 4 | 9 | 109/9
2016 | 5 | 4 | 113/4
The Query I tried is :
SELECT DISTINCT
//CALCULATIONS
Year,
Month,
Output,
(sum(CAST(Output) AS DOUBLE)))
over(order by year,month rows between 12 preceding and 1 preceding )
as Rolling_Average
from my_table
group by Year,Month
order by Year,Month
It gives me error :
Syntax error: OVER keyword must follow a function call
Also I have tried other things
Can someone please help me in an easy way . I am using SQL Plx it is similar to SQL
Thank You!
You might have misplaced some parentheses
(sum( CAST(Output) AS DOUBLE ))) over (order by year, month rows between 12 preceding and 1 preceding ) as Rolling_Average
Versus:
SUM( CAST(Output AS DOUBLE) ) OVER (order by year, month rows between 12 preceding and 1 preceding) as Rolling_Average
You can also ROUND that result.
And those records already seem to be unique by Year and Month.
So there's not really a need to group on those.
SELECT
t.Year, t.Month, t.Output,
ROUND(SUM(CAST(t.Output AS INT)) OVER (ORDER BY t.Year, t.Month ROWS BETWEEN 12 PRECEDING AND 1 PRECEDING)*1.0 / CAST(t.Output AS INT), 1) as Rolling_Average
FROM my_table t
ORDER BY t.Year, t.Month;
And if the window functions aren't supported, then this will work:
SELECT
t1.Year, t1.Month, t1.Output,
ROUND(SUM(CAST(t2.Output AS INT))*1.0 / CAST(t1.Output AS INT), 1) as Rolling_Average
FROM my_table t1
LEFT JOIN my_table t2 ON ((t2.Year = t1.Year AND t2.Month < t1.Month) OR
(t2.Year = t1.Year - 1 AND t2.Month >= t1.Month))
GROUP BY t1.Year, t1.Month, t1.Output
ORDER BY t1.Year, t1.Month;
db<>fiddle here
Try this(if you use sql-server)
Select *
from tableName T
outer apply (
select sum(output) Rolling_Average
from tableName T_in on T_in.year = T.year and T_in.Month <= T.Month
)x

How to subtract previous value in a column with calculation of other column on SQL server

I have a requirement for a table as shown below. As you can see mgt_year,tot_dflt_mgt and to_accum_mgt columns. In year column where its 2016 the value is 20 and accum value is 600. What I want is that when I do
(to_accum_mgt - tot_dflt_mgt)
I want this calculated result in previous row as shown in the table below. Then this calculated result i.e. 580 is used for subtracting 9 like (580 - 9) for year 2015 and so on for all trailing years. I have done this in excel and also in Oracle thanks to #mathguy, but how to achieve this result in SQL server. I have tried to use this SQL server but its not working.
Please forgive My bad English and noob formatting.
My table t:
line_seg MGT_YEAR TOT_DFLT_MGT TOT_ACCUM_MGT
--------- -------- ------------ ------------
A 2013 10
A 2014 15
A 2015 9
A 2016 20 600
B 2013 10
B 2014 15
B 2015 8
B 2016 20 500
Oracle Solution:
select mgt_year, tot_dflt_mgt,
max(tot_accum_mgt) over () -
nvl( sum(tot_dflt_mgt) over
(order by mgt_year
rows between 1 following and unbounded following)
, 0 ) as tot_accum_mgt
from t;
but I am unable use this in SQL Server.
required output
line_seg MGT_YEAR TOT_DFLT_MGT TOT_ACCUM_MGT
--------- -------- ------------ ------------
A 2013 10 556
A 2014 15 471
A 2015 9 580
A 2016 20 600
B 2013 12 457
B 2014 15 472
B 2015 8 480
B 2016 20 500
select *,
(sum(TOT_ACCUM_MGT) over()) -
(sum(TOT_DFLT_MGT ) over (order by TOT_DFLT_MGT )) as somecolname
from
table
Put Row_number() and self join it with the previous row on (a.ID = b.ID) and (a.row_num = b.row_num - 1)
OR
You can use lag() function
Please try the following query. I assumed that you are using 2012+ version of SQL Server. If not, please change the FIRST_VALUE to SUM -
SELECT t1.line_seg, t1.mgt_year, t1.[tot_dflt_mgt]
, FIRST_VALUE(t1.tot_accum_mgt) OVER(PARTITION BY t1.[line_seg] ORDER BY t1.mgt_year DESC)
- ISNULL(SUM(t2.[tot_dflt_mgt]) OVER(PARTITION BY t2.[line_seg] ORDER BY t2.mgt_year DESC), 0) AS tot_accum_mgt
FROM [dbo].[t] AS t1
LEFT JOIN [dbo].[t] AS t2 ON (t2.line_seg = t1.line_seg AND t2.mgt_year = t1.mgt_year + 1)
ORDER BY t1.line_seg, t1.mgt_year ASC;
To do this first I have to imagine the table as sorted by the descending order of date -
+------------+----------+--------------+---------------+
| line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt |
+------------+----------+--------------+---------------+
| A | 2016 | 20 | 600 |
| A | 2015 | 9 | NULL |
| A | 2014 | 15 | NULL |
| A | 2013 | 10 | NULL |
| B | 2016 | 20 | 500 |
| B | 2015 | 8 | NULL |
| B | 2014 | 15 | NULL |
| B | 2013 | 12 | NULL |
+------------+----------+--------------+---------------+
Then all I have to do is to subtract the PREVIOUS running total of tot_dflt_mgt from the latest year's tot_accum_mgt. This is equivalent to subtract the previous tot_dflt_mgt from the current computed value of tot_accum_mgt To use the previous year's fields LEFT JOIN is used to self join the table. Resulting in the following table -
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
| line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt | line_seg | mgt_year | tot_dflt_mgt | tot_accum_mgt |
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
| A | 2013 | 10 | NULL | A | 2014 | 15 | NULL |
| A | 2014 | 15 | NULL | A | 2015 | 9 | NULL |
| A | 2015 | 9 | NULL | A | 2016 | 20 | 600 |
| A | 2016 | 20 | 600 | NULL | NULL | NULL | NULL |
| B | 2013 | 12 | NULL | B | 2014 | 15 | NULL |
| B | 2014 | 15 | NULL | B | 2015 | 8 | NULL |
| B | 2015 | 8 | NULL | B | 2016 | 20 | 500 |
| B | 2016 | 20 | 500 | NULL | NULL | NULL | NULL |
+------------+----------+--------------+---------------+------------+----------+--------------+---------------+
The AND t2.mgt_year = t1.mgt_year + 1 filter in the LEFT join clause does the trick of getting previous rows value. Now all I had to do is to calculate the running total on this previous rows (t2). Also as, subtracting NULL from anything will result in NULL. So ISNULL replaces any NULL with zeros.
ISNULL(SUM(t2.[tot_dflt_mgt]) OVER(PARTITION BY t2.[line_seg] ORDER BY t2.mgt_year DESC), 0) AS tot_accum_mgt
Now, as we have the previous running total of tot_dflt_mgt, all we have to do is to delete the latest (largest mgt_year) tot_accum_mgt. We get that by using FIRST_VALUE function. SUM could also be used instead I guess.
FIRST_VALUE(t1.tot_accum_mgt) OVER(PARTITION BY t1.[line_seg] ORDER BY t1.mgt_year DESC)

MSSQL Count Multiple Columns

Say I have a table like this in ms sql 2008:
+------+--------+---------+
| year | JAN | FEB |
+------+--------+---------+
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | 8Z | 8Z |
| 2016 | R5205 | R5205 |
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | NULL | NULL |
| 2016 | TE | NULL |
| 2016 | TE | NULL |
| 2016 | 8Z | 8Z |
+------+--------+---------+
And I want to get a count for each column, something like this
+------+--------+---------+
| opt | JAN_cnt| FEB_cnt |
+------+--------+---------+
| 5K2 | 5 | 4 |
| 8Z | 2 | 2 |
| R5205| 1 | 1 |
| TE | 2 | 0 |
| NULL | 1 | 4 |
+------+--------+---------+
First, can this be done? Second, how? I have searched, but cant find exactly what I am looking for.
I think the simplest way is to use UNION ALL with conditional aggregation using CASE EXPRESSION :
SELECT s.opt,
COUNT(CASE WHEN s.ind_from = 1 THEN 1 END) as jan_cnt,
COUNT(CASE WHEN s.ind_from = 2 THEN 1 END) as feb_cnt
FROM (
SELECT t1.jan as opt,1 as ind_from FROM YourTable t1
UNION ALL
SELECT t2.feb,2 FROM YourTable t2) s
GROUP BY s.opt
I would advise putting the values into a different format:
opt
month
cnt
You can do this as:
select opt, mon, count(*) as cnt
from ((select jan as opt, 'jan' as mon from t) union all
(select feb as opt, 'feb' as mon from t)
) o
group by opt, mon;
It is easy enough to switch this to your format:
select opt, sum(jan) as jan, sum(feb) as feb
from ((select jan as opt, 1 as jan, 0 as feb from t) union all
(select feb as opt, 0, 1, from t)
) o
group by opt;
I just prefer the first format. It is easier to generalize to more columns.
SELECT COALESCE(t1.JAN, t2.FEB), t1.JAN_cnt, t2.FEB_cnt
FROM
(
SELECT JAN, COUNT(*) AS JAN_cnt
FROM yourTable
GROUP BY JAN
) t1
FULL OUTER JOIN
(
SELECT FEB, COUNT(*) AS FEB_cnt
FROM yourTable
GROUP BY FEB
) t2
ON t1.JAN = t2.FEB

How in query result add 0-data for don't exist rows?

I have Table with columns: "Month" and "Year", and other data.
All row in Table have different values "Month" and "Year".
But for some Month and Year rows don't exist.
I want create SQL-query (... where year in (2010, 2011, 2012) ...), that in result this SQL-query have all Month for select Year and if some month don't exist else add it to result with 0 in other data columns.
Example:
Input: Table
data / month / year
+-----+---+------+
| 3.0 | 1 | 2011 |
| 4.3 | 3 | 2011 |
| 5.7 | 4 | 2011 |
| 2.2 | 5 | 2011 |
| 5.4 | 7 | 2011 |
+-----+---+------+
Output: SELECT ... WHERE year IN (2011)
+-----+----+------+
| 3.0 | 1 | 2011 |
| 0 | 2 | 2011 |
| 4.3 | 3 | 2011 |
| 5.7 | 4 | 2011 |
| 2.2 | 5 | 2011 |
| 0 | 6 | 2011 |
| 5.4 | 7 | 2011 |
| 0 | 8 | 2011 |
| 0 | 9 | 2011 |
| 0 | 10 | 2011 |
| 0 | 11 | 2011 |
| 0 | 12 | 2011 |
+-----+----+------+
Try Partition Outer Join:
SELECT
NVL(T.DATA, 0) DATA,
F.MONTH,
T.YEAR
FROM <your_table> T
PARTITION BY(T.YEAR)
RIGHT JOIN (SELECT LEVEL MONTH FROM DUAL CONNECT BY LEVEL <= 12) F ON T.MONTH = F.MONTH
Add your WHERE clause at the end or create a view with that definition and query against it.
select datecol,
nvl(val,0),
to_char(d.date_col,'MM') month,
to_char(d.date_col,'yyyy') year
from(
select add_months('1-Jan-2011',level-1) as datecol
from dual connect by level <= 12
) d
left join(
select sum(val) as val, month, year
from your_table
group by month, year
) S
on (to_char(d.date_col,'MM') = s.month and to_char(d.date_col,'yyyy') = s.year)
select nvl(t.data, 0), x.month, nvl(t.year, <your_year>) as year
from <your_table> t,
(select rownum as month from dual connect by level < 13) x
where (t.year is null or t.year = <your_year>)
and t.month(+) = x.month
order by x.month