Im having trouble pivoting a table correct.
My input is this raw data table:
+------+---------+------------+----------+
| YEAR | FACULTY | ADMISSIONS | DROPOUTS |
+------+---------+------------+----------+
| 2018 | LAW | 15 | 2 |
| 2019 | LAW | 18 | 4 |
| 2020 | LAW | 11 | 1 |
| 2018 | MATH | 19 | 1 |
| 2019 | MATH | 17 | 6 |
| 2020 | MATH | 24 | 5 |
+------+---------+------------+----------+
I want to pivot years to row but I also want to keep the measure for admissions and drop outs as row names. E.g I want a table as this:
+---------+------------+------+------+------+
| FACULTY | MEASURE | 2018 | 2019 | 2020 |
+---------+------------+------+------+------+
| LAW | ADMISSIONS | 15 | 18 | 11 |
| LAW | DROPOUTS | 2 | 4 | 1 |
| MATH | ADMISSIONS | 19 | 17 | 24 |
| MATH | DROPOUTS | 1 | 6 | 5 |
+---------+------------+------+------+------+
I can pivot years using:
SELECT *
FROM
(
SELECT FACULTY, YEAR, ADMINISSION, DROPPUTS
FROM TABLE
PIVOT (SUM (ADMISSIONS)
FOR YEAR IN (2018,2019,2020)
)
But I need to pivot both measures and still get the measure names column. Any ideas?
That's unpivoting, then pivoting. If your database supports lateral joins and values(), you can do:
select
t.faculty,
x.measure,
sum(case when t.year = 2018 then x.value end) value_2018,
sum(case when t.year = 2019 then x.value end) value_2019,
sum(case when t.year = 2020 then x.value end) value_2020
from mytable t
cross apply (values ('admission', admission), ('dropout', dropout)) as x(measure, value)
group by t.faculty, x.measure
I would unpivot using apply (assuming you are using SQL Server) and reaggregate:
select t.faculty, v.measure,
max(case when year = 2018 then val end) as [2018],
max(case when year = 2019 then val end) as [2019],
max(case when year = 2020 then val end) as [2020]
from t cross apply
(values ('ADMISSIONS', ADMISSIONS), ('DROPOUTS', DROPOUTS)
) v(measure, val)
group by t.faculty, v.measure
Related
I have a table of the form show below.My intention is to create a monthly checklist.
+-------------+----------------+-----------------+-----------------+
| id | mem_id | month_code | type |
+-------------+----------------+-----------------+-----------------+
| 1 | 1 | Jan | to |
| 2 | 2 | Feb | t |
| 3 | 1 | Feb | to |
| 4 | 3 | Jan | o |
| 5 | 1 | Mar | o |
+-------------+----------------+-----------------+-----------------+
The query used is
SELECT distinct(mem_id) as Member,
(SELECT type FROM test where mem_id=Member and month_code='Jan') as Jan,
(SELECT type FROM test where mem_id=Member and month_code='Feb') as Feb,
(SELECT type FROM test where mem_id=Member and month_code='Mar') as Mar
FROM test
The desired output is
+-------------+----------------+-----------------+-----------------+
| mem_id | Jan | Feb | Mar |
+-------------+----------------+-----------------+-----------------+
| 1 | to | to | o |
| 2 | | t | |
| 3 | o | | |
+-------------+----------------+-----------------+-----------------+
My problem however is that the code works fine on mysql but on msaccess I get a pop up of an input box asking me to enter the value of parameter Member. How can I have the correct output in Access?
You can do conditional aggregation instead :
select mem_id,
max(iif(month_code = 'Jan', type)) as Jan,
max(iif(month_code = 'Feb', type)) as Feb,
max(iif(month_code = 'Mar', type)) as Mar
from test t
group by mem_id;
For your query, the name Member doesn't exists in test table so, access considered it as parameter.
So, you probably need mem_id instead :
SELECT t.mem_id as Member,
(SELECT t1.type FROM test as t1 where t1.mem_id = t.mem_id and t1.month_code='Jan') as Jan,
(SELECT t1.type FROM test as t1 where t1.mem_id = t.mem_id and t1.month_code='Feb') as Feb,
(SELECT t1.type FROM test as t1 where t1.mem_id = t.mem_id and t1.month_code='Mar') as Mar
FROM test as t
GROUP BY t.mem_id;
The only problem is that with your version is, if one mem_id has duplicate types for same month_code then it will throw subquery error.
So, you need top clause in subquery.
Say I have a table like this in ms sql 2008:
+------+--------+---------+
| year | JAN | FEB |
+------+--------+---------+
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | 8Z | 8Z |
| 2016 | R5205 | R5205 |
| 2016 | 5K2 | 5K2 |
| 2016 | 5K2 | 5K2 |
| 2016 | NULL | NULL |
| 2016 | TE | NULL |
| 2016 | TE | NULL |
| 2016 | 8Z | 8Z |
+------+--------+---------+
And I want to get a count for each column, something like this
+------+--------+---------+
| opt | JAN_cnt| FEB_cnt |
+------+--------+---------+
| 5K2 | 5 | 4 |
| 8Z | 2 | 2 |
| R5205| 1 | 1 |
| TE | 2 | 0 |
| NULL | 1 | 4 |
+------+--------+---------+
First, can this be done? Second, how? I have searched, but cant find exactly what I am looking for.
I think the simplest way is to use UNION ALL with conditional aggregation using CASE EXPRESSION :
SELECT s.opt,
COUNT(CASE WHEN s.ind_from = 1 THEN 1 END) as jan_cnt,
COUNT(CASE WHEN s.ind_from = 2 THEN 1 END) as feb_cnt
FROM (
SELECT t1.jan as opt,1 as ind_from FROM YourTable t1
UNION ALL
SELECT t2.feb,2 FROM YourTable t2) s
GROUP BY s.opt
I would advise putting the values into a different format:
opt
month
cnt
You can do this as:
select opt, mon, count(*) as cnt
from ((select jan as opt, 'jan' as mon from t) union all
(select feb as opt, 'feb' as mon from t)
) o
group by opt, mon;
It is easy enough to switch this to your format:
select opt, sum(jan) as jan, sum(feb) as feb
from ((select jan as opt, 1 as jan, 0 as feb from t) union all
(select feb as opt, 0, 1, from t)
) o
group by opt;
I just prefer the first format. It is easier to generalize to more columns.
SELECT COALESCE(t1.JAN, t2.FEB), t1.JAN_cnt, t2.FEB_cnt
FROM
(
SELECT JAN, COUNT(*) AS JAN_cnt
FROM yourTable
GROUP BY JAN
) t1
FULL OUTER JOIN
(
SELECT FEB, COUNT(*) AS FEB_cnt
FROM yourTable
GROUP BY FEB
) t2
ON t1.JAN = t2.FEB
Table:
id | year | score
-----+------+-----------
12 | 2011 | 0.929
12 | 2014 | 0.933
12 | 2010 | 0.937
12 | 2013 | 0.938
12 | 2009 | 0.97
13 | 2010 | 0.851
13 | 2014 | 0.881
13 | 2011 | 0.885
13 | 2013 | 0.895
13 | 2009 | 0.955
16 | 2009 | 0.867
16 | 2011 | 0.881
16 | 2012 | 0.886
16 | 2013 | 0.897
16 | 2014 | 0.953
Desired Output:
id | year | score
-----+------+-----------
16 | 2009 | 0.867
16 | 2011 | 0.881
16 | 2012 | 0.886
16 | 2013 | 0.897
16 | 2014 | 0.953
I'm having difficulties in trying to output scores that are increasing in respect to the year.
Any help would be greatly appreciated.
So you want to select id = 16 because it is the only one that has steadily increasing values.
Many versions of SQL support lag(), which can help solve this problem. You can determine, for a given id, if all the values are increasing or decreasing by doing:
select id,
(case when min(score - prev_score) < 0 then 'nonincreasing' else 'increasoing' end) as grp
from (select t.*, lag(score) over (partition by id order by year) as prev_score
from table t
) t
group by id;
You can then select all "increasing" ids using a join:
select t.*
from table t join
(select id
from (select t.*, lag(score) over (partition by id order by year) as prev_score
from table t
) t
group by id
having min(score - prev_score) > 0
) inc
on t.id = inc.id;
I have Table with columns: "Month" and "Year", and other data.
All row in Table have different values "Month" and "Year".
But for some Month and Year rows don't exist.
I want create SQL-query (... where year in (2010, 2011, 2012) ...), that in result this SQL-query have all Month for select Year and if some month don't exist else add it to result with 0 in other data columns.
Example:
Input: Table
data / month / year
+-----+---+------+
| 3.0 | 1 | 2011 |
| 4.3 | 3 | 2011 |
| 5.7 | 4 | 2011 |
| 2.2 | 5 | 2011 |
| 5.4 | 7 | 2011 |
+-----+---+------+
Output: SELECT ... WHERE year IN (2011)
+-----+----+------+
| 3.0 | 1 | 2011 |
| 0 | 2 | 2011 |
| 4.3 | 3 | 2011 |
| 5.7 | 4 | 2011 |
| 2.2 | 5 | 2011 |
| 0 | 6 | 2011 |
| 5.4 | 7 | 2011 |
| 0 | 8 | 2011 |
| 0 | 9 | 2011 |
| 0 | 10 | 2011 |
| 0 | 11 | 2011 |
| 0 | 12 | 2011 |
+-----+----+------+
Try Partition Outer Join:
SELECT
NVL(T.DATA, 0) DATA,
F.MONTH,
T.YEAR
FROM <your_table> T
PARTITION BY(T.YEAR)
RIGHT JOIN (SELECT LEVEL MONTH FROM DUAL CONNECT BY LEVEL <= 12) F ON T.MONTH = F.MONTH
Add your WHERE clause at the end or create a view with that definition and query against it.
select datecol,
nvl(val,0),
to_char(d.date_col,'MM') month,
to_char(d.date_col,'yyyy') year
from(
select add_months('1-Jan-2011',level-1) as datecol
from dual connect by level <= 12
) d
left join(
select sum(val) as val, month, year
from your_table
group by month, year
) S
on (to_char(d.date_col,'MM') = s.month and to_char(d.date_col,'yyyy') = s.year)
select nvl(t.data, 0), x.month, nvl(t.year, <your_year>) as year
from <your_table> t,
(select rownum as month from dual connect by level < 13) x
where (t.year is null or t.year = <your_year>)
and t.month(+) = x.month
order by x.month
I have a table like this.
|-DT--------- |-ID------|
|5/30 12:00pm |10 |
|5/30 01:00pm |30 |
|5/30 02:30pm |30 |
|5/30 03:00pm |50 |
|5/30 04:30pm |10 |
|5/30 05:00pm |10 |
|5/30 06:30pm |10 |
|5/30 07:30pm |10 |
|5/30 08:00pm |50 |
|5/30 09:30pm |10 |
I want to remove any duplicate rows only if the previous row has the same ID as the following row. I want to keep the duplicate row with the datetime furthest in the future. For example the above table would look like this.
|-DT--------- |-ID------|
|5/30 12:00pm |10 |
|5/30 02:30pm |30 |
|5/30 03:00pm |50 |
|5/30 07:30pm |10 |
|5/30 08:00pm |50 |
|5/30 09:30pm |10 |
Can I get any tips on how this can be done?
with C as
(
select ID,
row_number() over(order by DT) as rn
from YourTable
)
delete C1
from C as C1
inner join C as C2
on C1.rn = C2.rn-1 and
C1.ID = C2.ID
SE-Data
Do these 3 steps: http://www.sqlfiddle.com/#!3/b58b9/19
First make the rows sequential:
with a as
(
select dt, id, row_number() over(order by dt) as rn
from tbl
)
select * from a;
Output:
| DT | ID | RN |
----------------------------------------
| May, 30 2012 12:00:00-0700 | 10 | 1 |
| May, 30 2012 13:00:00-0700 | 30 | 2 |
| May, 30 2012 14:30:00-0700 | 30 | 3 |
| May, 30 2012 15:00:00-0700 | 50 | 4 |
| May, 30 2012 16:30:00-0700 | 10 | 5 |
| May, 30 2012 17:00:00-0700 | 10 | 6 |
| May, 30 2012 18:30:00-0700 | 10 | 7 |
| May, 30 2012 19:30:00-0700 | 10 | 8 |
| May, 30 2012 20:00:00-0700 | 50 | 9 |
| May, 30 2012 21:30:00-0700 | 10 | 10 |
Second, using the sequential numbers, we can find which rows are at the bottom (and also those not at the bottom for that matter):
with a as
(
select dt, id, row_number() over(order by dt) as rn
from tbl
)
select below.*,
case when above.id <> below.id or above.id is null then
1
else
0
end as is_at_bottom
from a below
left join a above on above.rn + 1 = below.rn;
Output:
| DT | ID | RN | IS_AT_BOTTOM |
-------------------------------------------------------
| May, 30 2012 12:00:00-0700 | 10 | 1 | 1 |
| May, 30 2012 13:00:00-0700 | 30 | 2 | 1 |
| May, 30 2012 14:30:00-0700 | 30 | 3 | 0 |
| May, 30 2012 15:00:00-0700 | 50 | 4 | 1 |
| May, 30 2012 16:30:00-0700 | 10 | 5 | 1 |
| May, 30 2012 17:00:00-0700 | 10 | 6 | 0 |
| May, 30 2012 18:30:00-0700 | 10 | 7 | 0 |
| May, 30 2012 19:30:00-0700 | 10 | 8 | 0 |
| May, 30 2012 20:00:00-0700 | 50 | 9 | 1 |
| May, 30 2012 21:30:00-0700 | 10 | 10 | 1 |
Third, delete all rows not at the bottom:
with a as
(
select dt, id, row_number() over(order by dt) as rn
from tbl
)
,b as
(
select below.*,
case when above.id <> below.id or above.id is null then
1
else
0
end as is_at_bottom
from a below
left join a above on above.rn + 1 = below.rn
)
delete a
from a
inner join b on b.rn = a.rn
where b.is_at_bottom = 0;
To verify:
select * from tbl order by dt;
Output:
| DT | ID |
-----------------------------------
| May, 30 2012 12:00:00-0700 | 10 |
| May, 30 2012 13:00:00-0700 | 30 |
| May, 30 2012 15:00:00-0700 | 50 |
| May, 30 2012 16:30:00-0700 | 10 |
| May, 30 2012 20:00:00-0700 | 50 |
| May, 30 2012 21:30:00-0700 | 10 |
You can also simplify the deletion to this: http://www.sqlfiddle.com/#!3/b58b9/20
with a as
(
select dt, id, row_number() over(order by dt, id) as rn
from tbl
)
delete above
from a below
left join a above on above.rn + 1 = below.rn
where case when above.id <> below.id or above.id is null then 1 else 0 end = 0;
Mikael Eriksson's answer is the best though, if I simplify again my simplified query, it will look like his answer ツ For that, I +1'd his answer. I will just make his query a bit more readable though; by swapping the joining order and giving good aliases.
with a as
(
select *, row_number() over(order by dt, id) as rn
from tbl
)
delete above
from a below
join a above on above.rn + 1 = below.rn and above.id = below.id;
Live test: http://www.sqlfiddle.com/#!3/b58b9/24
Here you go, simply replace [Table] with the name of your table.
SELECT *
FROM [dbo].[Table]
WHERE [Ident] NOT IN
(
SELECT Extent.[Ident]
FROM
(
SELECT TOP 100 PERCENT T1.[DT],
T1.[ID],
T1.[Ident],
(
SELECT TOP 1 Previous.ID
FROM [dbo].[Table] AS Previous
WHERE Previous.[Ident] > T1.Ident -- this is where the identity seed is important
ORDER BY [Ident] ASC
) AS 'PreviousId'
FROM [dbo].[Table] AS T1
ORDER BY T1.[Ident] DESC
) AS Extent
WHERE [Id] = [PreviousId]
)
Note: You will need an indentity column on the table - use a CTE if you can't change the structure of the table.
You can try following Query ...
select * from
(
select *,RANK() OVER (ORDER BY dt,id) AS Rank from test
) as a
where 0 = (
select count(id) from (
select id, RANK() OVER (ORDER BY dt,id) AS Rank from test
)as b where b.id = a.id and b.Rank = a.Rank + 1
) order by dt
Thanks,
Mahesh