SQL Pivot Table isn't working - sql

SQL 2005
I have a temp table:
Year PercentMale PercentFemale PercentHmlss PercentEmployed TotalSrvd
2008 100 0 0 100 1
2009 55 40 0 80 20
2010 64 35 0 67 162
2011 69 27 0 34 285
2012 56 43 10 1 58
and I want to create a query to display the data like this:
2008 2009 2010 2011 2012
PercentMale 100 55 64 69 56
PercentFemale - 40 35 27 43
PercentHmlss - - - - 10
PercentEmployed 100 80 67 34 1
TotalSrvd 1 20 162 285 58
Can I use a pivot table to accomplish this? If so, how? I've tried using a pivot but have found no success.
select PercentHmlss,PercentMale,Percentfemale,
PercentEmployed,[2008],[2009],[2010],[2011],[2012] from
(select PercentHmlss,PercentMale, Percentfemale, PercentEmployed,
TotalSrvd,year from #TempTable)as T
pivot (sum (TotalSrvd) for year
in ([2008],[2009],[2010],[2011],[2012])) as pvt
This is the result:
PercentHmlss PercentMale Percentfemale PercentEmployed [2008] [2009] [2010] [2011] [2012]
0 55 40 80 NULL 20 NULL NULL NULL
0 64 35 67 NULL NULL 162 NULL NULL
0 69 27 34 NULL NULL NULL 285 NULL
0 100 0 100 1 NULL NULL NULL NULL
10 56 43 1 NULL NULL NULL NULL 58
Thanks.

For this to work you will want to perform an UNPIVOT and then a PIVOT
SELECT *
from
(
select year, quantity, type
from
(
select year, percentmale, percentfemale, percenthmlss, percentemployed, totalsrvd
from t
) x
UNPIVOT
(
quantity for type
in
([percentmale]
, [percentfemale]
, [percenthmlss]
, [percentemployed]
, [totalsrvd])
) u
) x1
pivot
(
sum(quantity)
for Year in ([2008], [2009], [2010], [2011], [2012])
) p
See a SQL Fiddle with a Demo
Edit Further explanation:
You were close with your PIVOT query that you tried, in that you got the data for the Year in the column format that you wanted. However, since you want the data that was contained in the columns initially percentmale, percentfemale, etc in the row of data - you need to unpivot the data first.
Basically, what you are doing is taking the original data and placing it all in rows based on the year. The UNPIVOT is going to place your data in the format (Demo):
Year Quantity Type
2008 100 percentmale
2008 0 percentfemale
etc
Once you have transformed the data into this format, then you can perform the PIVOT to get the result you want.

Related

Hive Summing up data in the table based on the date range

Have a table with the following schema design and the data residing inside it is like:
ID HITS MISS DDATE
1 10 3 20180101
1 33 21 20180122
1 84 11 20180901
1 11 2 20180405
1 54 23 20190203
1 33 43 20190102
4 54 22 20170305
4 56 88 20180115
5 87 22 20180809
5 66 48 20180617
5 91 53 20170606
DataTypes:
ID INT
HITS INT
MISS INT
DDATE STRING
The requirement is to calculate the total of the given (HITS and MISS) on yearly basis i.e 2017,2018,2019...
Written the following query:
SELECT ID,
SUM(HITS) AS HITS,SUM(MISS) AS MISS,
CASE
WHEN DDATE BETWEEN '201701' AND '201712' THEN '2017' ELSE
'NOTHING' END AS TTL_YR17_DATA
CASE
WHEN DDATE BETWEEN '201801' AND '201812' THEN '2018' ELSE
'NOTHING' END AS TTL_YR18_DATA
CASE
WHEN DDATE BETWEEN '201901' AND '201912' THEN '2019' ELSE
'NOTHING' END AS TTL_YR19_DATA
FROM
HST_TABLE
WHERE
DDATE BETWEEN '201801' AND '201812'
GROUP BY
ID,DDATE;
But, the query is not fetching the expected result.
Actual O/P:
1 10 3 2018
1 33 21 2018
1 84 11 2018
1 11 2 2018
1 54 23 2019
1 33 43 2019
4 54 22 2017
4 56 88 2018
5 87 22 2018
5 66 48 2018
5 91 53 2017
Expected O/P:
1 138 37 2018
4 56 88 2018
5 153 70 2018
1 87 66 2019
5 91 53 2017
Another related question:
Is there a way that I can avoid passing the DDATE range in the query? As this should be given by the user and shouldn't be hardcoded.
Any help/advice to achieve the above two requirements will be really helpful.
OK,it's easy to implement this with the substring function in HIVE, as below:
select
substring(dddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(dddate,0,4),
id
order by
the_year,
id
The answer above by #Shawn.X is correct but has a logical flaw. Below is the corrected one:
select
substring(ddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(ddate,0,4),
id
order by
the_year,
id;

proc sql statement to sum on values/rows that match a condition

I have a data table like below:
Table 1:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT
10 111 2009 . 100 .
110 120 2009 9 10 .
231 120 2009 0 20 10
222 120 2010 0 40 20
221 222 2009 102 10 30
321 222 2009 0 30 20
213 222 2009 0 10 20
432 321 2009 99 10 0
211 432 2009 111 20 10
212 432 2009 0 20 0
I want to sum over the DAYSBETVISIT column only when the pidDifference value is 0 for each PERSONID. So I wrote the following proc sql statement.
proc sql;
create table table5 as
(
select rowid, YEAR, PERSONID, pidDifference, TIMETOEVENT, DAYSBETVISIT,
SUM(CASE WHEN PIDDifference = 0 THEN DaysBetVisit ELSE 0 END)
from WORK.Table4_1
group by PERSONID,TIMETOEVENT, YEAR
);
quit;
However, the result I got was not summing the DAYSBETVISIT values in rows where PIDDifference = 0 within the same PERSONID. It just output the same value as was present in DAYSBETVISIT in that specific row.
Column that I NEED (sumdays) but don't get with above statement (showing the resultant column using above statement as OUT:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT sumdays OUT
10 111 2009 . 100 . 0 0
110 120 2009 9 10 . 0 0
231 120 2009 0 20 10 30 10
222 120 2010 0 40 20 30 20
221 222 2009 102 10 30 0 0
321 222 2009 0 30 20 40 20
213 222 2009 0 10 20 40 20
432 321 2009 99 10 0 0 0
211 432 2009 111 20 10 0 0
212 432 2009 0 20 0 0 0
I do not know what I am doing wrong.
I am using SAS EG Version 7.15, Base SAS version 9.4.
For your example data it looks like you just need to use two CASE statements. One to define which values to SUM() and another to define whether to report the SUM or not.
proc sql ;
select personid, piddifference, daysbetvisit, sumdays
, case when piddifference = 0
then sum(case when piddifference=0 then daysbetvisit else 0 end)
else 0 end as WANT
from expect
group by personid
;
quit;
Results
pid
PERSONID Difference DAYSBETVISIT sumdays WANT
--------------------------------------------------------
111 . . 0 0
120 0 10 30 30
120 0 20 30 30
120 9 . 0 0
222 0 20 40 40
222 0 20 40 40
222 102 30 0 0
321 99 0 0 0
432 0 0 0 0
432 111 10 0 0
SAS proc sql doesn't support window functions. I find the re-merging aggregations to be a bit difficult to use, except in the obvious cases. So, use a subquery or join and group by:
proc sql;
create table table5 as
select t.rowid, t.YEAR, t.PERSONID, t.pidDifference, t.TIMETOEVENT, t.DAYSBETVISIT,
tt.sum_DaysBetVisit
from WORK.Table4_1 t left join
(select personid, sum(DaysBetVisit) as sum_DaysBetVisit
from WORK.Table4_1
group by personid
having min(pidDifference) = max(pidDifference) and min(pidDifference) = 0
) tt
on tt.personid = t.personid;
Note: This doesn't handle NULL values for pidDifference. If that is a concern, you can add count(pidDifference) = count(*) to the having clause.

Aggregate result from query by quarter SQL

Lets say I have a table which holds all exports for some time back in Microsoft SQL database:
Name:
ExportTable
Columns:
id - numeric(18)
exportdate - datetime
In order to get the number of exports per week I can run the following query:
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports'
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
Returns:
exportdate totalExports
---------- ------------
27 13
28 12
29 15
30 8
31 17
32 10
33 7
34 15
35 4
36 18
37 10
38 14
39 14
40 21
41 19
Would it be possible to aggregate the week results by quarter so the output becomes something like the bellow?
UPDATE
Sorry for not being crystal clear, I would like the current result to add upp with previous result up to a new quarter.
Note week 41 contains 21+19 = 40
Week 39 contains 157 (13+12+15+8+17+10+7+15+4+18+10+14+14)
exportdate totalExports Quarter
---------- ------------ -------
27 13 3
28 25 3
29 40 3
30 48 3
31 65 3
32 75 3
33 82 3
34 97 3
35 101 3
36 119 3
37 129 3
38 143 3
39 157 3 -- Sum of 3 Quarter values.
40 21 4 -- New Quarter show current week value
41 40 4 -- (21+19)
You can use this.
SELECT
DATEPART(ISO_WEEK,[exportdate]) as 'exportdate'
, SUM( count(exportdate) ) OVER ( PARTITION BY DATEPART(QUARTER,MIN([exportdate])) ORDER BY DATEPART(ISO_WEEK,[exportdate]) ROWS UNBOUNDED PRECEDING ) as 'totalExports'
, DATEPART(QUARTER,MIN([exportdate])) [Quarter]
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
You could use a case statement to separate the dates into quarters.
e.g.
CASE
WHEN EXPORT_DATE BETWEEN '1' AND '4' THEN 1
WHEN Export_Date BETWEEN '5' and '9' THEN 2
ELSE 0 AS [Quarter]
END
Its just an example but you get the idea.
You could then use the alias from the case
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports', DATEPART(quarter,[exportdate]) as quarter FROM [ExportTable] Group By DATEPART(ISO_WEEK,[exportdate]), DATEPART(quarter,[exportdate]) order by exportdate;

T-SQL Group by day date but i want show query full date

I want to show the date field can not group.
My Query:
SELECT DAY(T1.UI_CreateDate) AS DATEDAY, SUM(1) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
GROUP BY DAY(T1.UI_CreateDate)
Result:
DATEDAY TOTALCOUNT
----------- -----------
15 186
9 1
3 2
26 481
21 297
27 342
18 18
30 14
4 183
25 553
13 8
22 469
16 1
17 28
20 331
28 90
14 33
8 1
But i want to show the full date...
Example result:
DATEDAY TOTALCOUNT
----------- -----------
15/06/2015 186
9/06/2015 1
3/06/2015 2
26/06/2015 481
21/06/2015 297
27/06/2015 342
18/06/2015 18
30/06/2015 14
4/06/2015 183
25/06/2015 553
13/06/2015 8
22/06/2015 469
16/06/2015 1
17/06/2015 28
20/06/2015 331
28/06/2015 90
14/06/2015 33
8/06/2015 1
I want to see the results...
I could not get a kind of results...
How can I do?
Thanx!
How about just casting to date to remove any time component:
SELECT CAST(T1.UI_CreateDate as DATE) AS DATEDAY, COUNT(*) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1
WHERE T1.UI_BR_BO_ID = 45
GROUP BY CAST(T1.UI_CreateDate as DATE)
ORDER BY DATEDAY;
SUM(1) for calculating the count does work. However, because SQL has the COUNT(*) function, it seems a bit awkward.
So you can group by DAY(T1.UI_CreateDate) or use full date for grouping. But these are different . As both these dates '2015-04-15' and '2015-12-15' result in same DAY value of 15.
Assuming you want to group on DAY rather than date please try the below version of query:
SELECT DISTINCT
T1.UI_CreateDate as DATEDAY,
count(1) over (PARTITION BY DAY(T1.UI_CreateDate) ) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
sql fiddle for demo: http://sqlfiddle.com/#!6/c3337/1

Get SUM for each combination of values from two tables

I have two tables:
1. #Forecast_Premiums
Syndicate_Key Durg_Key Currency_Key Year_Of_Account Forecast_Premium CUML_EPI_Amount
NULL NULL NULL UNKNOWN 0 6
3 54 46 2000 109105 0
3 54 46 2001 128645 128646
5 47 80 2002 117829 6333
6 47 80 2002 125471 NULL
6 60 80 2003 82371 82371
10 98 215 2006 2093825 77888
10 98 215 2007 11111938 4523645
2.#Forecast_Claims
Syndicate_Key Durg_Key Currency_Key Year_Of_Account Contract_Ref Forecast_Claims Ultimate_Profit_Comission
NULL NULL NULL UNKNOWN UNKNOWN 0 -45
5 47 80 2002 AB00ZZ021M12 -9991203 NULL
5 47 80 2002 AB00ZZ021M13 -4522 -74412
9 60 215 2006 AC04ZZ021M13 -2340299 -895562
10 98 46 2007 FAC0ZZ021M55 -2564123 -851298
The task:
Using #Forecast_Premiums and #Forecast_Claims tables write a query to find
total amount of Pure Premium ,Cumulative EPI Amount, Forecast_Claims and Ultimate_Profit_Comissionreceived for each combination of Syndicate_Key, Durg_Key , Currency_key and Year_of_Account.
Note: In case the Key is NULL set it as 'UNKNOWN' , In Case the Amount is NULL set it as 0.
My solution:
SELECT
ISNULL(CAST(FP.Syndicate_key AS VARCHAR(20)), 'UNKNOWN') AS 'Syndicate_key',
ISNULL(CAST(FP.Durg_Key AS VARCHAR(20)), 'UNKNOWN') AS 'Durg_Key',
ISNULL(CAST(FP.Currency_Key AS VARCHAR(20)), 'UNKNOWN') AS 'Currency_Key',
fp.Year_Of_Account,
SUM(ISNULL(FP.Forecast_Premium,0)) AS 'Pure_Premium',
SUM(ISNULL(FP.CUML_EPI_Amount,0)) AS 'Cuml_Amount',
SUM(ISNULL(dc.Forecast_Claims,0)) AS 'Total_Claims',
SUM(ISNULL(dc.Ultimate_Profit_Comission,0)) AS 'Total_Comission'
FROM #FORECAST_PREMIUMS fp
left join #FORECAST_Claims dc
ON
(FP.Year_Of_Account = dc.Year_Of_Account AND
FP.Syndicate_Key = dc.Syndicate_Key AND
FP.Currency_Key = dc.Currency_Key AND
FP.Year_Of_Account = dc.Year_Of_Account)
GROUP BY fp.Syndicate_Key, fp.Durg_Key,fp.Currency_Key,fp.Year_Of_Account
Issue:
It returns the Forecast_Claims SUM and Ultimate_Profit_Comission SUM only for one combination of keys and year: 5 47 80 2002.
Moreover it returns 8 rows when it should had return 10.
Eight result records is correct, for there are eight distinct combinations of Syndicate_Key, Durg_Key , Currency_key and Year_of_Account in FORECAST_PREMIUMS.
As to the Forecast_Claims SUM: This is also correct; 5 47 80 2002 is the only combination that has a match in Forecast_Claims.
Only: Are you supposed to match both NULL records? You don't do this, as NULL = NULL is never true (only NULL is NULL is true). You would have to do something like
(
(FP.Year_Of_Account = dc.Year_Of_Account)
OR
(FP.Year_Of_Account is null AND dc.Year_Of_Account is null
) AND ...
to get these records match. Or:
ISNULL(FP.Year_Of_Account, -1) = ISNULL(dc.Year_Of_Account, -1) AND ...