Get SUM for each combination of values from two tables - sql

I have two tables:
1. #Forecast_Premiums
Syndicate_Key Durg_Key Currency_Key Year_Of_Account Forecast_Premium CUML_EPI_Amount
NULL NULL NULL UNKNOWN 0 6
3 54 46 2000 109105 0
3 54 46 2001 128645 128646
5 47 80 2002 117829 6333
6 47 80 2002 125471 NULL
6 60 80 2003 82371 82371
10 98 215 2006 2093825 77888
10 98 215 2007 11111938 4523645
2.#Forecast_Claims
Syndicate_Key Durg_Key Currency_Key Year_Of_Account Contract_Ref Forecast_Claims Ultimate_Profit_Comission
NULL NULL NULL UNKNOWN UNKNOWN 0 -45
5 47 80 2002 AB00ZZ021M12 -9991203 NULL
5 47 80 2002 AB00ZZ021M13 -4522 -74412
9 60 215 2006 AC04ZZ021M13 -2340299 -895562
10 98 46 2007 FAC0ZZ021M55 -2564123 -851298
The task:
Using #Forecast_Premiums and #Forecast_Claims tables write a query to find
total amount of Pure Premium ,Cumulative EPI Amount, Forecast_Claims and Ultimate_Profit_Comissionreceived for each combination of Syndicate_Key, Durg_Key , Currency_key and Year_of_Account.
Note: In case the Key is NULL set it as 'UNKNOWN' , In Case the Amount is NULL set it as 0.
My solution:
SELECT
ISNULL(CAST(FP.Syndicate_key AS VARCHAR(20)), 'UNKNOWN') AS 'Syndicate_key',
ISNULL(CAST(FP.Durg_Key AS VARCHAR(20)), 'UNKNOWN') AS 'Durg_Key',
ISNULL(CAST(FP.Currency_Key AS VARCHAR(20)), 'UNKNOWN') AS 'Currency_Key',
fp.Year_Of_Account,
SUM(ISNULL(FP.Forecast_Premium,0)) AS 'Pure_Premium',
SUM(ISNULL(FP.CUML_EPI_Amount,0)) AS 'Cuml_Amount',
SUM(ISNULL(dc.Forecast_Claims,0)) AS 'Total_Claims',
SUM(ISNULL(dc.Ultimate_Profit_Comission,0)) AS 'Total_Comission'
FROM #FORECAST_PREMIUMS fp
left join #FORECAST_Claims dc
ON
(FP.Year_Of_Account = dc.Year_Of_Account AND
FP.Syndicate_Key = dc.Syndicate_Key AND
FP.Currency_Key = dc.Currency_Key AND
FP.Year_Of_Account = dc.Year_Of_Account)
GROUP BY fp.Syndicate_Key, fp.Durg_Key,fp.Currency_Key,fp.Year_Of_Account
Issue:
It returns the Forecast_Claims SUM and Ultimate_Profit_Comission SUM only for one combination of keys and year: 5 47 80 2002.
Moreover it returns 8 rows when it should had return 10.

Eight result records is correct, for there are eight distinct combinations of Syndicate_Key, Durg_Key , Currency_key and Year_of_Account in FORECAST_PREMIUMS.
As to the Forecast_Claims SUM: This is also correct; 5 47 80 2002 is the only combination that has a match in Forecast_Claims.
Only: Are you supposed to match both NULL records? You don't do this, as NULL = NULL is never true (only NULL is NULL is true). You would have to do something like
(
(FP.Year_Of_Account = dc.Year_Of_Account)
OR
(FP.Year_Of_Account is null AND dc.Year_Of_Account is null
) AND ...
to get these records match. Or:
ISNULL(FP.Year_Of_Account, -1) = ISNULL(dc.Year_Of_Account, -1) AND ...

Related

SQL Query group by Postcode multiple Sums

I have following data:
ID
Weight
Postcode
Year
1
23
56222
2022
2
24
56332
2022
3
50
56442
2022
4
22
62331
2022
5
80
72130
2022
and i want to query it that i get the data like this:
Grouped by Postcode and splitted in different weight ranges.
and then just Count of the amount of entrys.
Postcode/Weight
0-20
21-40
41-60
61-80
81-100
56
0
2
1
0
0
62
0
1
0
0
0
72
0
0
0
1
0
Is there any way to query this in SQL?
Try this one.
Query:
SELECT
p.postcode,
COUNT(p20.id) as "0-20",
COUNT(p40.id) as "21-40",
COUNT(p60.id) as "41-60",
COUNT(p80.id) as "61-80",
COUNT(p100.id) as "81-100"
FROM packs p
LEFT JOIN packs p20 ON p20.postcode=p.postcode AND p20.weight < 20
LEFT JOIN packs p40 ON p40.postcode=p.postcode AND p40.weight >= 21 AND p40.weight <= 40
LEFT JOIN packs p60 ON p60.postcode=p.postcode AND p60.weight >= 41 AND p60.weight <= 60
LEFT JOIN packs p80 ON p80.postcode=p.postcode AND p80.weight >= 61 AND p80.weight <= 80
LEFT JOIN packs p100 ON p100.postcode=p.postcode AND p100.weight >= 81 AND p100.weight <= 100
GROUP by postcode;
Result:
Table

proc sql statement to sum on values/rows that match a condition

I have a data table like below:
Table 1:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT
10 111 2009 . 100 .
110 120 2009 9 10 .
231 120 2009 0 20 10
222 120 2010 0 40 20
221 222 2009 102 10 30
321 222 2009 0 30 20
213 222 2009 0 10 20
432 321 2009 99 10 0
211 432 2009 111 20 10
212 432 2009 0 20 0
I want to sum over the DAYSBETVISIT column only when the pidDifference value is 0 for each PERSONID. So I wrote the following proc sql statement.
proc sql;
create table table5 as
(
select rowid, YEAR, PERSONID, pidDifference, TIMETOEVENT, DAYSBETVISIT,
SUM(CASE WHEN PIDDifference = 0 THEN DaysBetVisit ELSE 0 END)
from WORK.Table4_1
group by PERSONID,TIMETOEVENT, YEAR
);
quit;
However, the result I got was not summing the DAYSBETVISIT values in rows where PIDDifference = 0 within the same PERSONID. It just output the same value as was present in DAYSBETVISIT in that specific row.
Column that I NEED (sumdays) but don't get with above statement (showing the resultant column using above statement as OUT:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT sumdays OUT
10 111 2009 . 100 . 0 0
110 120 2009 9 10 . 0 0
231 120 2009 0 20 10 30 10
222 120 2010 0 40 20 30 20
221 222 2009 102 10 30 0 0
321 222 2009 0 30 20 40 20
213 222 2009 0 10 20 40 20
432 321 2009 99 10 0 0 0
211 432 2009 111 20 10 0 0
212 432 2009 0 20 0 0 0
I do not know what I am doing wrong.
I am using SAS EG Version 7.15, Base SAS version 9.4.
For your example data it looks like you just need to use two CASE statements. One to define which values to SUM() and another to define whether to report the SUM or not.
proc sql ;
select personid, piddifference, daysbetvisit, sumdays
, case when piddifference = 0
then sum(case when piddifference=0 then daysbetvisit else 0 end)
else 0 end as WANT
from expect
group by personid
;
quit;
Results
pid
PERSONID Difference DAYSBETVISIT sumdays WANT
--------------------------------------------------------
111 . . 0 0
120 0 10 30 30
120 0 20 30 30
120 9 . 0 0
222 0 20 40 40
222 0 20 40 40
222 102 30 0 0
321 99 0 0 0
432 0 0 0 0
432 111 10 0 0
SAS proc sql doesn't support window functions. I find the re-merging aggregations to be a bit difficult to use, except in the obvious cases. So, use a subquery or join and group by:
proc sql;
create table table5 as
select t.rowid, t.YEAR, t.PERSONID, t.pidDifference, t.TIMETOEVENT, t.DAYSBETVISIT,
tt.sum_DaysBetVisit
from WORK.Table4_1 t left join
(select personid, sum(DaysBetVisit) as sum_DaysBetVisit
from WORK.Table4_1
group by personid
having min(pidDifference) = max(pidDifference) and min(pidDifference) = 0
) tt
on tt.personid = t.personid;
Note: This doesn't handle NULL values for pidDifference. If that is a concern, you can add count(pidDifference) = count(*) to the having clause.

Identifying unicode character in nvarchar column in SQL Server

I have a table called airports in a SQL Server database, with a column declared as nvarchar(255). I had to declare it as nvarchar otherwise SSIS failed to import the data from a .csv file generated by an API.
I have approx 25k records in this table, where by from what I can tell 763 have Unicode characters in them, by running this query:
select cast(name as varchar), name
from airports
where cast(name as varchar) <> name
The first row shows the following two values returned in column 1 and 2
Harrisburg Capital City Airpor
Harrisburg Capital City Airport
The first value from column 1 has had the last t stripped off it, which I assume means there is one unicode character in the string. Please let me know if I am wrong, as I am a bit useless with unicode characters.
My question is: how can I find the unicode characters in the column, and is there a safe / recommended way to remove them?
I did try this to see if I could find it, but it didn't do what I thought it would do.
set nocount on
DECLARE #nstring NVARCHAR(100)
SET #nstring =(select name from airports where fs = 'HAR')
DECLARE #position INT
SET #position = 1
DECLARE #CharList TABLE (Position INT,UnicodeChar NVARCHAR(1),UnicodeValue INT)
WHILE #position <= DATALENGTH(#nstring)
BEGIN
INSERT #CharList
SELECT
#position as Position,
CONVERT(nchar(1),SUBSTRING(#nstring, #position, 1)) as UnicodeChar,
UNICODE(SUBSTRING(#nstring, #position, 1)) as UnicodeValue
SET #position = #position + 1
END
SELECT *
FROM #CharList[/sql]
ORDER BY unicodevalue
The output is as follows
32 NULL
33 NULL
34 NULL
35 NULL
36 NULL
37 NULL
38 NULL
39 NULL
40 NULL
41 NULL
42 NULL
43 NULL
44 NULL
45 NULL
46 NULL
47 NULL
48 NULL
49 NULL
50 NULL
51 NULL
52 NULL
53 NULL
54 NULL
55 NULL
56 NULL
57 NULL
58 NULL
59 NULL
60 NULL
61 NULL
62 NULL
11 32
19 32
24 32
25 A 65
20 C 67
12 C 67
1 H 72
2 a 97
13 a 97
17 a 97
7 b 98
10 g 103
15 i 105
5 i 105
21 i 105
26 i 105
18 l 108
29 o 111
28 p 112
14 p 112
9 r 114
3 r 114
4 r 114
30 r 114
27 r 114
6 s 115
16 t 116
22 t 116
31 t 116
8 u 117
23 y 121
However, if you want to first find the records which have some unicode chars then follow below approach with help of case expression
;WITH CTE
AS (
SELECT DATA,
CASE
WHEN(CAST(DATA AS VARCHAR(MAX)) COLLATE SQL_Latin1_General_Cp1251_CS_AS) = DATA
THEN 0
ELSE 1
END HasUnicodeChars,
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) RN
FROM <table_name>)
SELECT * FROM CTE where HasUnicodeChars = 1

group by column not having specific value

I am trying to obtain a list of Case_Id's where the case does not contain a specific RoleId using Microsoft Sql Server 2012.
For example, I would like to obtain a collection of Case_Id's that do not contain a RoleId of 4.
So from the data set below the query would exclude Case_Id's 49, 50, and 53.
Id RoleId Person_Id Case_Id
--------------------------------------
108 4 108 49
109 1 109 49
110 4 110 50
111 1 111 50
112 1 112 51
113 2 113 52
114 1 114 52
115 7 115 53
116 4 116 53
117 3 117 53
So far I have tried the following
SELECT Case_Id
FROM [dbo].[caseRole] cr
WHERE cr.RoleId!=4
GROUP BY Case_Id ORDER BY Case_Id
The not exists operator seems to fit your need exactly:
SELECT DISTINCT Case_Id
FROM [dbo].[caseRole] cr
WHERE NOT EXISTS (SELECT *
FROM [dbo].[caseRole] cr_inner
WHERE cr_inner.Case_Id = cr.case_id
AND cr_inner.RoleId = 4);
Just add a having clause instead of where:
SELECT Case_Id
FROM [dbo].[caseRole] cr
GROUP BY Case_Id
HAVING SUM(case when cr.RoleId = 4 then 1 else 0 end) = 0
ORDER BY Case_Id;

SQL Pivot Table isn't working

SQL 2005
I have a temp table:
Year PercentMale PercentFemale PercentHmlss PercentEmployed TotalSrvd
2008 100 0 0 100 1
2009 55 40 0 80 20
2010 64 35 0 67 162
2011 69 27 0 34 285
2012 56 43 10 1 58
and I want to create a query to display the data like this:
2008 2009 2010 2011 2012
PercentMale 100 55 64 69 56
PercentFemale - 40 35 27 43
PercentHmlss - - - - 10
PercentEmployed 100 80 67 34 1
TotalSrvd 1 20 162 285 58
Can I use a pivot table to accomplish this? If so, how? I've tried using a pivot but have found no success.
select PercentHmlss,PercentMale,Percentfemale,
PercentEmployed,[2008],[2009],[2010],[2011],[2012] from
(select PercentHmlss,PercentMale, Percentfemale, PercentEmployed,
TotalSrvd,year from #TempTable)as T
pivot (sum (TotalSrvd) for year
in ([2008],[2009],[2010],[2011],[2012])) as pvt
This is the result:
PercentHmlss PercentMale Percentfemale PercentEmployed [2008] [2009] [2010] [2011] [2012]
0 55 40 80 NULL 20 NULL NULL NULL
0 64 35 67 NULL NULL 162 NULL NULL
0 69 27 34 NULL NULL NULL 285 NULL
0 100 0 100 1 NULL NULL NULL NULL
10 56 43 1 NULL NULL NULL NULL 58
Thanks.
For this to work you will want to perform an UNPIVOT and then a PIVOT
SELECT *
from
(
select year, quantity, type
from
(
select year, percentmale, percentfemale, percenthmlss, percentemployed, totalsrvd
from t
) x
UNPIVOT
(
quantity for type
in
([percentmale]
, [percentfemale]
, [percenthmlss]
, [percentemployed]
, [totalsrvd])
) u
) x1
pivot
(
sum(quantity)
for Year in ([2008], [2009], [2010], [2011], [2012])
) p
See a SQL Fiddle with a Demo
Edit Further explanation:
You were close with your PIVOT query that you tried, in that you got the data for the Year in the column format that you wanted. However, since you want the data that was contained in the columns initially percentmale, percentfemale, etc in the row of data - you need to unpivot the data first.
Basically, what you are doing is taking the original data and placing it all in rows based on the year. The UNPIVOT is going to place your data in the format (Demo):
Year Quantity Type
2008 100 percentmale
2008 0 percentfemale
etc
Once you have transformed the data into this format, then you can perform the PIVOT to get the result you want.