Remove Duplicates while Merging values - sql

How can I remove duplicates and merge Account Types?
I have a call log that reports duplicate phones based on Account Type.
For example:
Telephone | Account Type
304-555-6666 | R
304-555-6666 | C
I know how to remove duplicate Telephones using RANK\MAXCOUNT
But before removing duplicates I need to reset the Account Type to “B” is the duplicates have multiple account types.
In the example the surviving duplicate would be:
Telephone | Account Type
304-555-6666 | B
Warning, it is not guaranteed that duplicate phones have multiple Account Types.
Example:
Telephone | Account Type
999-888-6666 | R
999-888-6666 | R
Therefore the surviving duplicate should be:
Telephone | Account Type
999-888-6666 | R
How can I remove duplicates and reset the account type at the same time?
--
-- Remove Duplicate Recordings
--
SELECT * FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT'
FROM #myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]

SELECT * FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT',
DENSE_RANK() OVER ( PARTITION BY i.telephone ORDER BY s.accounttype DESC ) AS 'ContPhone'
FROM #myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]

Try this?
select
x.dateofcall
, x.recordingfile
, x.telephone
, case when count(*) > 2 then 'B' else max(x.accounttype) end accounttype
(
select
i.dateofcall
, i.recordingfile
, i.telephone
, s.accounttype
from
#myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
group by
i.dateofcall
, i.recordingfile
, i.telephone
, s.accounttype
) x
group by
x.dateofcall
, x.recordingfile
, x.telephone

Basically you need to put your business check in a case statement outside.
EDIT: I've also added the logic for B, R and C. Also done a sql fiddle- link to fiddle -http://sqlfiddle.com/#!6/b5ef5/7
SELECT
x.dateofcall,
x.recordingfile,
x.telephone,
COALESCE(
CASE WHEN x.maxcount>1 AND value>x.maxcount AND value<(2*x.maxcount) THEN 'B' ELSE NULL END,
CASE WHEN x.maxcount>1 AND value= (2*x.maxcount) THEN 'C' ELSE NULL END,
CASE WHEN x.maxcount>1 AND value= x.maxcount THEN 'R' ELSE NULL END,
x.accounttype ) as accounttype,
x.rank,
x.maxcount
FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT',
SUM(CASE WHEN s.accounttype LIKE 'R' THEN 1 ELSE 2 END) OVER (PARTITION BY i.telephone) as Value
FROM
myactions i LEFT JOIN myphone s
ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]

Related

check for a column if it is null , over a previous term with some conditions

I have lets say two terms term A (previous)and term B (current) , i need to check if pol_cancl_date is null or not in term A , there is a transaction_sequence_number , i need to see if the pol_cncl_date is existing in the greatest A.transaction_sequence_number and if greatest( A.transaction_sequence_number ) is the greated when compared to all B.transaction_sequence_number numbers , if it is then i would want to check for pol_cancl_dates's existence and apply a logic
WITH x AS (
SELECT * FROM (
SELECT
pol_num
,term_start_dt
,term_end_dt,pol_cancel_dt
,trans_seq_num
,future_cancel_dt
,DENSE_RANK() OVER (PARTITION BY pol_num ORDER BY term_end_dt DESC) AS flag
FROM `gcp-ent-datalake-preprod.trns_prop_pol_hs_horison.prop_cost`
--WHERE pol_num IN ('30766675','33896642')
-- pol_num = '33288489'
ORDER BY term_start_dt, term_end_dt DESC
)
)
SELECT
*
,CASE
WHEN prior_pol_cancel_dt IS NOT NULL AND current_trans_seq_num < prior_trans_seq_num THEN prior_pol_cancel_dt
ELSE current_pol_cancel_dt
END apply_cancelled_renewal_dt
FROM (
SELECT
MAX(a.pol_num) AS current_pol_num
,MAX(a.term_start_dt) AS current_term_start_dt
,a.term_end_dt AS current_term_ent_dt
,MAX(a.pol_cancel_dt) AS current_pol_cancel_dt
,MAX(a.trans_seq_num) AS current_trans_seq_num
,MAX(a.future_cancel_dt) AS current_future_cancel_dt
,MAX(a.flag) AS current_flag
,MAX(b.pol_num) AS prior_pol_num
,MAX(b.term_start_dt) AS prior_term_start_dt
,b.term_end_dt AS prior_term_end_dt
,MAX(b.pol_cancel_dt) AS prior_pol_cancel_dt
,MAX(b.trans_seq_num) AS prior_trans_seq_num
,MAX(b.future_cancel_dt) AS prior_future_cancel_dt
,MAX(b.flag) AS prior_flag
FROM (
SELECT * FROM x WHERE flag=1) a
INNER JOIN(
SELECT * FROM x WHERE flag = 2 ) b
ON a.pol_num = b.pol_num AND a.flag = b.flag - 1
WHERE a.pol_cancel_dt IS NOT NULL
AND b.pol_cancel_dt IS NOT NULL
AND greatest(a.trans_seq_num) < b.trans_seq_num
-- AND a.trans_seq_num = GREATEST(a.trans_seq_num)
-- AND b.trans_seq_num = GREATEST(b.trans_seq_num)
GROUP BY a.term_end_dt, b.term_end_dt
)
--WHERE a.term_start_dt < b.term_start_dt
--if prior term GREATEST (trans_sewq num
this logic is still not giving me some results , one thing is that trans_seq_num doesn't necessarily have to be one less

Why null value in the table getting error while using lead function

I am getting this error - Error converting data type nvarchar to numeric.
I have data coming from a table and I need only two values from the table where I filter only the number (no alphanumeric so used the isnumeric(covrg_cd)=1). The input data looks like the first picture. The Row 1 will always be null and in other other rows, there may or may not be data. However, because row 1 is always null, the lead function is throwing this error: Error converting data type nvarchar to numeric, but the rate column is always in nvarchar. I am using LEAD function in SQL to get the paybandfrom & paybandto using the Rate from Input table and using row_number() to get the tier value.
Input table
out put must be like this..
I have my query like this
SELECT a.payband , a.[from] as pybdnfrom, (RIGHT('00000000000000000000' + CAST(A.[TO] AS VARCHAR),20)) AS pybndto , a.tier
FROM (SELECT DISTINCT A.RATE as payband, A.RATE as [from], CASE WHEN TIER <> 4 THEN A.[TO] ELSE 100000000.000 END AS [to], ROW_NUMBER() OVER(ORDER BY RATE) AS TIER
FROM(SELECT DISTINCT A.RATE, LEAD(SUM((CONVERT(NUMERIC(20,3), (A.RATE)))-0.010)) OVER(ORDER BY A.RATE) AS [TO], ROW_NUMBER() OVER(ORDER BY A.RATE) AS TIER
FROM (SELECT DISTINCT BN_RATE_KEY02 as RATE, COVRG_CD AS COVERAGE
from #tmppsRateCost
WHERE ISNUMERIC(COVRG_CD) = 1 AND COVRG_CD = '1')A GROUP BY A.RATE)A)A
ORDER BY 1
Any help would be appreciated.
The error is because the '' cannot be parsed as a number. It's not related to the LEAD.
If you want to keep that approach you can modify your query in this way (I just commented the parts I replaced):
SELECT a.payband
,a.[from] AS pybdnfrom
--,(RIGHT('00000000000000000000' + CAST(A.[TO] AS VARCHAR), 20)) AS pybndto
,CASE WHEN payband = '' THEN '' ELSE (RIGHT('00000000000000000000' + CAST(A.[TO] AS VARCHAR), 20)) END AS pybndto
,a.tier
FROM (
SELECT DISTINCT A.RATE AS payband
,A.RATE AS [from]
,CASE
WHEN TIER <> 5
THEN A.[TO]
ELSE 100000000.000
END AS [to]
,ROW_NUMBER() OVER (
ORDER BY RATE
) AS TIER
FROM (
SELECT DISTINCT A.RATE
--,LEAD(SUM((CONVERT(NUMERIC(20, 3), (A.RATE))) - 0.010)) OVER (
,LEAD(SUM((CONVERT(NUMERIC(20, 3), (CASE WHEN A.RATE = '' THEN '0.010' ELSE A.RATE END))) - 0.010)) OVER (
ORDER BY A.RATE
) AS [TO]
,ROW_NUMBER() OVER (
ORDER BY A.RATE
) AS TIER
FROM (
SELECT DISTINCT BN_RATE_KEY02 AS RATE
,COVRG_CD AS COVERAGE
FROM #tmppsRateCost
WHERE ISNUMERIC(COVRG_CD) = 1
AND COVRG_CD = '1'
) A
GROUP BY A.RATE
) A
) A
ORDER BY 1
Anyway I guess you might have a cleaner approach just by removing the empty line in the initial table.
Get rid of ISNUMERIC() and use TRY_CONVERT() insert of CONVERT(). In this condition:
WHERE ISNUMERIC(COVRG_CD) = 1 AND COVRG_CD = '1'
The ISNUMERIC() is just unneeded because you have an exact string comparison.
SELECT a.payband , a.[from] as pybdnfrom,
(RIGHT('00000000000000000000' + CAST(A.[TO] AS VARCHAR),20)) AS pybndto ,
a.tier
FROM (SELECT DISTINCT A.RATE as payband, A.RATE as [from],
CASE WHEN TIER <> 4 THEN A.[TO] ELSE 100000000.000 END AS [to],
ROW_NUMBER() OVER (ORDER BY RATE) AS TIER
FROM (SELECT DISTINCT A.RATE,
LEAD(SUM((TRY_CONVERT(NUMERIC(20,3), (A.RATE)))-0.010)) OVER (ORDER BY A.RATE) AS [TO],
ROW_NUMBER() OVER (ORDER BY A.RATE) AS TIER
FROM (SELECT DISTINCT BN_RATE_KEY02 as RATE, COVRG_CD AS COVERAGE
FROM #tmppsRateCost
WHERE COVRG_CD = '1'
)A
GROUP BY A.RATE
) A
) A
ORDER BY 1;

Finding Median in Sql Server

I want to get the median of unitRate from [dbo].[ReplaceCost_DirectCost_Details] view in Microsoft Sql Server Management Studio. I already got Min,Max and avg of it.But do not know about median. I tried following code, but did not get median .Thanks in advacen for your help.
select
JobName as JobName
,Client as Client
,AssetClass as AssetClass
,AssetType as AssetType
,AssetSubType as AssetSubType
,Component as Component
,ComponentType as ComponentType
,ComponentSubType as ComponentSubType
,UnitRate AS UnitRate
,Max(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MaxFinalUnitRate]
,Min(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MinFinalUnitRate]
,AVG(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MeanFinalUnitRate]
,AVG (UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType)as Median
from
(
Select top (10)
JobName as JobName
,Client as Client
,AssetClass as AssetClass
,AssetType as AssetType
,AssetSubType as AssetSubType
,Component as Component
,ComponentType as ComponentType
,ComponentSubType as ComponentSubType
,UnitRate AS UnitRate
,ROW_NUMBER () over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType order by UnitRate) as [RowNum]
,COUNT(*) OVER (PARTITION BY JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType ) AS RowCnt
from [dbo].[ReplaceCost_DirectCost_Details] rdd
where client = 'APV_Ballina_Shire_Council_Old' and UnitRate is not Null and UnitRate <> 0
) x
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
EDIT
SQL Fiddle
CREATE TABLE Table1
([somevalue] int)
;
INSERT INTO Table1
([somevalue])
VALUES
(141),
(325),
(325),
(353),
(3166),
(325),
(207),
(141),
(3166),
(161)
;
Query 1:
with cte as (
select *
, row_number() over(order by somevalue) as RowNum
, count(*) over() as RowCnt
from table1
)
select
*
from CTE
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
| 325 | 5 | 10 |
| 325 | 6 | 10 |
Please consider the following small example. There are 7 rows of data, the median is the "midpoint" of those, so the where clause uses a row number compared to row count, and returns just that midpoint valuse. That value (67) repesents the median of that small sample.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Table1
([somevalue] int)
;
INSERT INTO Table1
([somevalue])
VALUES
(2),
(45),
(67),
(89),
(4567),
(6),
(1290)
;
Query 1:
with cte as (
select *
, row_number() over(order by somevalue) as RowNum
, count(*) over() as RowCnt
from table1
)
select
*
from CTE
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
Results:
| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
| 67 | 4 | 7 |
(sorry for using a second answer, but it will get lost if just added to the earlier one)
I am really not certain what the expected output of your query is. But I note that you are using TOP(10) and for that to work you must have an order by otherwise the result is indeterminate for the first 10 rows.
While the following may produce many more rows than you need, perhaps it will help lead to a solution.
WITH Basis as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, ROW_NUMBER() OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType
ORDER BY UnitRate)
AS [rownum]
FROM [dbo].[ReplaceCost_DirectCost_Details] rdd
WHERE client = 'APV_Ballina_Shire_Council_Old'
AND UnitRate IS NOT NULL
AND UnitRate <> 0
)
, Top10s as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, rownum
, COUNT(*) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType)
AS rowcnt
FROM Basis
WHERE rownum <= 10
)
, Medians as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, AVG(UnitRate) AS Median
FROM Top10s
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
GROUP BY
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, AVG(UnitRate)
)
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, rownum
, rowcnt
, MAX(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [maxfinalunitrate]
, MIN(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [minfinalunitrate]
, AVG(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [meanfinalunitrate]
, Medians.Median
FROM Top10s t
JOIN Medians m ON t.JobName = m.JobName
AND t.Client = m.Client
AND t.AssetClass = m.AssetClass
AND t.AssetType = m.AssetType
AND t.AssetSubType = m.AssetSubType
AND t.Component = m.Component
AND t.ComponentType = m.ComponentType
AND t.ComponentSubType = m.ComponentSubType
;

LAG within CASE giving false negative offset

TL;DR: scroll down to TASK 2.
I am dealing with the following data set:
email,createdby,createdon
a#b.c,jsmith,2016-10-10
a#b.c,nsmythe,2016-09-09
a#b.c,vstark,2016-11-11
b#x.y,ajohnson,2015-02-03
b#x.y,elear,2015-01-01
...
and so on. Each email is guaranteed to have at least one duplicate in the data set.
Now, there are two tasks to resolve; I resolved one of them but am struggling with the other one. I will now present both tasks for completeness.
TASK 1 (resolved):
For each row, for each email, return an additional column with the name of the user that created the first record with this email.
Expected result for the above sample data set:
email,createdby,createdon,original_createdby
a#b.c,jsmith,2016-10-10,nsmythe
a#b.c,nsmythe,2016-09-09,nsmythe
a#b.c,vstark,2016-11-11,nsmythe
b#x.y,ajohnson,2015-02-03,elear
b#x.y,elear,2015-01-01,elear
Code to get the above:
;WITH q0 -- this is just a security measure in case there are unique emails in the data set
AS ( SELECT t.email
FROM t
GROUP BY t.email
HAVING COUNT(*) > 1) ,
q1
AS ( SELECT q0.email
, createdon
, createdby
, ROW_NUMBER() OVER ( PARTITION BY q0.email ORDER BY createdon ) rn
FROM t
JOIN q0
ON t.email = q0.email)
SELECT q1.email
, q1.createdon
, q1.createdby
, LAG(q1.createdby, q1.rn - 1) OVER ( ORDER BY q1.email, q1.createdon ) original_createdby
FROM q1
ORDER BY q1.email
, q1.rn
Brief explanation: I partition data set by email, then I number rows in each partition ordered by creation date, finally I return [createdby] value from (rn-1)th record. Works exactly as expected.
Now, similar to the above, there is TASK 2:
TASK 2:
For each row, for each email, return name of the user that created the first duplicate. I.e. name of a user where rn=2.
Expected result:
email,createdby,createdon,first_dupl_createdby
a#b.c,jsmith,2016-10-10,jsmith
a#b.c,nsmythe,2016-09-09,jsmith
a#b.c,vstark,2016-11-11,jsmith
b#x.y,ajohnson,2015-02-03,ajohnson
b#x.y,elear,2015-01-01,ajohnson
I want to keep things performant so trying to employ LEAD-LAG functions:
WITH q0
AS ( SELECT t.email
FROM t
GROUP BY t.email
HAVING COUNT(*) > 1) ,
q1
AS ( SELECT q0.email
, createdon
, createdby
, ROW_NUMBER() OVER ( PARTITION BY q0.email ORDER BY createdon ) rn
FROM t
JOIN q0
ON t.email = q0.email)
SELECT q1.email
, q1.createdon
, q1.createdby
, q1.rn
, CASE q1.rn
WHEN 1 THEN LEAD(q1.createdby, 1) OVER ( ORDER BY q1.email, q1.createdon )
ELSE LAG(q1.createdby, q1.rn - 2) OVER ( ORDER BY q1.email, q1.createdon )
END AS first_dupl_createdby
FROM q1
ORDER BY q1.email
, q1.rn
Explanation: for the first record in each partition, return [createdby] from the following record (i.e. from the record containing the first duplicate). For all other records in the same partition return [createdby] from (rn-2) records ago (i.e. for rn = 2 we're staying on the same record, for rn = 3 we're going 1 record back, for rn = 4 - 2 records back and so on).
An issue comes up on the
ELSE LAG(q1.createdby, q1.rn - 2)
operation. Apparently, against any logic, despite the existence of the preceding line (WHEN 1 THEN...), the ELSE block is also evaluated for rn = 1, resulting in a negative offset value passed to the LAG function:
Msg 8730, Level 16, State 2, Line 37
Offset parameter for Lag and Lead functions cannot be a negative value.
When I comment out that ELSE line, the whole thing works fine but obviously I am not getting any results in the first_dupl_createdby column for rn > 1.
QUESTION:
Is there any way of re-writing the above CASE statement (in TASK #2) so that it always returns the value from a record where rn = 2 within each partition but - and this is important bit - without doing a self-JOIN operation (I know I could prepare rows where rn = 2 in a separate sub-query but this would mean extra scans on the whole table and also running an unnecessary self-JOIN).
I think you can simply use the max window function as you are trying to get the value from rownumber = 2 for each partition.
SELECT q1.email
, q1.createdon
, q1.createdby
, q1.rn
, max(case when rn=2 then q1.createdby end) over(partition by q1.email) first_dup_created_by
FROM q1
ORDER BY q1.email, q1.rn
You can use a similar query to get the results for rownumber=1 for the 1st scenario as well.
You can get the information for each email using row_number() and conditional aggregation:
select email,
max(case when seqnum = 1 then createdby end) as createdby_first,
max(case when seqnum = 2 then createdby end) as createdby_second
from (select t.*,
row_number() over (partition by email order by createdon) as seqnum
from t
) t
group by email;
You can join this information back to the original data to get the information you want. I don't see how lag() naturally would be used to solve this problem.
/shrug
; WITH duplicate_email_addresses AS (
SELECT email
FROM t
GROUP
BY email
HAVING Count(*) > 1
)
, records_with_duplicate_email_addresses AS (
SELECT email
, createdon
, createdby
, Row_Number() OVER (PARTITION BY email ORDER BY createdon) AS sequencer
FROM t
WHERE EXISTS (
SELECT *
FROM duplicate_email_addresses
WHERE email = t.email
)
)
, second_duplicate_record AS ( -- Why do you need any more than this?
SELECT email
, createdon
, createdby
FROM records_with_duplicate_email_addresses
WHERE sequencer = 2
)
SELECT records_with_duplicate_email_addresses.email
, records_with_duplicate_email_addresses.createdon
, records_with_duplicate_email_addresses.createdby
, second_duplicate_record.createdby AS first_duplicate_createdby
FROM records_with_duplicate_email_addresses
INNER
JOIN second_duplicate_record
ON second_duplicate_record.email = records_with_duplicate_email_addresses.email
;

How to add a count/sum and group by in a CTE

Just a question on displaying a row on flight level and displaying a count on how many crew members on that flight.
I want to change the output so it will only display a single record at flight level and it will display two additional columns. One column (cabincrew) is the count of crew members that have the 'CREWTYPE' = 'F' and the other column (cockpitcrew) is the count of crew members that have the `'CREWTYPE' = 'C'.
So the query result should look like:
Flight DepartureDate DepartureAirport CREWBASE CockpitCrew CabinCrew
LS361 2016-05-19 BFS BFS 0 3
Can I have a little help tweaking the below query please:
WITH CTE AS (
SELECT cd.*, c.*, l.Carrier, l.FlightNumber, l.Suffix, l.ScheduledDepartureDate, l.ScheduledDepartureAirport
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber FROM Data.Crew) c
INNER JOIN
Data.CrewDetail cd
ON c.UpdateID = cd.CrewUpdateID
AND cd.IsPassive = 0
AND RowNumber = 1
INNER JOIN
Data.Leg l
ON c.LegKey = l.LegKey
)
SELECT
sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix AS Flight
, sac.DepartureDate
, sac.DepartureAirport
, sac.CREWBASE
, sac.CREWTYPE
, sac.EMPNO
, sac.FIRSTNAME
, sac.LASTNAME
, sac.SEX
FROM
Staging.SabreAssignedCrew sac
LEFT JOIN CTE cte
ON sac.Airline + CAST(sac.FlightNumber AS VARCHAR) + sac.Suffix = cte.Carrier + CAST(cte.FlightNumber AS VARCHAR) + cte.Suffix
AND sac.DepartureDate = cte.ScheduledDepartureDate
PLEASE TRY THIS.
SELECT Flight,
DepartureDate,
DepartureAirport,
CREWBASE,
SUM(CASE WHEN CREWTYPE = 'F' THEN 1 ELSE 0 END) AS CabinCrew ,
SUM(CASE WHEN CREWTYPE = 'C' THEN 1 ELSE 0 END) AS CockpitCrew
FROM #Table
GROUP BY Flight, DepartureDate, DepartureAirport, CREWBASE
Please Try This:
select Flight, DepartureDate, DepartureAirport,CREWBASE,
count(case when CREWTYPE='F' then 1 end ) as CabinCrew,count(case when CREWTYPE='C' then 1 end ) as CockpitCrew
from Staging.SabreAssignedCrew
group by Flight, DepartureDate, DepartureAirport,CREWBASE