Finding Median in Sql Server - sql

I want to get the median of unitRate from [dbo].[ReplaceCost_DirectCost_Details] view in Microsoft Sql Server Management Studio. I already got Min,Max and avg of it.But do not know about median. I tried following code, but did not get median .Thanks in advacen for your help.
select
JobName as JobName
,Client as Client
,AssetClass as AssetClass
,AssetType as AssetType
,AssetSubType as AssetSubType
,Component as Component
,ComponentType as ComponentType
,ComponentSubType as ComponentSubType
,UnitRate AS UnitRate
,Max(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MaxFinalUnitRate]
,Min(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MinFinalUnitRate]
,AVG(UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType) as [MeanFinalUnitRate]
,AVG (UnitRate) over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType)as Median
from
(
Select top (10)
JobName as JobName
,Client as Client
,AssetClass as AssetClass
,AssetType as AssetType
,AssetSubType as AssetSubType
,Component as Component
,ComponentType as ComponentType
,ComponentSubType as ComponentSubType
,UnitRate AS UnitRate
,ROW_NUMBER () over (partition by JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType order by UnitRate) as [RowNum]
,COUNT(*) OVER (PARTITION BY JobName,Client,AssetClass,AssetType,AssetSubType,Component,ComponentType,ComponentSubType ) AS RowCnt
from [dbo].[ReplaceCost_DirectCost_Details] rdd
where client = 'APV_Ballina_Shire_Council_Old' and UnitRate is not Null and UnitRate <> 0
) x
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)

EDIT
SQL Fiddle
CREATE TABLE Table1
([somevalue] int)
;
INSERT INTO Table1
([somevalue])
VALUES
(141),
(325),
(325),
(353),
(3166),
(325),
(207),
(141),
(3166),
(161)
;
Query 1:
with cte as (
select *
, row_number() over(order by somevalue) as RowNum
, count(*) over() as RowCnt
from table1
)
select
*
from CTE
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
| 325 | 5 | 10 |
| 325 | 6 | 10 |
Please consider the following small example. There are 7 rows of data, the median is the "midpoint" of those, so the where clause uses a row number compared to row count, and returns just that midpoint valuse. That value (67) repesents the median of that small sample.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Table1
([somevalue] int)
;
INSERT INTO Table1
([somevalue])
VALUES
(2),
(45),
(67),
(89),
(4567),
(6),
(1290)
;
Query 1:
with cte as (
select *
, row_number() over(order by somevalue) as RowNum
, count(*) over() as RowCnt
from table1
)
select
*
from CTE
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
Results:
| somevalue | RowNum | RowCnt |
|-----------|--------|--------|
| 67 | 4 | 7 |

(sorry for using a second answer, but it will get lost if just added to the earlier one)
I am really not certain what the expected output of your query is. But I note that you are using TOP(10) and for that to work you must have an order by otherwise the result is indeterminate for the first 10 rows.
While the following may produce many more rows than you need, perhaps it will help lead to a solution.
WITH Basis as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, ROW_NUMBER() OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType
ORDER BY UnitRate)
AS [rownum]
FROM [dbo].[ReplaceCost_DirectCost_Details] rdd
WHERE client = 'APV_Ballina_Shire_Council_Old'
AND UnitRate IS NOT NULL
AND UnitRate <> 0
)
, Top10s as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, rownum
, COUNT(*) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType)
AS rowcnt
FROM Basis
WHERE rownum <= 10
)
, Medians as (
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, AVG(UnitRate) AS Median
FROM Top10s
WHERE RowNum IN ((RowCnt + 1) / 2, (RowCnt + 2) / 2)
GROUP BY
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, AVG(UnitRate)
)
SELECT
JobName
, Client
, AssetClass
, AssetType
, AssetSubType
, Component
, ComponentType
, ComponentSubType
, UnitRate
, rownum
, rowcnt
, MAX(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [maxfinalunitrate]
, MIN(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [minfinalunitrate]
, AVG(UnitRate) OVER (PARTITION BY JobName, Client, AssetClass, AssetType, AssetSubType, Component, ComponentType, ComponentSubType) AS [meanfinalunitrate]
, Medians.Median
FROM Top10s t
JOIN Medians m ON t.JobName = m.JobName
AND t.Client = m.Client
AND t.AssetClass = m.AssetClass
AND t.AssetType = m.AssetType
AND t.AssetSubType = m.AssetSubType
AND t.Component = m.Component
AND t.ComponentType = m.ComponentType
AND t.ComponentSubType = m.ComponentSubType
;

Related

Where should i put the AS Clause for tax rate (VAT_RATE)

I want to put a AS VAT_RATE in this SELECT statement but i don't know where.
SELECT ROW_NUMBER() OVER(ORDER BY QD.DETAIL_ID) AS No,
QD.PRODUCT_ID AS PROD_ID,PM.'+#ProdCode+' AS PROD_CODE,pm.DESCRIPTION AS SHORT_DESC,
QD.CORPORATE_PRICE AS Corpo_Price,CONVERT(DECIMAL(18,2),QD.RETAIL_PRICE) AS UNIT_SP,QD.COST_PRICE AS COST_SP,
QD.GM,QD.DETAIL_ID,QD.DISC AS Discount,QD.NOTE,
VAT_RATE=(SELECT VAT_RATE/100 FROM dbo.vat
WHERE VAT_ID=(SELECT TOP 1 VAT_ID FROM dbo.product_detail(NOLOCK) WHERE PRODUCT_ID=PM.PROD_ID))
,
Img=(SELECT TOP 1 IMAGE_DATA FROM dbo.PRODUCT_IMAGE WHERE PRODUCT_ID=PM.PROD_ID), QD.CostPrice_Percentage
FROM dbo.CUSTOMER_QUOTATION_DETAIL(NOLOCK) QD
JOIN dbo.product_master(NOLOCK) PM ON PM.PROD_ID=QD.PRODUCT_ID
In TSQL you there is 3 way to name your columns
1) With the AS (optional in tsql)
SELECT QD.PRODUCT_ID AS PROD_ID
FROM dbo.CUSTOMER_QUOTATION_DETAIL(NOLOCK) QD
2) Without the AS (since it is optional)
SELECT QD.PRODUCT_ID PROD_ID
FROM dbo.CUSTOMER_QUOTATION_DETAIL(NOLOCK) QD
3) with an equal sign as if it is a formula
SELECT PROD_ID = QD.PRODUCT_ID
FROM dbo.CUSTOMER_QUOTATION_DETAIL(NOLOCK) QD
Specifically for your query this is where the AS should go.
You would have to remove the equal and put the AS at the end of the sub-query.
Please do note that you have various other issues with the queries that is beyond the scope your original question. If you run into preformance issue, do investigated on the subject of CROSS APPLY / CROSS OUTER JOIN and/or CTE : Common Table Expression.
SELECT ROW_NUMBER() OVER (
ORDER BY QD.DETAIL_ID
) AS No
, QD.PRODUCT_ID AS PROD_ID
--, PM.'+#ProdCode+' AS PROD_CODE
, #ProdCode AS PROD_CODE
, pm.DESCRIPTION AS SHORT_DESC
, QD.CORPORATE_PRICE AS Corpo_Price
, CONVERT(DECIMAL(18, 2), QD.RETAIL_PRICE) AS UNIT_SP
, QD.COST_PRICE AS COST_SP
, QD.GM
, QD.DETAIL_ID
, QD.DISC AS Discount
, QD.NOTE
, (
SELECT TOP 1 (VAT_RATE / 100)
FROM dbo.vat
WHERE VAT_ID = (
SELECT TOP 1 VAT_ID
FROM dbo.product_detail(NOLOCK)
WHERE PRODUCT_ID = PM.PROD_ID
)
) AS VAT_RATE
, (
SELECT TOP 1 IMAGE_DATA
FROM dbo.PRODUCT_IMAGE
WHERE PRODUCT_ID = PM.PROD_ID
) AS Img
, QD.CostPrice_Percentage
FROM dbo.CUSTOMER_QUOTATION_DETAIL(NOLOCK) QD
JOIN dbo.product_master(NOLOCK) PM
ON PM.PROD_ID = QD.PRODUCT_ID

always get a "not a GROUP BY expression" exception

The select part is fine, I can run it and get result, but if I insert the query result into a table, the "not a group by expression" exception was thrown, see my sql statement below:
INSERT INTO RAWDATA_FACT
(
LINEITEMID
,CALENDARYEAR
,CALENDARQUARTER
,CALENDARMONTH
,DEPARTMENTID
,PRODUCTID
,DEALERID
,ACTUALVALUE
,TARGETVALUE
,AGGREGATION
,TODATE
,CREATEDATE
,BATCH_ID
)
select parentid
,calendaryear
,calendarquarter
,calendarmonth
,departmentid
,productid
,dealerid
,sum(case when unaryoperator = '-' then actualvalue * (-1)
when unaryoperator = '~' then actualvalue * 0
else actualvalue
end
) actualvalue
,sum(case when unaryoperator = '-' then targetvalue * (-1)
when unaryoperator = '~' then targetvalue * 0
else targetvalue
end
) targetvalue
,aggregation
,todate
,sysdate createdate
,'201808' batch_id--v_batch_ID
from
(
select
x.parentid
, x.unaryoperator
, y.calendaryear
, y.calendarquarter
, y.calendarmonth
, y.departmentid
, y.productid
, y.dealerid
, y.actualvalue
, y.targetvalue
, y.aggregation
, y.todate
from
(select substr(lineitemid,instr(lineitemid,'_')+1) lineitemid,
parentid, unaryoperator from lineitem_temp where levelid = 14 /*v_cur_level*/) x
inner join
(select lineitemid,calendaryear, calendarquarter, calendarmonth, departmentid, productid, dealerid,
coalesce(actualvalue,0) actualvalue,coalesce(targetvalue,0) targetvalue,
aggregation, todate
from RAWDATA_FACT where BATCH_ID = '201808'/*v_batch_ID*/) y --eg. 201809
on x.lineitemid = y.lineitemid
--parent node's id contains "_" will not take part in calculation from current level to parent level
where regexp_like(x.parentid , '^\d+$') and not exists
--parent node contains formula will not participate calculation from current level to parent level
(select 1 from LINEITEM_TEMP where lineitemid = x.parentid and custommember is not null)
) t
GROUP BY
t.parentid
, t.calendaryear
, t.calendarquarter
, t.calendarmonth
, t.departmentid
, t.productid
, t.dealerid
, t.aggregation
, t.todate
;
who can tell me why? is there a way to insert the result into that table? If I run the select parts, it works fine, but if I want to insert the result into that table, it reports that exception.

Convert Oracle SQL statement into SQL Server statement

A bookings project now requires the same data extract - but from an SQL Server database - instead of Oracle. Can anyone assist converting the following into SQL Server syntax?
SELECT *
FROM (
SELECT o.ot_outlet_code
,v.lab_site_code ot_outlet_code
,v.brand
,v.region
, bd.cd_day_date booking_date, dd.cd_day_date dining_date
, f.last_change_date, f.created_date
, f.modified_date, t15.ts_timeslot_desc
, t.TIME, s.session_type
, tbs.booking_status, f.ADDED_BY_USER
, bp.product, bs.booking_source
, f.SPECIAL_OFFER, f.SEATING_PREFERENCE
, f.Tables_guest_id, covers
, booking_occurrence, breakfast_flag
, row_number() OVER (PARTITION BY f.Tables_guest_id ORDER BY f.last_change_date DESC, f.last_change_time DESC) rank_latest_record
, f.title, f.emailoptout
, f.MOBILE_OPT_IN, f.HIGH_CHAIR_COVERS
, f.GUEST_TYPE, f.Booking_ID
FROM owbi.whs_fact_rest_booking f
, owbi.whs_dim_cal_date bd
, owbi.whs_dim_cal_date dd
, owbi.whs_dim_bat_booking_source bs
, owbi.whs_dim_time_of_day t
, owbi.whs_dim_bat_product bp
, owbi.whs_dim_15_timeslot t15
, owbi.whs_dim_bat_booking_status tbs
, owbi.whs_dim_bat_session s
, owbi.bat_restaurants_v v
WHERE f.whs_dim_outlet = v.outlet
AND f.whs_dim_booking_date = bd.dimension_Key
AND f.whs_dim_dining_date = dd.dimension_key
AND f.whs_dim_bat_session = s.dimension_key
AND f.whs_dim_bat_booking_status = tbs.dimension_key
AND f.whs_dim_bat_product = bp.dimension_Key
AND f.whs_dim_bat_booking_source = bs.dimension_key
AND f.whs_dim_booking_time = t.dimension_Key
AND f.whs_dim_dining_15_timeslot = t15.dimension_key
AND dd.ey_year_code in ('2018')
AND f.whs_dim_dining_date >= 20170303
)
WHERE rank_latest_record = 1
ORDER BY BOOKING_DATE DESC;
The derived table must have an alias. eg
SELECT *
FROM (
SELECT o.ot_outlet_code
,v.lab_site_code ot_outlet_code
,v.brand
,v.region
, bd.cd_day_date booking_date, dd.cd_day_date dining_date
, f.last_change_date, f.created_date
, f.modified_date, t15.ts_timeslot_desc
, t.TIME, s.session_type
, tbs.booking_status, f.ADDED_BY_USER
, bp.product, bs.booking_source
, f.SPECIAL_OFFER, f.SEATING_PREFERENCE
, f.Tables_guest_id, covers
, booking_occurrence, breakfast_flag
, row_number() OVER (PARTITION BY f.Tables_guest_id ORDER BY f.last_change_date DESC, f.last_change_time DESC) rank_latest_record
, f.title, f.emailoptout
, f.MOBILE_OPT_IN, f.HIGH_CHAIR_COVERS
, f.GUEST_TYPE, f.Booking_ID
FROM owbi.whs_fact_rest_booking f
, owbi.whs_dim_cal_date bd
, owbi.whs_dim_cal_date dd
, owbi.whs_dim_bat_booking_source bs
, owbi.whs_dim_time_of_day t
, owbi.whs_dim_bat_product bp
, owbi.whs_dim_15_timeslot t15
, owbi.whs_dim_bat_booking_status tbs
, owbi.whs_dim_bat_session s
, owbi.bat_restaurants_v v
WHERE f.whs_dim_outlet = v.outlet
AND f.whs_dim_booking_date = bd.dimension_Key
AND f.whs_dim_dining_date = dd.dimension_key
AND f.whs_dim_bat_session = s.dimension_key
AND f.whs_dim_bat_booking_status = tbs.dimension_key
AND f.whs_dim_bat_product = bp.dimension_Key
AND f.whs_dim_bat_booking_source = bs.dimension_key
AND f.whs_dim_booking_time = t.dimension_Key
AND f.whs_dim_dining_15_timeslot = t15.dimension_key
AND dd.ey_year_code in ('2018')
AND f.whs_dim_dining_date >= 20170303
) dt
WHERE rank_latest_record = 1
ORDER BY BOOKING_DATE DESC;
In SQL Server it's considered poor form not use ANSI-style JOINs, although it's perfectly legal for inner joins to write them as a cross-join with the join criteria in the WHERE clause.
And it's generally better to use CTEs instead of subqueries/inline views/derived tables in the FROM clause.

Order by logic for a forum thread

I have three columns
ThreadID
DateTime
CommentID
ReplyCommentID
Query
WITH CTE AS ( SELECT CommentID ,
CommentUserName,
ReplyCommentID ,
CommentID AS ThreadID ,
CAST( CommentID AS VARCHAR( MAX ) ) AS PathStr,
HtmlComment ,
CommentPostDocumentID ,
CommentIsApproved,
CommentDate
FROM Blog_CommentDetails AS T WITH(NOLOCK)
WHERE ReplyCommentID IS NULL
UNION ALL
SELECT T.CommentID ,
T.CommentUserName,
T.ReplyCommentID ,
CTE.ThreadID ,
PathStr + '-'+ CAST( T.ReplyCommentID AS VARCHAR( MAX ) ) AS PathStr,
T.HtmlComment ,
t.CommentPostDocumentID ,
t.CommentIsApproved,
T.CommentDate
FROM Blog_CommentDetails AS T WITH(NOLOCK)
JOIN CTE
ON T.ReplyCommentID = CTE.CommentID
WHERE T.ReplyCommentID IS NOT NULL)
SELECT *
FROM CTE
WHERE CommentPostDocumentID = 15 AND CommentIsApproved=1
ORDER BY ThreadID, PathStr ,
CommentDate DESC;
I need to order by ThreadID ascending first
Then i need to order by CommentID ascending second
Then i need to order by date descending third
But one condision, when there is commenid and replycommendid matches for two rows, rows with commenid should be first.
How can i write an order by for this?
ORDER BY ThreadID,CommentID,DateTime desc,
IF(ReplyCommentID == CommentID)
then
rows with commentid should be first
Current result:
But the expected result is:
You can use a calculated value for ordering; However, as calculation requires attributes from separate rows, these rows have first to be combined.
Without demo data in textual form, which I could take over to my environment, it's a bit hard to test.
The following schema is very close to your requirements, if we consider a as representing CommentId and b standing for ReplyCommentID:
create table test (
a int,
b int
);
insert into test (a,b) values (1,null), (2,1), (3,2), (4,1);
select distinct test.a as a, test.b as b, case when (test.b=test2.a or test.b is null) then 0 else 1 end as priority
from test left join test test2 on test.b = test2.a and test2.b is null
order by priority,a,b
Note that the order compared to an order by a,b changes analogously to what you expect in your sample data.
Givent that, when applying to your query (after the WITH CTE...-part), it should look as follows. As mentioned above, I cannot test it, so please do not throw stones on me if it does not work immediately:
SELECT distinct cte.*, case when (cte.ReplyCommentID =cte2.CommentId or cte.ReplyCommentID is null) then 0 else 1 end as priority
FROM CTE left join CTE cte2 on cte.ReplyCommentID = cte2.CommentId and cte2.ReplyCommentID is null
WHERE CommentPostDocumentID = 15 AND CommentIsApproved=1
ORDER BY ThreadID, priority, CommentId, PathStr , CommentDate DESC;
The following extensive example will walk through a tree of comments based on CommentID and ReplyCommentID (which serves as the ParentID). Children of the same ReplyCommentID are ordered by CommentDate DESC:
DECLARE #Table TABLE (
CommentID INT,
ReplyCommentID INT,
CommentDate DATETIME
);
INSERT INTO #Table VALUES
(140,NULL, CAST('20170109' AS DATETIME))
,(141,NULL, CAST('20170110' AS DATETIME))
,(142,141, CAST('20170111' AS DATETIME))
,(143,141, CAST('20170112' AS DATETIME))
,(144,141, CAST('20170113' AS DATETIME))
,(145,144, CAST('20170114' AS DATETIME))
,(146,NULL, CAST('20170115' AS DATETIME));
WITH [Statistics] AS (
SELECT Records.CommentID, Records.ReplyCommentID
, COUNT(Children.CommentID) AS NrOfChildren
, ROW_NUMBER() OVER (PARTITION BY Records.ReplyCommentID ORDER BY Records.CommentDate DESC) AS NthChild
, COUNT(Records.CommentID) OVER (PARTITION BY Records.ReplyCommentID) AS NrOfSiblings
FROM #Table Records
LEFT JOIN #Table Children ON Records.CommentID = Children.ReplyCommentID
GROUP BY Records.CommentID, Records.ReplyCommentID, Records.CommentDate
)
, Tree AS (
SELECT *
, 1 AS [Order]
, 0 AS [Rerouting]
, CAST(-1 AS INT) AS ReroutedFromNthChild
FROM [Statistics] AS TreeNode
WHERE ReplyCommentID IS NULL AND NthChild = 1
UNION ALL
SELECT NextNode.*
, TreeNode.[Order] + 1 AS [Order]
, CASE
WHEN (TreeNode.NrOfChildren = 0 AND TreeNode.NthChild = TreeNode.NrOfSiblings AND TreeNode.ReplyCommentID = NextNode.CommentID)
OR (TreeNode.Rerouting = 1 AND TreeNode.ReroutedFromNthChild = TreeNode.NrOfChildren AND TreeNode.ReplyCommentID = NextNode.CommentID)
THEN 1
ELSE 0
END AS [Rerouting]
, CAST(TreeNode.NthChild AS INT) AS ReroutedFromNthChild
FROM Tree AS TreeNode
JOIN [Statistics] AS NextNode
--Has children, so select first child
ON (TreeNode.Rerouting = 0 AND TreeNode.NrOfChildren > 0 AND NextNode.NthChild = 1 AND TreeNode.CommentID = NextNode.ReplyCommentID)
--Has no children, so select next sibling
OR (TreeNode.Rerouting = 0 AND TreeNode.NrOfChildren = 0 AND TreeNode.NthChild + 1 = NextNode.NthChild AND (TreeNode.ReplyCommentID = NextNode.ReplyCommentID OR (TreeNode.ReplyCommentID IS NULL AND NextNode.ReplyCommentID IS NULL)))
--Has no children or following siblings, so retrace the step (reroute)
OR (TreeNode.Rerouting = 0 AND TreeNode.NrOfChildren = 0 AND TreeNode.NthChild = TreeNode.NrOfSiblings AND TreeNode.ReplyCommentID = NextNode.CommentID)
--Was rerouting but has children, so back on track and follow the next child
OR (TreeNode.Rerouting = 1 AND TreeNode.ReroutedFromNthChild < TreeNode.NrOfChildren AND TreeNode.ReroutedFromNthChild + 1 = NextNode.NthChild AND TreeNode.CommentID = NextNode.ReplyCommentID)
--Was rerouting and has no other children, so continue rerouting
OR (TreeNode.Rerouting = 1 AND TreeNode.ReroutedFromNthChild = TreeNode.NrOfChildren AND TreeNode.ReplyCommentID = NextNode.CommentID)
--Rerouted to the top without children left, jumping to the next sibling
OR (TreeNode.Rerouting = 1 AND (TreeNode.ReroutedFromNthChild = TreeNode.NrOfChildren OR TreeNode.NrOfChildren = 0) AND TreeNode.NthChild + 1 = NextNode.NthChild AND TreeNode.ReplyCommentID IS NULL AND NextNode.ReplyCommentID IS NULL)
)
select *
from Tree
where Rerouting = 0
ORDER BY [Order]

Remove Duplicates while Merging values

How can I remove duplicates and merge Account Types?
I have a call log that reports duplicate phones based on Account Type.
For example:
Telephone | Account Type
304-555-6666 | R
304-555-6666 | C
I know how to remove duplicate Telephones using RANK\MAXCOUNT
But before removing duplicates I need to reset the Account Type to “B” is the duplicates have multiple account types.
In the example the surviving duplicate would be:
Telephone | Account Type
304-555-6666 | B
Warning, it is not guaranteed that duplicate phones have multiple Account Types.
Example:
Telephone | Account Type
999-888-6666 | R
999-888-6666 | R
Therefore the surviving duplicate should be:
Telephone | Account Type
999-888-6666 | R
How can I remove duplicates and reset the account type at the same time?
--
-- Remove Duplicate Recordings
--
SELECT * FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT'
FROM #myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]
SELECT * FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT',
DENSE_RANK() OVER ( PARTITION BY i.telephone ORDER BY s.accounttype DESC ) AS 'ContPhone'
FROM #myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]
Try this?
select
x.dateofcall
, x.recordingfile
, x.telephone
, case when count(*) > 2 then 'B' else max(x.accounttype) end accounttype
(
select
i.dateofcall
, i.recordingfile
, i.telephone
, s.accounttype
from
#myactions i
LEFT JOIN #myphone s ON s.interactionID = i.Interactionid
group by
i.dateofcall
, i.recordingfile
, i.telephone
, s.accounttype
) x
group by
x.dateofcall
, x.recordingfile
, x.telephone
Basically you need to put your business check in a case statement outside.
EDIT: I've also added the logic for B, R and C. Also done a sql fiddle- link to fiddle -http://sqlfiddle.com/#!6/b5ef5/7
SELECT
x.dateofcall,
x.recordingfile,
x.telephone,
COALESCE(
CASE WHEN x.maxcount>1 AND value>x.maxcount AND value<(2*x.maxcount) THEN 'B' ELSE NULL END,
CASE WHEN x.maxcount>1 AND value= (2*x.maxcount) THEN 'C' ELSE NULL END,
CASE WHEN x.maxcount>1 AND value= x.maxcount THEN 'R' ELSE NULL END,
x.accounttype ) as accounttype,
x.rank,
x.maxcount
FROM (
SELECT i.dateofcall ,
i.recordingfile ,
i.telephone ,
s.accounttype ,
ROW_NUMBER() OVER (PARTITION BY i.telephone ORDER BY i.dateofcall DESC) AS 'RANK' ,
COUNT(i.telephone) OVER (PARTITION BY i.telephone) AS 'MAXCOUNT',
SUM(CASE WHEN s.accounttype LIKE 'R' THEN 1 ELSE 2 END) OVER (PARTITION BY i.telephone) as Value
FROM
myactions i LEFT JOIN myphone s
ON s.interactionID = i.Interactionid
) x
WHERE [RANK] = [MAXCOUNT]