First Contact Resolution - sql

I am using Microsoft SQL Server Management Studio
I have a table which contains a unique customerId, date when the contact was made and reason why the contact was made.
customerId
,DateOfContact
,ContactReason
I need to create two Yes (Y) or No (N) columns.
Column 1 named 7DayContact
Column 2 named 7DaySameContact
Column 1 should provide me with a Y or N if the same customerId had another contact in the previous 7 days (interval of 7 days)
Column 2 should provide me with a Y or N if the same customerId had another contact in the previous 7 days with the same contactReason.
How should I go about it?
I didn't manage to do anything.

I am using Microsoft SQL Server Management Studio.
What I did for the first scenario is the below query, however, all results are showing as 'N' whilst there are results that should show as 'Y'.
Basically, I want to look at the con.CustomerId, go back 7 days and check whether the same con.CustomerId shows up. If the same con.CustomerID shows up, then give 'Y' else 'N'.
SELECT
con.[CustomerID]
,con.[ContactDate]
,con.[ContactReason1]
,con.[ContactReason2]
,con.[ContactReason3]
,CASE
WHEN LAG(con.[ContactDate], 1) OVER (PARTITION BY con.[CustomerID] ORDER BY con.[ContactDate]) BETWEEN con.[ContactDate] AND DATEADD(DAY, -7, con.[ContactDate])
THEN 'Y' ELSE 'N'
END AS '7DayContact'
FROM [DWH_Unknown1].[unknown2].[unknown3] con
Order By disp.DispositionDate DESC
For the second scenario I want to look at the con.CustomerId and con.ContactReason2, go back 7 days and check whether the same con.CustomerId having the same con.ContactReason2 shows up. If the same con.CustomerID having the same con.ContactReason2 shows up, then give 'Y' else 'N'.

Please note that you have two different tags mysql <> sql-server.
I used SQL server coding, best to use the LAG function:
SELECT customerId, DateOfContact, ContactReason,
CASE WHEN LAG(DateOfContact, 1) OVER (PARTITION BY customerId ORDER BY DateOfContact) BETWEEN DateOfContact - INTERVAL 7 DAY AND DateOfContact THEN 'Y' ELSE 'N' END AS 7DayContact,
CASE WHEN LAG(ContactReason, 1) OVER (PARTITION BY customerId ORDER BY DateOfContact) = ContactReason THEN 'Y' ELSE 'N' END AS 7DaySameContact
FROM yourTable

Related

Improve CASE WHEN Performance

I want to calculate customer retention week over week. My sales_orders table has columns order_date, and customer_name. Basically I want to check if a customer in this week also had an order the previous week. To do this, I have used CASE WHEN and subquery as follows (I have extracted order_week in a cte I've called weekly_customers and gotten distinct customer names within each week):
SELECT wc.order_week,
wc.customer,
CASE
WHEN wc.customer IN (
SELECT sq.customer
FROM weekly_customers sq
WHERE sq.order_week = (wc.order_week - 1))
THEN 'YES'
ELSE 'NO'
END AS present_in_previous_week
from weekly_customers wc
The query returns the correct data. My issue, the table is really huge with about 15000 distinct weekly values. This obviously leads to very long execution time. Is there a way I can improve this loop or even an alternative to the loop altogether?
Something like this:
SELECT
wc.order_week,
wc.customer,
CASE WHEN wcb.customer IS NOT NULL THEN "YES" ELSE "NO" END AS present_in_previous_week
FROM weekly_customers AS wca
LEFT JOIN
weekly_customers AS wcb
ON
wca.customer = wcb.customer
AND wca.order_week - 1 = wcb.order_week
This joins all of the customer data onto the customer data from a week ago. If there is a record for a week ago then wcb.customer will not be null, and we can set the flag to "YES". Otherwise, we set the flag to "NO".

Find duplicates using parcel number and updated their status. SQL Server 2014

I have a problem with duplicate records in a SQL Server 2014 database.
Users get a small postcard with a parcel number printed on them.
The postcard also shows a link to a simple form that they can use, to register their parcel.
The form unfortunately does not have any type of validation, to ensure that the same parcel does not get submitted more than once.
I currently have no control on the web form, and I am not sure how long will take for the responsible team to implement validation on it.
So I have to come up with a routine to deactivate the duplicate records, and keep only one.
This has to be a query that process a bulk of records, no tokens passed to the routine.
When the web form gets submitted, it creates a record id in sequential order, and assigns an application status of "Registered'.
I think that the way to correct this, would be to take highest record id value per parcel, and that would be the one to keep, the rest, will have to be deactivated.
Deactivate the non most recent records putting a rec_status of "I"
Set APPLICATION_STATUS to 'Closed' to the non most recent records
The query I use, returns 4 columns: Record Id, Parcel Number, Record Status, and Application Status
SELECT
B.[RECORD_ID],
B.[PARCEL_NBR],
B.[RECORD_STATUS], -- The value of this column would be "I" for the duplicate records.
B.[APPLICATION_STATUS]
FROM
A_TABLE A
INNER JOIN B_TABLE B
ON A.PARCEL_NBR = B.PARCEL_NBR
AND (A.APPLICATION_STATUS IS NULL
OR B.APPLICATION_STATUS = 'Registered');
Initial Output:
RECORD_ID PARCEL_NBR RECORD_STATUS APPLICATION_STATUS
REC-00081 0608012098 A Registered
REC-00082 0608012098 A Registered
REC-00083 0608012098 A Registered
Expected Output:
RECORD_ID PARCEL_NBR RECORD_STATUS APPLICATION_STATUS
REC-00081 0608012098 I Closed - this record got updated
REC-00082 0608012098 I Closed - this record got updated
REC-00083 0608012098 A Registered
I think that perhaps a cursor might be part of the solution? Honestly I am not sure. I kindly ask for your help.
You can use window functions and case logic:
SELECT B.[RECORD_ID], B.[PARCEL_NBR],
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY B.PARCEL_NBR ORDER BY B.RECORD_ID DESC) > 1
THEN 'I' ELSE B.[RECORD_STATUS]
END) as RECORD_STATUS,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY B.PARCEL_NBR ORDER BY B.RECORD_ID DESC) > 1
THEN Closed - this record got updated ELSE B.APPLICATION_STATUS
END) as APPLICATION_STATUS,
B.[]
FROM A_TABLE A JOIN
B_TABLE B
ON A.PARCEL_NBR = B.PARCEL_NBR AND
(A.APPLICATION_STATUS IS NULL OR B.APPLICATION_STATUS = 'Registered');
I'm not sure what role A_TABLE plays in this, but this may give you what you want:
update B_TABLE
set record_Status = 'I'
, application_status = 'Closed - this record got updated'
where record_status = 'A'
and application_status = 'Registered'
and record_id <> (select max(record_id)
from B_TABLE b
where b.parcel_nbr = B_TABLE.parcel_nbr
and b.record_status = 'A'
and b.application_status = 'Registered');

JOIN other table only if condition is true for ALL joined rows

I have two tables I'm trying to conditionally JOIN.
dbo.Users looks like this:
UserID
------
24525
5425
7676
dbo.TelemarketingCallAudits looks like this (date format dd/mm/yyyy):
UserID Date CampaignID
------ ---------- ----------
24525 21/01/2018 1
24525 26/08/2018 1
24525 17/02/2018 1
24525 12/01/2017 2
5425 22/01/2018 1
7676 16/11/2017 2
I'd like to return a table that contains ONLY users that I called at least 30 days ago (if CampaignID=1) and at least 70 days ago (if CampaignID=2).
The end result should look like this (today is 02/09/18):
UserID Date CampaignID
------ ---------- ----------
5425 22/01/2018 1
7676 16/11/2017 2
Note that because I called user 24524 with Campaign 1 only 7 days ago, I shall not see the user at all.
I tried this simple AND/OR condition and then I found out it will still return the users I shouldn't see because they do have rows indicating other calls and it simply ignoring the conditioned calls... which misses the goal obviously.
I have no idea on how to condition the overall appearance of the user if ANY of his associated rows in the second table did not meet the condition.
AND
(
internal_TelemarketingCallAudits.CallAuditID IS NULL --No telemarketing calls is fine
OR
(
internal_TelemarketingCallAudits.CampaignID = 1 --Campaign 1
AND
DATEADD(dd, 75, MAX(internal_TelemarketingCallAudits.Date)) < GETDATE() --Last call occured at least 10 days ago
)
OR
(
internal_TelemarketingCallAudits.CampaignID != 1 --Other campaigns
AND
DATEADD(dd, 10, MAX(internal_TelemarketingCallAudits.Date)) < GETDATE() --Last call occured at least 10 days ago
)
)
I really appreciate your help.
Try this: SQL Fiddle
select *
from dbo.Users u
inner join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
inner join (
values (1, 60)
, (2, 70)
) c (CampaignId, DaysSinceLastCall)
on tca.CampaignId = c.CampaignId
) mrc
on mrc.UserId = u.UserId
and mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
I'm not comparing all rows here; but rather saw that you're interested in when the most recent call is; then you only care if that's in the X day window. There's a bit of additional complexity given the X days varies by campaign; so it's not the most recent call you care about so much as the most likely to fall within that window. To get around that, I sort each users' calls by those which are in the window first followed by those which aren't; then sort by most recent first within those 2 groups. This gives me the field r.
By filtering on r = 1 for each user, we only get the most recent call (adjusted for campaign windows). By filtering on LastCalledInWindow = 0 we exclude those who have been called within the campaign's window.
NB: I've used an inner query (aliased c) to hold the campaign ids and their corresponding windows. In reality you'd probably want a campaigns table holding that same information instead of coding inside the query itself.
Hopefully everything else is self-explanatory; but give me a nudge in the comments if you need any further information.
UPDATE
Just realised you'd also said "no calls is fine"... Here's a tweaked version to allow for scenarios where the person has not been called.
SQL Fiddle Example.
select *
from dbo.Users u
left outer join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
inner join (
values (1, 60)
, (2, 70)
) c (CampaignId, DaysSinceLastCall)
on tca.CampaignId = c.CampaignId
) mrc
on mrc.UserId = u.UserId
where
(
mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
)
or mrc.r is null --no calls at all
Update: Including a default campaign offset
To include a default, you could do something like the code below (SQL Fiddle Example). Here, I've put each campaign's offset value in the Campaigns table, but created a default campaign with ID = -1 to handle anything for which there is no offset defined. I use a left join between the audit table and the campaigns table so that we get all records from the audit table, regardless of whether there's a campaign defined, then a cross join to get the default campaign. Finally, I use a coalesce to say "if the campaign isn't defined, use the default campaign".
select *
from dbo.Users u
left outer join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
left outer join Campaigns c
on tca.CampaignId = c.CampaignId
cross join Campaigns dflt
where dflt.CampaignId = -1
) mrc
on mrc.UserId = u.UserId
where
(
mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
)
or mrc.r is null --no calls at all
That said, I'd recommend not using a default, but rather ensuring that every campaign has an offset defined. i.e. Presumably you already have a campaigns table; and since this offset value is defined per campaign, you can include a field in that table for holding this offset. Rather than leaving this as null for some records, you could set it to your default value; thus simplifying the logic / avoiding potential issues elsewhere where that value may subsequently be used.
You'd also asked about the order by clause. There is no order by 1/0; so I assume that's a typo. Rather the full statement is row_number() over (partition by tca.UserId order by case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r.
The purpose of this piece is to find the "most important" call for each user. By "most important" I basically mean the most recent, since that's generally what we're after; though there's one caveat. If a user is part of 2 campaigns, one with an offset of 30 days and one with an offset of 60 days, they may have had 2 calls, one 32 days ago and one 38 days ago. Though the call from 32 days ago is more recent, if that's on the campaign with the 30 day offset it's outside the window, whilst the older call from 38 days ago may be on the campaign with an offset of 60 days, meaning that it's within the window, so is more of interest (i.e. this user has been called within a campaign window).
Given the above requirement, here's how this code meets it:
row_number() produces a number from 1, counting up, for each row in the (sub)query's results. The counter is reset to 1 for each partition
partition by tca.UserId says that we're partitioning by the user id; so for each user there will be 1 row for which row_number() returns 1, then for each additional row for that user there will be a consecutive number returned.
The order by part of this statement defines which of each users' rows gets #1, then how the numbers progress thereafter; i.e. the first row according to the order by gets number 1, the next number 2, etc.
case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end returns 1 for calls within their campaign's window, and 0 for those outside of the window. Since we're ordering by this result in ascending order, that says that any records within their campaign's window should be returned before any outside of their campaign's window.
we then order by tca.[Date] desc; i.e. the more recent calls are returned before the later calls.
finally, we name the output of this row number as r and in the outer query filter on r = 1; meaning that for each user we only take one row, and that's the first row according to the order criteria above; i.e. if there's a row in its campaign's window we take that, after which it's whichever call was most recent (within those in the window if there were any; then outside that window if there weren't).
Take a look at the output of the subquery to get a better idea of exactly how this works: SQL Fiddle
I hope that explanation makes some sense / helps you to understand the code? Sadly I can't find a way to explain it more concisely than the code itself does; so if it doesn't make sense try playing with the code and seeing how that affects the output to see if that helps your understanding.

I want to update values in column, based on condition that need to compare data from another table

I need help on one case in SQL, so I have to fill one column DIFFERENCE with 'Above' or 'Below' in table CLIENTS, if the date in other column in table - DOCUMENT is above or below 4 months from now. I tried with this
UPDATE CLIENTS
SET DIFFERENCE = CASE WHEN MONTHS_BETWEEN(TO_DATE((SELECT DATA FROM DOCUMENT, CLIENTS WHERE DOCUMENT.ID_CLIENT=CLIENTS.ID_CLIENT ),'DD.MM.YYYY'),TO_DATE(SYSDATE,'DD.MM.YYYY')) < 4 THEN 'Below' ELSE 'Above' END
but it returns lot of values, so I tried to JOIN the tables and
UPDATE CLIENTS
SET DIFFERENCE = CASE WHEN MONTHS_BETWEEN(TO_DATE(DATA,'DD.MM.YYYY'),TO_DATE(SYSDATE,'DD.MM.YYYY')) < 4 THEN 'Below' ELSE 'Above' END
FROM CLIENTS JOIN DOCUMENT
ON DOCUMENT.ID_CLIENT=CLIENTS.ID_CLIENT
but this time says Not properly ended.
I'm working with Oracle db.
Please if you see the answer, write me!
Thank you in advance!
SELECT CLIENTS.ID_CLIENT,MIN(DOCUMENT.DATA) AS "DATA"
FROM DOCUMENT,CLIENTS
WHERE CLIENTS.ID_CLIENT=DOCUMENT.ID_CLIENT
GROUP BY CLIENTS.ID_CLIENT
and some of the results:
ID_CLIENT DATA
54 01/23/2014
57 01/23/2014
78 01/23/2014
87 01/24/2014
91 01/24/2014
I found the solution,
UPDATE CLIENTS
SET DIFFERENCE = CASE WHEN MONTHS_BETWEEN(TO_DATE((SELECT MIN(DATA) FROM DOCUMENT, CLIENTS WHERE DOCUMENT.ID_CLIENT=CLIENTS.ID_CLIENT),'MM.DD.YYYY'),TO_DATE(SYSDATE,'MM.DD.YYYY')) < 4 THEN 'Below' ELSE 'Above' END
The mistake was 'MM.DD.YYYY' ... first I used 'DD.MM.YYYY' - very stupid mistake!
Thanks for all the answers! ekad YOU really helped me!!!
Instead of joining the tables, you need to check whether there's any related documents with DATA more than 4 months from now using EXISTS. It also seems that DOCUMENT.DATA is a varchar and the value is set using mm/dd/yyyy format, so you need to change the second parameter of TO_DATE function to MM/DD/YYYY
UPDATE CLIENTS
SET DIFFERENCE = CASE WHEN EXISTS
(SELECT 1 FROM DOCUMENT
WHERE ID_CLIENT = CLIENTS.ID_CLIENT
AND MONTHS_BETWEEN(TO_DATE(DATA,'MM/DD/YYYY'),SYSDATE) > 4)
THEN 'Above'
ELSE 'Below' END

SQL - 2 table values to be grouped by third unconnected value

I want to create a graph that pulls data from 2 user questions generated from within an SQL database.
The issue is that the user questions are stored in the same table, as are the answers. The only connection is that the question string includes a year value, which I extract using the LEFT command so that I output a column called 'YEAR' with a list of integer values running from 2013 to 2038 (25 year period).
I then want to pull the corresponding answers ('forecast' and 'actual') from each 'YEAR' so that I can plot a graph with a couple of values from each year (sorry if this isn't making any sense). The graph should show a forecast line covering the 25 year period with a second line (or column) showing the actual value as it gets populated over the years. I'll then be able to visualise if our actual value is close to our original forecast figures (long term goal!)
CODE BELOW
SELECT CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INTEGER) AS YEAR,
-- first select takes left 4 characters of question and outputs value as string then coverts value to whole number.
CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%forecast' THEN F_TASK_ANS.TA_ANS_ANSWER END) AS NUMERIC(9,2)) AS 'FORECAST',
CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%actual' THEN ISNULL(F_TASK_ANS.TA_ANS_ANSWER,0) END) AS NUMERIC(9,2)) AS 'ACTUAL'
-- actual value will be null until filled in each year therefore ISNULL added to replace null with 0.00.
FROM F_TASK_ANS INNER JOIN F_TASKS ON F_TASK_ANS.TA_ANS_FKEY_TA_SEQ = F_TASKS.TA_SEQ
WHERE TA_ANS_ANSWER <> ''
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
-- The two numbers above refer to separate PPM questions that the user enters a value into
I tried GROUP BY 'YEAR' but I get an
Error: Each GROUP BY expression must contain at least one column that
is not an outer reference - which I assume is because I haven't linked
the 2 tables in any way...
Should I be adding a UNION so the tables are joined?
What I want to see is something like the following output (which I'll graph up later)
YEAR FORECAST ACTUAL
2013 135000 127331
2014 143000 145102
2015 149000 0
2016 158000 0
2017 161000 0
2018... etc
Any help or guidance would be hugely appreciated.
Thanks
Although the syntax is pretty hairy, this seems like a fairly simple query. You are in fact linking your two tables (with the JOIN statement) and you don't need a UNION.
Try something like this (using a common table expression, or CTE, to make the grouping clearer, and changing the syntax for slightly greater clarity):
WITH data
AS (
SELECT YEAR = CAST((LEFT(A.TA_ANS_QUESTION,4)) AS INTEGER)
, FORECAST = CASE WHEN A.TA_ANS_QUESTION LIKE '%forecast'
THEN CONVERT(NUMERIC(9,2), A.TA_ANS_ANSWER)
ELSE CONVERT(NUMERIC(9,2), 0)
END
, ACTUAL = CASE WHEN A.TA_ANS_QUESTION LIKE '%actual'
THEN CONVERT(NUMERIC(9,2), ISNULL(A.TA_ANS_ANSWER,0) )
ELSE CONVERT(NUMERIC(9,2), 0)
END
FROM F_TASK_ANS A
INNER JOIN F_TASKS T
ON A.TA_ANS_FKEY_TA_SEQ = T.TA_SEQ
-- It sounded like you wanted to include the ones where the answer was null. If
-- that's wrong, get rid of the test for NULL.
WHERE (A.TA_ANS_ANSWER <> '' OR A.TA_ANS_ANSWER IS NULL)
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
)
SELECT YEAR
, FORECAST = SUM(data.Forecast)
, ACTUAL = SUM(data.Actual)
FROM data
GROUP BY YEAR
ORDER BY YEAR
Try something like this ...
SELECT CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INT) AS [YEAR]
,SUM(CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%forecast'
THEN F_TASK_ANS.TA_ANS_ANSWER ELSE 0 END) AS NUMERIC(9,2))) AS [FORECAST]
,SUM(CAST((CASE WHEN F_TASK_ANS.TA_ANS_QUESTION LIKE '%actual'
THEN F_TASK_ANS.TA_ANS_ANSWER ELSE 0 END) AS NUMERIC(9,2))) AS [ACTUAL]
FROM F_TASK_ANS INNER JOIN F_TASKS
ON F_TASK_ANS.TA_ANS_FKEY_TA_SEQ = F_TASKS.TA_SEQ
WHERE TA_ANS_ANSWER <> ''
AND (TA_TASK_ID LIKE '%6051' OR TA_TASK_ID LIKE '%6052')
GROUP BY CAST((LEFT(F_TASK_ANS.TA_ANS_QUESTION,4)) AS INT)