Need some help in creating a query in SQL? - sql

I have 2 following tables :
Ticket(ID, Problem, Status,Priority, LoggedTime,CustomerID*, ProductID*);
TicketUpdate(ID,Message, UpdateTime,TickedID*,StaffID*);
Here is a question to be answered:
Close all support tickets which have not been updated for at least 24 hours. This will be records that have received at least one update from a staff member and no further updates from the customer (or staff member) for at least 24 hours.
My query is:
UPDATE Ticket SET Status = 'closed' FROM TicketUpdate
WHERE(LoggedTime - MAX(UpdateTime))> 24
AND Ticket.ID = TicketUpdate.TicketID;
When I run this query on mysql it says that "<" does not exist.
Can you tell me is my query right to for calculating the records which have not been updated for at least 24 hours and if it is right what should I do use instead of "<"?

... records that have received at least one update from a staff member and
no further updates from the customer (or staff member) for at least 24
hours.
So, effectively, the last update must have been done by a staff member and be older than 24 hours. That covers it all.
(BTW, you have a typo: TickedID -> I use ticketid here.)
UPDATE ticket t
SET status = 'closed'
FROM (
SELECT DISTINCT ON (1)
ticketid
,first_value(updatetime) OVER w AS last_up
,first_value(staffid) OVER w AS staffid
FROM ticketupdate
-- you could join back to ticket here and eliminate 'closed' ids right away
WINDOW w AS (PARTITION BY ticketid ORDER BY updateTime DESC)
) tu
WHERE tu.ticketid = t.id
AND tu.last_up < (now()::timestamp - interval '24 hours')
AND tu.staffid > 1 -- whatever signifies "update from a staff member"
AND t.status IS DISTINCT FROM 'closed'; -- to avoid pointless updates
Note that PostgreSQL folds identifiers to lower case if not double-quoted. I advise to stay away from mixed case identifiers to begin with.

If you are working with postgreSQL then this should work
UPDATE Ticket SET Status = 'closed' FROM TicketUpdate
WHERE abs(extract(epoch from LoggedTime - MAX(UpdateTime))) >24
AND Ticket.ID = TicketUpdate.TicketID;

Related

Improve CASE WHEN Performance

I want to calculate customer retention week over week. My sales_orders table has columns order_date, and customer_name. Basically I want to check if a customer in this week also had an order the previous week. To do this, I have used CASE WHEN and subquery as follows (I have extracted order_week in a cte I've called weekly_customers and gotten distinct customer names within each week):
SELECT wc.order_week,
wc.customer,
CASE
WHEN wc.customer IN (
SELECT sq.customer
FROM weekly_customers sq
WHERE sq.order_week = (wc.order_week - 1))
THEN 'YES'
ELSE 'NO'
END AS present_in_previous_week
from weekly_customers wc
The query returns the correct data. My issue, the table is really huge with about 15000 distinct weekly values. This obviously leads to very long execution time. Is there a way I can improve this loop or even an alternative to the loop altogether?
Something like this:
SELECT
wc.order_week,
wc.customer,
CASE WHEN wcb.customer IS NOT NULL THEN "YES" ELSE "NO" END AS present_in_previous_week
FROM weekly_customers AS wca
LEFT JOIN
weekly_customers AS wcb
ON
wca.customer = wcb.customer
AND wca.order_week - 1 = wcb.order_week
This joins all of the customer data onto the customer data from a week ago. If there is a record for a week ago then wcb.customer will not be null, and we can set the flag to "YES". Otherwise, we set the flag to "NO".

Calculated column syntax when using a group by function Teradata

I'm trying to include a column calculated as a % of OTYPE.
IE
Order type | Status | volume of orders at each status | % of all orders at this status
SELECT
T.OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
(STATVOL / COUNT(ROW_ID)) * 100
FROM Database.S_ORDER O
LEFT JOIN /* Finding definitions for status codes & attaching */
(
SELECT
ROW_ID AS TYPEJOIN,
"NAME" AS OTYPE
FROM database.S_ORDER_TYPE
) T
ON T.TYPEJOIN = ORDER_TYPE_ID
GROUP BY (T.OTYPE, STATUS_CD)
/*Excludes pending and pending online orders */
WHERE CAST(CREATED AS DATE) = '2018/09/21' AND STATUS_CD <> 'Pending'
AND STATUS_CD <> 'Pending-Online'
ORDER BY T.OTYPE, STATUS_CD DESC
OTYPE STATUS_CD STATVOL TOTALPERC
Add New Service Provisioning 2,740 100
Add New Service In-transit 13 100
Add New Service Error - Provisioning 568 100
Add New Service Error - Integration 1 100
Add New Service Complete 14,387 100
Current output just puts 100 at every line, need it to be a % of total orders
Could anyone help out a Teradata & SQL student?
The complication making this difficult is my understanding of the group by and count syntax is tenuous. It took some fiddling to get it displayed as I have it, I'm not sure how to introduce a calculated column within this combo.
Thanks in advance
There are a couple of places the total could be done, but this is the way I would do it. I also cleaned up your other sub query which was not required, and changed the date to a non-ambiguous format (change it back if it cases an issue in Teradata)
SELECT
T."NAME" as OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
COUNT(STATUS_CD)*100/TotalVol as Pct
FROM database.S_ORDER O
LEFT JOIN EDWPRDR_VW40_SBLCPY.S_ORDER_TYPE T on T.ROW_ID = ORDER_TYPE_ID
cross join (select count(*) as TotalVol from database.S_ORDER) Tot
GROUP BY T."NAME", STATUS_CD, TotalVol
WHERE CAST(CREATED AS DATE) = '2018-09-21' AND STATUS_CD <> 'Pending' AND STATUS_CD <> 'Pending-Online'
ORDER BY T."NAME", STATUS_CD DESC
A where clause comes before a group by clause, so the query
shown in the question isn't valid.
Always prefix every column reference with the relevant table alias, below I have assumed that where you did not use the alias that it belongs to the orders table.
You probably do not need a subquery for this left join. While there are times when a subquery is needed or good for performance, this does not appear to be the case here.
Most modern SQL compliant databases provide "window functions", and Teradata does do this. They are extremely useful, and here when you combine count() with an over clause you can get the total of all rows without needing another subquery or join.
Because there is neither sample data nor expected result provided with the question I do not actually know which numbers you really need for your percentage calculation. Instead I have opted to show you different ways to count so that you can choose the right ones. I suspect you are getting 100 for each row because the count(status_cd) is equal to the count(row_id). You need to count status_cd differently to how you count row_id. nb: The count() function increases by 1 for every non-null value
I changed the way your date filter is applied. It is not efficient to change data on every row to suit constants in a where clause. Leave the data untouched and alter the way you apply the filter to suit the data, this is almost always more efficient (search sargable)
SELECT
t.OTYPE
, o.STATUS_CD
, COUNT(o.STATUS_CD) count_status
, COUNT(t.ROW_ID count_row_id
, count(t.row_id) over() count_row_id_over
FROM dbo.S_ORDER o
LEFT JOIN dbo.S_ORDER_TYPE t ON t.TYPEJOIN = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
WHERE o.CREATED >= '2018-09-21' AND o.CREATED < '2018-09-22'
AND o.STATUS_CD <> 'Pending'
AND o.STATUS_CD <> 'Pending-Online'
GROUP BY
t.OTYPE
, o.STATUS_CD
ORDER BY
t.OTYPE
, o.STATUS_CD DESC
As #TomC already noted, there's no need for the join to a Derived Table. The simplest way to get the percentage is based on a Group Sum. I also changed the date to an Standard SQL Date Literal and moved the where before group by.
SELECT
t."NAME",
o.STATUS_CD,
Count(o.STATUS_CD) AS STATVOL,
-- rule of thumb: multiply first then divide, otherwise you will get unexpected results
-- (Teradata rounds after each calculation)
100.00 * STATVOL / Sum(STATVOL) Over ()
FROM database.S_ORDER AS O
/* Finding definitions for status codes & attaching */
LEFT JOIN database.S_ORDER_TYPE AS t
ON t.ROW_ID = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
-- if o.CREATED is a Timestamp there's no need to apply the CAST
WHERE Cast(o.CREATED AS DATE) = DATE '2018-09-21'
AND o.STATUS_CD NOT IN ('Pending', 'Pending-Online')
GROUP BY (T.OTYPE, o.STATUS_CD)
ORDER BY T.OTYPE, o.STATUS_CD DESC
Btw, you probably don't need an Outer Join, Inner should return the same result.

JOIN other table only if condition is true for ALL joined rows

I have two tables I'm trying to conditionally JOIN.
dbo.Users looks like this:
UserID
------
24525
5425
7676
dbo.TelemarketingCallAudits looks like this (date format dd/mm/yyyy):
UserID Date CampaignID
------ ---------- ----------
24525 21/01/2018 1
24525 26/08/2018 1
24525 17/02/2018 1
24525 12/01/2017 2
5425 22/01/2018 1
7676 16/11/2017 2
I'd like to return a table that contains ONLY users that I called at least 30 days ago (if CampaignID=1) and at least 70 days ago (if CampaignID=2).
The end result should look like this (today is 02/09/18):
UserID Date CampaignID
------ ---------- ----------
5425 22/01/2018 1
7676 16/11/2017 2
Note that because I called user 24524 with Campaign 1 only 7 days ago, I shall not see the user at all.
I tried this simple AND/OR condition and then I found out it will still return the users I shouldn't see because they do have rows indicating other calls and it simply ignoring the conditioned calls... which misses the goal obviously.
I have no idea on how to condition the overall appearance of the user if ANY of his associated rows in the second table did not meet the condition.
AND
(
internal_TelemarketingCallAudits.CallAuditID IS NULL --No telemarketing calls is fine
OR
(
internal_TelemarketingCallAudits.CampaignID = 1 --Campaign 1
AND
DATEADD(dd, 75, MAX(internal_TelemarketingCallAudits.Date)) < GETDATE() --Last call occured at least 10 days ago
)
OR
(
internal_TelemarketingCallAudits.CampaignID != 1 --Other campaigns
AND
DATEADD(dd, 10, MAX(internal_TelemarketingCallAudits.Date)) < GETDATE() --Last call occured at least 10 days ago
)
)
I really appreciate your help.
Try this: SQL Fiddle
select *
from dbo.Users u
inner join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
inner join (
values (1, 60)
, (2, 70)
) c (CampaignId, DaysSinceLastCall)
on tca.CampaignId = c.CampaignId
) mrc
on mrc.UserId = u.UserId
and mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
I'm not comparing all rows here; but rather saw that you're interested in when the most recent call is; then you only care if that's in the X day window. There's a bit of additional complexity given the X days varies by campaign; so it's not the most recent call you care about so much as the most likely to fall within that window. To get around that, I sort each users' calls by those which are in the window first followed by those which aren't; then sort by most recent first within those 2 groups. This gives me the field r.
By filtering on r = 1 for each user, we only get the most recent call (adjusted for campaign windows). By filtering on LastCalledInWindow = 0 we exclude those who have been called within the campaign's window.
NB: I've used an inner query (aliased c) to hold the campaign ids and their corresponding windows. In reality you'd probably want a campaigns table holding that same information instead of coding inside the query itself.
Hopefully everything else is self-explanatory; but give me a nudge in the comments if you need any further information.
UPDATE
Just realised you'd also said "no calls is fine"... Here's a tweaked version to allow for scenarios where the person has not been called.
SQL Fiddle Example.
select *
from dbo.Users u
left outer join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,c.DaysSinceLastCall, tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
inner join (
values (1, 60)
, (2, 70)
) c (CampaignId, DaysSinceLastCall)
on tca.CampaignId = c.CampaignId
) mrc
on mrc.UserId = u.UserId
where
(
mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
)
or mrc.r is null --no calls at all
Update: Including a default campaign offset
To include a default, you could do something like the code below (SQL Fiddle Example). Here, I've put each campaign's offset value in the Campaigns table, but created a default campaign with ID = -1 to handle anything for which there is no offset defined. I use a left join between the audit table and the campaigns table so that we get all records from the audit table, regardless of whether there's a campaign defined, then a cross join to get the default campaign. Finally, I use a coalesce to say "if the campaign isn't defined, use the default campaign".
select *
from dbo.Users u
left outer join ( --get the most recent call per user (taking into account different campaign timescales)
select tca.UserId
, tca.CampaignId
, tca.[Date]
, case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end LastCalledInWindow
, row_number() over (partition by tca.UserId order by case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r
from dbo.TelemarketingCallAudits tca
left outer join Campaigns c
on tca.CampaignId = c.CampaignId
cross join Campaigns dflt
where dflt.CampaignId = -1
) mrc
on mrc.UserId = u.UserId
where
(
mrc.r = 1 --only accept the most recent call
and mrc.LastCalledInWindow = 0 --only include if they haven't been contacted in the last x days
)
or mrc.r is null --no calls at all
That said, I'd recommend not using a default, but rather ensuring that every campaign has an offset defined. i.e. Presumably you already have a campaigns table; and since this offset value is defined per campaign, you can include a field in that table for holding this offset. Rather than leaving this as null for some records, you could set it to your default value; thus simplifying the logic / avoiding potential issues elsewhere where that value may subsequently be used.
You'd also asked about the order by clause. There is no order by 1/0; so I assume that's a typo. Rather the full statement is row_number() over (partition by tca.UserId order by case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end desc, tca.[Date] desc) r.
The purpose of this piece is to find the "most important" call for each user. By "most important" I basically mean the most recent, since that's generally what we're after; though there's one caveat. If a user is part of 2 campaigns, one with an offset of 30 days and one with an offset of 60 days, they may have had 2 calls, one 32 days ago and one 38 days ago. Though the call from 32 days ago is more recent, if that's on the campaign with the 30 day offset it's outside the window, whilst the older call from 38 days ago may be on the campaign with an offset of 60 days, meaning that it's within the window, so is more of interest (i.e. this user has been called within a campaign window).
Given the above requirement, here's how this code meets it:
row_number() produces a number from 1, counting up, for each row in the (sub)query's results. The counter is reset to 1 for each partition
partition by tca.UserId says that we're partitioning by the user id; so for each user there will be 1 row for which row_number() returns 1, then for each additional row for that user there will be a consecutive number returned.
The order by part of this statement defines which of each users' rows gets #1, then how the numbers progress thereafter; i.e. the first row according to the order by gets number 1, the next number 2, etc.
case when DateAdd(Day,coalesce(c.DaysSinceLastCall,dflt.DaysSinceLastCall), tca.[Date]) > getutcdate() then 1 else 0 end returns 1 for calls within their campaign's window, and 0 for those outside of the window. Since we're ordering by this result in ascending order, that says that any records within their campaign's window should be returned before any outside of their campaign's window.
we then order by tca.[Date] desc; i.e. the more recent calls are returned before the later calls.
finally, we name the output of this row number as r and in the outer query filter on r = 1; meaning that for each user we only take one row, and that's the first row according to the order criteria above; i.e. if there's a row in its campaign's window we take that, after which it's whichever call was most recent (within those in the window if there were any; then outside that window if there weren't).
Take a look at the output of the subquery to get a better idea of exactly how this works: SQL Fiddle
I hope that explanation makes some sense / helps you to understand the code? Sadly I can't find a way to explain it more concisely than the code itself does; so if it doesn't make sense try playing with the code and seeing how that affects the output to see if that helps your understanding.

Sql, how to get data in difference of 24 hours

Select p.uhid,p.inpatientno,dateof admission
from adt.inpatientmaster p
where p.uhid='apd1' and status <>0
Here uhid is unique. I want to check that a patient gets admitted in between 24 hours , here if patient gets admitted again then uhid remains same but inpatientno always change.
Ex:
Registraionno inpatientno dateofadmission
Apd1 xy1 18/01/15
Ap1 ab2 19/01/15
We can do arithmetic on Oracle dates. So yesterday is sysdate - 1.
You need to query the table twice. Once to find the patient records, and once to find any previous matches. Use a self-join to achieve this:
select p1.uhid,
p1.inpatientno as current_inpatientno,
p1.dateofadmission as current_dateofadmission
p2.inpatientno as previous_inpatientno,
p2.dateofadmission as previous_dateofadmission
from adt.inpatientmaster p1
join adt.inpatientmaster p2
on p2.uhid = p1.uhid
where p1.uhid='apd1'
and p1.status <> 0
and p2.dateofadmission >= p1.dateofadmission-1
and p2.inpatientno != p1.inpatientno
/
You may need to restrict on p2.status <> 0 as well: not sure what your business rules are.
This query will return one row for each match. If there are several admissions within the same 24 hours the result set will have one row for each combination.
SELECT p.uhid,
p.inpatientno,
p.dateofadmission
FROM adt.inpatientmaster p
WHERE p.status<>0
AND p.dateofadmission <= p.dateofadmission +1
AND p.uhid='APD1'

SQL query determine stopped time in a range

I have to determine stopped time of an vehicle that sends back to server its status data every 30 second and this data is stored in a table of a database.
The fields of a status record consist of (vehicleID, ReceiveDate, ReceiveTime, Speed, Location).
Now what I want to do is, determine each suspension time at the point that vehicle speed came to zero to the status the vehicle move again and so on for next suspension time.
For example on a given day, a given vehicle may have 10 stopped status and I must determine duration of each by a query.
The result can be like this:
id Recvdate Rtime Duration
1 2010-05-01 8:30 45min
1 2110-05-01 12:21 3hour
This is an application of windows functions (called analytic functions in Oracle).
Your goal is to assign a "block number" to each sequence of stops. That is, all stops in a sequence (for a vehicle) will have the same block number, and this will be different from all other sequences of stops.
Here is a way to assign the block number:
Create a speed flag that says 1 when speed > 0 and 0 when speed = 0.
Enumerate all the records where the speed flag = 1. These are "blocks".
Do a self join to put each flag = 0 in a block (this requires grouping and taking the max blocknum).
Summarize by duration or however you want.
The following code is a sketch of what I mean. It won't solve your problem, because you are not clear about how to handle day breaks, what information you want to summarize, and it has an off-by-1 error (in each sequence of stops it includes the previous non-stop, if any).
with vd as
(
select vd.*,
(case when SpeedFlag = 1
then ROW_NUMBER() over (partition by id, SpeedFlag) end) as blocknum
from
(
select vd.*, (case when speed = 0 then 0 else 1 end) as SpeedFlag
from vehicaldata vd
) vd
)
select id, blocknum, COUNT(*) as numrecs, SUM(duration) as duration
from
(
select vd.id, vd.rtime, vd.duration, MAX(vdprev.blocknum) as blocknum
from vd
left outer join vd vdprev
on vd.id = vdprev.id
and vd.rtime > vdprev.rtime
group by vd.id, vd.rtime, vd.duration
) vd
group by id, blocknum