creating a calculated column using SQL COUNT - sql

Lets say I work for a call center and I closed 10 calls but opened 20 calls in the day. The "real" figure is actually -10. Even though a target is 10 calls to close, The worker failed because 20 calls were opened.
I would like to write an SQL report to reflect this. But my problem seems to be I cannot calculate figures from aggregate counts.
SELECT workername AS Name,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NULL) OPENCALLS,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NOT NULL) CLOSEDCALLS,
(SELECT opencalls - closedcalls) REALCALLS
FROM mybanksupport
In short, I want to calculate 2 column count values and then use that value to produce another calculated column called Real Calls

COUNT only counts values, i.e., it ignores NULLs. This property can be used to simply your expression:
SELECT workername, closedCalls, totalCalls - closedCalss AS openCalls
FROM (SELECT workername, COUNT(closeddate) AS closedcalls, COUNT(*) totalCalls
FROM mybanksupport
GROUP BY workername) t

Write as subquery then you can use the fields form that as you want
Select
workername
, OPENCALLS
, CLOSEDCALLS
, (OPENCALLS - CLOSEDCALLS) REALCALLS
From
(
SELECT workername AS Name,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NULL) OPENCALLS,
(SELECT Count(closeddate)
FROM mybanksupport
WHERE closeddate = NOT NULL) CLOSEDCALLS,
FROM mybanksupport
) T1

Related

Count rows in SQL Server table using GROUP BY GroupValue and filter by condition

I think my question is similar to here
How to use last_value with group by with count in SQL Server?, however, I can't seem to transcribe the small change to answer my question.
I have table of colleague contracts
ContractId int PK
ColleagueId int [not null]
ContractStart datetime2 [not null]
ContractEnd datetime2 [null]
BranchId int [not null]
SaturdayOnly bit
isActive bit
What I need to do is get a count, per BranchId of the number of active contracts that are SaturdayOnly i.e. bit 1, and the number of active contracts not SaturdayOnly i.e. bit 0. A colleague can have multiple contracts in the same branch but only one will be active. The final condition is that for the contract to be considered it must start before 2022-12-01 and if there is an end date it must be after 2022-12-01.
I attempted it with this but the 2 cte counts give the same result and the count is incorrect for the branch anyway.
WITH cte AS
(
SELECT
co.BranchId, co.ContractId, co.ColleagueId,
ROW_NUMBER() OVER (PARTITION BY co.ColleagueId ORDER BY co.ContractStart DESC) AS row_number
FROM
hr.Contract co
WHERE
co.SaturdayOnly = 0
AND (co.ContractEnd IS NULL OR co.ContractEnd > '2022-12-01')
),
cte_sat AS
(
SELECT
co.BranchId, co.ContractId, co.ColleagueId,
ROW_NUMBER() OVER (PARTITION BY co.ColleagueId ORDER BY co.ContractStart DESC) AS row_number
FROM
hr.Contract co
WHERE
co.SaturdayOnly = 1
AND (co.ContractEnd IS NULL OR co.ContractEnd > '2022-12-01')
)
SELECT
b.BranchName,
COUNT(cte.ContractId), COUNT(cte_sat.ContractId)
FROM
hr.Branch b
JOIN
cte ON b.ContractorCode = cte.BranchId
JOIN
cte_sat ON b.ContractorCode = cte_sat.BranchId
WHERE
cte.row_number = 1
GROUP BY
b.BranchNumber, b.BranchName
ORDER BY
b.BranchNumber
No need for CTE - try
SELECT BranchId, SaturdayOnly, COUNT(*) FROM hr.Contract
WHERE IsActive = 1 AND ContractStart > ... AND ContractEnd < ...
GROUP BY BranchId, SaturdayOnly

Select latest and 2nd latest date rows per user

I have the following query to select rows where the LAST_UPDATE_DATE field is getting records that have a date value greater than or equal to the last 7 days, which works great.
SELECT 'NEW ROW' AS 'ROW_TYPE', A.EMPLID, B.FIRST_NAME, B.LAST_NAME,
A.BANK_CD, A.ACCOUNT_NUM, ACCOUNT_TYPE, PRIORITY, A.LAST_UPDATE_DATE
FROM PS_DIRECT_DEPOSIT D
INNER JOIN PS_DIR_DEP_DISTRIB A ON A.EMPLID = D.EMPLID AND A.EFFDT = D.EFFDT
INNER JOIN PS_EMPLOYEES B ON B.EMPLID = A.EMPLID
WHERE
B.EMPL_STATUS NOT IN ('T','R','D')
AND ((A.DEPOSIT_TYPE = 'P' AND A.AMOUNT_PCT = 100)
OR A.PRIORITY = 999
OR A.DEPOSIT_TYPE = 'B')
AND A.EFFDT = (SELECT MAX(A1.EFFDT)
FROM PS_DIR_DEP_DISTRIB A1
WHERE A1.EMPLID = A.EMPLID
AND A1.EFFDT <= GETDATE())
AND D.EFF_STATUS = 'A'
AND D.EFFDT = (SELECT MAX(D1.EFFDT)
FROM PS_DIRECT_DEPOSIT D1
WHERE D1.EMPLID = D.EMPLID
AND D1.EFFDT <= GETDATE())
AND A.LAST_UPDATE_DATE >= GETDATE() - 7
What I would like to add onto this is to also add the previous (2nd MAX) row per EMPLID, so that I can output the 'old' row (that was prior to the last update the latest row meeting above criteria), along with the new row that I already am outputting in the query.
ROW_TYPE EMPLID FIRST_NAME LAST_NAME BANK_CD ACCOUNT_NUM ACCOUNT_TYPE PRIORITY LAST_UPDATE_DATE
NEW ROW 12345 JOHN SMITH 123548999 45234879 C 999 2019-03-06 00:00:00.000
OLD ROW 12345 JOHN SMITH 214080046 92178616 C 999 2018-10-24 00:00:00.000
NEW ROW 56399 CHARLES MASTER 785816167 84314314 C 999 2019-03-07 00:00:00.000
OLD ROW 56399 CHARLES MASTER 345761227 547352 C 999 2017-05-16 00:00:00.000
So the EMPLID would be ordered by NEW ROW, followed by OLD ROW as shown above. In this example the 'NEW ROW' is getting the record that is within the past 7 days, as indicated by the LAST_UPDATE_DATE.
I would like to get feedback on how to modify the query so I can also get the 'old' row (which is the max row that is less than the 'NEW' row retrieved above).
It was a slow day for crime in Gotham, so I gave this a whirl. Might work.
This is unlikely to work right out of the box, though, but it should get you started.
Your LAST_UPDATE_DATE column is on the table PS_DIR_DEP_DISTRIB, so we'll start there. First, you want to identify all of the records that were updated in the last 7 days because those are the only ones you're interested in. Throughout this, I'm assuming, and I'm probably wrong, that the natural key for the table consists of EMPLID, BANK_CD, and ACCOUNT_NUM. You'll want to sub in the actual natural key for those columns in a few places. That said, the date limiter looks something like this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND
limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
Now we'll use that as a correlated sub-query in a WHERE EXISTS clause that we'll correlate back to the base table to limit ourselves to records with natural key values that were updated in the last week. I altered the SELECT list to just SELECT 1, which is typical verbiage for a correlated sub, since it stops looking for a match when it finds one (1), and doesn't actually return any values at all.
Additionally, since we're filtering this record set anyway, I moved all the other WHERE clause filters for this table into this (soon to be) sub-query.
Finally, in the SELECT portion, I added a DENSE_RANK to force order the records. We' use the DENSE_RANK value later to filter off only the first (N) records of interest.
So that leaves us with this:
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
Replace the original INNER JOIN to PS_DIR_DEP_DISTRIB with that query. In the SELECT list, the first hard-coded value is now dependent on the RowNum value, so that's a CASE expression now. In the WHERE clause, the dates are all driven by the subquery, so they're gone, several were folded into the subquery, and we're adding WHERE dist.RowNum <= 2 to bring back the top 2 records.
(I also replaced all the table aliases so I could keep track of what I was looking at.)
SELECT
CASE dist.RowNum
WHEN 1 THEN 'NEW ROW'
ELSE 'OLD ROW'
END AS ROW_TYPE
,dist.EMPLID
,emp.FIRST_NAME
,emp.LAST_NAME
,dist.BANK_CD
,dist.ACCOUNT_NUM
,ACCOUNT_TYPE
,dist.PRIORITY
,dist.LAST_UPDATE_DATE
FROM
PS_DIRECT_DEPOSIT AS dd
INNER JOIN
(
SELECT
EMPLID
,BANK_CD
,ACCOUNT_NUM
--,ACCOUNT_TYPE --Might belong here. Can't tell without table alias in original SELECT
,PRIORITY
,EFFDT
,LAST_UPDATE_DATE
,DEPOSIT_TYPE
,AMOUNT_PCT
,DENSE_RANK() OVER (PARTITION BY --Add actual natural key columns here...
EMPLID
ORDER BY
LAST_UPDATE_DATE DESC
) AS RowNum
FROM
PS_DIR_DEP_DISTRIB AS sdist
WHERE
EXISTS
(
-- Get the set of records that were last updated in the last 7 days.
-- Correlate to the outer query so it only returns records related to this subset.
-- This uses a correlated subquery. A JOIN will work, too. Try both, pick the faster one.
-- Something like this, using the actual natural key columns in the WHERE
SELECT
1
FROM
PS_DIR_DEP_DISTRIB AS limit
WHERE
--The first two define the date range.
limit.LAST_UPDATE_DATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE))
AND limit.LAST_UPDATE_DATE <= CAST(GETDATE() AS DATE)
AND
--And these are the correlations to the outer query.
limit.EMPLID = sdist.EMPLID
AND limit.BANK_CD = sdist.BANK_CD
AND limit.ACCOUNT_NUM = sdist.ACCOUNT_NUM
)
AND
(
dist.DEPOSIT_TYPE = 'P'
AND dist.AMOUNT_PCT = 100
)
OR dist.PRIORITY = 999
OR dist.DEPOSIT_TYPE = 'B'
) AS dist
ON
dist.EMPLID = dd.EMPLID
AND dist.EFFDT = dd.EFFDT
INNER JOIN
PS_EMPLOYEES AS emp
ON
emp.EMPLID = dist.EMPLID
WHERE
dist.RowNum <= 2
AND
emp.EMPL_STATUS NOT IN ('T', 'R', 'D')
AND
dd.EFF_STATUS = 'A';

sql count/sum the number of calls until a specific date in another column

I have data that shows the customer calls. I have columns for customer number, phone number(1 customer can have many), date record for each voice call and duration of a call. Table looks lie below example.
CusID | PhoneNum | Date | Duration
20111 43576233 20.01.2016-14:00 00:10:12
20111 44498228 14.01.2016-15:30 00:05:12
20112 43898983 14.01.2016-15:30
What I want is to count the number of call attempts for each number before It is answered(Duration is > 0). So that I can estimate how many time I should call on average to reach a customer or phone number. It should basically count any column per phone number before min(Date) where duration is >0.
SELECT Phone, Min(Date) FROM XX WHERE Duration IS NOT NULL GROUP BY Phone --
I think This should give me the time limit until when I should count the number of calls. I could not figure out how to finish the rest of the job
EDIT- I will add an example
And the result should only count row number 5 since it is the call before the customer is reached for the first time. So resulted table should be like :
Your first step is valid:
SELECT
CusID
,PhoneNum
,MIN(Date) AS MinDate
FROM XX
WHERE Duration IS NOT NULL
GROUP BY CusID, PhoneNum
This gives you one row per PhoneNum with the date of the first successful call.
Now join this to original table and leave only those rows that have a prior date (per PhoneNum). Group it by PhoneNum again and count. The join should be LEFT JOIN to have a row with zero count for numbers that were answered on the first attempt.
WITH
CTE
AS
(
SELECT
CusID
,PhoneNum
,MIN(Date) AS MinDate
FROM XX
WHERE Duration IS NOT NULL
GROUP BY CusID, PhoneNum
)
SELECT
CusID
,PhoneNum
,COUNT(XX.PhoneNum) AS Count
FROM
CTE
LEFT JOIN XX
ON XX.PhoneNum = CTE.PhoneNum
AND XX.Date < CTE.MinDate
GROUP BY CusID, PhoneNum
;
If a number was never answered, it will not be included in the result set at all.
Please try this query:
SELECT phonecalls.CusID, COUNT(0) AS failedcalls, phonenumber, success.firstsuccess FROM phonecalls,
(SELECT min(Date) AS firstsuccess, CusID, phonenumber FROM phonecalls WHERE Duration IS NOT NULL GROUP BY CusID, phonenumber) success
WHERE phonecalls.CusID = success.CusID AND phonecalls.phonenumber = success.phonenumber AND phonecalls.Date < success.firstsuccess
GROUP BY phonecalls.CusID, phonecalls.phonenumber, success.firstsuccess;
I've not tested it...
Note: users which have not established a successfull call are not listed. Is this ok, or do you need them listed as well? If so, you need to "left join":
SELECT phonecalls.CusID, COUNT(0) AS failedcalls, phonenumber, success.firstsuccess FROM phonecalls LEFT JOIN
(SELECT min(Date) AS firstsuccess, CusID, phonenumber FROM phonecalls WHERE Duration IS NOT NULL GROUP BY CusID, phonenumber) success ON
phonecalls.CusID = success.CusID AND phonecalls.phonenumber = success.phonenumber AND phonecalls.Date < success.firstsuccess
GROUP BY phonecalls.CusID, phonecalls.phonenumber, success.firstsuccess;
In SQL Server 2012+, you can use the following logic:
Assign the number of "unanswered" calls to each row in the data. This uses conditional aggregation with a window function.
Then, take the maximum of the count for answered calls for each user.
Count the number of answered calls.
The ratio is the average.
This ignores strings of unanswered calls not followed by an answered call.
The resulting query:
select phone, max(cume_unanswered), count(*) as num_answered,
max(cume_unanswered) * 1.0 / count(*) as ratio
from (select t.*,
sum(case when duration is null then 1 else 0 end) over (partition by phone order by date) as cume_unanswered
from t
) t
where duration is not null
group by phone;

Datediff between two tables

I have those two tables
1-Add to queue table
TransID , ADD date
10 , 10/10/2012
11 , 14/10/2012
11 , 18/11/2012
11 , 25/12/2012
12 , 1/1/2013
2-Removed from queue table
TransID , Removed Date
10 , 15/1/2013
11 , 12/12/2012
11 , 13/1/2013
11 , 20/1/2013
The TansID is the key between the two tables , and I can't modify those tables, what I want is to query the amount of time each transaction spent in the queue
It's easy when there is one item in each table , but when the item get queued more than once how do I calculate that?
Assuming the order TransIDs are entered into the Add table is the same order they are removed, you can use the following:
WITH OrderedAdds AS
( SELECT TransID,
AddDate,
[RowNumber] = ROW_NUMBER() OVER(PARTITION BY TransID ORDER BY AddDate)
FROM AddTable
), OrderedRemoves AS
( SELECT TransID,
RemovedDate,
[RowNumber] = ROW_NUMBER() OVER(PARTITION BY TransID ORDER BY RemovedDate)
FROM RemoveTable
)
SELECT OrderedAdds.TransID,
OrderedAdds.AddDate,
OrderedRemoves.RemovedDate,
[DaysInQueue] = DATEDIFF(DAY, OrderedAdds.AddDate, ISNULL(OrderedRemoves.RemovedDate, CURRENT_TIMESTAMP))
FROM OrderedAdds
LEFT JOIN OrderedRemoves
ON OrderedAdds.TransID = OrderedRemoves.TransID
AND OrderedAdds.RowNumber = OrderedRemoves.RowNumber;
The key part is that each record gets a rownumber based on the transaction id and the date it was entered, you can then join on both rownumber and transID to stop any cross joining.
Example on SQL Fiddle
DISCLAIMER: There is probably problem with this, but i hope to send you in one possible direction. Make sure to expect problems.
You can try in the following direction (which might work in some way depending on your system, version, etc) :
SELECT transId, (sum(add_date_sum) - sum(remove_date_sum)) / (1000*60*60*24)
FROM
(
SELECT transId, (SUM(UNIX_TIMESTAMP(add_date)) as add_date_sum, 0 as remove_date_sum
FROM add_to_queue
GROUP BY transId
UNION ALL
SELECT transId, 0 as add_date_sum, (SUM(UNIX_TIMESTAMP(remove_date)) as remove_date_sum
FROM remove_from_queue
GROUP BY transId
)
GROUP BY transId;
A bit of explanation: as far as I know, you cannot sum dates, but you can convert them to some sort of timestamps. Check if UNIX_TIMESTAMPS works for you, or figure out something else. Then you can sum in each table, create union by conveniently leaving the other one as zeto and then subtracting the union query.
As for that devision in the end of first SELECT, UNIT_TIMESTAMP throws out miliseconds, you devide to get days - or whatever it is that you want.
This all said - I would probably solve this using a stored procedure or some client script. SQL is not a weapon for every battle. Making two separate queries can be much simpler.
Answer 2: after your comments. (As a side note, some of your dates 15/1/2013,13/1/2013 do not represent proper date formats )
select transId, sum(numberOfDays) totalQueueTime
from (
select a.transId,
datediff(day,a.addDate,isnull(r.removeDate,a.addDate)) numberOfDays
from AddTable a left join RemoveTable r on a.transId = r.transId
order by a.transId, a.addDate, r.removeDate
) X
group by transId
Answer 1: before your comments
Assuming that there won't be a new record added unless it is being removed. Also note following query will bring numberOfDays as zero for unremoved records;
select a.transId, a.addDate, r.removeDate,
datediff(day,a.addDate,isnull(r.removeDate,a.addDate)) numberOfDays
from AddTable a left join RemoveTable r on a.transId = r.transId
order by a.transId, a.addDate, r.removeDate

How return a count(*) of 0 instead of NULL

I have this bit of code:
SELECT Project, Financial_Year, COUNT(*) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year
where it's not returning any rows when the count is zero. How do I make these rows appear with the HighRiskCount set as 0?
You can't select the values from the table when the row count is 0. Where would it get the values for the nonexistent rows?
To do this, you'll have to have another table that defines your list of valid Project and Financial_Year values. You'll then select from this table, perform a left join on your existing table, then do the grouping.
Something like this:
SELECT l.Project, l.Financial_Year, COUNT(t.Project) AS HighRiskCount
INTO #HighRisk
FROM MasterRiskList l
left join #TempRisk1 t on t.Project = l.Project and t.Financial_Year = l.Financial_Year
WHERE t.Risk_1 = 3
GROUP BY l.Project, l.Financial_Year
Wrap your SELECT Query in an ISNULL:
SELECT ISNULL((SELECT Project, Financial_Year, COUNT(*) AS hrc
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year),0) AS HighRiskCount
If your SELECT returns a number, it will pass through. If it returns NULL, the 0 will pass through.
Assuming you have your 'Project' and 'Financial_Year' where Risk_1 is different than 3, and those are the ones you intend to include.
SELECT Project, Financial_Year, SUM(CASE WHEN RISK_1 = 3 THEN 1 ELSE 0 END) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
GROUP BY Project, Financial_Year
Notice i removed the where part.
By the way, your current query is not returning null, it is returning no rows.
Use:
SELECT x.Project, x.financial_Year,
COUNT(y.*) AS HighRiskCount
INTO #HighRisk
FROM (SELECT DISTINCT t.project, t.financial_year
FROM #TempRisk1
WHERE t.Risk_1 = 3) x
LEFT JOIN #TempRisk1 y ON y.project = x.project
AND y.financial_year = x.financial_year
GROUP BY x.Project, x.Financial_Year
The only way to get zero counts is to use an OUTER join against a list of the distinct values you want to see zero counts for.
SQL generally has a problem returning the values that aren't in a table. To accomplish this (without a stored procedure, in any event), you'll need another table that contains the missing values.
Assuming you want one row per project / financial year combination, you'll need a table that contains each valid Project, Finanical_Year combination:
SELECT HR.Project, HR.Financial_Year, COUNT(HR.Risk_1) AS HighRiskCount
INTO #HighRisk HR RIGHT OUTER JOIN ProjectYears PY
ON HR.Project = PY.Project AND HR.Financial_Year = PY.Financial_Year
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY HR.Project, HR.Financial_Year
Note that we're taking advantage of the fact that COUNT() will only count non-NULL values to get a 0 COUNT result for those result set records that are made up only of data from the new ProjectYears table.
Alternatively, you might only one 0 count record to be returned per project (or maybe one per financial_year). You would modify the above solution so that the JOINed table has only that one column.
Little longer, but what about this as a solution?
IF EXISTS (
SELECT *
FROM #TempRisk1
WHERE Risk_1 = 3
)
BEGIN
SELECT Project, Financial_Year, COUNT(*) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year
END
ELSE
BEGIN
INSERT INTO #HighRisk
SELECT 'Project', 'Financial_Year', 0
END
MSDN - ISNULL function
SELECT Project, Financial_Year, ISNULL(COUNT(*), 0) AS HighRiskCount
INTO #HighRisk
FROM #TempRisk1
WHERE Risk_1 = 3
GROUP BY Project, Financial_Year