Aggregate SQL Query, GROUP BY Causing Issues

Aggregate SQL Query, GROUP BY Causing Issues - sql

Everything in this query works except for the second LEFT JOIN, where BEGIN_DATE and END_DATE are. Because I have to group by the additional columns, so they can be used in the "on join", I am getting false numbers. Is there any way to do this without having to group by. I hope this makes sense. Basically because I have to include BEGIN_DATE AND END_DATE in the group by, everything gets lost.
SELECT
to_char(T1.CALL_TIMESTAMP,'YYYY-IW') AS OMONTH
,COUNT(T1.HOUSE) AS NODECALLS
,T3.NODE_CODE
,T5.NODECUSTCOUNT
,T1.CALL_CATEGORY_LVL_3
,sum((CASE WHEN T1.TC_WIP_TRANSACTION_ID IS NOT NULL THEN 1 ELSE 0 END )) AS TC
,sum((CASE WHEN T1.TC_WIP_TRANSACTION_ID IS NOT NULL THEN 1 ELSE 0 END ))/nullif(COUNT(T1.HOUSE), 0) AS SVRATEPERCALL
,COUNT(T1.HOUSE)/ nullif(T5.NODECUSTCOUNT, 0) AS CALLRATE
FROM CVKOMNZP.NZKOMUSER.NFOV_INBD_REMEDY_CALL_DETAILS T1
LEFT JOIN
(
SELECT T2.NODE_CODE,T2.BEGIN_DATE,T2.END_DATE,T2.HOUSE,T2.CORP
FROM CVKOMNZP.NZKOMUSER.D_HOUSEHOLD_CH_HIST T2
) T3
ON T1.CORP = T3.CORP AND T1.HOUSE = T3.HOUSE AND (T1.CALL_TIMESTAMP BETWEEN T3.BEGIN_DATE AND T3.END_DATE)
LEFT JOIN
(
SELECT count(ADM_HOUSEHOLD_ID) AS NODECUSTCOUNT,NODE_CODE,BEGIN_DATE, END_DATE
FROM CVKOMNZP.NZKOMUSER.D_HOUSEHOLD_CH_HIST
WHERE HOUSE_STATUS_CODE = 2
AND END_DATE = '2999-12-31 00:00:00'
AND T1.CALL_TIMESTAMP BETWEEN BEGIN_DATE AND END_DATE
GROUP BY NODE_CODE,BEGIN_DATE,END_DATE
) T5
ON T5.NODE_CODE = T3.NODE_CODE AND T1.CALL_TIMESTAMP BETWEEN T5.BEGIN_DATE AND T5.END_DATE
WHERE T1.EXCLUSION_FLAG = 'N'
AND T1.CALL_TIMESTAMP >= To_Date ('07-29-2017', 'MM-DD-YYYY' ) AND T1.CALL_TIMESTAMP <= To_Date ('07-31-2017', 'MM-DD-YYYY' )
GROUP BY
to_char(T1.CALL_TIMESTAMP,'YYYY-IW')
,T3.NODE_CODE
,T5.NODECUSTCOUNT
,T1.CALL_CATEGORY_LVL_3

If I am understanding this right, you want to get a COUNT without grouping by BEGIN and END DATE. However, because your Subquery (2nd LEFT JOIN) needs to include the BEGIN and NEED, you do not know how to group without it.
If this is the case, you'll need a subquery for your count and JOIN it back to the same table.
FYI: Your T1.CALL_TIMESTAMP does not make sense in this subquery since you don't have a table called T1. I renamed it to "a". Feel free to change it to what you want.
See if this make sense
LEFT JOIN
(
SELECT a.BEGIN_DATE,
a.END_DATE,
node.NODECUSTCOUNT,
a.node_code
FROM CVKOMNZP.NZKOMUSER.D_HOUSEHOLD_CH_HIST a
/**Subquery to get a COUNT of all the Node based on NODE_CODE.
You link this back to your query above using the NODE CODE**/
JOIN ( SELECT count(ADM_HOUSEHOLD_ID) AS NODECUSTCOUNT,
NODE_CODE
FROM CVKOMNZP.NZKOMUSER.D_HOUSEHOLD_CH_HIST
GROUP BY NODE_CODE ) node on node.node_code = a.node_code
WHERE a.HOUSE_STATUS_CODE = 2
AND a.END_DATE = '2999-12-31 00:00:00'
AND a.CALL_TIMESTAMP BETWEEN BEGIN_DATE AND END_DATE
) ..JOIN THIS BACK TO YOUR MAIN TABLE

Related

SQL Rowwise comparison between groups

Question
The following is a snippet of my data:
Create Table Emps(person VARCHAR(50), started DATE, stopped DATE);
Insert Into Emps Values
('p1','2015-10-10','2016-10-10'),
('p1','2016-10-11','2017-10-11'),
('p1','2017-10-12','2018-10-13'),
('p2','2019-11-13','2019-11-13'),
('p2','2019-11-14','2020-10-14'),
('p3','2020-07-15','2021-08-15'),
('p3','2021-08-16','2022-08-16');
db<>fiddle.
I want to use T-SQL to get a count of how many persons fulfil the following criteria at least once - multiples should also count as one:
For a person:
One of the dates in 'started' (say s1) is larger than at least one of the dates in 'ended' (say e1)
s1 and e1 are in the same year, to be set manually - e.g. '2021-01-01' until '2022-01-01'
Example expected response
If I put the date range '2016-01-01' until '2017-01-01' somewhere in a WHERE / HAVING clause, the output should be 1 as only p1 has both a start date and an end date that fall in 2016 where the start date is larger than the end date:
s1 = '2016-10-11', and e1 = '2016-10-10'.
Why can't I do this myself
The reason I'm stuck is that I don't know how to do this rowwise comparison between groups. The question requires comparing values across columns (start with end) across rows, within a person ID.

Use conditional aggregation to get the maximum start date and the minimum stop date in the given range.
select person
from emps
group by person
having max(case when started >= '2016-01-01' and started < '2017-01-01'
then started end) >
min(case when stopped >= '2016-01-01' and stopped < '2017-01-01'
then stopped end);
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=45adb153fcac9ce72708f1283cac7833

I would choose to use a self-outer-join with an exists correlation, it should be pretty much the most performant, all things being equal.
select Count(*)
from emps e
where exists (
select * from emps e2
where e2.person = e.person
and e2.stopped > e.started
and e.started between '20160101' and '20170101'
and e2.started between '20160101' and '20170101'
);

You said you plan to set the dates manually, so this works where we set the start date in one CTE, and the end date in another CTE. Then we calculate the min/max for each, and use that criteria in the query where statement.
with min_max_start as (
select person,
min(started) as min_start, --obsolete
max(started) as max_start
from emps
where started >= '2016-01-01'
group by person
),
min_max_end as (
select person,
min(stopped) as min_stop,
max(stopped) as max_stop --obsolete
from emps
where stopped < '2017-01-01'
group by person
)
select count(distinct e.person)
from emps e
join min_max_start mms
on e.person = mms.person
join min_max_end mme
on e.person = mme.person
where mms.max_start> mme.min_stop
Output: 1

Try the following:
With CTE as
(
Select D.person, D.started, T.stopped,
case
when Year(D.started) = Year(T.stopped) and D.started > T.stopped
then 1
else 0
end as chk
From
(Select person, started From Emps Where started >= '2016-01-01') D
Join
(Select person, stopped From Emps Where stopped <= '2017-01-01') T
On D.person = T.person
)
Select Count(Distinct person) as CNT
From CTE
Where chk = 1;
To get the employee list who met the criteria use the following on the CTE instead of the above Select Count... query:
Select person, started, stopped
From CTE
Where chk = 1;
See a demo from db<>fiddle.

How to select data without using group?

My base data based on dealer code only but in one condition we need to select other field as well to matching the condition in other temp table how can i retrieve data only based on dealercode ith matching the condition on chassis no.
Below is the sample data:
This is how we have selected the data for the requirement:
---------------lastyrRenewalpolicy------------------
IF OBJECT_ID('TEMPDB..#LASTYRETEN') IS NOT NULL DROP TABLE #LASTYRETEN
select DEALERMASTERCODE , count(*) RENEWALEXPRPOLICY,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_RENEWAL' into #LASTYRETEN
from [dbo].[T_RE_POLICY_TRANSACTION]
where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'Renewal' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
-----------------lastrollower------------------------
IF OBJECT_ID('TEMPDB..#LASTYROLWR') IS NOT NULL DROP TABLE #LASTYROLWR
select DEALERMASTERCODE , count(*) ROLLOWEEXPRPOLICY ,SUM(NETOD_YEAR_PREM_PART_A) AS 'ACHIEVED-ODPREMIUM_ROLLOVER'
into #LASTYROLWR from [dbo].[T_RE_POLICY_TRANSACTION] where cast (InsPolicyCreatedDate as date) between #FirstDayC and #LastDayC
AND PolicyStatus= 'ROLLOVER' AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE
And continue with above flow Below is the other select statement which creating issue at the end due to grouping
:
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRBASEEXPIRY') IS NOT NULL DROP TABLE #OTHERYRBASEEXPIRY
select DEALERMASTERCODE ,ChassisNo , count(*) RENEWALPOLICYEXPIRY
into #OTHERYRBASEEXPIRY
from [dbo].[T_RE_POLICY_TRANSACTION] where cast (PolicyExpiryDate as date) between '2020-08-01' and '2020-08-31'
and BASIC_PREM_TOTAL <> 0 AND PolicyStatus in ('Renewal','rollover') and BusinessType='jcb'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by DEALERMASTERCODE,ChassisNo
-------------OTHERYRBASE(EXPIRYRENEWAL)--------------
IF OBJECT_ID('TEMPDB..#OTHERYRCON') IS NOT NULL DROP TABLE #OTHERYRCON
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE INNER JOIN #OTHERYRBASEEXPIRY EXP
ON OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31' and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0 AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 ) group by OTE.DEALERMASTERCODE,OTE.ChassisNo
Thanks a lot in advance for helping and giving a solution very quickly ///

After taking a look at this code it seems possible there was an omitted JOIN condition in the last SELECT statement. In the code provided the JOIN condition is only on ChassisNo. The GROUP BY in the prior queries which populates the temporary table also included the DEALERMASTERCODE column. I'm thinking DEALERMASTERCODE should be added to the JOIN condition. Something like this
select OTE.DEALERMASTERCODE ,OTE.ChassisNo , count(*) OTHERYRCON
into #OTHERYRCON
from [dbo].[T_RE_POLICY_TRANSACTION] OTE
INNER JOIN #OTHERYRBASEEXPIRY EXP ON OTE.DEALERMASTERCODE=EXP.DEALERMASTERCODE
and OTE.ChassisNo=EXP.ChassisNo
where cast(CREATED_DATE as date) between '2020-06-01' and '2020-12-31'
and BusinessType='jcb'
and OTE.BASIC_PREM_TOTAL <> 0
AND OTE.PolicyStatus = 'Renewal'
AND (ltrim(rtrim(ISCANCELLEDSTATUS)) = 0 )
group by OTE.DEALERMASTERCODE,OTE.ChassisNo;

SQL query for extracting accounts whose last load date is not equal to today

I seem to have some issue in the query and need your help.
I have 2 tables:
1st table contains Bank account details - account number, status etc - bankacc
2nd table stores name of the statement and the load date on which the statement is imported - bankstm
I am trying to write a query that will populate only those bank accounts whose statement was not imported as of today date.
Date format in database - 2020-01-17 00:00:00.000
Code that i have tried:
SELECT b.bank_acc as Bank_Account, max(b.date_ld) as Load_Date from bankstm b
where b.date_ld < CAST(GETDATE() AS DATE) and
b.bank_acc in (select a.acc_no from bankacc a where a.in_use = 'Y' and a.analyse03 = '1517')
group by b.bank_acc
This code populates all the records from previous date whereas most of them statements loaded today.
I also attempted the code with '=' or '<>' or '>' based on the queries raised previously in stack overflow. But nothing seems to be giving me the correct result.
So finally i am raising it for experts to help me out.

You need to apply the date filter on the max.
I cast the max(b.date_ld) to date in case its datetime format
SELECT b.bank_acc as Bank_Account, max(b.date_ld) as Load_Date from bankstm b
where
b.bank_acc in (select a.acc_no from bankacc a where a.in_use = 'Y' and a.analyse03 = '1517')
group by b.bank_acc
having cast(max(b.date_ld) as date) < CAST(GETDATE() AS DATE)

You can modify your statement to use a not exists if your only criteria is that the record doesn't have a corresponding entry for today's date as a load date.
If the criteria is different, may require modification.
SELECT [b].[bank_acc] AS [bank_account]
, MAX([b].[date_ld]) AS [load_date]
FROM bankstm AS b
WHERE NOT EXISTS
(
SELECT 1
FROM [bankstm] AS [bb]
WHERE [b].[bank_acc] = [bb].[bank_acc] AND
TRY_CONVERT(DATE, [bb].[date_ld]) = TRY_CONVERT(DATE, GETDATE())
)
AND EXISTS
(
SELECT 1
FROM [bankacct] a
WHERE b.bank_acc = a.bank_acc and a.in_use = 'Y' and a.analyse03 = '1517'
)
GROUP BY b.bank_acc
;

first of all you can improve your query with join and avoid using sub query.
SELECT b.bank_acc as Bank_Account, max(b.date_ld) as Load_Date
FROM bankstm AS b
LEFT JOIN bankacc AS ba ON b.bank_acc = ba.acc_no
WHERE ba.in_use = 'Y'
AND ba.analyse03 = '1517'
GROUP BY b.bank_acc
HAVING CAST(MAX(b.date_ld) AS DATE) < CAST(GETDATE() AS DATE)

I would use not exists:
select ba.*
from bankacc ba
where ba.in_use = 'Y' and
ba.analyse03 = '1517' and
not exists (select 1
from bankstm bs
where bs.bank_acc = ba.acc_no and
bs.date_ld = convert(date, getdate())
);
For performance, you want indexes on bankacc(in_use, analyse03, acc_no) and bankstm(bank_acc, date_ld).

Calculating difference in rows for many columns in SQL (Access)

What's up guys. I have an other question regarding using SQL to analyze. I have a table build like this.
ID Date Value
1 31.01.2019 10
1 30.01.2019 5
2 31.01.2019 20
2 30.01.2019 10
3 31.01.2019 30
3 30.01.2019 20
With many different IDs and many different Dates. What I would like to have as an output is an additional column, that gives me the difference to the previous date for each ID. So that I can then analyze the change of values between days for each Category (ID). To do that I would need to avoid that the command computes the difference of Last Day WHERE ID = 1 - First Day WHERE ID = 2.
Desired Output:
ID Date Difference to previous Days
1 31.01.2019 5
2 31.01.2019 10
3 31.01.2019 10
In the end I want to find outlier, so days where the difference in value between two days is very large. Does anyone have a solution? If it is not possible with Access, I am open to solutions with Excel, but Access should be the first choice as it is more scaleable.
Greetings and thanks in advance!!

With a self join:
select t1.ID, t1.[Date],
t1.[Value] - t2.[Value] as [Difference to previous Day]
from tablename t1 inner join tablename t2
on t2.[ID] = t1.[ID] and t2.[Date] = t1.[Date] - 1
Results:
ID Date Difference to previous Day
1 31/1/2019 5
2 31/1/2019 10
3 31/1/2019 10
Edit.
For the case that there are gaps between your dates:
select
t1.ID, t1.[Date], t1.[Value] - t2.[Value] as [Difference to previous Day]
from (
select t.ID, t.[Date], t.[Value],
(select max(tt.[Date]) from tablename as tt where ID = t.ID and tt.[Date] < t.[Date]) as prevdate
from tablename as t
) as t1 inner join tablename as t2
on t2.ID = t1.ID and t2.[Date] = t1.prevdate

In your example data, each id has the same two rows and the values are increasing. If this is generally true, then you can simply use aggregation:
select id, max(date), max(value) - min(value)
from t
group by id;
If the values might not be increasing, but the dates are the same, then you can use conditional aggregation:
select id,
max(date),
(max(iif(date = "31.01.2019", value, null)) -
max(iif(date = "30.01.2019", value, null))
) as diff
from t
group by id;
Note: Your date looks like it is using a bespoke format, so I am just doing the comparison as a string.
If previous date is exactly one day before, you can use a join:
select t.*,
(t.value - tprev.value) as diff
from t left join
t as tprev
on t.id = tprev.di and t.date = dateadd("d", 1, tprev.date);
If date is arbitrarily the previous date in the table, then you can use a correlated subquery
select t.*,
(t.value -
(select top (1) tprev.value
from t as tprev
where tprev.id = t.id and tprev.date < t.date
order by tprev.date desc
)
) as diff
(t.value - tprev.value) as diff
from t;

You can use a self join with an additional condition using a sub-query to determine the previous date
SELECT t.ID, t.Date, t.Value - prev.Value AS Diff
FROM
dtvalues AS t
INNER JOIN dtvalues AS prev
ON t.ID = prev.ID
WHERE
prev.[Date] = (SELECT MAX(x.[Date]) FROM dtvalues x WHERE x.ID=t.ID AND x.[Date]<t.[Date])
ORDER BY t.ID, t.[Date];
You could also include the where condition into the join condition, but the query designer would not be able to handle the query anymore. Like this, you can still edit the query in the query designer.

SQL Oracle Query - Left Outer Join on null Field

I have this query
SELECT *
FROM (SELECT mi.visit_id, mi.event_id, mi.patient_id, mi.mrn, mi.reg_date,
mi.d_date, mi.bml_count, mi.TYPE, mblp.baby_patient_id,
mblp.baby_birthdate
FROM ajmid.km0076_motherinfo_test mi LEFT JOIN alfayezb2.mbl_patients mblp
ON mblp.mother_patient_id = mi.patient_id
--works here
AND ( TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date, 'mm/dd/YYYY')
OR TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date - 1, 'mm/dd/YYYY')
OR TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date + 1, 'mm/dd/YYYY')
)
) bml
LEFT OUTER JOIN --doesn't work here
(SELECT ROW_NUMBER () OVER (PARTITION BY vis.patient_id ORDER BY vis.admission_date_time)
num,
vis.admission_date_time, vis.visit_id, vis.patient_id,
vis.facility_id
FROM visit vis) v ON bml.baby_patient_id = v.patient_id
WHERE v.num = 1
ORDER BY bml.reg_date
bml by itself returns 118 rows while the whole query returns 117, the reason is bml returns 1 row with baby_patient_id as null, so I used left outer join to show it, but it's still not showing !!
what can I do to show all rows of bml ?
I'm using Toad 9.6
Thank you

check the query:
SELECT ROW_NUMBER () OVER (PARTITION BY vis.patient_id ORDER BY vis.admission_date_time)
num,
vis.admission_date_time, vis.visit_id, vis.patient_id,
vis.facility_id
FROM visit vis
does it return 118 not null patient_id's?
if it returns 117, that might be the reason.(LEFT OUTER JOIN doesnot pick the records which are null on both tables)
Also, are you sure the null value of baby_patient_id in bml table is actually a NULL value and not a empty charater?(' ').

The probable cause is your filter / criteria (where clause) is eliminating a row with a null value for v.num. The WHERE filters the results after the join.
WHERE v.num = 1 -- Are all v.num equal to 1 ?
The mere action of using a criteria against a field, by definition of what NULL means, eliminates that row from consideration because NULL cannot be evaluated. You can say "WHERE id != 1" and expect to get rows where id is null because null != 1 right? Wrong. id != NULL is not defined logically. It is why we say "IS or IS NOT NULL" when dealing with NULL.

it's working finally !
I added
OR bml.baby_patient_id IS NULL
to the where clause, so the final script is
SELECT *
FROM (SELECT mi.visit_id, mi.event_id, mi.patient_id, mi.mrn, mi.reg_date,
mi.d_date, mi.bml_count, mi.TYPE, mblp.baby_patient_id,
mblp.baby_birthdate
FROM ajmid.km0076_motherinfo_test mi LEFT JOIN alfayezb2.mbl_patients mblp
ON mblp.mother_patient_id = mi.patient_id
AND ( TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date, 'mm/dd/YYYY')
OR TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date - 1, 'mm/dd/YYYY')
OR TO_CHAR (mblp.baby_birthdate, 'mm/dd/YYYY') =
TO_CHAR (mi.reg_date + 1, 'mm/dd/YYYY')
)
) bml
LEFT OUTER JOIN
(SELECT ROW_NUMBER () OVER (PARTITION BY vis.patient_id ORDER BY vis.admission_date_time)
num,
vis.admission_date_time, vis.visit_id, vis.patient_id,
vis.facility_id
FROM visit vis) v ON bml.baby_patient_id = v.patient_id
WHERE v.num = 1
OR bml.baby_patient_id IS NULL
ORDER BY bml.reg_date
I don't know how this was helpful, I wish someone would explain for me !
Thanks all

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Aggregate SQL Query, GROUP BY Causing Issues - sql

Related

SQL Rowwise comparison between groups

How to select data without using group?

SQL query for extracting accounts whose last load date is not equal to today

Calculating difference in rows for many columns in SQL (Access)

SQL Oracle Query - Left Outer Join on null Field

Categories

Resources