Having clause being ignored - sql

In this query, I am attempting to get a count that gives me a count of patients for each practice under given conditions.
The issue is that I have to show patients who have had >=3 office visits in the past year.
Count(D.PID)
in the select list is ignoring
HAVING count(admitdatetime)>=3
Here is my query
select distinct D.PracticeAbbrevName, D.ProviderLastName, count(D.pid) AS Count
from PersonDetail AS D
left join Visit AS V on D.PID = V.PID
where D.A1C >=7.5 and V.admitdatetime >= (getdate()-365) and D.A1CDays <180 and D.Diabetes = 1
group by D.PracticeAbbrevName, D.ProviderLastName
having count(admitdatetime)>=3
order by PracticeAbbrevName
If I get rid of the count function for D.pid, and just display each PID individually, my having phrase works properly.
There is something about count and having that do now work properly together.

Revised answer:
SELECT DISTINCT
D.PracticeAbbrevName,
D.ProviderLastName,
COUNT(D.pid) AS PIDCount,
COUNT(admitdatetime) AS AdmitCount
FROM
PersonDetail AS D
LEFT JOIN Visit AS V
ON D.PID = V.PID
WHERE
D.A1C >= 7.5
AND V.admitdatetime >= ( GETDATE() - 365 )
AND D.A1CDays < 180
AND D.Diabetes = 1
GROUP BY
D.PracticeAbbrevName,
D.ProviderLastName
HAVING
COUNT(admitdatetime) >= 3
ORDER BY
PracticeAbbrevName

You're trying to do too much at once. Split the logic in 2 steps:
Query grouping by PID to filter out patients that don't meet your criteria.
Query grouping by practice to get a patient count.
Your query would look like this:
;with EligiblePatients as (
select d.pid,
d.PracticeAbbrevName,
d.ProviderLastName
from PersonDetail d
left join Visit v
on v.pid = d.pid
and v.admitdatetime >= (getdate()-365)
where d.A1C >= 7.5
and d.A1CDays < 180
and d.Diabetes = 1
group by d.pid,
d.PracticeAbbrevName,
d.ProviderLastName
having count(v.pid) >= 3
)
select PracticeAbbrevName,
ProviderLastName,
COUNT(*) as PatientCount
from EligiblePatients
group by PracticeAbbrevName,
ProviderLastName
order by PracticeAbbrevName

Related

SELECT list expression references column integration_start_date which is neither grouped nor aggregated at

I'm facing an issue with the following query. It gave me this error [SELECT list expression references column integration_start_date which is neither grouped nor aggregated at [34:63]]. In particular, it points to the first 'when' in the result table, which I don't know how to fix. This is on BigQuery if that helps. I see everything is written correctly or I could be wrong. Seeking for help.
with plan_data as (
select format_date("%Y-%m-%d",last_day(date(a.basis_date))) as invoice_date,
a.sponsor_id as sponsor_id,
b.company_name as sponsor_name,
REPLACE(SUBSTR(d.meta,STRPOS(d.meta,'merchant_id')+12,13),'"','') as merchant_id,
a.state as plan_state,
date(c.start_date) as plan_start_date,
a.employee_id as square_employee_id,
date(
(select min(date)
from glproductionview.stats_sponsors
where sponsor_id = a.sponsor_id and sponsor_payroll_provider_identifier = 'square' and date >= c.start_date) )
as integration_start_date,
count(distinct a.employee_id) as eligible_pts_count, --pts that are in active plan and have payroll activities (payroll deductions) in the reporting month
from glproductionview.payroll_activities as a
left join glproductionview.sponsors as b
on a.sponsor_id = b.id
left join glproductionview.dc_plans as c
on a.plan_id = c.id
left join glproductionview.payroll_connections as d
on a.sponsor_id = d.sponsor_id and d.provider_identifier = 'rocket' and a.company_id = d.payroll_id
where a.payroll_provider_identifier = 'rocket'
and format_date("%Y-%m",date(a.basis_date)) = '2021-07'
and a.amount_cents > 0
group by 1,2,3,4,5,6,7,8
order by 2 asc
)
select invoice_date,
sponsor_id,
sponsor_name,
eligible_pts_count,
case
when eligible_pts_count <= 5 and date_diff(current_date(),integration_start_date, month) <= 12 then 20
when eligible_pts_count <= 5 and date_diff(current_date(),integration_start_date, month) > 12 then 15
when eligible_pts_count > 5 and date_diff(current_date(),integration_start_date, month) <= 12 then count(distinct square_employee_id)*4
when eligible_pts_count > 5 and date_diff(current_date(),integration_start_date, month) > 12 then count(distinct square_employee_id)*3
else 0
end as fees
from plan_data
group by 1,2,3,4;

Fill in blank dates for rolling average - CTE in Snowflake

I have two tables – activity and purchase
Activity table:
user_id date videos_watched
1 2020-01-02 3
1 2020-01-04 5
1 2020-01-07 5
Purchase table:
user_id purchase_date
1 2020-01-01
2 2020-02-02
What I would like to do is to get a 30 day rolling average since purchase on how many videos has been watched.
The base query is like this:
SELECT
DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED)
FROM PURCHASE P
LEFT OUTER JOIN ACTIVITY A ON P.USER_ID = A.USER_ID AND
A.DATE >= P.PURCHASE_DATE AND A.DATE <= DATEADD(DAY, 30, P.PURCHASE_DATE)
GROUP BY 1;
However, the Activity table only has records for each day a video has been logged. I would like to fill in the blanks for days a video has not been viewed.
I have started to look into using a CTE like this:
WITH cte AS (
SELECT date('2020-01-01') as fdate
UNION ALL
SELECT CAST(DATEADD(day,1,fdate) as date)
FROM cte
WHERE fdate < date('2020-04-01')
) select * from cte
cross join purchases p
left outer join activity a
on p.user id = a.user_id
and a.fdate = p.purchase_date
and a.date >= p.purchase_date and a.date <= dateadd(day, 30, p.purchase_date)
The end goal is to have something like this:
days_since_purchase videos_watched
1 3
2 0 --CTE coalesce inserted value
3 0
4 5
Been trying for the last couple of hours to get it right, but still can't really get the hang of it.
If you want to fill in the gaps in the result set, then I think you should be generating integers rather than dates:
WITH cte AS (
SELECT 1 as day_since_purchase
UNION ALL
SELECT 1 + day_since_purchase
FROM cte
WHERE day_since_purchase < 4
)
SELECT cte.day_since_purchase, COALESCE(avg_videos_viewed, 0)
FROM cte LEFT JOIN
(SELECT DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED) as avg_videos_viewed
FROM purchases p JOIN
activity a
ON p.user id = a.user_id AND
a.fdate = p.purchase_date AND
a.date >= p.purchase_date AND
a.date <= dateadd(day, 30, p.purchase_date)
GROUP BY 1
) pa
ON pa.day_since_purchase = cte.day_since_purchase;
You can use a recursive query to generate the 30 days following each purchase, then bring the activity table:
with cte as (
select
purchase_date,
client_id,
0 days_since_purchase,
purchase_date dt
from purchases
union all
select
purchase_date,
client_id,
days_since_purchase + 1
dateadd(day, days_since_purchase + 1, purchase_date)
from cte
where days_since_purchase < 30
)
select
c.days_since_purchase,
avg(colaesce(a. videos_watch, 0)) avg_ videos_watch
from cte c
left join activity a
on a.client_id = c.client_id
and a.fdate = c.purchase_date
and a.date = c.dt
group by c.days_since_purchase
Your question is unclear on whether you have a column in the activity table that stores the purchase date each row relates to. Your query has column fdate but not your sample data. I used that column in the query (without such column, you might end up counting the same activity in different purchases).

oracle query for get 20 top agency which have pass issued

I have a query which show datewise pass issued by agency. I wanted to get top 20 agency who have most pass issued here is my query
Nothing in your data id identifies "agency". If I assume you mean "agent", you can get the top 20 by aggregating and then limiting the result. In Oracle 12C+, you can use:
SELECT gp.agent_id, a.agent_name, COUNT(*)
FROM eofficeuat.gatepass gp INNER JOIN
eofficeuat.cnf_agents a
ON gp.agent_id = a.agent_id INNER JOIN
eofficeuat.cardprintlog_user u
ON gp.agent_id = u.agent_id
WHERE gp.issuedatetime BETWEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY gp.agent_id, a.agent_name
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
In earlier versions, a subquery is needed:
SELECT *
FROM (SELECT gp.agent_id, a.agent_name, COUNT(*)
FROM eofficeuat.gatepass gp INNER JOIN
eofficeuat.cnf_agents a
ON gp.agent_id = a.agent_id INNER JOIN
eofficeuat.cardprintlog_user u
ON gp.agent_id = u.agent_id
WHERE gp.issuedatetime BETWEEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY gp.agent_id, a.agent_name
ORDER BY COUNT(*) DESC
) a
WHERE rownum <= 20;
Obviously, if you do mean "agency" and that is identified by different columns, you would just adjust the SELECT and GROUP BY clauses.
Also, I would advise you never to use BETWEEN on dates in Oracle. There is a time component that might cause issues.
If you intend only times on '2019-09-28', then:
gp.issuedatetime >= DATE '2019-09-28' AND
gp.issuedatetime < DATE '2019-09-29'
If you intend both the 28th and 29th:
gp.issuedatetime >= DATE '2019-09-28' AND
gp.issuedatetime < DATE '2019-09-30'
You can use LIMIT clause(12c or higher version) with TOP 20 records as following:
SELECT eofficeuat.gatepass.agent_id, eofficeuat.cnf_agents.agent_name, COUNT(1) as cnt
FROM eofficeuat.gatepass INNER JOIN
eofficeuat.cnf_agents
ON eofficeuat.gatepass.agent_id = eofficeuat.cnf_agents.agent_id INNER JOIN
eofficeuat.cardprintlog_user
ON eofficeuat.gatepass.agent_id = eofficeuat.cardprintlog_user.agent_id
WHERE eofficeuat.gatepass.issuedatetime BETWEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY eofficeuat.gatepass.agent_id, eofficeuat.cnf_agents.agent_name
ORDER BY cnt DESC
FETCH FIRST 20 ROWS ONLY; -- this will fetch top 20 agents
Cheers!!

Other alternatives to achieve LIMIT in SQL

I have created an SQL query to get certain data with LIMIT so I can use it in datatable. It has 76288 rows.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber,
ContainerNumber, BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber,
b.SealNumber, b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 100
ORDER BY TransDate ASC;
0 and 100 is inside a variable that changes when the pagination is clicked.
If it's for the first hundred pages, it loads okay. But when I click the last page, it breaks saying timeout exceeded. Also, when I use the search function of the datatable, it's not working the way it should.
Example: I searched for dino in the datatable, it will say it has 95 records but will only show 1 record since the query is only between 0 and 10.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber, ContainerNumber,
BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber, b.SealNumber,
b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 10 AND AgentName LIKE '%dino%'
ORDER BY TransDate ASC;
I also tried TOP and EXCEPT but when I search for SELECT TOP 0... EXCEPT SELECT TOP 100... but it's only showing 9 rows.
UPDATE:
I was able to make it work by including the WHERE clause in the subquery. My only problem now is the ORDER BY. It only works in the current page shown which is data 1 - 10 but not for all the data.
Any alternatives? Your help is highly appreciated. Thanks!

Trying to create a SQL query

I am trying to create a query that retrieves only the ten companies with the highest number of pickups over the six-month period, this means pickup occasions, and not the number of items picked up.
I have done this
SELECT *
FROM customer
JOIN (SELECT manifest.pickup_customer_ref reference,
DENSE_RANK() OVER (PARTITION BY manifest.pickup_customer_ref ORDER BY COUNT(manifest.trip_id) DESC) rnk
FROM manifest
INNER JOIN trip ON manifest.trip_id = trip.trip_id
WHERE trip.departure_date > TRUNC(SYSDATE) - interval '6' month
GROUP BY manifest.pickup_customer_ref) cm ON customer.reference = cm.reference
WHERE cm.rnk < 11;
this uses dense_rank to determine the order or customers with the highest number of trips first
Hmm well i don't have Oracle so I can't test it 100%, but I believe your looking for something like the following:
Keep in mind that when you use group by, you have to narrow down to the same fields you group by in the select. Hope this helps at least give you an idea of what to look at.
select TOP 10
c.company_name,
m.pickup_customer_ref,
count(*) as 'count'
from customer c
inner join mainfest m on m.pickup_customer_ref = c.reference
inner join trip t on t.trip_id = m.trip_id
where t.departure_date < DATEADD(month, -6, GETDATE())
group by c.company_name, m.pickup_customer_ref
order by 'count', c.company_name, m.pickup_customer_ref desc