I have an OLTP database that contains 400 million rows. I am trying to create a SQL query that produces results something similar to this:
Count(*) DateRange Using DateDiff
1 million > 10 yrs
2 million > 20 yrs
10 Million > 50 yrs
And so on.
I create a query something like this:
select count(*) , DateDiff( year , Start_date , End_Date)
group by column
having DateDiff > 10
Union
select count(*) , DateDiff( year , Start_date , End_Date)
group by column
having DateDiff > 20
I believe there is a Cube function that I can use but I cannot seem to get that query right. Any help would be appreciated.
Having a query with aggregates on a full table takes a while. You are having n such queries, which is n times slower than if you had a single query. So, logically, we conclude that the goal is to convert your union-based query concatenation into a single query. And luckily it is achievable (I hope this is legal syntax in SQL Server, in which I didn't work for a few years, but I'm sure the idea can be used):
select yourcolumn
sum(
case
when DateDiff( year , Start_date , End_Date) > 10 then 1
else 0
end) as yrs10,
sum(
case
when DateDiff( year , Start_date , End_Date) > 20 then 1
else 0
end) as yrs20,
sum(
case
when DateDiff( year , Start_date , End_Date) > 50 then 1
else 0
end) as yrs50
from yourtable
group by yourcolumn;
So, this will create a single record for each possible value of yourcolumn and in that record you will have a field that will identify your yourcolumn value and a field for each of your time interval-based aggregation.
Related
Using Microsoft SQL Server, I'm trying to get the average days it took someone to complete a transaction in a given month.
Each user has hundreds of transactions so I'm looking for a way to get the count on how many transactions for each person and then their average for the month. I also need to make sure that I remove any NULL returns and convert any negatives to a Zero but keep it accounted for.
Example would look like (Max | 300 | 12.5) for (Person | Transactions | Average).
I've been able to get as far as:
SELECT
[Transaction],
[NAME],
DATEDIFF (d, [Startdate], [Closedate]) AS Days
FROM
[Table]
WHERE
YEAR ([Startdate]) = 2021
AND MONTH ([Closedate]) = 11
AND Closedate IS NOT NULL
I've tried to figure out how to incorporate a CASE statement but it's not working when I tried to do it before the DATEDIFF.
Looks like you can just do a simple GROUP BY with conditional aggregation.
To avoid repeating the DATEDIFF calculation you can stuff it into a CROSS APPLY (VALUES.
Always use date intervals such as >= AND < rather than using functions on date columns
SELECT
t.NAME,
SUM(CASE WHEN v.Days > 0 THEN v.Days ELSE 0 END) AS TotalDays
FROM
[Table] t
CROSS APPLY (VALUES(
DATEDIFF(day, t.Startdate, t.Closedate)
)) v(Days)
WHERE
t.Startdate >= '20211101'
AND t.Startdate < '20211201'
AND t.Closedate IS NOT NULL
GROUP BY
t.NAME;
Looking for some help with code in SQL Developer query to flag any 2 temperature readings - every rolling 12 hours - if they are greater than the acceptable benchmark of 101 deg F.
The given data fields are:
Temp Recorded (DT/TM data type ; down to seconds)
Reading Value (number data type)
Patient ID
There are multiple readings taken throughout a patients stay, at random times.
Logically, we can check if two adjacent times total 12 hrs or more & EACH of their temp readings are >101 but not sure how to put it into a CASE statement (unless there's a better SQL syntax).
Will really appreciate if a SQL only solution can be recommended.
Many Thanks
Giving the code below including the CASE statement as provided by #Gordon Linoff underneath. The below sample code is part of a bigger query joining multiple tables:
SELECT CE.PatientID, CE.ReadingValue, CE.TempRecorded_DT_TM,
(case when sum(case when readingvalue > 101 then 1 else 0 end) over (
partition by patientid
order by dt
range between '12' hour preceding and current row
) >= 2
then 'Y' else 'N'
end) as temp_flag
FROM
edw.se_clinical_event CE
WHERE
CE.PatientID = '176660214'
AND
CE.TempRecorded_DT_TM >= '01-JAN-20'
ORDER BY
TempRecorded_DT_TM
If you want two readings in 12 hours that are greater than 101, then you can use a rolling sum with a window frame:
select t.*,
(case when sum(case when readingvalue > 101 then 1 else 0 end) over (
partition by patientid
order by dt
range between interval '12' hour preceding and current row
) >= 2
then 'Y' else 'N'
end) as temp_flag
from t;
Just wondering if I have two fields in a table named modified date and created date, they either have a date populate or is null. What I would like to know is the best way to count the number of occurrences and group them into a particular range like for example 0-7 days, 8-14 days, 15- 30 days etc.
I was thinking about using
sum(case when modifieddate between getdate()-7 and getdate() then 1 else 0 end)
Is this the best way to do it or is there a better way for each date range specified above. Same would go for the created date
Build a table containing the ranges on the fly, then access your table and count:
select
mindays,
maxdays,
(
select count(*)
from mytable t
where datediff(day, coalesce(t.modifieddate, t.createddate), getdate() )
between ranges.mindays and ranges.maxdays
) as cnt
from
(
select 0 as mindays, 7 as maxdays
union all
select 8 as mindays, 14 as maxdays
union all
select 15 as mindays, 30 as maxdays
) ranges;
I am trying to sum in SQL cases when a date is between a max date minus 7 and a max date, but unable to get it right.
Example:
sum(case when date between max(date from field)-7 and max(date from field) then column to sum else 0 end) as '0-7 Days'
This may work:
I may be getting the DATEDIFFs a little bit confused but it should be something along those lines
SUM(CASE
WHEN DATEDIFF(DAY,datecolumn,DATEADD(DAY,-7,max(datecolumn)))<0 AND DATEDIFF(DAY,datecolumn,max(datecolumn))>0
THEN columntosum
ELSE 0
END) AS '0-7 Days'
In most databases, you can do something like this:
select (case when datecol >= maxdatecol - 7 and datecol <= maxdatecol then columntosum else 0
end) as days_0_7
from (select t.*,
max(datecol) over () as maxdatecol
from table t
) t;
Note that date arithmetic varies between databases, so this exact syntax may not work (small modifications should fix the problems for most databases).
I want to query statistics using SQL from 3 different days (in a row). The display would be something like:
15 users created today, 10 yesterday, 12 two days ago
The SQL would be something like (for today):
SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11'
And then I would do 2 more queries for yesterday and the day before.
So in total I'm doing 3 queries against the entire database. The format for created_date is 2012-05-11 05:24:11 (date & time).
Is there a more efficient SQL way to do this, say in one query?
For specifics, I'm using PHP and SQLite (so the PDO extension).
The result should be 3 different numbers (one for each day).
Any chance someone could show some performance numbers in comparison?
You can use GROUP BY:
SELECT Count(*), created_date FROM Users GROUP BY created_date
That will give you a list of dates with the number of records found on that date. You can add criteria for created_date using a normal WHERE clause.
Edit: based on your edit:
SELECT Count(*), created_date FROM Users WHERE created_date>='2012-05-09' GROUP BY date(created_date)
The best solution is to use GROUP BY DAY(created_date). Here is your query:
SELECT DATE(created_date), count(*)
FROM users
WHERE created_date > CURRENT_DATE - INTERVAL 3 DAY
GROUP BY DAY(created_date)
This would work I believe though I have no way to test it:
SELECT
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as today,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-10') as yesterday,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as day_before
;
Use GROUP BY like jeroen suggested, but if you're planning for other periods you can also set ranges like this:
SELECT SUM(IF(created_date BETWEEN '2012-05-01' AND NOW(), 1, 0)) AS `this_month`,
SUM(IF(created_date = '2012-05-09', 1, 0)) AS `2_days_ago`
FROM ...
As noted below, SQLite doesn't have IF function but there is CASE instead. So this way it should work:
SELECT SUM(CASE WHEN created_date BETWEEN '2012-05-01' AND NOW() THEN 1 ELSE 0 END) AS `this_month`,
SUM(CASE created_date WHEN '2012-05-09' THEN 1 ELSE 0 END) AS `2_days_ago`
FROM ...