Trying to create a Sql Cube Query Not having any luck - sql

I have an OLTP database that contains 400 million rows. I am trying to create a SQL query that produces results something similar to this:
Count(*) DateRange Using DateDiff
1 million > 10 yrs
2 million > 20 yrs
10 Million > 50 yrs
And so on.
I create a query something like this:
select count(*) , DateDiff( year , Start_date , End_Date)
group by column
having DateDiff > 10
Union
select count(*) , DateDiff( year , Start_date , End_Date)
group by column
having DateDiff > 20
I believe there is a Cube function that I can use but I cannot seem to get that query right. Any help would be appreciated.

Having a query with aggregates on a full table takes a while. You are having n such queries, which is n times slower than if you had a single query. So, logically, we conclude that the goal is to convert your union-based query concatenation into a single query. And luckily it is achievable (I hope this is legal syntax in SQL Server, in which I didn't work for a few years, but I'm sure the idea can be used):
select yourcolumn
sum(
case
when DateDiff( year , Start_date , End_Date) > 10 then 1
else 0
end) as yrs10,
sum(
case
when DateDiff( year , Start_date , End_Date) > 20 then 1
else 0
end) as yrs20,
sum(
case
when DateDiff( year , Start_date , End_Date) > 50 then 1
else 0
end) as yrs50
from yourtable
group by yourcolumn;
So, this will create a single record for each possible value of yourcolumn and in that record you will have a field that will identify your yourcolumn value and a field for each of your time interval-based aggregation.

Related

Get a count and average from a couple rows in the same table?

Using Microsoft SQL Server, I'm trying to get the average days it took someone to complete a transaction in a given month.
Each user has hundreds of transactions so I'm looking for a way to get the count on how many transactions for each person and then their average for the month. I also need to make sure that I remove any NULL returns and convert any negatives to a Zero but keep it accounted for.
Example would look like (Max | 300 | 12.5) for (Person | Transactions | Average).
I've been able to get as far as:
SELECT
[Transaction],
[NAME],
DATEDIFF (d, [Startdate], [Closedate]) AS Days
FROM
[Table]
WHERE
YEAR ([Startdate]) = 2021
AND MONTH ([Closedate]) = 11
AND Closedate IS NOT NULL
I've tried to figure out how to incorporate a CASE statement but it's not working when I tried to do it before the DATEDIFF.
Looks like you can just do a simple GROUP BY with conditional aggregation.
To avoid repeating the DATEDIFF calculation you can stuff it into a CROSS APPLY (VALUES.
Always use date intervals such as >= AND < rather than using functions on date columns
SELECT
t.NAME,
SUM(CASE WHEN v.Days > 0 THEN v.Days ELSE 0 END) AS TotalDays
FROM
[Table] t
CROSS APPLY (VALUES(
DATEDIFF(day, t.Startdate, t.Closedate)
)) v(Days)
WHERE
t.Startdate >= '20211101'
AND t.Startdate < '20211201'
AND t.Closedate IS NOT NULL
GROUP BY
t.NAME;

Flag 2 actual vs benchmark readings every rolling 12 hours in SQL Developer Query

Looking for some help with code in SQL Developer query to flag any 2 temperature readings - every rolling 12 hours - if they are greater than the acceptable benchmark of 101 deg F.
The given data fields are:
Temp Recorded (DT/TM data type ; down to seconds)
Reading Value (number data type)
Patient ID
There are multiple readings taken throughout a patients stay, at random times.
Logically, we can check if two adjacent times total 12 hrs or more & EACH of their temp readings are >101 but not sure how to put it into a CASE statement (unless there's a better SQL syntax).
Will really appreciate if a SQL only solution can be recommended.
Many Thanks
Giving the code below including the CASE statement as provided by #Gordon Linoff underneath. The below sample code is part of a bigger query joining multiple tables:
SELECT CE.PatientID, CE.ReadingValue, CE.TempRecorded_DT_TM,
(case when sum(case when readingvalue > 101 then 1 else 0 end) over (
partition by patientid
order by dt
range between '12' hour preceding and current row
) >= 2
then 'Y' else 'N'
end) as temp_flag
FROM
edw.se_clinical_event CE
WHERE
CE.PatientID = '176660214'
AND
CE.TempRecorded_DT_TM >= '01-JAN-20'
ORDER BY
TempRecorded_DT_TM
If you want two readings in 12 hours that are greater than 101, then you can use a rolling sum with a window frame:
select t.*,
(case when sum(case when readingvalue > 101 then 1 else 0 end) over (
partition by patientid
order by dt
range between interval '12' hour preceding and current row
) >= 2
then 'Y' else 'N'
end) as temp_flag
from t;

best way to group date in a range

Just wondering if I have two fields in a table named modified date and created date, they either have a date populate or is null. What I would like to know is the best way to count the number of occurrences and group them into a particular range like for example 0-7 days, 8-14 days, 15- 30 days etc.
I was thinking about using
sum(case when modifieddate between getdate()-7 and getdate() then 1 else 0 end)
Is this the best way to do it or is there a better way for each date range specified above. Same would go for the created date
Build a table containing the ranges on the fly, then access your table and count:
select
mindays,
maxdays,
(
select count(*)
from mytable t
where datediff(day, coalesce(t.modifieddate, t.createddate), getdate() )
between ranges.mindays and ranges.maxdays
) as cnt
from
(
select 0 as mindays, 7 as maxdays
union all
select 8 as mindays, 14 as maxdays
union all
select 15 as mindays, 30 as maxdays
) ranges;

SQL sum case when between max date from table column

I am trying to sum in SQL cases when a date is between a max date minus 7 and a max date, but unable to get it right.
Example:
sum(case when date between max(date from field)-7 and max(date from field) then column to sum else 0 end) as '0-7 Days'
This may work:
I may be getting the DATEDIFFs a little bit confused but it should be something along those lines
SUM(CASE
WHEN DATEDIFF(DAY,datecolumn,DATEADD(DAY,-7,max(datecolumn)))<0 AND DATEDIFF(DAY,datecolumn,max(datecolumn))>0
THEN columntosum
ELSE 0
END) AS '0-7 Days'
In most databases, you can do something like this:
select (case when datecol >= maxdatecol - 7 and datecol <= maxdatecol then columntosum else 0
end) as days_0_7
from (select t.*,
max(datecol) over () as maxdatecol
from table t
) t;
Note that date arithmetic varies between databases, so this exact syntax may not work (small modifications should fix the problems for most databases).

Efficient way to query separate days of data?

I want to query statistics using SQL from 3 different days (in a row). The display would be something like:
15 users created today, 10 yesterday, 12 two days ago
The SQL would be something like (for today):
SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11'
And then I would do 2 more queries for yesterday and the day before.
So in total I'm doing 3 queries against the entire database. The format for created_date is 2012-05-11 05:24:11 (date & time).
Is there a more efficient SQL way to do this, say in one query?
For specifics, I'm using PHP and SQLite (so the PDO extension).
The result should be 3 different numbers (one for each day).
Any chance someone could show some performance numbers in comparison?
You can use GROUP BY:
SELECT Count(*), created_date FROM Users GROUP BY created_date
That will give you a list of dates with the number of records found on that date. You can add criteria for created_date using a normal WHERE clause.
Edit: based on your edit:
SELECT Count(*), created_date FROM Users WHERE created_date>='2012-05-09' GROUP BY date(created_date)
The best solution is to use GROUP BY DAY(created_date). Here is your query:
SELECT DATE(created_date), count(*)
FROM users
WHERE created_date > CURRENT_DATE - INTERVAL 3 DAY
GROUP BY DAY(created_date)
This would work I believe though I have no way to test it:
SELECT
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as today,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-10') as yesterday,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as day_before
;
Use GROUP BY like jeroen suggested, but if you're planning for other periods you can also set ranges like this:
SELECT SUM(IF(created_date BETWEEN '2012-05-01' AND NOW(), 1, 0)) AS `this_month`,
SUM(IF(created_date = '2012-05-09', 1, 0)) AS `2_days_ago`
FROM ...
As noted below, SQLite doesn't have IF function but there is CASE instead. So this way it should work:
SELECT SUM(CASE WHEN created_date BETWEEN '2012-05-01' AND NOW() THEN 1 ELSE 0 END) AS `this_month`,
SUM(CASE created_date WHEN '2012-05-09' THEN 1 ELSE 0 END) AS `2_days_ago`
FROM ...