Query for Age calculation and Date comparison inconsistent - sql

I'm doing some counts to validate a table XXX, I designed 2 queries to calculate people younger than 18 years.
The query i'm using is the following:
select count(distinct user.id) from user
left join sometable on sometable.id = user.someTableId
left join anotherTable on sometable.anotherTableId = anotherTable.id
where (sometable.id = 'x' or user.public = true)
AND (DATE_PART('year', age(current_date, user.birthdate)) >= 0 and DATE_PART('year', age(current_date, user.birthdate)) <= 18);
This query is giving 5000 counts (Fake result)
but this query that is supposed to do the same:
select count(distinct user.id) from user
left join sometable on sometable.id = user.someTableId
left join anotherTable on sometable.anotherTableId = anotherTable.id
where (sometable.id = 'x' or user.public = true)
and (user.birthdate between '2002-08-26' and current_date)
SIDE NOTE: date '2002-08-26' is because today is 2020-08-26, so I subtracted 18 years from today's date.
is giving me a different count from the first one. (This last one, is giving the correct one, since is the same that I've in another nosql database)
I would like to know what's the difference in the queries or why the counts are different.
Thanks in advance.

In your first query, you are including everyone who has not yet turned 19.
In your second query, you are excluding a bunch of 18 year old's who were born prior to 2002-08-26. For example, someone born on 2002-04-12 is still 18 years old. She won't turn 19 until 2021-04-12.
Easiest way to write in postgres is this, which provides same results as your first query:
where extract(year from age(now(), birthdate)) <= 18
If you really want to use the format of your 2nd query, then change your line to:
where (birth_date between '2001-08-27' and current_date)

Related

How to calculated Working Hours over 17 week period

I am trying to work out How many field engineers work over 48 hours over a 17 week period. (by law you cannot work over 48 hours over a 17 week period)
I managed to run the query for 1 Engineer but when I run it without an Engineer filter my query is very slow.
I need to get the count of Engineers working over 48 hours and count under 48 hours then get an Average time worked per week.
Note: I am doing a Union on SPICEMEISTER & SMARTMEISTER because they are our old and new databases.
• How many field engineers go over the 48 hours
• How many field engineers are under the 48 hours
• What is the average time worked per week for engineers
SELECT DS_Date
,TechPersNo
FROM
(
SELECT DISTINCT
SMDS.EPL_DAT as DS_Date
,EN.pers_no as TechPersNo
FROM
[SpiceMeister].[FS_OTBE].[EngPayrollNumbers] EN
INNER JOIN
[SmartMeister].[Main].[PlusDailyKopf] SMDS
ON RIGHT(CAST(EN.[TechnicianID] AS CHAR(10)),5) = SMDS.PRPO_TECHNIKERNR
WHERE
SMDS.EPL_DAT >= '2017-01-01'
and
SMDS.EPL_DAT < '2018-03-01'
UNION ALL
SELECT DISTINCT
SPDS.DailySummaryDate as DS_Date
,EN.pers_no as TechPersNo
FROM
[SpiceMeister].[FS_OTBE].[EngPayrollNumbers] EN
INNER JOIN
[SpiceMeister].[FS_DS_BO].[DailySummaryHeader] SPDS
ON EN.TechnicianID = SPDS.TechnicianID
WHERE
SPDS.DailySummaryDate >= '2018-03-01'
) as Techa
where TechPersNo = 850009
) Tech
cross APPLY
Fast results
The slowness is definitely due to the use of cross apply with a correlated subquery. This will force the computation on a per-row basis and prevents SQL Server from optimizing anything.
This seems more like it should be a 'group by' query, but I can see why you had trouble making it up on account of the complex cumulative calculation in which you need output by person and by date, but the average involves not the date in question but a date range ending on the date in question.
What I would do first is make a common query to capture your base data between the two datasets. That's what I do in the 'dailySummaries' common table expression below. Then I would join dailySummaries onto itself matching by the employee and selecting the date range required. From that, I would group by employee and date, aggregating by the date range.
with
dailySummaries as (
select techPersNo = en.pers_no,
ds_date = smds.epl_dat,
dtDif = datediff(minute, smds.abfahrt_zeit, smds.rueck_zeit)
from smartMeister.main.plusDailyKopf
join smartMeister.main.plusDailyKopf smds
on right(cast(en.technicianid as char(10)),5) = smds.prpo_technikernr
where smds.epl_dat < '2018-03-01'
union all
select techPersNo = en.pers_no,
dailySummaryDate,
datediff(minute,
iif(spds.leaveHome < spds.workStart, spds.leaveHome, spds.workStart),
iif(spds.arrivehome > spds.workEnd, spds.arrivehome, spds.workEnd)
)
from spiceMeister.fs_ds_bo.dailySummaryHeader spds
join spiceMeister.fs_ds_bo.dailySummaryHeader spds
on en.TechnicianID = spds.TechnicianID
where spds.DailySummaryDate >= '2018-03-01'
)
select ds.ds_date,
ds.techPersNo,
AvgHr = convert(real, sum(dsPrev.dtDif)) / (60*17)
from dailySummaries ds
left join dailySummaries dsPrev
on ds.techPersNo = dsPrev.techPersNo
and dsPrev.ds_date between dateadd(day,-118, ds.ds_date) and ds.ds_date
where ds.ds_date >= '2017-01-01'
group by ds_date,
techPersNo
order by ds_date
I may have gotten a thing or two wrong in translation but you get the idea.
In the future, post a more minimal example. The union of the two datasets from the separate databases is not central to the problem you were trying to ask about. A lot of the date filterings aren't core to the question. The special casting in your join match logic is not important. These things cloud the real issue.

Progress date comparision

I am trying to make a query in Progress. I should select all records older than exactly one year, so the current date minus 1 year. I have tried several possibilities but became every time an error. The query belongs to a join and should take every record of the previous year up to the current date minus one year:
left outer join data.pub."vc-669" as det2
on deb.cddeb = det2.cddeb
and det2.jaar = year(curdate()) - 1
and det2."sys-date" < date(month(curdate()), day(curdate()), year(curdate()) - 1)
That should simply be:
and det2."sys-date" < add-interval( curdate(), - 1, 'year' )
(As this already deals with the year, there is no need to look at det2.jaar, too.)
https://documentation.progress.com/output/ua/OpenEdge_latest/index.html#page/dvref/add-interval-function.html

SQL query getting multiple where-claused aliases

Hoping you can help with this issue.
I have an energymanagement software running on a system. The data logged is the total value, logged in the column Value. This is done every hour. Along is some other data, here amongst a boolean called Active and an integer called Day.
What I'm going for, is one query that gets me the a list of sorted days, the total powerusage of the day, and the peak-powerusage of the day.
The peak-power usage is counted by using Max/Min of the value where Active is present. Somedays, however, the Active bit isn't set, and the result of this query alone would yield NULL.
This is my query:
SELECT
A.Day, A.Forbrug, B.Peak
FROM
(SELECT
Day, Max(Value) - Min(Value) AS Forbrug
FROM
EL_HT1_K
WHERE
MONTH = 8 AND YEAR = 2016
GROUP By Day) A,
(SELECT
Day, Max(Value) - Min(Value) AS Peak
FROM
EL_HT1_K
WHERE
Month = 8 AND Year = 2016 AND Active = 1
GROUP BY Day) B
WHERE
A.Day = B.Day
Which only returns the result where query B (Peak-usage) would yield results.
What I want, is that the rest of the results from inner query A, still is shown, even though query B yields 0/null for that day.
Is this possible, and how?
FYI. The reason I need this to be in one query, is that the scada system has some difficulties handling multiple queries.
I think you just want conditional aggregation. Based on your description, this seems to be the query you want:
SELECT Day, SUM(Value) as total,
MAX(CASE WHEN Active = 1 THEN Value END) as Peak,
FROM EL_HT1_K
WHERE Month = 8 AND Year = 2016
GROUP BY Day;

How to have GROUP BY and COUNT include zero sums?

I have SQL like this (where $ytoday is 5 days ago):
$sql = 'SELECT Count(*), created_at FROM People WHERE created_at >= "'. $ytoday .'" AND GROUP BY DATE(created_at)';
I want this to return a value for every day, so it would return 5 results in this case (5 days ago until today).
But say Count(*) is 0 for yesterday, instead of returning a zero it doesn't return any data at all for that date.
How can I change that SQLite query so it also returns data that has a count of 0?
Without convoluted (in my opinion) queries, your output data-set won't include dates that don't exist in your input data-set. This means that you need a data-set with the 5 days to join on to.
The simple version would be to create a table with the 5 dates, and join on that. I typically create and keep (effectively caching) a calendar table with every date I could ever need. (Such as from 1900-01-01 to 2099-12-31.)
SELECT
calendar.calendar_date,
Count(People.created_at)
FROM
Calendar
LEFT JOIN
People
ON Calendar.calendar_date = People.created_at
WHERE
Calendar.calendar_date >= '2012-05-01'
GROUP BY
Calendar.calendar_date
You'll need to left join against a list of dates. You can either create a table with the dates you need in it, or you can take the dynamic approach I outlined here:
generate days from date range

Confused on count(*) and self joins

I want to return all application dates for the current month and for the current year. This must be simple, however I can not figure it out. I know I have 2 dates for the current month and 90 dates for the current year. Right, Left, Outer, Inner I have tried them all, just throwing code at the wall trying to see what will stick and none of it works. I either get 2 for both columns or 180 for both columns. Here is my latest select statement.
SELECT count(a.evdtApplication) AS monthApplicationEntered,
count (b.evdtApplication) AS yearApplicationEntered
FROM tblEventDates a
RIGHT OUTER JOIN tblEventDates b ON a.LOANid = b.loanid
WHERE datediff(mm,a.evdtApplication,getdate()) = 0
AND datediff(yy,a.evdtApplication, getdate()) = 0
AND datediff(yy,b.evdtApplication,getdate()) = 0
You don't need any joins at all.
You want to count the loanID column from tblEventDates, and you want to do it conditionally based on the date matching the current month or the current year.
SO:
SELECT SUM( CASE WHEN Month(a.evdtApplication) = MONTH(GEtDate() THEN 1 END) as monthTotal,
count(*)
FROM tblEventDates a
WHERE a.evdtApplication BETWEEN '2008-01-01' AND '2008-12-31'
What that does is select all the event dates this year, and add up the ones which match your conditions. If it doesn't match the current month it won't add 1. Actually, don't even need to do a condition for the year because you're just querying everything for that year.