Selecting employees with birthdays in given range using Oracle SQL - sql

For selecting birthdays between two months where FROMDATE and TODATE are some parameters in a prepared statement I figured something like this:
select
p.id as person_id,
...
...
where e.active = 1
and extract(month from TODATE) >= extract(month from e.dateOfBirth)
and extract(month from e.dateOfBirth) >= extract(month from FROMDATE)
order by extract(month from e.dateOfBirth) DESC,
extract(day from e.dateOfBirth) DESC
How can this be improved to work with days as well?

There's more than one way to search date ranges in Oracle. For your scenario I suggest you turn the month and day elements of all the dates involved into numbers.
select
p.id as person_id,
...
...
where e.active = 1
and to_number (to_char( e.dateOfBirth, 'MMDD'))
between to_number (to_char( FROMDATE, 'MMDD'))
and to_number (to_char( TODATE, 'MMDD'))
order by extract(month from e.dateOfBirth) DESC,
extract(day from e.dateOfBirth) DESC
This won't use any index on the e.dateOfBirth column. Whether that matters depends on how often you want to run the query.
#AdeelAnsari comments:
" I don't like to to_char the predicate, in order to make use of
index"
What index? A normal index on dateOfBirth isn't going to be of any use, because the index entries will lead with the year element. So it won't help you find all the records of people born on any 23rd December.
A function-based index - or in 11g, a virtual column with an index (basically the same thing) - is the only way of indexing parts of a date column.

Do you need maximum performance, and so are willing to make a change to the schema? Or are the number of records so small, and performance relatively un-important, that you want a query that will work with the data as-is?
The simplest and fastest way to do this is to store a second data-of-birth field, but 'without' the year. I put quotes around 'without' because a date can't actually not have a year. So, instead, you just base it on another year.
Re-dating every DoB to the year 2000 is a good choice in my experience. Because it includes a leap-year and is a nice round number. Every DoB and FromDate and ToDate will work in the year 2000...
WHERE
DoB2000 >= FromDate
AND DoB2000 <= ToDate
(This assumes you also index the new field to make the search quick, otherwise you still get a scan, though it MAY be faster than the following alternative anyway.)
Alternatively, you can keep using the EXTRACT pattern. But that will have an unfortunate consequence; it's extremely messy and you will never get an Index Seek, you will always get an Index Scan. This is because of the fact that the searched field is wrapped in a function call.
WHERE
( EXTRACT(month FROM e.DateOfBirth) > EXTRACT(month FROM FromDate)
OR ( EXTRACT(month FROM e.DateOfBirth) = EXTRACT(month FROM FromDate)
AND EXTRACT(day FROM e.DateOfBirth) >= EXTRACT(day FROM FromDate)
)
)
AND
( EXTRACT(month FROM e.DateOfBirth) < EXTRACT(month FROM ToDate)
OR ( EXTRACT(month FROM e.DateOfBirth) = EXTRACT(month FROM ToDate)
AND EXTRACT(day FROM e.DateOfBirth) <= EXTRACT(day FROM ToDate)
)
)

you should be able to use
SELECT * FROM mytbale
where dateofbirth between start_dt and end_dt
alternate:
you can convert dates to the day of the year using:
to_char( thedate, 'ddd' )
then check the range (note this has the same issue as #Dems answer where you should not span the end of the year as in Dec 25th through Jan 10th.)

This is the query that I am using for birtdates in the next 20 days:
SELECT A.BIRTHDATE,
CEIL(ABS (MONTHS_BETWEEN(A.BIRTHDATE, SYSDATE) / 12)) AGE_NOW,
CEIL(ABS (MONTHS_BETWEEN(A.BIRTHDATE, SYSDATE + 20) / 12)) AGE_IN_20_DAYS
FROM USERS A
WHERE
CEIL(ABS (MONTHS_BETWEEN(A.BIRTHDATE, SYSDATE) / 12)) <> CEIL(ABS (MONTHS_BETWEEN(A.BIRTHDATE, SYSDATE + 20) / 12));
ABS (MONTHS_BETWEEN(A.BIRTHDATE, SYSDATE) / 12)) return the age in format 38.9, 27.2, etc
Applying ceil() will give us the difference in years that we need to determinate if this person is near to have a birthdate. Eg.
Age: 29.9
29.9 + (20 days) = 30.2
ceil(30.1) - ceil(29.9) = 1
This is the result of querying at december 16:
BIRTHDATE AGE_NOW AGE_IN_20_DAYS
12/29/1981 35 36
12/29/1967 49 50
1/3/1973 44 45
1/4/1968 49 50

I don't know how this behaves in terms of performance, but I'd try using regular date subtraction. Say, for example, you want birth days between January 1st and July 1st. Anyone who is between 25 and 25.5 would qualify. That is, anyone whose partial age is less than 1/2 year would qualify. The math is easy at 1/2, but it is the same regardless of the window. In other words, "within one month" means +/- 1/12 of a year, within 1 day means +/- 1/365 of a year, and so on.
The year does not matter in this method, so I'll use current year when creating the variable.
Also, I would think of the center of the range, and then do absolute values (either that, or always use a future date).
For example
select personid
from mytable
where abs(mod(dob - target_birth_date)) < 1/52
would give you everyone with a birthday within a week.
I realize this code is barely pseudocode, but is seems like it might allow you to do it, and you might still use indices if you tweaked it a little. Just trying to think outside the box.

At the end we picked litter different solution where we add fist create the anniversary date :
where
...
and (to_char(sysdate,'yyyy') - to_char(e.dateofbirth,'yyyy')) > 0
and add_months(e.dateofbirth,
(to_char(sysdate,'yyyy') - to_char(e.dateofbirth,'yyyy')) * 12)
>:fromDate:
and :toDate: > add_months(e.dateofbirth,
(to_char(sysdate,'yyyy') - to_char(e.dateofbirth,'yyyy')) * 12)
order by extract(month from e.dateofbirth) DESC,
extract(day from e.dateofbirth) DESC)

That's the solution!!!
SELECT
*
FROM
PROFILE prof
where
to_date(to_char(prof.BIRTHDATE, 'DDMM'), 'DDMM') BETWEEN sysdate AND sysdate+5
or
add_months(to_date(to_char(prof.BIRTHDATE, 'DDMM'), 'DDMM'),12) BETWEEN sysdate AND sysdate+5

Guys I have a simpler solution to this problem
step 1. convert the month into number,
step 2. concatenate the day(in two digits) to the month
step 3. then convert the result to number by doing to_number(result)
step 4. Once you have this for the start date and end date, then do a between function on it and u are done.
sample code:
SELECT date_of_birth
FROM xxxtable
where to_number(to_number(to_char(to_date(substr(date_of_birth,4,3),'mon'),'mm'))||substr(date_of_birth,1,2)) between
to_number(to_number(to_char(to_date(substr(:start_date_variable,4,3),'mon'),'mm'))||(substr(:start_date_variable,1,2))) and
to_number(to_number(to_char(to_date(substr(:end_date_variable,4,3),'mon'),'mm'))||(substr(:end_date_variable,1,2)));

Related

SQL : retrieve top 7 of entries added in the past week

I’m new to SQL and I need some help for a query that should return a top of occurrences sorted by date.
Actually I have a table to which are added searches done by users alongside the date of search (column is in the DATETIME format).
What I would like to do is to create a list of « Trending searches » that would show the top 7 searches done over the past week (Sunday to Sunday for example) but I’m unsure where to start.
I’ve heard of the DATEPART function but I don’t know how to use it alongside a top 7 occurrences.
Thanks in advance for your help and have a nice day !
Does this work?
declare #lastweek datetime
declare #now datetime
set #now = getdate()
set #lastweek = dateadd(day,-7,#now)
SELECT COUNT(g.searchTerm) AS appearanceCount, g.searchTerm FROM DesiredTable AS g
WHERE g.DateSearched BETWEEN #lastweek AND #now
GROUP BY(GameTypeId)
ORDER BY (appearanceCount) DESC
The mention of datepart() suggests SQL Server. If so, you can do:
select top (7) s.searchTerm, count(*)
from searches s
where s.searchTime >= dateadd(day, -7, getdate())
group by s.searchTerm
order by count(*) desc;
This gets the last 7 days to the current time.
If you want the last week, a pretty simple where is:
where datediff(week, s.searchTime, getdate()) = 1
This problem is solvable pretty easy with mysql. You are trying to do multiple things, as far as I understood:
1. Get searches from last week
With DATETIME fields there are pretty easy ways to extract the week of the year:
SELECT id FROM searches
WHERE searchDate >= curdate() - INTERVAL DAYOFWEEK(curdate())+6 DAY
AND searchDate < curdate() - INTERVAL DAYOFWEEK(curdate())-1 DAY
As suggested here.
2. Get the top 7 most frequent ones
Secondly you said that you want to have the top 7 searches, which translate into most frequent occurring search terms. In other words: You need to group identical search terms together and count them:
SELECT count(id), searchTerm FROM searches
WHERE searchDate >= curdate() - INTERVAL DAYOFWEEK(curdate())+6 DAY
AND searchDate < curdate() - INTERVAL DAYOFWEEK(curdate())-1 DAY
GROUP BY searchTerm
in addition:
To get the first n (here 7) rows use rownum<=7.
Like this (added to Gegenwinds solution):
SELECT result.* FROM
(SELECT count(id), searchTerm FROM searches
WHERE searchDate >= curdate() - INTERVAL DAYOFWEEK(curdate())+6 DAY
AND searchDate < curdate() - INTERVAL DAYOFWEEK(curdate())-1 DAY
GROUP BY searchTerm) result
ORDER BY count(id) DESC
WHERE rownum<= 7

Retrieve upcoming birthdays in Postgres

I have a users table with a dob (date of birth) field, in a postgres database.
I want to write a query that will retrieve the next five upcoming birthdays. I think the following needs to be considered -
Sorting by date of birth won't work because the years can be different.
You want the result to be sorted by date/month, but starting from today. So, for example, yesterday's date would be the last row.
Ideally, I would like to do this without functions. Not a deal breaker though.
Similar questions have been asked on SO, but most don't even have an accepted answer. Thank you.
Well, this got downvoted a lot. But I'll post my answer anyway. One of the answers helped me arrive at the final solution, and that answer has been deleted by its owner for some reason.
Anyway, this query works perfectly. It gives the next 5 upcoming birthdays, along with the dates.
SELECT id, name,
CASE
WHEN dob2 < current_date THEN dob2 + interval '1 year'
ELSE dob2
END
AS birthday
FROM people,
make_date(extract(year from current_date)::int, extract(month from dob)::int, extract(day from dob)::int) as dob2
WHERE is_active = true
ORDER BY birthday
LIMIT 5;
You can look at day of year of dob and compare against current date's doy:
SELECT doy
, extract(doy from dob) - extract(doy from current_date) as upcoming_bday
FROM users
WHERE extract(doy from dob) - extract(doy from current_date) >= 0
order by 2
limit 5

SQL computing and reusing fiscal year calculation in sql query

I have a condition in my SQL query, using Oracle 11g database, that depends on a plan starting or ending with in a fiscal year:
(BUSPLAN.START_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31')) OR
(BUSPLAN.END_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31'))
For now, I am passing in YEAR as a parameter. It can be computed as (pseudocode):
IF CURRENT MONTH IN (JAN, FEB, MAR):
USE CURRENT YEAR // e.g. 2015
ELSE:
USE CURRENT YEAR + 1 // e.g. 2016
Is there a way I could computer the :YEAR parameter within in an SQL query and reuse it for the :YEAR parameter?
CTEs are easy, you can make little tables on the fly. With a 1 row table you just cross join it and then you have that value available every row:
WITH getyear as
(
SELECT
CASE WHEN to_char(sysdate,'mm') in ('01','02','03') THEN
EXTRACT(YEAR FROM sysdate)
ELSE
EXTRACT(YEAR FROM sysdate) + 1
END as ynum from dual
), mydates as
(
SELECT getyear.ynum || '-04-01' as startdate,
getyear.ynum+1 || '-03-31' as enddate
from getyear
)
select
-- your code here
from BUSPLAN, mydates -- this is a cross join
where
(BUSPLAN.START_DATE BETWEEN mydates.startdate AND mydates.enddate) OR
(BUSPLAN.END_DATE BETWEEN mydates.startdate AND mydates.enddate)
note, values statement is probably better if Oracle has values then the first CTE would look like this:
VALUES(CASE WHEN to_char(sysdate,'mm') in ('01','02','03') THEN
EXTRACT(YEAR FROM sysdate)
ELSE
EXTRACT(YEAR FROM sysdate) + 1)
I don't have access to Oracle so I might have bugs typos etc since I didn't test.
In the code you shared there is a problem and a potential problem.
Problem, implicit conversion to date without format string.
In (BUSPLAN.START_DATE BETWEEN (:YEAR || '-04-01') AND (:YEAR+1 || '-03-31')) two strings are being formed and then converted to dates. The conversion to date is going to change depending on the value of NLS_DATE_FORMAT. To insure that the string is converted correctly to_date(:YEAR || '-04-01', 'YYYY-MM-DD').
Potential problem, boundary at the end of the year when time <> midnight.
Oracle's date type holds both date and time. A test like someDate between startDate and endDate will miss all records that happened after midnight on endDate. One simple fix that precludes use of indexes on someDate is trunc(someDate) between startDate and endDate.
A more general approach is to define date ranges and closed open intervals. lowerBound <= aDate < upperBound where lowerBound is the same asstartDateabove andupperBoundisendDate` plus one day.
Note: Some applications used Oracle date columns as dates and always store midnight, if your application is of that sort, then this is not a problem. And check constraints like check (trunc(dateColumn) = dateColumn) would make sure it stays that way.
And now, to answer the question actually asked.
Using subquery factoring (Oracle's terminology) / common table expression (SQL Server's terminology) one can avoid repetition within a query.
Instead of figuring out the proper year, and then using strings to put together dates, the code below starts by getting January 1 at Midnight of the current calendar year, trunc(sysdate, 'YEAR')). Then it adds an offset in months. When the months are Jan, Feb, Mar, the current fiscal year started last year on 4/1, or nine months before the start of this year. The offset is -9. Else the current fiscal year started 4/1 of this calendar year, start of this year plus three months.
Instead of end date, an upper bound is calculated, similar to lower bound, but with the offsets being 12 greater than lower bound to get 4/1 the following year.
with current_fiscal_year as (select add_months(trunc(sysdate, 'YEAR')
, case when extract(month from sysdate) <= 3 then -9 else 3 end) as LowerBound
, add_months(trunc(sysdate, 'YEAR')
, case when extract(month from sysdate) <= 3 then 3 else 15 end) as UpperBound
from dual)
select *
from busplan
cross join current_fiscal_year CFY
where (CFY.LowerBound <= busplan.start_date and busplan.start_date < CFY.UpperBound)
or (CFY.LowerBound <= busplan.end_date and busplan.end_date < CFY.UpperBound)
And yet more unsolicited advise.
The times I've had to deal with fiscal year stuff, avoiding repetition within a query was low hanging fruit. Having the fiscal year calculations consistent and correct among many queries, that was the essence of the work. So I'd recommend a developing PL/SQL package that centralizes fiscal calculations. It might include a function like:
create or replace function GetFiscalYearStart(v_Date in date default sysdate)
return date
as begin
return add_months(trunc(v_Date, 'YEAR')
, case when extract(month from v_Date) <= 3 then -9 else 3 end);
end GetFiscalYearStart;
Then the query above becomes:
select *
from busplan
where (GetFiscalYearStart() <= busplan.start_date
and busplan.start_date < add_months(GetFiscalYearStart(), 12))
or (GetFiscalYearStart() <= busplan.end_date
and busplan.end_date < add_months(GetFiscalYearStart(), 12))

How to calculate ages in BigQuery?

I have two TIMESTAMP columns in my table: customer_birthday and purchase_date. I want to create a query to show the number of purchases by customer age, to create a chart.
But how do I calculate ages, in years, using BigQuery? In other words, how do I get the difference in years between two TIMESTAMPs? The age calculation cannot be made using days or hours, because of leap years, so the function DATEDIFF(<timestamp1>,<timestamp2>) is not appropriate.
Thanks.
First of all, I'd really love BigQuery to have a function which calculates current age based on a date. That seems to be like a very common use case and it's not really easy due to the whole leap year thing.
I found a great article about this issue: https://towardsdatascience.com/how-to-accurately-calculate-age-in-bigquery-999a8417e973
Their final approach is similar to Lars Haugseth's and Saad's answer, but they do not use the DAYOFYEAR part in order to avoid issues with leap years. It also gives you the flexibility not only to calculate the current age, but also the age at a particular date that you pass to the function as argument:
CREATE OR REPLACE FUNCTION workspace.age_calculation(as_of_date DATE, date_of_birth DATE)
AS (
DATE_DIFF(as_of_date,date_of_birth, YEAR) -
IF(EXTRACT(MONTH FROM date_of_birth)*100 + EXTRACT(DAY FROM date_of_birth) >
EXTRACT(MONTH FROM as_of_date)*100 + EXTRACT(DAY FROM as_of_date)
,1,0)
)
Regarding the difference between dates - you could consider user-defined functions (https://cloud.google.com/bigquery/user-defined-functions) with a JavaScript date library, such as Datejs or Moment.js
You can use DATE_DIFF to get the difference in years, but need to subtract by one if the birthday has not yet occured this year:
IF(EXTRACT(DAYOFYEAR FROM CURRENT_DATE) < EXTRACT(DAYOFYEAR FROM birthdate),
DATE_DIFF(CURRENT_DATE, birthdate, YEAR) - 1,
DATE_DIFF(CURRENT_DATE, birthdate, YEAR)) AS age
Here it is in a user defined function:
CREATE TEMP FUNCTION calculateAge(birthdate DATE) AS (
DATE_DIFF(CURRENT_DATE, birthdate, YEAR) +
IF(EXTRACT(DAYOFYEAR FROM CURRENT_DATE) < EXTRACT(DAYOFYEAR FROM birthdate), -1, 0) -- subtract 1 if bithdate has not yet occured this year
);
You can compute the number of days it would be if all years were 365 days long, take the difference, and divide by 365. For example:
SELECT (day2-day1)/365
FROM (
SELECT YEAR(t1) * 365 + DAYOFYEAR(t1) as day1,
YEAR(t2) * 365 + DAYOFYEAR(t2) as day2
FROM (
SELECT TIMESTAMP('20000201') as t1,
TIMESTAMP('20140201') as t2))
This returns 14.0, even though there are intervening leap years. If you want the final result as an integer instead of floating point, you can use the INTEGER() function to cast the result.
Note that if one of the dates is a leap day (feb 29) it will appear to be one year away from march 1, but I think this sounds like the intended behavior.
Another way to calculate age that takes leap years into account is to:
Calculate simple age based on difference in year
Either subtract 1 or not by:
Add difference in years to birthday (e.g. if today is 2022-12-14 and birthday is 2000-12-30, then the "new" birthday becomes 2022-12-30)
Do a DAY-based difference between today and "new" birthday, which either gives you a positive number (birthday passed for this year) or negative number (still has birthday this year)
Subtract 1 year from simple age calculation if number is negative
In BigQuery SQL code this looks like:
SELECT
bd AS birthday
,today
,DATE_DIFF(today, bd, YEAR) AS simpleAge
,DATE_DIFF(today, bd, YEAR) +
(CASE
WHEN DATE_DIFF(today, DATE_ADD(bd, INTERVAL DATE_DIFF(today, bd, YEAR) YEAR), DAY) >= 0
THEN 0
ELSE -1
END) AS age
FROM
(SELECT
PARSE_DATE("%Y-%m-%d", "2000-12-01") AS bd
,CURRENT_DATE("Asia/Tokyo") AS today
)
Outputs:
birthday
today
simpleAge
age
2000-12-30
2022-12-14
22
21

Timestamps and Intervals: NUMTOYMINTERVAL SYSTDATE CALCULATION SQL QUERY

I am working on a homework problem, I'm close but need some help with a data conversion I think. Or sysdate - start_date calculation
The question is:
Using the EX schema, write a SELECT statement that retrieves the date_id and start_date from the Date_Sample table (format below), followed by a column named Years_and_Months_Since_Start that uses an interval function to retrieve the number of years and months that have elapsed between the start_date and the sysdate. (Your values will vary based on the date you do this lab.) Display only the records with start dates having the month and day equal to Feb 28 (of any year).
DATE_ID START_DATE YEARS_AND_MONTHS_SINCE_START
2 Sunday , February 28, 1999 13-8
4 Monday , February 28, 2005 7-8
5 Tuesday , February 28, 2006 6-8
Our EX schema that refers to this question is simply a Date_Sample Table with two columns:
DATE_ID NUMBER NOT Null
START_DATE DATE
I Have written this code:
SELECT date_id, TO_CHAR(start_date, 'Day, MONTH DD, YYYY') AS start_date ,
NUMTOYMINTERVAL((SYSDATE - start_date), 'YEAR') AS years_and_months_since_start
FROM date_sample
WHERE TO_CHAR(start_date, 'MM/DD') = '02/28';
But my Years and months since start column is not working properly. It's getting very high numbers for years and months when the date calculated is from 1999-ish. ie, it should be 13-8 and I'm getting 5027-2 so I know it's not correct. I used NUMTOYMINTERVAL, which should be correct, but don't think the sysdate-start_date is working. Data Type for start_date is simply date. I tried ROUND but maybe need some help to get it right.
Something is wrong with my calculation and trying to figure out how to get the correct interval there. Not sure if I have provided enough information to everyone but I will let you know if I figure it out before you do.
It's a question from Murach's Oracle and SQL/PL book, chapter 17 if anyone else is trying to learn that chapter. Page 559.
you'll want MONTHS_BETWEEN in that numtoyminterval as the product of subtracting two date variables gives the answer in days which isn't usable to you and the reason its so high is you've told Oracle the answer was in years! Also use the fm modifier on the to_char to prevent excess whitespace.
select date_id,
to_char(start_date, 'fmDay, Month DD, YYYY') as start_date,
extract(year from numtoyminterval(months_between(trunc(sysdate), start_date), 'month') )
|| '-' ||
extract(month from numtoyminterval(months_between(trunc(sysdate), start_date), 'month') )
as years_and_months_since_start
from your_table
where to_char(start_date, 'MM/DD') = '02/28';
You can simplify the answer like this
SELECT date_id, start_date, numtoyminterval(months_between(sysdate, start_date), 'month') as "Years and Months Since Start"
FROM date_sample
WHERE EXTRACT (MONTH FROM start_date) = 2 AND EXTRACT (DAY FROM start_date) = 28;