How best store year, month, and day in a MySQL database so that it would be easily retrieved by year, by year-month, by year-month-day combinations?
Let's say you have a table tbl with a column d of type DATE.
All records in 1997:
SELECT * FROM tbl WHERE YEAR(d) = 1997
SELECT * FROM tbl WHERE d BETWEEN '1997-01-01' AND '1997-12-31'
All records in March of 1997:
SELECT * FROM tbl WHERE YEAR(d) = 1997 AND MONTH(d) = 3
SELECT * FROM tbl WHERE d BETWEEN '1997-03-01' AND '1997-03-31'
All records on March 10, 1997:
SELECT * FROM tbl WHERE d = '1997-03-10'
Unless a time will ever be involved, use the DATE data type. You can use functions from there to select portions of the date.
I'd recommend the obvious: use a DATE.
It stores year-month-day with no time (hour-minutes-seconds-etc) component.
Store as date and use built in functions:day(), month() or year() to return the combination you wish.
What's wrong with DATE? As long as you need Y, Y-M, or Y-M-D searches, they should be indexable. The problem with DATE would be if you want all December records across several years, for instance.
This may be related to the problem that archivists have with common date datatypes. Often, you want to be able to encode just the year, or just the year and the month, depending on what information is available, but you want to be able to encode this information in just one datatype. This is a problem which doesn't apply in very many other situations. (In answer to this question in the past, I've had techie types dismiss it as a problem with the data: your data is faulty!)
e.g., in a composer catalogue you are recording the fact that the composer dated a manuscript "January 1951". What can you put in a MySQL DATE field to represent this? "1951-01"? "1951-01-00"? Neither is really valid. Normally you end up encoding years, months and days in separate fields and then having to implement the semantics at application level. This is far from ideal.
If you're doing analytics against a fixed range of dates consider using a date dimension (fancy name for table) and use a foreign key into the date dimension. Check out this link:
http://www.ipcdesigns.com/dim_date/
If you use this date dimension consider how easily it will be to construct queries against any kind of dates you can think of.
SELECT * FROM my_table
JOIN DATE_DIM date on date.PK = my_table.date_FK
WHERE date.day = 30 AND
date.month = 1 AND
date.year = 2010
Or
SELECT * FROM my_table
JOIN DATE_DIM date on date.PK = my_table.date_FK
WHERE date.day_of_week = 1 AND
date.month = 1 AND
date.year = 2010
Or
SELECT *, date.day_of_week_name FROM my_table
JOIN DATE_DIM date on date.PK = my_table.date_FK
WHERE date.is_US_civil_holiday = 1
Related
I am merging 2 huge tables in Snowflake and I have 2 columns (one on each table):
"Year_birth" and "Exam_date" and the info inside looks like this respectively:
"1918" and "2007-03-13" (NUMBER(38,0) and VARCHAR(256))
I only want to merge the rows where the difference (i.e., age when the exam was made) is ">18" and "<60"
I was playing around with SELECT DATEDIFF(year,Exam_date, Year_birth) with no success.
Any ideas on how would I do it in Snowflake?
Cheers!
You only have a year, so there is not much you can do about the specific day of the year -- you need to deal with approximations.
So, extract the year from the date string (arggh! it should really be a date) and just compare them:
where (left(datestr, 4)::int - yearnum) not between 18 and 60
I would strongly advise you to fix the database and store these values using a proper date datatype.
You will need to convert the integer year into a date before doing a datediff
example:
set YearOfBirth = 1918;
set ExamDate = '2007-03-03'::DATE;
-- select $YearofBirth as YearofBirth, $ExamDate as ExamDate;
select $YearofBirth as YearofBirth,($YearofBirth::TEXT||'-01-01')::DATE as YearofBirthDate, $ExamDate as ExamDate, datediff(year,($YearofBirth::TEXT||'-01-01')::DATE,$ExamDate) as YearsSinceExam;
USE YEARS_DIFF IN WHERE CLAUSE TO FILTER DIFFERENCE BETWEEN 18 & 60
SELECT DATEDIFF( YEAR,'2007-03-03',TO_DATE(2018::CHAR(4),'YYYY')) YEARS_DIFF;
I have a table storing a datetime column, which is indexed. I'm trying to find a way to compare ONLY the month and day (ignores the year totally).
Just for the record, I would like to say that I'm already using MONTH() and DAY(). But I'm encountering the issue that my current implementation uses Index Scan instead of Index Seek, due to the column being used directly in both functions to get the month and day.
There could be 2 types of references for comparison: a fixed given date and today (GETDATE()). The date will be converted based on time zone, and then have its month and day extracted, e.g.
DECLARE #monthValue DATETIME = MONTH(#ConvertDateTimeFromServer_TimeZone);
DECLARE #dayValue DATETIME = DAY(#ConvertDateTimeFromServer_TimeZone);
Another point is that the column stores datetime with different years, e.g.
1989-06-21 00:00:00.000
1965-10-04 00:00:00.000
1958-09-15 00:00:00.000
1965-10-08 00:00:00.000
1942-01-30 00:00:00.000
Now here comes the problem. How do I create a SARGable query to get the rows in the table that match the given month and day regardless of the year but also not involving the column in any functions? Existing examples on the web utilise years and/or date ranges, which for my case is not helping at all.
A sample query:
Select t0.pk_id
From dob t0 WITH(NOLOCK)
WHERE ((MONTH(t0.date_of_birth) = #monthValue AND DAY(t0.date_of_birth) = #dayValue))
I've also tried DATEDIFF() and DATEADD(), but they all end up with an Index Scan.
Adding to the comment I made, on a Calendar Table.
This will, probably, be the easiest way to get a SARGable query. As you've discovered, MONTH([YourColumn]) and DATEPART(MONTH,[YourColumn]) both cause your query to become non-SARGable.
Considering that all your columns, at least in your sample data, have a time of 00:00:00 this "works" to our advantage, as they are effectively just dates. This means we can easily JOIN onto a Calendar Table using something like:
SELECT dob.[YourColumn]
FROM dob
JOIN CalendarTable CT ON dob.DateOfBirth = CT.CalendarDate;
Now, if we're using the table from the above article, you will have created some extra columns (MonthNo and CDay, however, you can call them whatever you want really). You can then add those columns to your query:
SELECT dob.[YourColumn]
FROM dob
JOIN CalendarTable CT ON dob.DateOfBirth = CT.CalendarDate
WHERE CT.MonthNo = #MonthValue
AND CT.CDay = #DayValue;
This, as you can see, is a more SARGable query.
If you want to deal with Leap Years, you could add a little more logic using a CASE expression:
SELECT dob.[YourColumn]
FROM dob
JOIN CalendarTable CT ON dob.DateOfBirth = CT.CalendarDate
WHERE CT.MonthNo = #MonthValue
AND CASE WHEN DATEPART(YEAR, GETDATE()) % 4 != 0 AND CT.CDat = 29 AND CT.MonthNo = 2 THEN 28 ELSE CT.Cdat END = #DayValue;
This treats someone's birthday on 29 February as 28 February on years that aren't leap years (when DATEPART(YEAR, GETDATE()) % 4 != 0).
It's also, probably, worth noting that it'll likely be worth while changing your DateOfBirth Column to a date. Date of Births aren't at a given time, only on a given date; this means that there's no implicit conversion from datetime to date on your Calendar Table.
Edit: Also, just noticed, why are you using NOLOCK? You do know what that does, right..? Unless you're happy with dirty reads and ghost data?
I have a table with a date column where date is stored in this format:
2012-08-01 16:39:17.601455+0530
How do I group or group_and_count on this column by month?
Your biggest problem is that SQLite won't directly recognize your dates as dates.
CREATE TABLE YOURTABLE (DateColumn date);
INSERT INTO "YOURTABLE" VALUES('2012-01-01');
INSERT INTO "YOURTABLE" VALUES('2012-08-01 16:39:17.601455+0530');
If you try to use strftime() to get the month . . .
sqlite> select strftime('%m', DateColumn) from yourtable;
01
. . . it picks up the month from the first row, but not from the second.
If you can reformat your existing data as valid timestamps (as far a SQLite is concerned), you can use this relatively simple query to group by year and month. (You almost certainly don't want to group by month alone.)
select strftime('%Y-%m', DateColumn) yr_mon, count(*) num_dates
from yourtable
group by yr_mon;
If you can't do that, you'll need to do some string parsing. Here's the simplest expression of this idea.
select substr(DateColumn, 1, 7) yr_mon, count(*) num_dates
from yourtable
group by yr_mon;
But that might not quite work for you. Since you have timezone information, it's sure to change the month for some values. To get a fully general solution, I think you'll need to correct for timezone, extract the year and month, and so on. The simpler approach would be to look hard at this data, declare "I'm not interested in accounting for those edge cases", and use the simpler query immediately above.
It took me a while to find the correct expression using Sequel. What I did was this:
Assuming a table like:
CREATE TABLE acct (date_time datetime, reward integer)
Then you can access the aggregated data as follows:
ds = DS[:acct]
ds.select_group(Sequel.function(:strftime, '%Y-%m', :date_time))
.select_append{sum(:reward)}.each do |row|
p row
end
I have data like this:
For example, today is on April 2012. Referring to data above, I want to get the data with M_PER = 03-2012 because this month is in the range 03-2012 TO 06-2012.
--EditedIn this case, I wanna get a rate for used currency code. Because today is still in April, and I want to know rate US Dollar (USD) to Indonesia Rupiah (IDR) I must get the data with M_PER = 03-2012 and CRR_CURRENCY_CODE = USD.
The question is what query can retrieve data like that?
Since you seem to be using quarterly values, I would use the TRUNC function with the 'Q' format model. This truncates a date to 1/1/YYYY, 1/4/YYYY, 1/7/YYYY and 1/10/YYYY, i.e. the first day of the quarter.
To fit your model which is the month at the end of the quarter, you would then have to add two months. This assumes that the MONTH_PERIOD column is a SQL date and not some other data type.
Included below is an example, using SYSDATE as the input date.
select *
from your_table
where add_months(trunc(sysdate, 'Q'),2) = month_period;
I use the rownum and order by to get the value.
SELECT * FROM tables WHERE m_per > '04-2012' AND ROWNUM = 1 ORDER BY month_period ASC
How would I go about doing a query that returns results of all rows that contain dates for current year and month at the time of query.
Timestamps for each row are formated as such: yyyy-mm-dd
I know it probably has something to do with the date function and that I must somehow set a special parameter to make it spit out like such: yyyy-mm-%%.
Setting days to be wild card character would do the trick but I can't seem to figure it out how to do it.
Here is a link to for quick reference to date-time functions in mysql:
http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html
Thanks
I think EXTRACT is the function you are looking for:
SELECT * FROM table
WHERE EXTRACT(YEAR_MONTH FROM timestamp_field) = EXTRACT(YEAR_MONTH FROM NOW())
you could extract the year and month using a function, but that will not be able to use an index.
if you want scalable performance, you need to do this:
SELECT *
FROM myTable
WHERE some_date_column BETWEEN '2009-01-01' AND '2009-01-31'
select * from someTable where year(myDt) = 2009 and month(myDt) = 9 and day(myDt) = 12