I am designing a database for use with a Ruby on Rails application. For a given object, I need to access the date of an event in both the Gregorian format and the Hebrew calendar equivalent. I can easily convert between the two formats, but the issue is that in the Hebrew calendar, the date changes at sunset, not midnight. Therefore, I'll need to either store two separate dates, or store a Gregorian date and a separate boolean field, after_sunset. Then, whenever I need to access the Hebrew date, I'll need to query for both fields, convert the date, and if after_sunset==true, increment the date.
Which of these options is considered "better"?
And, if I store the Hebrew date separately, is it best to store it as a String, an Integer, or can I use a regular Date?
With an after_sunset flag you store a Gregorian date and add all the additional information needed to know the Hebrew date.
With two dates you would store the two dates explicitely. However, to have data consistent you would install a check constraint to ensure that the dates match. This is because the two dates share part of their information (redundancy). This means the data is not normalized.
For this reason, to have data normalized in your database (and thus not having to install a check constraint to keep the data consistent) the first approach is better. Store the date plus an after- sunset flag.
Store the date in UTC and also store in unix format
You can use conversion function based on the type
This will allow your database to support other date time formats easily in the future
Unless you are going back to the dawn of time, I think I would simply have a many-year lookup table of UTC datetimes and Hebrew dates where the UTC column is the first second of the Hebrew day in a specific time zone (Greenwich?).
Conversions are a quick binary search,
SELECT hebrew_date FROM hebrew_gregorian_lookup
WHERE some_input_time >= gregorian_cutoff
ORDER BY gregorian_cutoff DESC LIMIT 1;
If you index and cluster the lookup table on gregorian_cutoff, it should be very quick, even for 100 years. (If your RDBMS has a way to force a table into RAM, even better.) Also depending on your RDBMS, you may be able to wrap this in a function/procedure with no loss of efficiency.
I suggest storing the Hebrew date not as a string but as a record of three shorts, day, month, year. You can have a tiny lookup table for month to string, or perhaps use an enumeration. That will give you some flexibility in formatting, e.g., Hebrew characters vs. Latin in the output.
Related
Lets say I have a table Student with columns Name,DOJ,TOJ.
Inorder to enter date in mm/dd/yyyy format and timestamp in the format hh24:mm:ss I used ALTER SESSION SET NLS_DATE_FORMAT='MM/DD/YYYY' and ALTER NLS_TIMESTAMP_FORMAT='HH24:MI:SS' but i want to know an alternative solution to enter in this format without involving session. Please guide me through this.
CLICK HERE TO VIEW COLUMNS AND THEIR DATA TYPES
We store dates/times either in a DATE column (which is Oracle's inappropriate name for a datetime) or a TIMESTAMP column (which has more precision and can handle timezones, too). These types have no format. This is important, because thus comparing and sorting them in the database works fine, because the DBMS knows how to handle datetimes, while the users see the date in their format. I, for instance, have set my Windows to German, so I will see the datetimes in some German format, while you will see the same date in a format you are used to.
There are few situations where you want to store date and time separately. The reason is typically that you can set them null. A date without a time means "the whole day". A time without a date means "every day this time". But then you often want this even more advanced anyway ("every Tuesday and Wednesday", "every December 24", etc.) for which you need soemthing more sophisticated then just date and time split into two.
So, usually we just store date and time combined. For a precision down to seconds we use DATE. If we wanted them separately we'd have to think of an appropriate data type, because Oracle has no real date type and no time type either. So we'd either use DATE and ignore the date part or the time part or we use strings. The former solution is usually preferred, because you cannot mistakenly enter invalid data like February 30 or 23:66:00.
If you want to store formatted dates and times, you are talking about strings. I don't recommend this. Strings are not the appropriate data types for dates and times. And a format '01/02/2000' is ambiguous. Some people will read this as February 1, others as January 2.
If you wanted to do this, you would have to change the column types to VARCHAR2 and simply store values like '02/25/2021' and '13:28:56'.
But if you wanted to sort your data or compare dates/times then or just show them to a foreign user in a format they are used to, you would always have to convert them. E.g.:
select *
from mytable
order by to_date(doj || ' ' || toj, 'mm/dd/yyyy hh24:mi:ss');
I am afraid that to change default format you have no other option but to change NLS_DATE_FORMAT either in database level or session level.
But If your purpose is to show the date in a specific format in the front end then you can use to_char() function as below:
SELECT NAME, to_CHAR(DOJ,'dd/mm/yyyy'),to_CHAR(TOJ,'HH24:MI:SS') FROM table
To change the default date format in system level:
ALTER SYSTEM SET NLS_DATE_FORMAT='DD/MM/YYYY' scope=both;
You can also change the default date format at startup time with a login trigger to change the nls_date_format:
CREATE OR REPLACE TRIGGER CHANGE_DATE_FORMAT
AFTER LOGON ON DATABASE
call DBMS_SESSION.SET_NLS('NLS_DATE_FORMAT','YYYYMMDD');
I need verify that all cells in column contain data in only date format. How it possible to verify?
*I think it isn't LIKE function.
DATE doesn't have any format. What you see is for display purpose so that it could be easily interpreted.
DATE datatype is stored in a proprietary format internally in 7 bytes. It is a bad idea and makes no sense to verify the format while date is stored in an internal format. As I said, format is only for display.
If the date column is not a DATE data type, then it is a design flaw. And, any application based on such a flawed database design is on the verge to break anytime.
Storing DATE values other than date data type is just like not understanding the basics.
You should first fix the design to get a permanent solution. Any solution to your question is just another workaround.
Let me show a small example how it creates even more confusion.
The following date :
01/02/2015
Is it:
1st Feb 2015 or,
2nd Jan 2015
There is no way to tell that. It could be either DD or MM. This being just one among so many other problems due to the incorrect data type.
Store date values as DATE data type only, period.
Based on your last question, I think you are looking for something like this:
SELECT COUNT(*) FROM ...
WHERE NOT REGEXP_LIKE (A, '^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$')
If count is greater than zero, something doesn't match. If you want more detail on what doesn't match, change your SELECT clause appropriately.
If you are looking for multiple date formats, you can change your regular expression appropriately. The | operator in most flavors of regular expression, including Oracle's, lets you define multiple patterns in the same space. You might use something like
SELECT COUNT(*) FROM ...
WHERE NOT
REGEXP_LIKE (A,
'^XXX/MOSCOW/XXXMSX/[0-9]{4}-[0-9]{2}-[0-9]{2}$|^[0-9]{4}-[0-9]{2}-[0-9]{2}$')
adding as many different matching patterns as you need.
Try
SELECT *
FROM POL
WHERE NOT REGEXP_LIKE(TR_KRY, '^(0[1-9]|([1-2][0-9])|30|31)-(([0][1-9])|10|11|12)-[0-9]{4}$')
This will return you all rows where TR_KRY is not formatted as 'DD-MM-YYYY', where DD is '01'-'31', MM is '01'-'12', and YYYY is any four numeric digits.
As others have said, storing dates as character strings is not a good idea. In the field you're looking at, it might be that the date is stored as DD-MM-YYYY (day-month-year - the usual case in Europe and perhaps elsewhere), or it might be that the date is stored as MM-DD-YYYY (month-day-year - a common practice in the US). If possible, I suggest you should convert this field to the DATE data type so that the TO_CHAR function can be used to produce a text version of the date in whatever format is desired.
Given the example data you've shown in comments (and that's also not good practice - you should go back and edit the question when you want to include additional information) it appears the dates are formatted as DD-MM-YYYY and I've set up the regular expression above to deal with this as best as possible.
Does SQLs built-in DateTime type has any merits over nvarchar type?
If it were you , which one would you use?
I need to store dates in my SQLServer database and I'm curious to know which one is better and why it is better.
I also want to know what happens if I for example store dates as string literals (I mean nvarchar )? Does it take longer to be searched? Or they are the same in terms of performance ?
And for the last question. How can I send a date from my c# application to the sql field of tye DateTime? Is it any different from the c#s DateTime ?
You're given a date datetype for a reason, why would you not use it?
What happens when you store "3/2/2012" in a text field? Is it March 2nd? Is it February 3rd?
Store the date in a date or datetime field, and do any formatting of the date after the fact.
EDIT
If you have to store dates like 1391/7/1, your choices are:
Assuming you're using SQL Server 2008 or greater, use the datetime2 data type; it allows dates earlier than 1753/01/01 (which is what datetime stops at).
Assuming you're using SQL Server 2005 or earlier, store the dates as Roman calendar dates, and then in your application, use date/time functions to convert the date and time to the Farsi calendar.
Use the correct datatype (date/datetime/datetime2 dependant on version and requirement for time component).
Advantages are more compact storage than storing as a string (especially nvarchar as this is double byte). Built in validation against invalid dates such as 30 February. Sorts correctly. Avoids the need to cast it back to the correct datatype anyway when using date functions on it.
If I'm storing a DateTime value, and I expect to perform date-based calculcations based on it, I'll use a DateTime.
Storing Dates as strings (varchars) introduces a variety of logistical issues, not the least of which is rendering the date in a proper format. Again, that bows in favor of DateTime.
I would go with the DateTime since you can use various functions on it directly.
string wouldn't be too much of a hassle but you will have to cast the data each time you want to do something with it.
There is no real performance variance while searching on both type of fields so going with DateTime is better than strings when working with date values.
you must realise the datetime datatype like other datatypes is provided for a reason and you should use the datatype that represents your data clearly.. Besides this you gain all the functionalities/operations that are special to the datetime datatype..
One of the biggest gains is correct sorting of data which will not be possible directly if you use nvarchar as your datatype.. Even if you think you dont need sorting right now there will be a time in the future where this will be useful.
Also date validation is something that you will benefit from. There is no confusion of the dateformat stored i.e dd/mm or mm/dd etc..
There is lot discussed about the subject. There is good post on the SQLCentral forum about this particular subject DateTime or nvarchar.
In short, nvarchar is twice as longer as datetime, so it takes more space and on the long range, any action affecting it will be slower. You will have some validation issues and many more.
How would you store a time or time range in SQL?
It won't be a datetime because it will just be let's say 4:30PM (not, January 3rd, 4:30pm).
Those would be weekly, or daily meetings.
The type of queries that I need are of course be for display, but also later will include complex queries such as avoiding conflicts in schedule.
I'd rather pick the best datatype for that now.
I'm using MS SQL Server Express 2005.
Thanks!
Nathan
Personally I would find this a reason to upgrade to 2008 which has a separate time datatype.
I would recommend still using a DateTime data type and ignoring the date values--ideally using the static MinDate for SQL (Google it). This will give you the benefits of working with a strongly typed field and the only cost will be a few extra bytes.
As for ranges, store them in two separate columns. Then you can subtract one from the other to determine the difference.
Edit: did some Googling.
SQL Server 2008 adds a Time data type, so you might want to consider that.
You can use SQL 2005's DateTime type and combine it with the CONVERT function to extract just the HH:MM:SS.MMM
SELECT CONVERT(VARCHAR(12), GETDATE(), 114) AS [HH:MI:SS(24H)] (Found on this handy-dandy page)
Different SQL versions support different minimum dates. You could use a static date that will be supported by all such as 1/1/2000, or you could use SQL 2005's minimum value of 1/1/1753 and append the time values to that startic day
So if you stick with 2005, pick your static date, like 1/1/2000, and store your times on it. So 1m:30s would be 2000-1-1 00:01:30.000, and 1h:15m would be 2000-1-1 01:15:00.000
You can then do Date2 - Date1 and get your result of (1h:15:m - 1m:30s) 2000-01-01 01:13:45.000. CONVERT it and you'll have 1:13:45.
You could store it as an int as 24 hour time and format as needed.
Or store it as a datetime with some fixed date and remove it as needed for display:
Jan 1 2000 4:30PM
I would go with datetime field as it gives you the power of all the datetime related functionality.
You might want to consider storing it as an int column representing the number of minutes since midnight. In your entity you could expose this as a TimeSpan (or int) representing the same thing. You'd only need to convert between your display values (time format) and the database value (minutes) in order to perform your queries and this could easily be done in your entity (TimeSpan.TotalMinutes, for example).
to me it sounds like you're developing a type of meeting scheduler or something to display the meetings.
i think that i would set it p with 2 columns MeetingStart and MeetingEnd, both as datetime fields. This way, you can determine the length of the meeting, and since you already have the date you can easily use it to display it on a calendar or something.
I'm supporting an existing application written by another developer and I have a question as to whether the choices the data type the developer chose to store dates is affecting the performance of certain queries.
Relevant information: The application makes heavy use of a "Business Date" field in one of our tables. The data type for this business date is nvarchar(10) rather than a datetime data type. The format of the dates is "MM/DD/YYYY", so Christmas 2007 is stored as "12/25/2007".
Long story short, we have some heavy duty queries that run once a week and are taking a very long time to execute.
I'm re-writing this application from the ground up, but since I'm looking at this, I want to know if there is a performance difference between using the datetime data type compared to storing dates as they are in the current database.
You will both save disk-space and increase performance if you use datetime instead of nvarchar(10).
If you use the date-fields to do date-calculation (DATEADD etc) you will see a massive increase in query-execution-speed, because the fields do not need to be converted to datetime at runtime.
Operations over DATETIMEs are faster than over VARCHARs converted to DATETIMEs.
If your dates appear anywhere but in SELECT clause (like, you add them, DATEDIFF them, search for them in WHERE clause etc), then you should keep them in internal format.
There are a lot of reasons you should actually use DateTime rather than a varchar to store a date. Performance is one... but i would be concerned about queries like this:
SELECT *
FROM Table
WHERE DateField > '12/25/2007'
giving you the wrong results.
I cannot back this up with numbers, but the datetime-type should be a lot faster, since it can easily be compared, unlike the varchar. In my opinion, it is also worth a shot to look into UNIX timestamps as your data type.
I believe from an architectural perspective a Datetime would be a more efficient data type as it would be stored as a two 4-byte integers, whereas your nvarchar(10) will be stored as up to 22 bytes (two times the number of characters entered + 2 bytes.). Therefore potentially more than double the amount of storage space is required now in comparison to using a Datetime.
This of course has possible implications for indexing, as the smaller the data item, the more records you can fit on an index data page. This in turn produces a smaller index which is of course quicker to traverse and therefore will return results faster.
In summary, Datetime is the way to go.
The date filtering in the nvarchar field is not easy possible, as the data in the index is sorted lexicographically which doesn't match the sorting you would expect for the date. It's the problem with the date format "mm/dd/yyyy". That means "12/25/2007" will be after "12/01/2008" in a nvarchar index, but that's not what you want. "yyyy/mm/dd" would have been fine.
So, you should use a date field and convert the string values to date. You will surely get a big performance boost. That's if you can change the table schema.
Yes. datetime will be far more efficient for date calculations than varchar or nvarchar (why nvarchar - there's no way you've got real unicode in there, right?). Plus strings can be invalid and misinterpreted.
If you are only using the date part, your system may have a smaller date-only version of datetime.
In addition, if you are just doing joins and certain types of operations (>/</= comparisions but not datediff), a date "id" column which is actually an int of the form yyyymmdd is commonly used in datawarehouses. This does allow "invalid" dates, unfortunately, but it also allows more obvious reserved, "special", dates, whereas in datetime, you might use NULL of 1/1/1900 or something. Integrity is usually enforced through a foerign key constraint to a date "dimension."
Seeing that you tagged the question as "sql server", I'm assuming you are using some version of SQL Server, so I recommend that you look at either using datetime or smalldatetime. In addition, in SQL Server 2008, you have a date type as well as a datetime2 with a much larger range. Check out this link which gives some details
One other problem with using varchar (or any other string datatype) is that the data likely contains invalid dates as they are not automatically validated on entry. If you try to chang e the filed to a datetime field, you amay have conversion problems wher people have added dates such as ASAP, Unknown, 1/32/2009, etc. You willneed to check for dates that won't convert using the handy isdate function and either fix or null them out before you try to chnge the data type.
Likely you also have a lot of code that converts the varchar type to date datatype on the fly so that you can do date math as well. All that code will also need to be fixed.
Chances are the datetime type is both more compact and faster, but more importantly using DATETIMES to store a date and time is a better architecture choice. You're less likely to run into weird problems looking for records between a certain date range and most database libraries will map them to your languages Date type, so the code is much cleaner, which is really much more important in the long run.
Even if it were slower, you'd spend more time debugging the strings-as-dates than all your users will ever see in savings combined.