CONVERT various date-like strings (varchar) to one DATE field - sql

Consider a varchar field (ShipDate) that gets date-like strings written to it. These strings come from multiple third-party systems in various formats (over which I, apparently, have no control =/).
I decided to create a view that converts this varchar field to DATE so that I can query it easily (and filter out some other records / fields that I don't care about).
So far I see two formats coming in: YYYYMMDD (which is fine, I can just a a straight CONVERT) and MM/DD/YYYY, which causes an error:
Conversion failed when converting date and/or time from character string.
This changes my conversion from a simple CONVERT(DATE, ShipDate, 1) to:
CONVERT (DATE,
(CASE
WHEN ShipDate LIKE '_/__/____' THEN SUBSTRING(ShipDate, 6, 4) + '0' + SUBSTRING(ShipDate, 1, 1) + SUBSTRING(ShipDate, 3, 2)--M/DD/YYYY
WHEN ShipDate LIKE '__/_/____' THEN SUBSTRING(ShipDate, 6, 4) + SUBSTRING(ShipDate, 1, 2) + '0' + SUBSTRING(ShipDate, 4, 1)--MM/D/YYYY
WHEN ShipDate LIKE '_/_/____' THEN SUBSTRING(ShipDate, 5, 4) + '0' + SUBSTRING(ShipDate, 1, 1) + '0' + SUBSTRING(ShipDate, 3, 1)--M/D/YYYY
ELSE ShipDate --For the YYYYMMDD dates
END), 1) --End of CONVERT
Is there a better way to do the above SQL statement? I could potentially get even more date-like string formats as time goes on, so the above example could get pretty awful (I tagged this question with regex in case that could reduce the size of the case statement).
Or, is there a way to handle this problem as the records come in, avoiding the view altogether? I'm not too familiar with Triggers / SP's, but if that's a good option I'm willing to go that route =)
Or, some other method that is commonly used to solve this problem? Just curious at this point. I'm a .NET programmer, but end up helping out with SQL work because I have some experience, so I'm pretty new to anything even kind of advanced in SQL.

Don't use the date_style parameter for CONVERT. That's really for converting in the other direction. You should be able to just use: CAST(some_string AS DATE).
You might have some problems if you start getting dates in the DD/MM/YYYY format though. Of course, if they're being all mixed together then there's no way to solve that issue anyway, since even you can't know whether 4/1/2011 is April 1st or January 4th.

If the known formats are always M then D, and the separators are always /, why not just parse for the slashes? Also, why are you using ,1) in your CONVERT? All of the above formats seemed to convert fine for me without it:
WITH x(ShipDate) AS
(
SELECT '5/12/2011'
UNION ALL SELECT '05/5/2012'
UNION ALL SELECT '05/05/2012'
)
SELECT CONVERT (DATE, ShipDate) FROM x;

You say you can work with YYYYMMDD?
But MM/DD/YYYY is giving you problems. Then perhaps you can do this:
CONVERT(varchar(8),CAST('MM/DD/YYYY' as datetime),112) = YYYYMMDD

my reaction would be to add a proper date column, then implement a trigger that does the conversion into that date column.
you could then manually fix up any that failed to convert, and those records would still have values, unlike the view solution.

Related

Compare Date saved as varchar with DateTime

I have a table with a column jsonStr of type varchar.
This is an example of an element in this column
{"Date":"/Date(1602846000000)/","person":"Laura"}
I want to compare this date with a static date. This is my query:
select *
from mytable
where json_value(jsonStr, '$.Date') >= '2020-10-01T00:00:00'
I expected one element to be displayed but no result so how can I convert this date to compare it with DateTime
I tried to remove /Date and / with substring and then Convert / Parse the result which is 1602846000000 but no result
Extracted unixtime value might be converted to datetime format through use of
DATEADD(S, CONVERT(int,LEFT(1602846000000, 10)), '1970-01-01') such as :
WITH t AS
(
SELECT *, JSON_VALUE(jsonStr, '$.Date') AS str
FROM mytable
), t2 AS
(
SELECT t.*,
SUBSTRING(str, PATINDEX('%[0-9]%', str), PATINDEX('%[0-9][^0-9]%', str + 't')
- PATINDEX('%[0-9]%', str) + 1) AS nr
FROM t
)
SELECT t2.jsonStr
FROM t2
WHERE DATEADD(S, CONVERT(int,LEFT(nr, 10)), '1970-01-01') >= '2020-10-01T00:00:00'
Demo
I would reverse this as much as possible. Every bit of work you do for this comparison must done for every row in your table, because we don't know which rows will match until after we do the work. The more we can do to the constant value, rather than all the stored values, the more efficient the query becomes.
Parsing dates out of JSON is stupid expensive to do in the database. We can't get rid of that work completely, but we can at least convert the initial date string into the unix time format before including in the SQL. So this:
'2020-10-01T00:00:00'
becomes this:
1601510400
Now you can do some simpler string manipulation and compare the numbers, without needing to convert the unix time into a date value for every single row.
What that string manipulation will look like varies greatly depending on what version of Sql Server you have. Sql Server 2019 adds some new native JSON support, which could make this much easier.
But either way, you're still better off taking the time to understand the data you're storing. Even when keeping the raw json makes sense, you should have a schema that at least supports basic metadata on top of it. It's difference between using an index or not, which can make multiple orders magnitude difference for performance.
For example, as previously mentioned the query in this question must extract the date value for every row in your table... even the rows that won't match. If you build a schema where the date was identified as meta and extracted during the initial insert, an index could let you seek to just the rows you need. If at this point you still need to extract a value from JSON records, at least it's just for the relevant rows.
I solved the problem using
DATEADD(SECOND, CONVERT(INT, Left(SUBSTRING(JSON_VALUE(jsonStr, '$.EndDate'), 7, 13), 10)), '19700101'

Formatting TSQL date to mm-dd-yyyy

I have a date in Datetime2 format and it is coming up as yyyy-mm-dd. Is there a way to reformat it so it is mm-dd-yyyy?
CASE
WHEN CAST(ai.[Due Date] AS DATETIME2) < GETDATE() THEN '[due]' + LEFT((ai.[Due Date]),10)
WHEN CAST(ai.[Due Date] AS DATETIME2) IS NULL THEN ' '
ELSE LEFT((ai.[Due Date]),10)
END AS [TD]
The traditional way is to use convert():
convert(varchar(10), ai.[Due Date], 110)
A more versatile method uses format():
select format(ai.[Due Date], 'dd-MM-yyyy')
You misunderstand how date values work in a database. There is no human-readable format. When you see DateTime or DateTime2 values formatted as yyyy-mm-dd what you're seeing is something shown by your debugger or query tool for convenience; the actual value used in the database is binary, not human readable, and is intended to be efficient for storage, indexing, and date arithmetic.
If you need to see a value in a specific format, you must convert() or format() it to a type in the varchar family as part of the query. Or, even better, let your application code/reporting tool do this for you, and just return the original value.
I also see indication these dates are potentially stored originally in a varchar, or nvarchar column. If so, it is a major flaw in the schema design. You will get significant performance benefits and save yourself some big headaches down the road if you can start storing these values using a type from the DateTime family in the first place.
With this in mind, and because it's not clear what you're starting from, let's look at five scenarios, in order of preference:
The column already uses a type from the DateTime family, and you can let your application/reporting tool handle the format
Good for you using a real DateTime value in the schema. That's what we expect to see. Even better, suddenly everything gets really simple in your SQL and the entire snippet in the question reduces to just this:
ai.[Due Date] AS [TD]
The column already uses a type from the DateTime family, but the client system can't format
This is still pretty good. The schema is still okay, and in this case we can still simplify the original code somewhat:
COALESCE(
CASE WHEN ai.[Due Date] < GETDATE() THEN '[due] ' ELSE '' END
+ FORMAT(ai.[Due Date], 'MM-dd-yyyy')
, ' ') AS [TD]
The column uses the a type from varchar family, but you can fix it to use DateTime2
I say "fix" here, because now the schema really is broken as is. But that's okay: you can fix it. Do that. Then use the code from a previous scenario.
The column uses the a type from varchar family and you can't fix it, but at least the raw data always uses a semantic 'yyyy-MM-dd` format
Bummer. You're stuck with a broken schema. But we can at least take advantage of the well-formatted data to make things much more efficient by using cast/convert on the get_date() expression to match the column, rather than vice versa as it is now, like this:
WHEN ai.[Due Date] < CONVERT(varchar, GETDATE(), 120)
Now we're doing a string comparison instead of a date comparison, which is generally slower and, well, just wrong. But we can get away with it because of the nice format in the data, and the saving grace is we only need to cast the one get_date() value, rather than every single row we have. Moreover, this way any index on the column would still be valid. The code snippet on the question would be unable to use any index on the [Due Date] column. I know this is a SELECT clause, but this is worth remembering for the general case.
The full solution for this scenario now looks like this:
COALESCE(
CASE WHEN ai.[Due Date] < CONVERT(varchar, GETDATE(), 120) THEN '[due] ' ELSE '' END
+ FORMAT(CAST(ai.[Due Date]) AS Date), 'MM-dd-yyyy')
, ' ') AS [TD]
Again, only do this if you can't get your raw column data into a DateTime format. That is what you really want here.
The column uses the a type from varchar family, you can't fix it, and the format is not semantic or not consistent
Oh boy. This is where you really don't want to be. If you can do nothing else, at least see if you can start getting consistent and semantic values into your column. At this point, we are stuck with doing extra work on every row we have (possibly more than once) for pretty much every query. Here we go:
COALESCE(
CASE WHEN CAST(ai.[Due Date] AS DATETIME2) < GETDATE() THEN '[due] ' ELSE '' END
+ FORMAT(CAST(ai.[Due Date] AS DATETIME2), 'MM-dd-yyyy')
, ' ') AS [TD]
The code doesn't look much different than other options, but the performance characteristics will be extremely different... potentially multiple orders of magnitude worse.
Remember: because of internationalization and time zone issues, converting between strings and dates is surprisingly slow and expensive. Avoid doing that whenever possible in all your queries.

Converting from mmddyyyy to yyyymmdd with SQL

I should preface my question by saying I am very new to SQL (or any programming involving databases). I started learning SQL a couple of weeks ago when I decided to take on a data project.
I have been using SSMS in wrangling large tables in comma-separated text file format. I need to be able to sort by dates, and the current format is mmddyyyy, but when I try to sort by date it does not work.
I know that converting dates is something that gets asked a lot around here, but I haven't found any solutions that explain things for a newb like myself.
So far my guesses for a solution are to use the CONVERT or CAST solutions, but I'm not sure if that is the right approach. I have found several CAST/CONVERT posts but none have really applied to my situation.
I hate to have this as my first question, but I'd thought I'd take some down vote bullets if it means I could figure this out. Thank you.
Sample of what I'm trying to do:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE [ column1] > 01012017;
I get the entire table back, unsorted.
Since your:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE [ column1] > 01012017;
does not error, we could say that the [column1]'s datatype is either a character type (like VARCHAR, char), or datetime.
Since you are getting back all the data and I would think you don't have data in the future, it should be a varchar (or char) - with datetime that means 1900-01-01 + 1012017 days.
To make it a datetime you need to 'cast' your column1 which is in mmddyyyy form by first converting it to yyyymmdd style (which would work under any date and language setting):
cast(right([column1],4)+substring([column1],1,2)+substring([column1],3,2) as datetime)
and you would write that 01012017 as a string (quotes around) and also again in yyyymmdd format (it would be implicitly casted to datetime):
'20170101'
So your SQL becomes:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE cast(right([column1],4) +
substring([column1],1,2) +
substring([column1],3,2) as datetime) > '20170101';
Having a date\datetime column as varchar and using like this would render the ability to use simple indexes but that is another matter. For now, this would return the data you want.
Assuming your column's datatype is [Date], try something similar to:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE FORMAT([column1],'dd/MM/yyyy') >'01012017'
If it's string format, you'll have to use CONVERT() to convert the column to Date with a query like
SELECT *
FROM [databasename].[dbo].[table1]
WHERE CONVERT(NVARCHAR(10), [Column1], 112) >'01012017'
Refer to this W3Schools article if you need more help with the CONVERT clause

How do I display varchar date string with mixed format to another format?

I am using SQL server 2008 R2. I know I can use CONVERT with different format code as the third parameter to do the conversion to DATETIME first and CONVERT again to VARCHAR with another format code to change the display format.
The real problem now is I have mixed raw data in a single column. So my question is how do you write a single SELECT statement to display from mixed YYYY/MM/DD, DD/MM/YYYY all to DD/MM/YYYY?
I tried to use ISDATE() but it think 31/01/2013 is not a date while 01/01/2013 is a date. Now I could only think of to see if the YYYY is on the left or on the right to determine the correct input format, but I dont know how to write it out in a single SELECT statement.
Any procedure to change the format first then do a simple SELECT is not an option. I am not allowed to change the source.
Thank you
Why not just use string manipulations? Something like:
select (case when substring(d, 5, 1) = '/' -- YYYY/MM/DD
then right(d, 2)+'/'+substring(6, 2)+'/'+left(d, 4)
else d
end)
By the way, if you are choosing formats for dates when represented as strings, I highly recommend YYYY-MM-DD (or YYYY/MM/DD) because comparison operators work on them.
If you are sure that only those 2 formats (yyyy/mm/dd and dd/mm/yyyy) exist in the data, then you could probably get away with a CASE statement along the lines of:
CASE
WHEN (SUBSTRING(dateColumn, 5, 1) = '/') THEN CONVERT(datetime, dateColumn, 111)
ELSE CONVERT(datetime, dateColumn, 103)
END

Insert only Month and Year date to SQL table

I am using MS SQLServer and trying to insert a month/year combination to a table like this:
INSERT INTO MyTable VALUES (1111, 'item_name', '9/1998')
apparently, the above command cannot work since
Conversion failed when converting date and/or time from character string.
Because 9/1998 is a bad format. I want to fix this and this column of the table will show something like:
9/1998
12/1998
(other records with month/year format)
...
Can someone help me with this?
thank you
SQL Server only supports full dates (day, month, and year) or datetimes, as you can see over on the MSDN data type list: http://msdn.microsoft.com/en-us/library/ff848733(v=sql.105).aspx
You can use a date value with a bogus day or store the value as a string, but there's no native type that just stores month/year pairs.
I see this is an old post but my recent tests confirm that storing Date or splitting the year and month to two columns (year smallint, month tinyint) results in the overall same size.
The difference will be visible when you actually need to parse the date to the filter you need (year/month).
Let me know what do you think of this solution! :)
Kind regards
You can just use "01" for the day:
INSERT INTO MyTable VALUES (1111, 'item_name', '19980901')
You can:
1) Change the column type to varchar
2) Take the supplied value and convert it to a proper format that sql server will accept before inserting, and format it back to 'M/YYYY' format when you pull the data: SELECT MONTH([myDate]) + '/' + YEAR([myDate]) ...
You may want to consider what use you will have for your data. At the moment, you're only concerned with capturing and displaying the data. However, going forward, you may need to perform date calculations on it (ie, compare the difference between two records). Because of this and also since you're about two-thirds of the way there, you might as well convert this field to a Date type. Your presentation layer can then be delegated with the task of displaying it appropriately as "MM/yyyy", a function which is available to just about any programming language or reporting platform you may be using.
if you want use date type, you should format value:
declare #a date
SELECT #a='2000-01-01'
select RIGHT( convert (varchar , #a, 103), 7) AS 'mm/yyyy'
if you want make query like SELECT * FROM...
you should use varchar instead date type.