How to find select conversion failed value - sql

I have a query from core of data which is nvarchar and all values are '00:00:00' format. I want to convert it into long. When I try to convert top 1000 it working fine but problem with all values. Query show in below
SELECT DATEDIFF(second, '00:00', CAST(TimeSpent AS time(7)))* cast(1000 as bigint) + RIGHT(CAST(TimeSpent AS time(7)),7) FROM [mtr].[MatterDocument]
The error statement is
Conversion failed when converting date and/or time from character string
How can I find which value failed to convert?

I suggest that there is some bad data in your MatterDocument table. SQL Server does not support regex searches, but fortunately its LIKE operator does support some primitive regex which we can use:
SELECT *
FROM [mtr].[MatterDocument]
WHERE TimeSpent NOT LIKE '[01][0-9]:[0-5][0-9]:[0-5][0-9]' AND
TimeSpent NOT LIKE '2[0-3]:[0-5][0-9]:[0-5][0-9]';
Demo
You may verify in the demo that bad, non acceptable, time strings are being flushed out. The above query should also work to flush out strings which maybe aren't even time values at all, and somehow made it into your table.
The best long term fix would be to correct your data at its source, and then bring the data into SQL Server as a bona fide date/time type.
Edit: TRY_CAST, as described by #Denis in his answer, might be another approach. But this would require SQL Server 2012 or later. The above query should still work in earlier versions.

Try to use TRY_CAST function to find the wrong rows (it returns NULL if it cannot convert the value)
SELECT c.TimeSpent, /*Any columns to identify rows */
FROM (
SELECT TimeSpent, /*Any columns to identify rows */
DATEDIFF(second, '00:00', TRY_CAST(TimeSpent AS time(7)))* cast(1000 as bigint)
+ RIGHT(TRY_CAST(TimeSpent AS time(7)),7) AS Converted
FROM [mtr].[MatterDocument]
) c
WHERE Converted IS NULL

You should find the bad values:
select timespent
from t
where try_cast(TimeSpent AS time(7)) is null;
This will enable you to find the bad values. They are probably times that exceed 23.
I would suggest doing the conversion more simply:
select (left(TimeSpent, 2) * 60 * 60 +
substring(TimeStpent, 4, 2) * 60 +
right(TimeSpent, 2)
) as seconds
This will do the conversion without the limitations of the SQL Server time data type.

Related

Compare Date saved as varchar with DateTime

I have a table with a column jsonStr of type varchar.
This is an example of an element in this column
{"Date":"/Date(1602846000000)/","person":"Laura"}
I want to compare this date with a static date. This is my query:
select *
from mytable
where json_value(jsonStr, '$.Date') >= '2020-10-01T00:00:00'
I expected one element to be displayed but no result so how can I convert this date to compare it with DateTime
I tried to remove /Date and / with substring and then Convert / Parse the result which is 1602846000000 but no result
Extracted unixtime value might be converted to datetime format through use of
DATEADD(S, CONVERT(int,LEFT(1602846000000, 10)), '1970-01-01') such as :
WITH t AS
(
SELECT *, JSON_VALUE(jsonStr, '$.Date') AS str
FROM mytable
), t2 AS
(
SELECT t.*,
SUBSTRING(str, PATINDEX('%[0-9]%', str), PATINDEX('%[0-9][^0-9]%', str + 't')
- PATINDEX('%[0-9]%', str) + 1) AS nr
FROM t
)
SELECT t2.jsonStr
FROM t2
WHERE DATEADD(S, CONVERT(int,LEFT(nr, 10)), '1970-01-01') >= '2020-10-01T00:00:00'
Demo
I would reverse this as much as possible. Every bit of work you do for this comparison must done for every row in your table, because we don't know which rows will match until after we do the work. The more we can do to the constant value, rather than all the stored values, the more efficient the query becomes.
Parsing dates out of JSON is stupid expensive to do in the database. We can't get rid of that work completely, but we can at least convert the initial date string into the unix time format before including in the SQL. So this:
'2020-10-01T00:00:00'
becomes this:
1601510400
Now you can do some simpler string manipulation and compare the numbers, without needing to convert the unix time into a date value for every single row.
What that string manipulation will look like varies greatly depending on what version of Sql Server you have. Sql Server 2019 adds some new native JSON support, which could make this much easier.
But either way, you're still better off taking the time to understand the data you're storing. Even when keeping the raw json makes sense, you should have a schema that at least supports basic metadata on top of it. It's difference between using an index or not, which can make multiple orders magnitude difference for performance.
For example, as previously mentioned the query in this question must extract the date value for every row in your table... even the rows that won't match. If you build a schema where the date was identified as meta and extracted during the initial insert, an index could let you seek to just the rows you need. If at this point you still need to extract a value from JSON records, at least it's just for the relevant rows.
I solved the problem using
DATEADD(SECOND, CONVERT(INT, Left(SUBSTRING(JSON_VALUE(jsonStr, '$.EndDate'), 7, 13), 10)), '19700101'

Converting from mmddyyyy to yyyymmdd with SQL

I should preface my question by saying I am very new to SQL (or any programming involving databases). I started learning SQL a couple of weeks ago when I decided to take on a data project.
I have been using SSMS in wrangling large tables in comma-separated text file format. I need to be able to sort by dates, and the current format is mmddyyyy, but when I try to sort by date it does not work.
I know that converting dates is something that gets asked a lot around here, but I haven't found any solutions that explain things for a newb like myself.
So far my guesses for a solution are to use the CONVERT or CAST solutions, but I'm not sure if that is the right approach. I have found several CAST/CONVERT posts but none have really applied to my situation.
I hate to have this as my first question, but I'd thought I'd take some down vote bullets if it means I could figure this out. Thank you.
Sample of what I'm trying to do:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE [ column1] > 01012017;
I get the entire table back, unsorted.
Since your:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE [ column1] > 01012017;
does not error, we could say that the [column1]'s datatype is either a character type (like VARCHAR, char), or datetime.
Since you are getting back all the data and I would think you don't have data in the future, it should be a varchar (or char) - with datetime that means 1900-01-01 + 1012017 days.
To make it a datetime you need to 'cast' your column1 which is in mmddyyyy form by first converting it to yyyymmdd style (which would work under any date and language setting):
cast(right([column1],4)+substring([column1],1,2)+substring([column1],3,2) as datetime)
and you would write that 01012017 as a string (quotes around) and also again in yyyymmdd format (it would be implicitly casted to datetime):
'20170101'
So your SQL becomes:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE cast(right([column1],4) +
substring([column1],1,2) +
substring([column1],3,2) as datetime) > '20170101';
Having a date\datetime column as varchar and using like this would render the ability to use simple indexes but that is another matter. For now, this would return the data you want.
Assuming your column's datatype is [Date], try something similar to:
SELECT *
FROM [databasename].[dbo].[table1]
WHERE FORMAT([column1],'dd/MM/yyyy') >'01012017'
If it's string format, you'll have to use CONVERT() to convert the column to Date with a query like
SELECT *
FROM [databasename].[dbo].[table1]
WHERE CONVERT(NVARCHAR(10), [Column1], 112) >'01012017'
Refer to this W3Schools article if you need more help with the CONVERT clause

How can I use data from a string in SQL in a numeric comparison?

I'm a B-grade SQL user, so bear with me. I have a field that is in the NVARCHAR format ("Year"), but all but only about 1 in 1000 records is something other than a number. Yes, this is a ridiculous way to do this, but we receive this database from a customer, and we can't change it.
I want to pull records from the database where the year field is greater than something (say, 2006 or later). I can ignore any record whose year doesn't evaluate to an actual year. We are using SQL server 2014.
I have created an embedded query to convert the data to a "float" field, but for whatever reason, I can't add a where clause with this new floating-point field. I originally tried using a "case-if" but I got the same result.
I'm pulling my hair out, as I'm either missing something really silly, or there's a bug in SQL server. When I look at the field in the little hint, it's showing as a float. When I run this, I get "Error converting data type nvarchar to float."
SELECT VL.Field_A,
VL.FLYear,
VL.Field_B
FROM
(select
Field_A,
cast ([Year] as float) as FLYear,
/* didn't work either*/
/*Convert(float, [Year]) as FLYear, */
Field_B
from CustomerProvidedDatabaseTable
where (Field_A like 'E-%' OR
Field_A like 'F-%')
and
(isnumeric(year)=1)
and
year is not null
) VL
/* this statement is the one it chokes on */
where
VL.FLYear >= 2006.0
If I remove the last "where" clause, it works fine, and the field looks like a number. If I change the last where clause to:
where VL.FLYear like '%2006%'
SQL Server accepts it, though of course it doesn't return me all the records I want.
Try to simplify it and just use TRY_CONVERT(DATETIME, aYearvalue) or TRY_PARSE which will return NULL for values it can't convert and continue to process valid rows. I think you can do away with the where clause as join and just work directly against the column like: (substitute the literal string after datetime with your column)
SET DATEFORMAT mdy;
Select YEAR(try_convert(datetime, '08/01/2017')) as value1
WHERE value1 >=2016;
Try cast/convert to a numeric data type. I have modified the last line of your query to do just that. Take a peek.
SELECT
VL.Field_A,
VL.FLYear,
VL.Field_B
FROM
(select
Field_A,
cast ([Year] as float) as FLYear,
/* didn't work either*/
/*Convert(float, [Year]) as FLYear, */
Field_B
from CustomerProvidedDatabaseTable
where (Field_A like 'E-%' OR
Field_A like 'F-%')
and
(isnumeric(year)=1)
and
year is not null
) VL
/* this statement is the one it chokes on */
where
ISNUMERIC(VL.FLYear) = 1
and
CAST(VL.FLYear AS INT) >= 2006
Check out the following link for cast and convert documentation:
https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql
NOTE: ISNUMERIC will return true ( a false positive for a value which has a scientific numerical value, e.g. 1E10, though I don't see this happening from your data).
Another option is TRY_CONVERT.
Documentation on TRY_CONVERT: https://learn.microsoft.com/en-us/sql/t-sql/functions/try-convert-transact-sql
Try using Cast . Use the below link to check in more detail about casting.
https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql

Select data in date format

I have a query in which I want to select data from a column where the data is a date. The problem is that the data is a mix of text and dates.
This bit of SQL only returns the longest text field:
SELECT MAX(field_value)
Where the date does occur, it is always in the format xx/xx/xxxx
I'm trying to select the most recent date.
I'm using MS SQL.
Can anyone help?
Try this using ISDATE and CONVERT:
SELECT MAX(CONVERT(DateTime, MaybeDate))
FROM (
SELECT MaybeDate
FROM MyTable
WHERE ISDATE(MaybeDate) = 1) T
You could also use MAX(CAST(MaybeDate AS DateTime)). I got in the (maybe bad?) habit of using CONVERT years ago and have stuck with it.
To do this without a conversion error:
select max(case when isdate(col) = 1 then cast(col as date) end) -- or use convert()
from . . .
The SQL statement does not specify the order of operations. So, even including a where clause in a subquery will not guarantee that only dates get converted. In fact, the SQL Server optimizer is "smart" enough to do the conversion when the data is brought in and then do the filtering afterwards.
The only operation that guarantees sequencing of operations is the case statement, and there are even exceptions to that.
Another solution would be using PATINDEX in WHERE clause.
SELECT PATINDEX('[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]', field_value)
Problem with this approach is you really are not sure if something is date (e.g. 99/99/9999 is not date).
And problem with IS_DATE is it depends on configuration (e.g. DATEFORMAT).
So, use an appropriate option.

Date comparison in Hive

I'm working with Hive and I have a table structured as follows:
CREATE TABLE t1 (
id INT,
created TIMESTAMP,
some_value BIGINT
);
I need to find every row in t1 that is less than 180 days old. The following query yields no rows even though there is data present in the table that matches the search predicate.
select *
from t1
where created > date_sub(from_unixtime(unix_timestamp()), 180);
What is the appropriate way to perform a date comparison in Hive?
How about:
where unix_timestamp() - created < 180 * 24 * 60 * 60
Date math is usually simplest if you can just do it with the actual timestamp values.
Or do you want it to only cut off on whole days? Then I think the problem is with how you are converting back and forth between ints and strings. Try:
where created > unix_timestamp(date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),180),'yyyy-MM-dd')
Walking through each UDF:
unix_timestamp() returns an int: current time in seconds since epoch
from_unixtime(,'yyyy-MM-dd') converts to a string of the given format, e.g. '2012-12-28'
date_sub(,180) subtracts 180 days from that string, and returns a new string in the same format.
unix_timestamp(,'yyyy-MM-dd') converts that string back to an int
If that's all getting too hairy, you can always write a UDF to do it yourself.
Alternatively you may also use datediff. Then the where clause would be
in case of String timestamp (jdbc format) :
datediff(from_unixtime(unix_timestamp()), created) < 180;
in case of Unix epoch time:
datediff(from_unixtime(unix_timestamp()), from_unixtime(created)) < 180;
I think maybe it's a Hive bug dealing with the timestamp type. I've been trying to use it recently and getting incorrect results.
If I change your schema to use a string instead of timestamp, and supply values in the
yyyy-MM-dd HH:mm:ss
format, then the select query worked for me.
According to the documentation, Hive should be able to convert a BIGINT representing epoch seconds to a timestamp, and that all existing datetime UDFs work with the timestamp data type.
with this simple query:
select from_unixtime(unix_timestamp()), cast(unix_timestamp() as
timestamp) from test_tt limit 1;
I would expect both fields to be the same, but I get:
2012-12-29 00:47:43 1970-01-16 16:52:22.063
I'm seeing other weirdness as well.
TIMESTAMP is milliseconds
unix_timestamp is in seconds
You need to multiply the RHS by 1000.
where created > 1000 * date_sub(from_unixtime(unix_timestamp()), 180);
After reviewing this and referring to Date Difference less than 15 minutes in Hive I came up with a solution. While I'm not sure why Hive doesn't perform the comparison effectively on dates as strings (they should sort and compare lexicographically), the following solution works:
FROM (
SELECT id, value,
unix_timestamp(created) c_ts,
unix_timestamp(date_sub(from_unixtime(unix_timestamp()), 180), 'yyyy-MM-dd') c180_ts
FROM t1
) x
JOIN t1 t ON x.id = t.id
SELECT to_date(t.Created),
x.id, AVG(COALESCE(x.HighestPrice, 0)), AVG(COALESCE(x.LowestPrice, 0))
WHERE unix_timestamp(t.Created) > x.c180_ts
GROUP BY to_date(t.Created), x.id ;