Getting date from convoluted text field in SQL Server

Getting date from convoluted text field in SQL Server - sql

I need to extract the date/time from a text field that looks like this, into a date/time column:
some text - 29th Jul 2021 16:44
some different text - 2nd Jul 2021 12:31
Example code to reproduce:
select 'some text - 29th Jul 2021 16:44'
union
select
'some different text - 2nd Jul 2021 12:31'
as textfield
This is a vendor supplied database I'm querying - there's no option to change the format.
I need to extract the date & time into a datetime field (the purpose is to do a comparison to a different date time field).
Is there any 'shortcuts' to doing this? I've began attempting lots of manual substring functions to extract individual parts to piece back together again, but its very cumbersome, and I feel like there must be a better way.
The dash (-) is always going to be in the same position (relative to the date aspects), which has been helpful, but I still feel like I'm going down the wrong approach.
Is there a way I can substring after the dash, and for SQL Server to recognise the format?
A challenge here is the 'day' aspect will be single digit for 1-9, but double digit for 10-31.

If you like concise :
convert(datetime, stuff(right(txt, 19), 3, 2, ''), 106)
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=c8720885ef6239187b2c220d0dfa9ae2
This conversion relies on there being a space following the hyphen so that when picking up the rightmost 19 characters you'll either end up with a digit or a space character in the initial position. This then allows you to strip out the two characters of ordinal text from a known location. (I had initially put an ltrim() in before the conversion but the possibly leading space doesn't seem to break the conversion anyway.)
One advantage is that is avoids the potential of having other hyphens in the lead portion of the text interfering with the search. The whole issue of a marker/separator is eliminated completely.

For your sample data you can do this with a few uses of replace and a substring:
with t as (
select 'some text - 29th Jul 2021 16:44' as textfield
union
select
'some different text - 2nd Jul 2021 12:31'
)
select Try_parse(y.d as datetime using 'en-GB') as ExtractedDate
from t
cross apply(values(Substring(t.textfield, CharIndex('-',t.textfield)+2 ,100)))x(v)
cross apply(values(Replace(Replace(Replace(Replace(x.v,'st',''),'nd',''),'rd',''),'th','')))y(d)

Related

how to search for a date row in sql?

I create column in this way:
ALTER TABLE cages
ADD test date;
I add date to this column
for instance
'01/07/21'
and when I use like always select:
select *
from cages
where test = '01/07/21';
I get nothing, it's weird, because in different table it's works... Can it connection with pk or fk or what is the reason of this?
edit:
I use SQL orlace developer.
edit:
thanks everyone for help, problem was that I used calendar to put date to column and it add date with time.
Why is it possible, when I have type date not dateTime?

Not everyone formats date values the same way. When looking at a date like 01/07/21, most of the people on this site will naturally read January 7, 2021*. The group that reads this as July 1, 2021 (today) is significant, but still slightly smaller. A few people come from cultures where July 21, 2001 is the natural interpretation.
To avoid this kind of ambiguity, when writing date literals for SQL you should always format them using the ISO-8601 formats, which always uses four digit years, goes in sequence from most significant term on the left to least significant term on the right, and always uses leading zeroes to fill out the full width of a term:
yyyy-MM-dd HH:mm:ss.fff
yyyy-MM-dd HH:mm:ss
yyyy-MM-dd
yyyyMMdd (unseparated version of the format preferred on Sql Server for date-only values for historical reasons)
Anything else is wrong for SQL.
For completeness, I also want to key in on the word "literals" from the beginning of the second paragraph. We should always use parameterized queries/prepared statements when putting date values into a query from a client code language, rather than using string manipulation to substitute a literal into the SQL command. On strongly typed platforms this usually means using the DateTime type provided by the language to set the value. If you find yourself converting a datetime variable to a string for inclusion in an SQL query, you're making a mistake.
* This isn't just a blind assertion. A few years back I did a basic review of the public portion of the Stack Overflow developer survey, where I first looked up which countries/languages default to which date formats, and then grouped countries together based on their format. I wish I had saved the results :/. I forget how I treated places with mixed heritage like Canada.

Your root problem, and I'm amazed no one seems to have picked up on this, is that your column is a DATE but in your query you are comparing it to a STRING. This may or may not work, depending on your NLS_DATE_FORMAT setting. You need to compare like data types:
select *
from cages
where test = to_date('01/07/21','dd-mm-yy');
I leave it as an exercise for the student to go to the SQL Reference manual and read up on the TO_DATE function.
I also beg and plead with you to not be trying to use 2-digit years. As an industry we were supposed to have solved that problem over 20 years ago. Does the term "Y2k bug" not mean anything to you?
As it is, the date that is represented by the string '01/07/21' could be understood to be any of the following
Jan 7
Jan 21
Jul 1
Jul 21
And who knows the year? 2021? 1921? 1821? 2001? 1901? 2007? 1907?, 1807?
You might want to read this.

Using SQL Developer if you insert a date in a table using user interface it automatically generates hours and minutes like if it's a Timestamp. If the format you are using is correct you should be able to retrieve your rows using like operator instead of equal.
select * from cages where test like '01/07/21%';
The only way to retrieve your rows using equal operator is when the timestamp is set to 00:00:00

1st July 2021 is written date '2021-07-01' in Oracle SQL.
You can read more about literals in the Oracle SQL Reference.
You can also use a to_date expression like
to_date('2021-07-21','YYYY-MM-DD')
or
to_date('1-Jul-21', 'DD-Mon-YY', 'nls_date_language = English')
or indeed
to_date('1-Lug-21', 'DD-Mon-YY', 'nls_date_language = Italian')
but frankly, why would you?
Bear in mind that that the person running the query/report/procedure, or the application server in use, may not have the same territory and language settings as you, so it is dangerous to assume that the century for a 2-digit year will always be what you expect (what year is '50'?) or the language will always be English, or the week always starts on a Monday. I worked on a system once where we deployed some code that used 'DD-MON-YYYY', to offices in London and Paris. We deployed it in September, and we had a production issue in Paris in February, because Sep, Oct, Nov, Dec and Jan still worked, but French has no Feb.

Convert two fields (Month and Year) into a YYYY-MM-DD Field

I am using SAP HANA SQL (Through Alteryx) via an in-DB formula.
I have two fields (Month and YEAR) and I need to convert/combine these into one field shown as YYYY-MM-DD. I am able to do this succesfully locally in Alteryx but I need to make this happen within the DB via SQL.
See image for succesful local conversion in Alteryx:

There seem to be two goals here:
construct a valid date from year and month information.
represent this date in a specific format, ie. YYYY-MM-DD
The first part can be done in HANA like this:
to_date( "<year_column>" || "<month_column>", 'YYYYMM') as newDate
The double-pipe || operator concatenates strings, which means, that <year_column> and <month_column> data will be first converted into strings if these are not already string-values.
The concatenated string is then turned into a date data type. The to_date conversion function takes the pattern string YYYYMM and since the day information is missing, it makes it up on the fly and sets the day to the first day of the month.
This to_date conversion also checks for that only valid dates are created.
If, for example, the MM would not be a value between 01 and 12 then the conversion would fail with an error.
This brings me to the next potential obstacle to look out for: the conversion string pattern YYYYMM requires that there will be exactly four digits denoting the year and exactly two digits for the month.
While this may be fine for the existing year data as most dates are denoted with four digits nowadays, there is a good chance that the month data does not have a leading zero (e.g. when the data is currently stored in a numeric field).
To "fix" this issue, we can just add the leading zero for all values that only have a single digit so far. There's a couple of ways to do this in HANA, and as this does not seem to be in an ABAP context, I'd go with a way that works on most SQL databases:
LPAD ("<month_column>", 2, '0')
This gets us to the following expression for step 1:
to_date( "<year_column>" || LPAD ("<month_column>", 2, '0'), 'YYYYMM') as newDate
Step 2 now is relatively easy: turn the date-data that we constructed in step 1 and represent it in a specific format.
Since date-data per se does not have a specific output format (ie. you can display or print the same date format any way you like - it doesn't change the data), it needs to be converted to a string for that.
The conversion function for that is called TO_NVARCHAR() and can also take a conversion pattern:
to_nvarchar( "<date_data>", 'YYYY-MM-DD') as fixedFormatDate
is what we're looking for this question.
Putting it all together into a single expression:
to_nvarchar(to_date( "<year_column>"
|| LPAD ("<month_column>", 2, '0')
, 'YYYYMM')
, 'YYYY-MM-DD') as fixedFormatDate
While this is a long answer to a seemingly simple question, I believe it is important to understand all the involved steps that are necessary for this conversion.

sql date format patindex regex replace recall

Using SQL, is there an elegant way to remove the 'th' 'rd' and 'st' after a date? Here is how I'm doing it, but I'd rather do something like regex recall (example below).
Please no functions with while loops. I've already seen those solutions and I'm looking for something less ugly.
select Replace('september 8th, 2016', Substring('september 8th, 2016', PatIndex('%[a-z,A-Z][a-z,A-Z],%', 'september 8th, 2016'), 2), '') statdate2
if using recall I could do something like below
/(\s[0-9]+)[a-z,A-Z]{2}\,/
and recall what the date number is with the replacement
$1,
As-is my pattern inthe sql example may pull in any two characters followed by a comma which is good 99% of the time, but does match incorrectly some unwanted date misformats where users put the comma after the month.

Well you could write a CLR function that uses regex recall.
By the way, your current query does a replace on the entire date string.
This becomes a problem if the suffix you want to replace is present in the month name.
For instance when I run your logic on August 1st, like this:
select Replace('august 1st, 2016', Substring('august 1st, 2016', PatIndex('%[a-z,A-Z][a-z,A-Z],%', 'august 1st, 2016'), 2), '') statdate2
I get:
augu 1, 2016
If you want a non-CLR solution that works, you probably need to eliminate the characters by their position in the string rather than using REPLACE(). Meaning get all the characters before the suffix, and concatenate it with all the characters after the suffix. It won't be pretty, but like I said, if you want pretty, you can go with CLR.

Format date where the position of the parts is variable

We have a file that needs to be imported that has dates in it. The dates are in a format that I have not seen before and the day part can vary in length (but not the month or year seemingly) and position based on wether the number is double digit or not, i.e.
Dates:
13082014 is 13th February 2014
9092013 is 9th September 2013
The current script tries to substring the parts out, but fails on the second one as there is not enough data. I could write an if or case to check the length, but is there a SQL format that can be used to reliably import this data?
To clarify this is MSSQL and the date format is ddmmyyyy or dmmyyyy

One of the simple way is using STUFF.
example:
select STUFF(STUFF('13082014 ',3,0,'/'),6,0,'/');
//result: 13/08/2014
Good luck.

LPAD a zero when it is missing so to always get an eight character date string. Here is an example with Oracle, other DBMS may have other string and date functions to achieve the same.
select to_date(datestring, 'ddmmyyyy')
from
(
select lpad('13082014', 8, '0') as datestring from dual
union all
select lpad('9092013', 8, '0') as datestring from dual
);
Result:
13.08.2014
09.09.2013

you can convert the dates to a relevant date format then import data(based on the dateformat change the logic).
something like this :
select Convert(varchar(10),CONVERT(date,YourDateColumn,106),103)

2005 SQL Server Management Studio

Can someone interpret this query. I am aware that alias, conversion, and linked databases are involved, but not familiar placement of numbers "4,2".
set birthday = convert(datetime, left(a.dob,2)+ '/'+
right(left(a.dob,4),2)+'/'+
right(a.dob,2)+left(right(a.dob,4),2))
from
[plkfsql2k5\prod2k5].st_data.dbo.st_patient a

This left(a.dob,2)+ '/'+right(left(a.dob,4),2)+'/'+right(a.dob,2)+left(right(a.dob,4),2) nightmare of a substring concatenation is taking a date, probably stored as text (ugh) and returning it in a format that can be converted to a date with the convert(datetime, <textdate) formula it's wrapped in.
The text date appears to be in a format where the month is in the first 2 characters, the day is in the 3 and 4th characters and the year... well the years is stored instead of 1999 like 9919 because some psychopath designed this field...
So it takes this mmddyyYY format (that only a crazy person would use) and coverts it to mm/dd/YYyy and the n uses the convert(datetime, <text date>) function to make it into an actual date.
Note that if you are in a country that uses dd/mm/yyyy format, then my explaination may need to be tweaked as the incoming text value may be ddmmyyYY and converts it to dd/mm/YYyy. Still a crazy-banana's starting point though.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas