Checking a DateTime which is part of a larger nvarchar - sql

I have a table, let's call it 'tblData'. which has a field called 'details' which is a nvarchar(4000). in this field there might be a text like such :
'blah blah ## 12/08/1982 ## blah blah'. the part that starts and ends in '##' might or might not exist.
I need to make a select statement with a where clause which will not return a row that has '## [date] ##' that is in the future. if the '## [date] ##' part does not exist, return the row.
is this possible to do without using a function?
and if I have to use a function, some sample code could come in handy as I barely know any tsql...
10x alot!

This is quite difficult to do in SQL. You can do it, by extracting the date, and converting it. But, SQL Server does not have great string manipulation functions.
The following extracts the date:
(case when details like '% [0-9]0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %'
then substring(details,
patindex('% [0-9]0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %',
details) + 1, 10)
end)
So, you can put this into a where clause as:
where details not like '% [0-9]0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %' or
convert(date,
(case when details like '% [0-9]0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %'
then substring(details,
patindex('% [0-9]0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] %',
details) + 1, 10),
end), 101) <= getdate();

select * from tblData where details not like '%## ??/??/???? ##%'

Related

Highlighting rows with trailing and leading space/s

I want to highlight rows with trailing and leading spaces. I've below query but want to know if there is better more efficient way to achieve this.
SELECT *
FROM DummyTable lc
WHERE
(lc.Code LIKE '% ' OR lc.Code LIKE ' %' or lc.Code like '% % %' OR lc.Code like '% % %')
AND (lc.StartDate <= getdate() AND lc.EndDate > getdate())
AND (lc.CodeTypeID <> 27)
ORDER BY 4 DESC
Please note that I don't want to remove space from the field "Code" but just highlight in my result set.
There is not a more efficient method, but you can reduce the number of comparisons:
WHERE CONCAT(' ', lc.Code, ' ') LIKE '% %' AND
lc.StartDate <= getdate() AND
lc.EndDate > getdate() AND
lc.CodeTypeID <> 27
This adds a space to the beginning and end and then looks for two spaces in a row (which seems to be your intention despite how the question is phrased).
Unfortunately, there is little you can do to improve performance beause all the comparisons are inequalities.
You could just use string functions LEFT() and RIGHT() to check if the string starts or ends with a space, like:
SELECT *
FROM DummyTable
WHERE
(LEFT(Code, 1) = ' ' OR RIGHT(Code, 1) = ' ')
AND StartDate <= getdate()
AND EndDate > getdate()
AND CodeTypeID <> 27
ORDER BY 4 desc
As commented by Martin Smith, (LEFT(Code, 1) = ' ' OR RIGHT(Code, 1) = ' ') can be simplified as ' ' IN (LEFT(Code, 1), RIGHT(Code, 1)).
NB: few simplifications in your query:
you don't need to prefix the columns with the table name, since only one table is involved in the query
you don't need to surround individual conditions with parenthesis (just make sure to surround the ORed conditions with parenthesis to separate them from the ANDed conditions

SQL Server | Look for specific keywords in strings

I need your help.
I try to match a manually created lookup of specific keywords with a fact comment table. Purpose: an attempt to categorize these comments.
Example
comment: A lot more power than the equivalent from Audi.
keyword from keyword-list: Audi
category from keyword-list: competitor
I tried something like
SELECT
FC.comment_id, KWM.keyword, KWM.category
FROM
dbo.factcomments FC
INNER JOIN
(SELECT
keywordmatcher = '%[,. ]' + keyword + '[ .,]%',
keyword,
category
FROM
dbo.keywordlist) KWM ON FC.comment LIKE KWM.keywordmatcher
Maybe a bad example, but I only want specific matches --> no matches if the keyword is part of another word in the fact comments (e.g. 'part' but not 'apart').
Because my first try didn't match keywords at the beginning/end of strings I did something really nasty:
SELECT
FC.comment_id, KWM.keyword, KWM.category
FROM
dbo.factcomments FC
INNER JOIN
(SELECT
keyword,
category
FROM
dbo.keywordlist) KWM ON FC.comment LIKE '%[,. ]' + KWM.keyword + '[ .,]%'
OR FC.comment LIKE KWM.keyword + '[ .,]%'
OR FC.comment LIKE '%[,. ]' + KWM.keyword
I know...
Besides the fact that I also want to detect those comments where there are '!', '?', ''', '-' or '_' before or after these keywords - is there any clever way to do so?
In fact I want any comments where there are no word characters before or after the keyword, any other character is OK.
In the JOIN condition, REPLACE() all non-alphanumeric characters in FC.Comment with a space character, and surround it with spaces. Something like this:
' '+REPLACE(FC.Comment, ...)+' '
Then do your LIKE Comparison like this:
LIKE '% '+KWM.Keyword+' %'
A different approach may be.
declare #comment varchar(255)=concat(' ','A lot more power than the equivalent from Audi.',' ')
declare #keyword varchar(50)='Audi'
DECLARE #allowedStrings VARCHAR(100)
DECLARE #teststring VARCHAR(100)
SET #allowedStrings = '><()!?#_-.\/?!*&^%$#()~'
;WITH CTE AS
(
SELECT SUBSTRING(#allowedStrings, 1, 1) AS [String], 1 AS [Start], 1 AS [Counter]
UNION ALL
SELECT SUBSTRING(#allowedStrings, [Start] + 1, 1) AS [String], [Start] + 1, [Counter] + 1
FROM CTE
WHERE [Counter] < LEN(#allowedStrings)
)
SELECT #comment = REPLACE(#comment, CTE.[String], '') FROM CTE
Change the #comment variable however you like and check the result
SELECT
#comment as Comment , #keyword as KeyWord,
iif(substring(#comment,PATINDEX(concat('%',#keyword,'%'),#comment)-1,len(#keyword)+2)=' Audi ',1,0) as isMatch
This is a borrowed idea from https://stackoverflow.com/a/29162400/10735793

Looping a CASE WHEN and REPLACE statement in SQL

Apologies for the multiple basic questions - I am very new to SQL and still trying to work things out.
I would like to insert records from my staging table to another table in my database, both removing the double quotes in the source file with a 'replace' function and converting the data from nvarchar (staging table) to datetime2. I can't quite work out how to do this: if I loop the 'case when' within 'replace', as below, then SQL doesn't recognise my data and nulls it out:
CASE WHEN ISDATE (REPLACE([Column1], '"', '')) = 1
THEN CONVERT(datetime2, Column1, 103)
ELSE null END
However if I loop my 'replace' within my 'case when', as below, SQL gives me an error message saying that it is unable to convert nvarchar into datetime2:
LTRIM(REPLACE([Column1], '"', '')
,CASE WHEN ISDATE(Column1) = 1 THEN CONVERT(datetime2, Column1, 103)
ELSE null END
What order / syntax do I need to be using to achieve this? An example of the data field would be:
"16/10/2017"
It uploads to my staging table as nvarchar
"16/10/2017"
and I would like to move it into my table2 as datetime2:
16/10/2017
Instead of isdate(), use try_convert():
TRY_CONVERT(datetime2, LTRIM(REPLACE([Column1], '"', ''), 103)
I think your confusion is that you need to do the string manipulation before the conversion. To do this, the string manipulation needs to be an argument to the conversion.
You are doing it right. The problem is, convert needs value without " ", and hence your convert was failing.
Just try this :
select
CASE WHEN ISDATE (REPLACE([Column1], '"', '')) = 1
THEN CONVERT(datetime2, (REPLACE([Column1], '"', '')), 103)
ELSE null END
from #tbl
more details : cast and convert doc

Is it possible to Compare two columns in Microsoft SQL server so that the comparison skips punctuation marks and other character like %, ' etc?

I have two columns having data like below.
Column1
AMC Standard, School
Column2
AMC Standard School.
In need to compare these two columns such that comparison is made for the words only and not for any additional, meaning from the above example Column1 and ColumnC are match but due to the Comma ",' and the period sign "." the simple comparison of Column1 and Column2 suggests it as a mismatch.
you can replace the non comparable characters to empty string (in your case , and .)and then compare them. Something like this.
SELECT 1 WHERE REPLACE('AMC Standard, School',',','') = REPLACE('AMC Standard School.','.','')
Based on jarlh comments, You should (if possible) update the columns and remove the punctuation marks if they are not using in any comparison and display.
One option is to use SQL Servers SoundEx() and Difference() functions (https://msdn.microsoft.com/en-us/library/ms187384.aspx and https://msdn.microsoft.com/en-us/library/ms188753.aspx respectively)
DECLARE #val1 varchar(50) = 'AMC Standard, School'
, #val2 varchar(50) = 'AMC Standard School.'
;
SELECT #val1
, #val2
, SoundEx(#val1)
, SoundEx(#val2)
, Difference(SoundEx(#val1), SoundEx(#val2))
;
The return value of Difference() is between 0 and 4, with a higher number signifying a closer match.
IMPORTANT NOTE: This type of comparison is not as exacting as a method that cleans up your data beforehand as in those scenarios you can use an exact (a=a) comparison, whereas this method looks for similar values.
Try like this
DECLARE #column1 VARCHAR(100)='AMC Standard, School (Near to ABC Building)'
DECLARE #column2 VARCHAR(100)='AMC Standard, School (Opposite KFC)'
SELECT 'MATCHED' AS COLUMN_COMPARE
WHERE replace(replace(replace(#column1, ',', ''), '.', ''), substring(#column1, CHARINDEX('(', #column1), CHARINDEX(')', #column1) - 1), '') = replace(replace(replace(#column2, ',', ''), '.', ''), substring(#column2, CHARINDEX('(', #column2), CHARINDEX(')', #column2) - 1), '')

Running SQL query to remove all trailing and beginning double quotes only affects first record in result set

I'm having a problem when running the below query - it seems to ONLY affect the very first record. The query removes all trailing and beginning double quotes. The first query is the one that does this; the second query is just to demonstrate that there are multiple records that have beginning double quotes that I need removed.
QUESTION: As you can see the first record resulting from the top query is fine - it has its double quotes removed from the beginning. But all subsequent queries appear to be untouched. Why?
If quotes are always assumed to exist at both the beginning and the end, adjust your CASE statement to look for instances where both cases exist:
CASE
WHEN ([Message] LIKE '"%' AND [Message] LIKE '%"') THEN LEFT(RIGHT([Message], LEN([Message])-1),LEN([Message]-2)
ELSE [Message]
EDIT
If assumption is not valid, combine above syntax with your existing CASE logic:
CASE
WHEN ([Message] LIKE '"%' AND [Message] LIKE '%"') THEN LEFT(RIGHT([Message],LEN([Message])-1),LEN([Message]-2)
WHEN ([Message] LIKE '"%') THEN RIGHT([Message],LEN([Message]-1)
WHEN ([Message] LIKE '%"') THEN LEFT([Message],LEN([Message]-1)
ELSE [Message]
Because your CASE statement is only evaluating the first condition met, it will only ever remove one of the statements.
Try something like the following:
SELECT REPLACE(SUBSTRING(Message, 1, 1), '"', '') + SUBSTRING(Message, 2, LEN(Message) - 2) + REPLACE(SUBSTRING(Message, LEN(Message), 1), '"', '')
EDIT: As Martin Smith pointed out, my original code wouldn't work if a string was under two characters, so ...
CREATE TABLE #Message (Message VARCHAR(20))
INSERT INTO #Message (Message)
SELECT '"SomeText"'
UNION
SELECT '"SomeText'
UNION
SELECT 'SomeText"'
UNION
SELECT 'S'
SELECT
CASE
WHEN LEN(Message) >=2
THEN REPLACE(SUBSTRING(Message, 1, 1), '"', '') + SUBSTRING(Message, 2, LEN(Message) - 2) + REPLACE(SUBSTRING(Message, LEN(Message), 1), '"', '')
ELSE Message
END AS Message
FROM #Message
DROP TABLE #Message
Try this:
SELECT REPLACE([Message], '"', '') AS [Message] FROM SomeTable