SQL look up abbreviation value - sql

So i have declared a table:
DECLARE #VTable TABLE (ColtoAdj VARCHAR(50),HeadlineDescription
VARCHAR(50),HeadlineDescriptionGroup VARCHAR(50));
INSERT INTO #VTable VALUES ('Neopla','Neoplasia','Cancer');
INSERT INTO #VTable VALUES ('Metasta','Metastases','Cancer');
INSERT INTO #VTable VALUES ('CES','CES','Cauda Equina');
INSERT INTO #VTable VALUES ('CES-I','CES-I','Cauda Equina');
which is then used in an Outter apply on the main table to find the headlinedescription and headlinedescriptiongroup.
OUTER APPLY (
SELECT TOP 1
vt.ColtoAdj,
Vt.HeadlineDescription,
Vt.HeadlineDescriptionGroup
FROM #VTable AS vt
WHERE cmd.Headline LIKE ('%' + vt.ColtoAdj + '%')
Which for most of my data it works fine as alot of the vt.ColtoAdj lookup words are unique such as have a hyphon etc. so are only pulled. The problem is the CES value. This could be anywhere in a 255 character headline. So its currently picking up things like "BraCES" and adding it incorrectly. I tried to insert 2 values for it 'CES' or 'CES' to just get the abbreviation but this still grabs words ending in it or starting. Is there anyway for this value in the table i can get it to find just the 3 letters by themselves, but also search for full words for the other ones?
It may be a case of no, simply because its a like function and with a word lookup you cant always guarantee 100% accuracy, but thought i would check.

If you are looking for word boundaries and words are bounded by a space, you can use:
WHERE ' ' + cmd.Headline + ' ' LIKE '% ' + vt.ColtoAdj + ' %'

I can only think of a per-case solution:
WHERE
(
(cmd.Headline LIKE ('%' + vt.ColtoAdj + '%') and vt.ColtoAdj<>'CES')
OR
(cmd.Headline='CES' and vt.ColtoAdj='CES')
)

Related

In SQL Server, how can I search for column with 1 or 2 whitespace characters?

So I need to filter column which contains either one, two or three whitespace character.
CREATE TABLE a
(
[col] [char](3) NULL,
)
and some inserts like
INSERT INTO a VALUES (' ',' ', ' ')
How do I get only the row with one white space?
Simply writing
SELECT *
FROM a
WHERE column = ' '
returns all rows irrespective of one or more whitespace character.
Is there a way to escape the space? Or search for specific number of whitespaces in column? Regex?
Use like clause - eg where column like '%[ ]%'
the brackets are important, like clauses provide a very limited version of regex. If its not enough, you can add a regex function written in C# to the DB and use that to check each row, but it won't be indexed and thus will be very slow.
The other alternative, if you need speed, is to look into full text search indexes.
Here is one approach you can take:
DECLARE #data table ( txt varchar(50), val varchar(50) );
INSERT INTO #data VALUES ( 'One Space', ' ' ), ( 'Two Spaces', ' ' ), ( 'Three Spaces', ' ' );
;WITH cte AS (
SELECT
txt,
DATALENGTH ( val ) - ( DATALENGTH ( REPLACE ( val, ' ', '' ) ) ) AS CharCount
FROM #data
)
SELECT * FROM cte WHERE CharCount = 1;
RETURNS
+-----------+-----------+
| txt | CharCount |
+-----------+-----------+
| One Space | 1 |
+-----------+-----------+
You need to use DATALENGTH as LEN ignores trailing blank spaces, but this is a method I have used before.
NOTE:
This example assumes the use of a varchar column.
Trailing spaces are often ignored in string comparisons in SQL Server. They are treated as significant on the LHS of the LIKE though.
To search for values that are exactly one space you can use
select *
from a
where ' ' LIKE col AND col = ' '
/*The second predicate is required in case col contains % or _ and for index seek*/
Note with your example table all the values will be padded out to three characters with trailing spaces anyway though. You would need a variable length datatype (varchar/nvarchar) to avoid this.
The advantage this has over checking value + DATALENGTH is that it is agnostic to how many bytes per character the string is using (dependant on datatype and collation)
DB Fiddle
How to get only rows with one space?
SELECT *
FROM a
WHERE col LIKE SPACE(1) AND col NOT LIKE SPACE(2)
;
Though this will only work for variable length datatypes.
Thanks guys for answering.
So I converted the char(3) column to varchar(3).
This seemed to work for me. It seems sql server has ansi padding that puts three while space in char(3) column for any empty or single space input. So any search or len or replace will take the padded value.

Search for specific string inside column field

In my table for specific column values are stored in three diffrent ways as shown below. It could be either one, two or three items separated by commas (of course if more than one value). Minimum is one value, maximum 3 values separated by commas. To be clear i know it's bad approach it was done (not by me) however i have to work on this and i have to change just only this query. Example showing three ways of storing values:
MaterialAttributes (column name)
----------------------------------
1,12,32
3,1
9
I have specific sql query for searching if some value existing within field. It is universal to check all tree ways.
somevalue1
or:
somevalue1,somevalue2
or:
somevalue1,somevalue2,somevalue3
Therefore for instance if i search entire table for each row in that column to get records where somevalue2 appears this query correctly gives me correct result.
This is the query:
";WITH spacesdeleted (vater, matatt) as (SELECT vater, REPLACE(MaterialAttributes, ' ', '') MaterialAttributes FROM myTable),
matattrfiltered (vat) as (SELECT vater FROM spacesdeleted WHERE matatt = #matAttrId
OR matatt LIKE #matAttrId +',%'
OR matatt LIKE '%,'+#matAttrId
OR matatt LIKE '%,'+#matAttrId+',%' ),
dictinctVaters (disc_vats) as (SELECT distinct(vat) FROM matattrfiltered)
SELECT ID from T_Artikel WHERE Vater IN (SELECT disc_vats FROM dictinctVaters)"
Note: For security reasons if for some reasons there are spaces close to commas there will be removed (just information from other developer).
What is the question:
The problem now is that logic changed in the way there could be instead of 3 (max) - 12 to store in that column.
OR matatt LIKE #matAttrId +',%'
OR matatt LIKE #matAttrId +',%'
These are the same, should one be '%,' + #matAttrId?
Regardless, I think there's only 4 cases you need:
= #matAttrId
LIKE #matAttrId + ',%'
LIKE '%,' +#matAttrId
LIKE '%,' +#matAttrId + ',%'
Covering
single value, equals
value is start of list
value is end of list
value is in middle of list
Which is what your original query already has.
You can search with
WHERE ',' + matatt + ',' LIKE '%,' + #matAttrId + ',%'
It works like this: matatt is extended to look like this
,1,12,32,
,3,1,
,9,
Now you can always serach for an id looking like ,id, by using the search pattern %,id,%, where id is the real id.
This works for any number of values per column.

SQL Server check if value is substring inside isnull

I have a field in UI interface that passes to a stored procedure a null value (when field is unfilled) or a contract number when it is filled. Substrings of the contract number are accepted as input.
Inside the procedure, I need to filter the results by this parameter.
I need something similar to this:
SELECT * FROM tableName tn
WHERE
tn.ContractNumber LIKE ISNULL('%' + #contractNumber + '%', tn.ContractNumber)
What do you think it is the best approach? Problem is that using a condition like this does not return values.
Simply:
SELECT *
FROM tableName tn
WHERE tn.ContractNumber LIKE '%' + #contractNumber + '%'
OR #contractNumber IS NULL
You are really checking multiple condition, so having them separated reads more intuitive (for most people, anyway).
I assume this is just a sample query, and you are not selecting * in reality...
Another one:
SELECT *
FROM tableName tn
WHERE tn.ContractNumber LIKE '%' + ISNULL(#contractNumber, '%') + '%'

SQL Server : find percentage match of LIKE string

I'm trying to write a query to find the percentage match of a search string in a notes or TEXT column.
This is what I'm starting with:
SELECT *
FROM NOTES
WHERE UPPER(NARRATIVE) LIKE 'PAID CALLED RECEIVED'
Ultimately, what I want to do is:
Split the search string by spaces and search individually for all words in the string
Order the results descending based on percentage match
For example, in the above scenario, each word in the search string would constitute 33.333% of the total. A NARRATIVE with 3 matches (100%) should be at the top of the results, while a match containing 2 of the keywords (66.666%) would be lower, and a match containing 1 of the keywords (33.333%) would be even lower.
I then want to display the resulting percentage match for that row in a column, along with all the other columns from that table (*).
Hopefully, this makes sense and can be done. Any thoughts on how to proceed? This MUST all be done in SQL Server, and I would prefer not to write any CTEs.
Thank you in advance for any guidance.
Here is what I came up with:
DECLARE #VISIT VARCHAR(25) = '999232'
DECLARE #KEYWORD VARCHAR(100) = 'PAID,CALLED,RECEIVED'
DECLARE SPLIT_CURSOR CURSOR FOR
SELECT RTRIM(LTRIM(VALUE)) FROM Rpt_Split(#KEYWORD, ',')
IF OBJECT_ID('tempdb..#NOTES_FF_SEARCH') IS NOT NULL DROP TABLE #NOTES_FF_SEARCH
SELECT N.VISIT_NO
,N.CREATE_DATE
,N.CREATE_BY
,N.NARRATIVE
,0E8 AS PERCENTAGE
INTO #NOTES_FF_SEARCH
FROM NOTES_FF AS N
WHERE N.VISIT_NO = #VISIT
DECLARE #KEYWORD_VALUE AS VARCHAR(255)
OPEN SPLIT_CURSOR
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE #NOTES_FF_SEARCH
SET PERCENTAGE = PERCENTAGE + ( 100 / ##CURSOR_ROWS )
WHERE UPPER(NARRATIVE) LIKE '%' + UPPER(#KEYWORD_VALUE) + '%'
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
END
CLOSE SPLIT_CURSOR
DEALLOCATE SPLIT_CURSOR
SELECT * FROM #NOTES_FF_SEARCH
WHERE PERCENTAGE > 0
ORDER BY PERCENTAGE, CREATE_DATE DESC
There may be a more efficient way to do this but every other road I started down ended in a dead-end. Thanks for your help
If you want to do a "percentage" match, you need to do two things: calculate the number of words in the string and calculate the number of words you care about. Before giving some guidance, I will say that full text search probably does everything you want and much more efficiently.
Assuming the search string has space delimited words, you can count the words with the expression:
(len(narrative) - len(replace(narrative, ' ', '') + 1) as NumWords
You can count the matching words with success replaces. So, for keywords, it would be something like removing each key word, fixing the spaces, and counting the words.
The overall code is best represented with subqueries. The resulting query is something like:
select n.*
from (select n.*,
(len(narrative) - len(replace(narrative, ' ', '') + 1.0) as NumWords,
ltrim(rtrim(replace(replace(replace(narrative + ' ', #keyword1 + ' ', ''),
#keyword2 + ' ', ''),
#keyword3 + ' ', ''))) as NoKeywords
from notes n
) n
order by 1 - (len(NoKeywords) - len(replace(NoKeywords, ' ', '') + 1.0) / NumWords desc;
SQL Server -- as with many databases -- is not particularly good at parsing strings. You can do that outside the query and assign the #keyword variables accordingly.

Is it possible to search for multiple terms in a column by using a LIKE statement?

I'm trying to understand if the above question is possible. I've been conceptually thinking about it, and basically what I'm looking to do is:
Specify keywords that may appear in a title. Lets use the two terms "Portfolio" and "Mike"
I'm hoping to generate a query that will allow for me to search for when Portfolio is contained within a title, or Mike. These two titles need not to be together.
For instance, if I have a title dubbed: "Portfolio A" and another title "Mike's favorite" I'd like both of these titles to be returned.
The issue I've encountered with using a LIKE statement is the following:
WHERE 1=1
and rpt_title LIKE ''%'+#report_title+'%'''
If I were to input: 'Portfolio,Mike' it would search for the occurrence of just that within a title.
EDIT: I should have been a bit more clear. I believe it's necessary for me to input my variable as 'Portfolio, Mike' in order for it to find the multiple values. Is this possible?
I'm assuming you could maybe use a charindex with a substring and a replace?
Yep, multiple Like statements with OR will work just fine -- just make sure you use the correct parentheses:
SELECT ...
FROM ...
WHERE 1=1
and (rpt_title LIKE '%Portfolio%'
or rpt_title LIKE '%Mike%')
However, I might suggest you look into using a full-text search.
http://msdn.microsoft.com/en-us/library/ms142571.aspx
I can propose a solution where you could specify any number of masks, without using multiple LIKE -
DECLARE #temp TABLE (st VARCHAR(100))
INSERT INTO #temp (st)
VALUES ('Portfolio photo'),('- Mike'),('blank'),('else'),('est')
DECLARE #delims VARCHAR(30)
SELECT #delims = '|Portfolio|Mike|' -- %Portfolio% OR %Mike% OR etc.
SELECT t.st
FROM #temp t
CROSS JOIN (
SELECT substr =
SUBSTRING(
#delims,
number + 1,
CHARINDEX('|', #delims, number + 1) - number - 1)
FROM [master].dbo.spt_values n
WHERE [type] = N'P'
AND number <= LEN(#delims) - 1
AND SUBSTRING(#delims, number, 1) = '|'
) s
WHERE t.st LIKE '%' + s.substr + '%'