SQL Server: display whole column only if substring found - sql

Working with SQL Sever 2016. I am constrained by the fact we cannot create functions or stored procedures. I am trying to find %word% in many columns across a table (75). Right now, I have a very large clump of
and (fieldname1 like %word%
or fieldname2 like %word%
or fieldname3 like %word%) etc.
While cumbersome, this does provide me the correct results. However:
I am looking to simplify this and
in the select, I want to display the whole column if and only if it finds %word% (or even just the column name would work)
Thank you in advance for any thoughts.

--...slow...
declare #searchfor varchar(100) = '23';
select #searchfor as [thevalue],
thexml.query('for $a in (/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))])
return concat(local-name($a[1]), ",")').value('.', 'nvarchar(max)') as [appears_in_columns],
*
from
(
select *, (select o.* for xml path(''), type) as thexml
from sys.all_objects as o --table goes here
) as src
where thexml.exist('/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))]') = 1;

One option uses cross apply to unpivot the table and then search:
select v.*
from mytable t
cross apply (values
('fieldname1', fieldname1),
('fieldname2', fieldname2),
('fieldname3', fieldname3)
) v(fieldname, fieldvalue)
where v.fieldvalue like '%word%'
Note that if more than one column contains the search word, you will get several rows in the resultset. I am unsure how you want to handle this use case (there are options).

SELECT OBJECT_NAME(id) ObjectName , [Text]
FROM syscomments
WHERE TEXT LIKE '%word%'

Related

data type of each characters in a varchar T-sql

I'm curious on the data I get from someone. Most of the time I need to get 3 integers then a space then eight integers.
And The integration created a column varchar(20) ... Don't doubt it works, but that gives me some matching errors.
Because of this, I'd like to know what is the data type of the characters on each row.
For exemple : 0 is for integer, s for space, a for char, * for specific
AWB | data type
---------------------------------
012 12345678 | 000s00000000
9/5 ab0534 | 0*0saa0000
I'd like to know if there is a function or a formula to get this kind of results.
Right after I'll be able to group by this column and finally be able to check how good is the data quality.
I don't know if there is a specific word for what I tried to explain, so excuse me if this is a duplicate of a post, I didn't find it.
Thank you for your feedback.
There's nothing built-in, but you might use an approach like this:
DECLARE #tbl TABLE(ID INT IDENTITY,AWB VARCHAR(100));
INSERT INTO #tbl VALUES
('012 12345678')
,('9/5 ab0534');
WITH cte AS
(
SELECT t.ID
,t.AWB
,A.Nmbr
,C.YourMask
FROM #tbl t
CROSS APPLY (SELECT TOP (DATALENGTH(t.AWB)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY (SELECT SUBSTRING(t.AWB,A.Nmbr,1)) B(SingleCharacter)
CROSS APPLY (SELECT CASE WHEN B.SingleCharacter LIKE '[0-9]' THEN '0'
WHEN B.SingleCharacter LIKE '[a-z]' THEN 'a'
WHEN B.SingleCharacter = ' ' THEN 's'
ELSE '*' END) C(YourMask)
)
SELECT ID
,AWB
,(
SELECT YourMask
FROM cte cte2
WHERE cte2.ID=cte.ID
ORDER BY cte2.Nmbr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') YourMaskConcatenated
FROM cte
GROUP BY ID,AWB;
The idea in short:
The cte will create a derived set of your table.
The first CROSS APPLY will create a list of numbers as long as the current AWB value.
The second CROSS APPLY will read each character separately.
The third CROSS APPLY will finally use some rather simple logic to translate your values to the mask you expect.
The final SELECT will then use GROUP BY and a correlated sub-query with FOR XML to get the mask characters re-concatenated (With version v2017+ this would be easier calling STRING_AGG()).

compare some lists in where condition sql

I have some question in Sqlserver2012. I have a table that contains a filed that save who System Used from this information and separated by ',', I want to set into parameter the name of Systems and query the related rows:
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT *
FROM dbo.tblMeasureCatalog t1
where ( ( select Upper(value) from dbo.split(t1.System,','))
= any( select Upper(value) from dbo.split(#System,',')))
dbo.split is a function to return systems in separated rows
Forgetting for a second that storing delimited lists in a relational database is abhorrent, you can do it using a combination of INTERSECT and EXISTS, for example:
DECLARE #System NVARCHAR(50) = 'BPM,SEM';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog VALUES ('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY');
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
INTERSECT
SELECT Value
FROM dbo.Split(#System, ',')
);
Returns
System
---------
BPM,XXX
BPM,SEM
XXX,SEM
EDIT
Based on your question stating "Any" I assumed that you wanted rows where the terms matched any of those provided, based on your comment I now assume you want records where the terms match all. This is a fairly similar approach but you need to use NOT EXISTS and EXCEPT instead:
Now all is still quite ambiguous, for example if you search for "BMP,SEM" should it return a record that is "BPM,SEM,YYY", it does contain all of the searched terms, but it does contain additional terms too. So the approach you need depends on your requirements:
DECLARE #System NVARCHAR(50) = 'BPM,SEM,XXX';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog
VALUES
('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY'),
('SEM,BPM'), ('SEM,BPM,XXX'), ('SEM,BPM,XXX,YYY');
-- METHOD 1 - CONTAINS ALL SEARCHED TERMS BUT CAN CONTAIN ADDITIONAL TERMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
(
SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
);
-- METHOD 2 - ONLY CONTAINS ITEMS WITHIN THE SEARCHED TERMS, BUT NOT
-- NECESSARILY ALL OF THEM
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(#System, ',')
);
-- METHOD 3 - CONTAINS ALL ITEMS IN THE SEARCHED TERMS, AND NO ADDITIONAL ITEMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
)
AND LEN(mc.System) = LEN(#System);
You have a problem with your data structure because you are storing lists of things in a comma-delimited list. SQL has a great data structure for storing lists. It goes by the name "table". You should have a junction table with one row per "measure catalog" and "system".
Sometimes, you are stuck with other people's really bad design decisions. One solution is to use split(). Here is one method:
select mc.*
from dbo.tblMeasureCatalog mc
where exists (select 1
from dbo.split(t1.System, ',') t1s join
dbo.split(#System, ',') ss
on upper(t1s.value) = upper(ss.value)
);
you can try this :
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT * from dbo.tblMeasureCatalog t1 inner join dbo.Split (#System ,',') B on t1.it=B.items

Solution to avoid non-sargable argument in where clause

In the code_list CTE in this query I have a row constructor that will eventually take any number of arguments. The column icd in the patient_codes CTE is a five digit identifier that is most descriptive that the three digit codes that the row constructor has. The table icd_patient has a 100 million rows so for performance's sake, I would like to filer the rows on this table before I do any further work. I have
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select distinct icd,pat_id,id
from icd_patient
where icd in (select icd from code_list)
)
select distinct pat_id from patient_codes
The problem is, however, is that in the icd_patient table all of the icd columns are five digit and more descriptive. If I look at the execution plan of this query it's pretty streamlined. If I do
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select substring(icd,1,3) as icd,pat_id
from icd_patient2
where substring(icd,1,3) in (select * from code_list)
)
select * from patient_codes
this if course has a large performance impact because of the substring expression in the where clause. Does something akin to like in exist so I can take advantage of my indexes?
Index on icd_patient
CREATE NONCLUSTERED INDEX [ix_icd_patient] ON [dbo].[icd_patient2]
(
[pat_id] ASC
)
INCLUDE ( [id],
This much simpler query should be better than (or, at worst, the same as) your existing query.
select pat_id
FROM dbo.icd_patient
where icd LIKE '707%'
OR icd LIKE '250%'
GROUP BY pat_id;
Note that sargability only matters if there is actually an index on this column.
An alternative (since OR can sometimes give the optimizer fits):
SELECT pat_id FROM
(
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '707%'
UNION ALL
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '250%'
) AS x
GROUP BY pat_id;
To make this extensible beyond a handful of OR conditions, I would use a table-valued parameter (TVP).
CREATE TYPE dbo.StringPatterns AS TABLE(s VARCHAR(3) PRIMARY KEY);
Then your stored procedure could say:
CREATE PROCEDURE dbo.whatever
#sp dbo.StringPatterns READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT p.pat_id
FROM dbo.icd_patient AS p
INNER JOIN #sp AS sp
ON p.pat_id LIKE sp.s + '%'
GROUP BY p.pat_id;
END
Then you can pass in your set of three-character substrings from a DataTable or other collection in C#. From T-SQL just as an example:
DECLARE #p dbo.StringPatterns;
INSERT #p VALUES('707'),('250');
EXEC dbo.whatever #sp = #p;
Something like like in does not exist. The following is sargable:
select *
from icd_patient
where icd like '70700%' or
icd like '25002%'
Because like with a constant initial substring is a special case for SQL Server. This does not work when the strings on the right are variables.
One solution is to create an indexed view on the icd_patient table with an index on the first five characters of the icd code.
Using "IN" makes that part of a command non-sargable on both sides. End of discussion.
Saying he fixes it using substring, completely changes what it would return while it remains non sarged.
Any "fix" should exactly match results. The actual fix is to join the cte so the five characters match or put three characters in the cte and match that in a join or put 4 characters in the cte where the fourth is "%" and join matching by using LIKE
Using a "like" that starts with "%" increases the complexity of the search, but it would still use the index to find the value because parsing the index should use less reading by only getting the full table row when a search is successful.

Compare Xml data in SQL

I have two tables with same NVARCHAR field that really contains XML data.
in some cases this really-XML-field is really same as one row in other table but differs in attributes order and therefor string comparison does not return the correct result!!!
and to determining the same XML fields ,I need to have a comparison like:
cast('<root><book b="" c="" a=""/></root>' as XML)
= cast('<root><book a="" b="" c=""/></root>' as XML)
but I get this Err Msg:
The XML data type cannot be compared or sorted, except when using the
IS NULL operator.
then what is the best solution to determine the same XML without re-casting them to NVARCHAR?
Why cast it at all? Just plug them into an XML column in a temp table and run Xquery to compare them to the other table. EDIT: Included example of the comparison. There are many, many ways to run the query against the XML to get the rows that are the same - exactly how that query is written is going to depend on preference, requirements, etc. I went with a simple group by/count, but a self join could be used, WHERE EXISTS against the columns that are being searched for duplicates, you name it.
CREATE TABLE #Test (SomeXML NVARCHAR(MAX))
CREATE TABLE #XML (SomeXML XML)
INSERT #Test (SomeXML)
VALUES('<root><book b="b" c="c" a="a"/></root>')
,('<root><book a="a" b="b" c="c"/></root>')
INSERT #XML (SomeXML)
SELECT SomeXML FROM #Test;
WITH XMLCompare (a,b,c)
AS
(
SELECT
x.c.value('#a[1]','char(1)') AS a
,x.c.value('#b[1]','char(1)') AS b
,x.c.value('#c[1]','char(1)') AS c
FROM #XML
CROSS APPLY SomeXMl.nodes('/root/book') X(C)
)
SELECT
a
,b
,c
FROM XMLCompare as a
GROUP BY
a
,b
,c
HAVING COUNT(*) >1

SQL Server query brings unmatched data with BETWEEN filter

I'm querying on my products table for all products with code between a range of codes, and the result brings a row that should't be there.
This is my SQL query:
select prdcod
from products
where prdcod between 'F-DH1' and 'F-FMS'
order by prdcod
and the results of this query are:
F-DH1
F-DH2
F-DH3
FET-RAZ <-- What is this value doing here!?
F-FMC
F-FML
F-FMS
How can this odd value make it's way into the query results?
PS: I get the same results if I use <= and >= instead of between.
According to OP request promoted next comment to answer:
Seems like your collation excludes '-' sign - this way results make sense, FE is between FD and FM.
:)
between and >= and <= are primarily used for numeric operations (including dates). You're trying to use this for strings, which are difficult at best to determine how those operators will interpret the each string.
Now, while I think I understand your goal here, I'm not entirely sure it's possible using SQL Server queries. This may be some business logic (thanks to the product codes) that needs implemented in code. Something like the Entity Framework or Linq-to-SQL may be better suited to get you the data you're looking for.
How about adding AND LEFT(prdcod, 2) = 'F-'?
Try replacing the "-" with a space so the order is what you would expect:
DECLARE #list table(word varchar(50))
--create list
INSERT INTO #list
SELECT 'F-DH1'
UNION ALL
SELECT 'F-DH2'
UNION ALL
SELECT 'F-DH3'
UNION ALL
SELECT 'FET-RAZ'
UNION ALL
SELECT 'F-FMC'
UNION ALL
SELECT 'F-FML'
UNION ALL
SELECT 'F-FMS'
--original order
SELECT * FROM #list order by word
--show how order changes
SELECT *,replace(word,'-',' ') FROM #list order by replace(word,'-',' ')
--show between condition
SELECT * FROM #list where replace(word,'-',' ') between 'F DH1' and 'F FMS'