Detect (find) string in another string ( nvarchar (MAX) ) - sql

I've got nvarchar(max) column with different values alike 'A2'
And another column from another table with values alike '(A2 AND A3) OR A4'
I need to detect does string from second column contains string from first column.
So then I need to select all columns of second table which contains an string from first column of first table.
something alike ... but that is wrong
SELECT * Cols FROM T2
WHERE (SELECT T1.StringCol FROM T1) IN T2.StringCol
but I more understand it like it (in f# syntax)
for t1.date, t1.StringCol from t1
for t2.StringCol from t2
if t2.StringCol.Contains( t1.StringCol )
yield t2.StringCol, t1.date

This should get what you want...
select t2.*
from t1 cross join t2
where patindex('%' + t1.StringCol + '%', t2.StringCol) > 0

Related

Why does SQL Server not behave consistently when dealing with half-emojis?

When an NVARCHAR attribute in a SQL Server database contains an emoji, string functions and operators behave in different ways (default database collation is SQL_Latin1_General_CP1_CI_AS).
Behavior of string functions
Functions like LEFT() and LEN() see the emoji as multiple, separate characters (LEFT will cut the emoji in half and return a partial value).
DECLARE #YourString NVARCHAR(12) = N'Thank you 😃'
SELECT #YourString, LEFT(#YourString, 11), LEN(#YourString)
The return values are "Thank you 😃", "Thank you �", and 12.
Behavior of operators
Operators such as UNION, INTERSECT, and EXISTS treat both the emoji and the half-emoji as a single, identical value, though they return different results depending on the order the values arrive in.
Behavior of the UNION operator
UNION treats these as identical records (since it only returns one value), but will nevertheless yield a different result depending on the order, with the bottom record being returned.
SELECT #YourString UNION SELECT LEFT(#YourString, 11) returns "Thank you �" (the value from the bottom half)
SELECT LEFT(#YourString, 11) UNION SELECT #YourString returns "Thank you 😃" (the value from the top half)
Behavior of the INTERSECT and EXISTS operators
INTERSECT and EXISTS also treat these as identical values, but will return the value from the top record (which makes sense given the purpose of those operators, but nonetheless feels weird after seeing UNION do the opposite).
SELECT #YourString INTERSECT SELECT LEFT(#YourString, 11) returns "Thank you 😃".
SELECT LEFT(#YourString, 11) INTERSECT SELECT #YourString returns "Thank you �".
SELECT #YourString EXCEPT SELECT LEFT(#YourString, 11) returns no result.
Summary
String functions such as LEFT() and LEN() treat these characters as separate values.
The UNION operator treats these as identical values, but will nevertheless return varying results (with preference given to the value from the bottom half of the operator)
The EXISTS and INTERSECT operator treats these as identical values, but will return the value from the top half
Question
Why would two different values be treated as a single, identical character by some operators (UNION, INTERSECT, EXCEPT), but be interpreted as two different values by string functions (LEFT, LEN)?
Bonus Question
Why would the UNION operator be returning the second value it sees?
I'm afraid I don't see any inconsistence here: Using SQL_Latin1_General_CP1_CI_AS, all set and comparison operations treat the two values as equal (fiddle):
CREATE TABLE t1 (s1 NVARCHAR(12) COLLATE SQL_Latin1_General_CP1_CI_AS);
CREATE TABLE t2 (s2 NVARCHAR(12) COLLATE SQL_Latin1_General_CP1_CI_AS);
INSERT INTO t1 VALUES (N'Thank you 😃');
INSERT INTO t2 SELECT LEFT(s1, 11) FROM t1;
SELECT * FROM t1;
SELECT * FROM t2;
SELECT s1 FROM t1 UNION SELECT s2 FROM t2; -- only s1
SELECT s1 FROM t1 EXCEPT SELECT s2 FROM t2; -- none
SELECT s2 FROM t2 EXCEPT SELECT s1 FROM t1; -- none
SELECT s1 FROM t1 INTERSECT SELECT s2 FROM t2; -- only s1
SELECT * FROM t1 INNER JOIN t2 ON s1 = s2; -- one row with s1 and s2
As soon as we change the collation to Latin1_General_100_CI_AS, all operations treat the two values as not equal (fiddle):
CREATE TABLE t1 (s1 NVARCHAR(12) COLLATE Latin1_General_100_CI_AS);
CREATE TABLE t2 (s2 NVARCHAR(12) COLLATE Latin1_General_100_CI_AS);
INSERT INTO t1 VALUES (N'Thank you 😃');
INSERT INTO t2 SELECT LEFT(s1, 11) FROM t1;
SELECT * FROM t1;
SELECT * FROM t2;
SELECT s1 FROM t1 UNION SELECT s2 FROM t2; -- both
SELECT s1 FROM t1 EXCEPT SELECT s2 FROM t2; -- only s1
SELECT s2 FROM t2 EXCEPT SELECT s1 FROM t1; -- only s2
SELECT s1 FROM t1 INTERSECT SELECT s2 FROM t2; -- none
SELECT * FROM t1 INNER JOIN t2 ON s1 = s2; -- no result
The reason why the legacy encoding considers those two values equal can be found in this question:
SQL Query Where Column = '' returning Emoji characters 🎃 and 🍰
You probably already know this, but I'd like to explicitly point out that the fact that a = b evaluates to true for two values with different byte contents and different lengths is in itself not exceptional: All case-insensitive collations treat 'A' and 'a' as equal, and even case-sensitive collations treat 'A' (length 1) and 'A ' (length 2) as equal, since SQL Server ignores trailing spaces.

tsql - extract only a portion of a dynamic string

I have a table with a column that holds a string such as:
xx-xx-xxx-84
xx-25-xxx-xx
xx-xx-123-xx
I want to go ahead and query out the numbers only, but as you can see, the numbers are placed at different places in the string every time. Is there away to query for only the numbers in the string?
Thank you,
I appreciate your time!
This requires repeated application of string functions. One method that helps with all the nesting is using OUTER APPLY. Something like this:
select t3.col
from t outer apply
(select t.*, patindex(t.col, '[0-9]') - 1 as numpos) t1 outer apply
(select t1.*, substring(t1.col, t1.numpos, len(t1.col)) as col2) t2 outer apply
(select t2.*,
(case when col2 like '%-%'
then substring(t2.col, charindex('-', t2.col))
else t2.col
end) as col3
) t3
The easy way (sure if only 'x' and '-' are in strings):
SELECT REPLACE(REPLACE(s,'x',''),'-','') FROM T
Or if X can be any non digital character then using PATINDEX() function:
SELECT S, SUBSTRING(S,
PATINDEX('%[0-9]%',s),
PATINDEX('%[0-9][^0-9]%',s)
+PATINDEX('%[0-9]',s)
+1
-PATINDEX('%[0-9]%',s)) as digit
FROM T

Cast varchar that holds some strings to integer field in informix

I have 2 rows from 2 tables in a database that I want to compare.
Column1 is on table1 and is an Integer field with entries like the following
column1
147518
187146
169592
Column2 is on table2 and is a Varchar(15) field with various entries but for this example lets use these 3:
column2
169592
00010000089
DummyId
For my query part of it relies on checking if rows from table1 are linked to the rows in table2, but to do this, I need to compare column1 and column2.
SELECT * FROM table1 WHERE column1 IN (SELECT column2 FROM table2)
The result of this using the data above should be 1 row - 169592
Obviously this wont work (A character to numeric conversion process failed) as they cannot be compared as is, but how do I get them to work?
I have tried
SELECT * FROM table1 WHERE column1 IN (SELECT CAST(column2 AS INTEGER) FROM table2)
and
SELECT * FROM table1 WHERE column1 IN (SELECT (column2::INTEGER) column2 FROM table2)
Using Server Studio 9.1 if that helps.
Try casting the int to a string:
SELECT * FROM table1 WHERE cast(column1 as varchar(15)) IN (SELECT column2 FROM table2)
You can try to use ISNUMERIC in following:
SELECT * FROM table1 WHERE column1 IN (SELECT CASE WHEN ISNUMERIC(column2) = 1 THEN CAST(column2 AS INT) END FROM table2)
For this purpose there is no need to create a special function that you'll not find on other environments.
Let's create a test case for your example:
CREATE TABLE tab1 (
col1 INT,
col2 INT
);
CREATE TABLE tab2 (
col1 VARCHAR(15)
);
INSERT INTO tab1 VALUES(147518,1);
INSERT INTO tab1 VALUES(187146,2);
INSERT INTO tab1 VALUES(169592,3);
INSERT INTO tab2 VALUES(169592);
INSERT INTO tab2 VALUES('00010000089');
INSERT INTO tab2 VALUES('DummyId');
The first query you run was like:
SELECT t1.*
FROM tab1 AS t1
WHERE t1.col1 IN (SELECT t2.col1 FROM tab2 AS t2);
This will raise an error because it tries to compare an INT with a VARCHAR
[infx1210#tardis ~]$ finderr 1213
-1213 A character to numeric conversion process failed.
A character value is being converted to numeric form for storage in a
numeric column or variable. However, the character string cannot be
interpreted as a number. It contains some characters other than white
space, digits, a sign, a decimal, or the letter e; or the parts are in
the wrong order, so the number cannot be deciphered.
If you are using NLS, the decimal character or thousands separator
might be wrong for your locale.
[infx1210#tardis ~]$
Then you've tried to cast a VARCHAR into a INT which resulted in the same error, you should tried the other way:
> SELECT t1.*
> FROM tab1 AS t1
> WHERE t1.col1::CHAR(11) IN (SELECT t2.col1 FROM tab2 AS t2);
>
col1 col2
169592 3
1 row(s) retrieved.
>
Check also if you don't get faster results using the EXISTS:
> SELECT t1.*
> FROM tab1 AS t1
> WHERE EXISTS (
> SELECT 1
> FROM tab2 AS t2
> WHERE t1.col1::CHAR(11) = t2.col1
> );
col1 col2
169592 3
1 row(s) retrieved.
>
Another way possible is to just join the tables:
> SELECT t1.*
> FROM tab1 AS t1
> INNER JOIN tab2 AS t2
> ON (t1.col1 = t2.col1);
col1 col2
169592 3
1 row(s) retrieved.
>
Part of this question was answered by #Stanislovas Kalašnikovas where he said to use the following:
SELECT * FROM table1 WHERE column1 IN (SELECT CASE WHEN ISNUMERIC(column2) = 1 THEN CAST(column2 AS INT) END FROM table2)
But informix does not have a built in function for ISNUMERIC, so the following created it:
create function isnumeric2(inputstr varchar(15)) returning integer;
define numeric_var decimal(15,0);
define function_rtn integer;
on exception in (-1213)
let function_rtn = 0;
end exception with resume
let function_rtn = 1;
let numeric_var = inputstr;
return function_rtn;
end function;
And then the first query above worked for me.

T-SQL Comma delimited value from resultset to in clause in Subquery

I have an issue where in my data I will have a record returned where a column value will look like
-- query
Select col1 from myTable where id = 23
-- result of col1
111, 104, 34, 45
I want to feed these values to an in clause. So far I have tried:
-- Query 2 -- try 1
Select * from mytableTwo
where myfield in (
SELECT col1
from myTable where id = 23)
-- Query 2 -- try 2
Select * from mytableTwo
where myfield in (
SELECT '''' +
Replace(col1, ',', ''',''') + ''''
from myTable where id = 23)
-- query 2 test -- This works and will return data, so I verify here that data exists
Select * from mytableTwo
where myfield in ('111', '104', '34', '45')
Why aren't query 2 try 1 or 2 working?
You don't want an in clause. You want to use like:
select *
from myTableTwo t2
where exists (select 1
from myTable t
where id = 23 and
', '+t.col1+', ' like '%, '+t2.myfield+', %'
);
This uses like for the comparison in the list. It uses a subquery for the value. You could also phrase this as a join by doing:
select t2.*
from myTableTwo t2 join
myTable t
on t.id = 23 and
', '+t.col1+', ' like '%, '+t2.myfield+', %';
However, this could multiply the number of rows in the output if there is more than one row with id = 23 in myTable.
If you observe closely, Query 2 -- try 1 & Query 2 -- try 2 are considered as single value.
like this :
WHERE myfield in ('111, 104, 34, 45')
which is not same as :
WHERE myfield in ('111', '104', '34', '45')
So, If you intend to filter myTable rows from MyTableTwo, you need to extract the values of fields column data to a table variable/table valued function and filter the data.
I have created a table valued function which takes comma seperated string and returns a table value.
you can refer here T-SQL : Comma separated values to table
Final code to filter the data :
DECLARE #filteredIds VARCHAR(100)
-- Get the filter data
SELECT #filteredIds = col1
FROM myTable WHERE id = 23
-- TODO : Get the script for [dbo].[GetDelimitedStringToTable]
-- from the given link and execute before this
SELECT *
FROM mytableTwo T
CROSS APPLY [dbo].[GetDelimitedStringToTable] ( #filteredIds, ',') F
WHERE T.myfield = F.Value
Please let me know If this helps you!
I suppose col is a character type, whose result would be like like '111, 104, 34, 45'. If this is your situation, it's not the best of the world (denormalized database), but you can still relate these tables by using character operators like LIKE or CHARINDEX. The only gotcha is to convert the numeric column to character -- the default conversion between character and numeric is numeric and it will cause a conversion error.
Since #Gordon, responded using LIKE, I present a solution using CHARINDEX:
SELECT *
FROM mytableTwo tb2
WHERE EXISTS (
SELECT 'x'
FROM myTable tb1
WHERE tb1.id = 23
AND CHARINDEX(CONVERT(VARCHAR(20), tb2.myfield), tb1.col1) > 0
)

sql zero leading in comparing 2 record set

I have a problem compare 2 set of data with and without leading zero.
In the Inventory_NBP, there is as 000000000000909120.
However, in PDM_Analysis, there exist only 909120 without the leading zero.
The current query below failed to retrieve any 000000000000909120 or 909120 as the "IN" condition is not met.
How do i modify the query below to fulfill my requirement?
sel * FROM Inventory_NBP.v_dmnd_rsrv_dpnd_rqr_mrp
WHERE plnt_id ='WA01'
and mtrl_id in('G29329-001', '000000000000909120', '13-0006-001')
and
(mtrl_id, plnt_id)
IN
( SELECT itm_cd, sap_plnt_cd
FROM PDM_Analysis.v_itm_plnt_extn
)
Do casting. This may solve your problem. Please check.
SELECT * FROM Inventory_NBP.v_dmnd_rsrv_dpnd_rqr_mrp
WHERE plnt_id ='WA01'
and mtrl_id in('G29329-001',CAST(CAST('000000000000909120' AS INT) AS VARCHAR(10)), '13-0006-001')
and
(mtrl_id, plnt_id)
IN
( SELECT itm_cd, sap_plnt_cd
FROM PDM_Analysis.v_itm_plnt_extn
)
Here's an example of how to match values regardless of leading zeros by padding (SQL Server 2008 syntax):
WITH T1
AS
(
SELECT *
FROM (
VALUES ('000000000000909120'),
('00000000099'),
('000000055'),
('22'),
('152')
) AS T (data_col)
),
T2
AS
(
SELECT *
FROM (
VALUES ('909120'),
('99'),
('00055'),
('0000000022'),
('152')
) AS T (data_col)
)
SELECT *
FROM T1 INNER JOIN T2
ON T1.data_col
= REPLICATE('0', LEN(T1.data_col) - LEN(T2.data_col))
+ T2.data_col
UNION
SELECT *
FROM T1 INNER JOIN T2
ON T2.data_col
= REPLICATE('0', LEN(T2.data_col) - LEN(T1.data_col))
+ T1.data_col;