Search for a particular value in a string with commas - sql

I have a TEXT column in my Table T and contains some values separated by Commas.
Example
Columns BNFT has text values such as
B20,B30,B3,B13,B31,B14,B25,B29,B1,B2,B4,B5
OR
B1,B2,B34,B31,B8,B4,B5,B33,B30,B20,B3
I want to return result in my query only if B3 is present.
It should not consider B30-B39 or B[1-9]3 (i.e. B13, B23 .... B93).
I tried with below query, but want to implement REGEXP or REGEXP_LIKE/INSTR etc. Haven't used them before and unable to understand also.
Select *
FROM T
Where BNFT LIKE '%B3,%' or BNFT LIKE '%B3'
Pls advise
Procedures will not work. Query must start with Select as 1st statement.

The first advice is to fix your data structure. Storing lists of ids in strings is a bad idea:
You are storing numbers as strings. That is the wrong representation.
You are storing multiple values in a string column. That is not using SQL correctly.
These values are probably ids. You cannot declare proper foreign key relationships.
SQL does not have particularly strong string functions.
The resulting query cannot take advantage of indexes.
That said, sometimes we are stuck with other people's bad design decisions.
In SQL Server, you would do:
where ',' + BNFT + ',' LIKE '%,33,%'
This question was originally tagged MySQL, which offers find_in_set() for this purpose:
Where find_in_set(33, BNFT) > 0

Select *
FROM T
Where ',' + BNFT + ',' LIKE '%,B3,%';
or
Select *
FROM T
Where CHARINDEX (',B3,',',' + BNFT + ',') > 0;

This can be easily achieve by CTE, REGEXP/REGEXP_Like/INSTR works better with oracle, for MS SQL Server you can try this
DECLARE #CSV VARCHAR(100) ='B2,B34,B31,B8,B4,B5,B33,B30,B20,B3';
SET #CSV = #CSV+',';
WITH CTE AS
(
SELECT SUBSTRING(#CSV,1,CHARINDEX(',',#CSV,1)-1) AS VAL, SUBSTRING(#CSV,CHARINDEX(',',#CSV,1)+1,LEN(#CSV)) AS REM
UNION ALL
SELECT SUBSTRING(A.REM,1,CHARINDEX(',',A.REM,1)-1)AS VAL, SUBSTRING(A.REM,CHARINDEX(',',A.REM,1)+1,LEN(A.REM))
FROM CTE A WHERE LEN(A.REM)>=1
) SELECT VAL FROM CTE
WHERE VAL='B3'

Related

STRING_AGG substitute for SQL Server 2008 [duplicate]

Can anyone help me make this query work for SQL Server 2014?
This is working on Postgresql and probably on SQL Server 2017. On Oracle it is listagg instead of string_agg.
Here is the SQL:
select
string_agg(t.id,',') AS id
from
Table t
I checked on the site some xml option should be used but I could not understand it.
In SQL Server pre-2017, you can do:
select stuff( (select ',' + cast(t.id as varchar(max))
from tabel t
for xml path ('')
), 1, 1, ''
);
The only purpose of stuff() is to remove the initial comma. The work is being done by for xml path.
Note that for some characters, the values will be escaped when using FOR XML PATH, for example:
SELECT STUFF((SELECT ',' + V.String
FROM (VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String)
FOR XML PATH('')),1,1,'');
This returns the string below:
7 > 5,Salt & pepper,2
lines'
This is unlikely desired. You can get around this using TYPE and then getting the value of the XML:
SELECT STUFF((SELECT ',' + V.String
FROM (VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String)
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'');
This returns the string below:
7 > 5,Salt & pepper,2
lines
This would replicate the behaviour of the following:
SELECT STRING_AGG(V.String,',')
FROM VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String);
Of course, there might be times where you want to group the data, which the above doesn't demonstrate. To achieve this you would need to use a correlated subquery. Take the following sample data:
CREATE TABLE dbo.MyTable (ID int IDENTITY(1,1),
GroupID int,
SomeCharacter char(1));
INSERT INTO dbo.MyTable (GroupID, SomeCharacter)
VALUES (1,'A'), (1,'B'), (1,'D'),
(2,'C'), (2,NULL), (2,'Z');
From this wanted the below results:
GroupID
Characters
1
A,B,D
2
C,Z
To achieve this you would need to do something like this:
SELECT MT.GroupID,
STUFF((SELECT ',' + sq.SomeCharacter
FROM dbo.MyTable sq
WHERE sq.GroupID = MT.GroupID --This is your correlated join and should be on the same columns as your GROUP BY
--You "JOIN" on the columns that would have been in the PARTITION BY
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')
FROM dbo.MyTable MT
GROUP BY MT.GroupID; --I use GROUP BY rather than DISTINCT as we are technically aggregating here
So, if you were grouping on 2 columns, then you would have 2 clauses your sub query's WHERE: WHERE MT.SomeColumn = sq.SomeColumn AND MT.AnotherColumn = sq.AnotherColumn, and your outer GROUP BY would be GROUP BY MT.SomeColumn, MT.AnotherColumn.
Finally, let's add an ORDER BY into this, which you also define in the subquery. Let's, for example, assume you wanted to sort the data by the value of the ID descending in the string aggregation:
SELECT MT.GroupID,
STUFF((SELECT ',' + sq.SomeCharacter
FROM dbo.MyTable sq
WHERE sq.GroupID = MT.GroupID
ORDER BY sq.ID DESC --This is identical to the ORDER BY you would have in your OVER clause
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')
FROM dbo.MyTable MT
GROUP BY MT.GroupID;
For would produce the following results:
GroupID
Characters
1
D,B,A
2
Z,C
Unsurprisingly, this will never be as efficient as a STRING_AGG, due to having the reference the table multiple times (if you need to perform multiple aggregations, then you need multiple sub queries), but a well indexed table will greatly help the RDBMS. If performance really is a problem, because you're doing multiple string aggregations in a single query, then I would either suggest you need to reconsider if you need the aggregation, or it's about time you conisidered upgrading.

DB2 efficient select query with like operator for many values (~200)

I have written the following query:
SELECT TBSPACE FROM SYSCAT.TABLES WHERE TYPE='T' AND (TABNAME LIKE '%_ABS_%' OR TABNAME LIKE '%_ACCT_%')
This gives me a certain amount of results. Now the problem is that I have multiple TABNAME to select using the LIKE operator (~200). Is there an efficient way to write the query for the 200 values without repeating the TABNAME LIKE part (because there are 200 such values which would result in a really huge query) ?
(If it helps, I have stored all required TABNAME values in a table TS to retrieve from)
If you are just looking for substrings, you could use LOCATE. E.g.
WITH SS(S) AS (
VALUES
('_ABS_')
, ('_ACCT_')
)
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES, SS
WHERE
TYPE='T'
AND LOCATE(S,TABNAME) > 0
or if your substrings are in table CREATE TABLE TS(S VARCHAR(64))
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES, TS
WHERE
TYPE='T'
AND LOCATE(S,TABNAME) > 0
You could try REGEXP_LIKE. E.g.
SELECT DISTINCT
TABNAME
FROM
SYSCAT.TABLES
WHERE
TYPE='T'
AND REGEXP_LIKE(TABNAME,'.*_((ABS)|(ACCT))_.*')
Just in case.
Note, that the '_' character has special meaning in a pattern-expression of the LIKE predicate:
The underscore character (_) represents any single character.
The percent sign (%) represents a string of zero or more characters.
Any other character represents itself.
So, if you really need to find _ABS_ substring, you should use something like below.
You get both rows in the result, if you use the commented out pattern instead, which may not be desired.
with
pattern (str) as (values
'%\_ABS\_%'
--'%_ABS_%'
)
, tables (tabname) as (values
'A*ABS*A'
, 'A_ABS_A'
)
select tabname
from tables t
where exists (
select 1
from pattern p
where t.tabname like p.str escape '\'
);

Finding max value for a column containing hierarchical decimals

I have a table where the column values are like '1.2.4.5', '3.11.0.6',
'3.9.3.14','1.4.5.6.7', N/A, etc.. I want to find the max of that particular column. However when i use this query i am not getting the max value.
(SELECT max (CASE WHEN mycolumn = 'N/A'
THEN '-1000'
ELSE mycolumn
END )
FROM mytable
WHERE column like 'abc')
I am getting 3.9.3.14 as max value instead of 3.11....
Can someone help me?
Those aren't really decimals - they're strings containing multiple dots, so it's unhelpful to think of them as being "decimals".
We can accomplish your query with a bit of manipulation. There is a type build into SQL Server that more naturally represents this type of structure - hierarchyid. If we convert your values to this type then we can find the MAX fairly easily:
declare #t table (val varchar(93) not null)
insert into #t(val) values
('1.2.4.5'),
('3.11.0.6'),
('3.9.3.14'),
('1.4.5.6.7')
select MAX(CONVERT(hierarchyid,'/' + REPLACE(val,'.','/') + '/')).ToString()
from #t
Result:
/3/11/0/6/
I leave the exercise of fully converting this string representation back into the original form as an exercise for the reader. Alternatively, I'd suggest that you may want to start storing your data using this datatype anyway.
MAX() on values stored as text performs an alphabetic sort.
Use FIRST_VALUE and HIERARCHYID:
SELECT DISTINCT FIRST_VALUE(t.mycolumn) OVER(
ORDER BY CONVERT(HIERARCHYID, '/' + REPLACE(NULLIF(t.mycolumn,'N/A'), '.', '/') + '/') DESC) AS [Max]
FROM #mytable t

How do I return rows if concatenated value is within another set of values?

This may be a dumb question, but I am trying to concatenate values from two columns and then return rows where those concatenated values are in another set.
My question is, what is the best method to do this? I've done this in the past using an IN statement, but if I have a lot of values this time around.
Would I be better off creating a new table with my comparison values and get my results using some kind of relationship statement? Something like that below?
SELECT Users.ID + Users.SubID AS UniqueID, TempVals.ID
FROM Users, TempVals
WHERE Users.ID + Users.SubID = TempVals.ID
Thanks!
What you have should work fine assuming it's MySQL and the data types of the columns being concatenated are strings/character-types - no worries, you should be able to cast them.
Another way:
SELECT DISTINCT U.ID + U.SubID AS UniqueID
FROM Users as U
WHERE (U.ID + U.SubID) IN (
SELECT DISTINCT T.ID
FROM TempVals as T
)
Also, the type of database will determine the method of concatenation:
SQL Server: +
Oracle: CONCAT()
MySQL: CONCAT()
Reference: Using IN Statement - SQL

SQL Server query brings unmatched data with BETWEEN filter

I'm querying on my products table for all products with code between a range of codes, and the result brings a row that should't be there.
This is my SQL query:
select prdcod
from products
where prdcod between 'F-DH1' and 'F-FMS'
order by prdcod
and the results of this query are:
F-DH1
F-DH2
F-DH3
FET-RAZ <-- What is this value doing here!?
F-FMC
F-FML
F-FMS
How can this odd value make it's way into the query results?
PS: I get the same results if I use <= and >= instead of between.
According to OP request promoted next comment to answer:
Seems like your collation excludes '-' sign - this way results make sense, FE is between FD and FM.
:)
between and >= and <= are primarily used for numeric operations (including dates). You're trying to use this for strings, which are difficult at best to determine how those operators will interpret the each string.
Now, while I think I understand your goal here, I'm not entirely sure it's possible using SQL Server queries. This may be some business logic (thanks to the product codes) that needs implemented in code. Something like the Entity Framework or Linq-to-SQL may be better suited to get you the data you're looking for.
How about adding AND LEFT(prdcod, 2) = 'F-'?
Try replacing the "-" with a space so the order is what you would expect:
DECLARE #list table(word varchar(50))
--create list
INSERT INTO #list
SELECT 'F-DH1'
UNION ALL
SELECT 'F-DH2'
UNION ALL
SELECT 'F-DH3'
UNION ALL
SELECT 'FET-RAZ'
UNION ALL
SELECT 'F-FMC'
UNION ALL
SELECT 'F-FML'
UNION ALL
SELECT 'F-FMS'
--original order
SELECT * FROM #list order by word
--show how order changes
SELECT *,replace(word,'-',' ') FROM #list order by replace(word,'-',' ')
--show between condition
SELECT * FROM #list where replace(word,'-',' ') between 'F DH1' and 'F FMS'