compare some lists in where condition sql - sql

I have some question in Sqlserver2012. I have a table that contains a filed that save who System Used from this information and separated by ',', I want to set into parameter the name of Systems and query the related rows:
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT *
FROM dbo.tblMeasureCatalog t1
where ( ( select Upper(value) from dbo.split(t1.System,','))
= any( select Upper(value) from dbo.split(#System,',')))
dbo.split is a function to return systems in separated rows

Forgetting for a second that storing delimited lists in a relational database is abhorrent, you can do it using a combination of INTERSECT and EXISTS, for example:
DECLARE #System NVARCHAR(50) = 'BPM,SEM';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog VALUES ('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY');
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
INTERSECT
SELECT Value
FROM dbo.Split(#System, ',')
);
Returns
System
---------
BPM,XXX
BPM,SEM
XXX,SEM
EDIT
Based on your question stating "Any" I assumed that you wanted rows where the terms matched any of those provided, based on your comment I now assume you want records where the terms match all. This is a fairly similar approach but you need to use NOT EXISTS and EXCEPT instead:
Now all is still quite ambiguous, for example if you search for "BMP,SEM" should it return a record that is "BPM,SEM,YYY", it does contain all of the searched terms, but it does contain additional terms too. So the approach you need depends on your requirements:
DECLARE #System NVARCHAR(50) = 'BPM,SEM,XXX';
DECLARE #tblMeasureCatalog TABLE (System VARCHAR(MAX));
INSERT #tblMeasureCatalog
VALUES
('BPM,XXX'), ('BPM,SEM'), ('XXX,SEM'), ('XXX,YYY'),
('SEM,BPM'), ('SEM,BPM,XXX'), ('SEM,BPM,XXX,YYY');
-- METHOD 1 - CONTAINS ALL SEARCHED TERMS BUT CAN CONTAIN ADDITIONAL TERMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
(
SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
);
-- METHOD 2 - ONLY CONTAINS ITEMS WITHIN THE SEARCHED TERMS, BUT NOT
-- NECESSARILY ALL OF THEM
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(mc.System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(#System, ',')
);
-- METHOD 3 - CONTAINS ALL ITEMS IN THE SEARCHED TERMS, AND NO ADDITIONAL ITEMS
SELECT mc.System
FROM #tblMeasureCatalog AS mc
WHERE NOT EXISTS
( SELECT Value
FROM dbo.Split(#System, ',')
EXCEPT
SELECT Value
FROM dbo.Split(mc.System, ',')
)
AND LEN(mc.System) = LEN(#System);

You have a problem with your data structure because you are storing lists of things in a comma-delimited list. SQL has a great data structure for storing lists. It goes by the name "table". You should have a junction table with one row per "measure catalog" and "system".
Sometimes, you are stuck with other people's really bad design decisions. One solution is to use split(). Here is one method:
select mc.*
from dbo.tblMeasureCatalog mc
where exists (select 1
from dbo.split(t1.System, ',') t1s join
dbo.split(#System, ',') ss
on upper(t1s.value) = upper(ss.value)
);

you can try this :
declare #System nvarchar(50)
set #System ='BPM,SEM'
SELECT * from dbo.tblMeasureCatalog t1 inner join dbo.Split (#System ,',') B on t1.it=B.items

Related

Redshift - Extract value matching a condition in Array

I have a Redshift table with the following column
How can I extract the value starting by cat_ from this column please (there is only one for each row and at different position in the array)?
I want to get those results:
cat_incident
cat_feature_missing
cat_duplicated_request
Thanks!
There is no easy way to extract multiple values from within one column in SQL (or at least not in the SQL used by Redshift).
You could write a User-Defined Function (UDF) that returns a string containing those values, separated by newlines. Whether this is acceptable depends on what you wish to do with the output (eg JOIN against it).
Another option is to pre-process the data before it is loaded into Redshift, to put this information in a separate one-to-many table, with each value in its own row. It would then be trivial to return this information.
You can do this using tally table (table with numbers). Check this link on information how to create this table: http://www.sqlservercentral.com/articles/T-SQL/62867/
Here is example how you would use it. In real life you should replace temporary #tally table with a permanent one.
--create sample table with data
create table #a (tags varchar(500));
insert into #a
select 'blah,cat_incident,mcr_close_ticket'
union
select 'blah-blah,cat_feature_missing,cat_duplicated_request';
--create tally table
create table #tally(n int);
insert into #tally
select 1
union select 2
union select 3
union select 4
union select 5
;
--get tags
select * from
(
select TRIM(SPLIT_PART(a.tags, ',', t.n)) AS single_tag
from #tally t
inner join #a a ON t.n <= REGEXP_COUNT(a.tags, ',') + 1 and n<1000
)
where single_tag like 'cat%'
;
Thanks!
In the end I managed to do it with the following query:
SELECT SUBSTRING(SUBSTRING(tags, charindex('cat_', tags), len(tags)), 0, charindex(',', SUBSTRING(tags, charindex('cat_', tags), len(tags)))) tags
FROM table

How to detect numbers in string in many observations?

I would like to detect all numbers from Name to gain ID. I use this code:
select Name, ID
from [my_table]
where [Name] like ('%000234%')
But I need that working for many names. I tried Name like in (000234, 000235, ...), but it doesn't work. is it possible to gain whole list of IDs searching by Names?
If you want to select names for a specific range of numbers
select Name, ID
from [my_table]
where [Name] like '%00023[4-9]%'
Which will search for names 000234 - 000239
For other wild card reference
Insert data to temp table. Then you can merge it with the main table.
DECLARE #TempTable TABLE (TempName NVARCHAR(10))
INSERT INTO #TempTable
VALUES
('000234'),
('000235')
SELECT
A.Name,
A.ID
FROM
[my_table] A INNER JOIN
#TempTable B ON A.Name LIKE '%' + B.TempName + '%'
Added test table for clarity.
CREATE TABLE #Test (
NameString varchar(25) )
INSERT INTO #Test (NameString)
SELECT 'Mike387592'
UNION ALL
SELECT 'Nancy2387'
UNION ALL
SELECT 'Tim0088297234'
WITH CTE AS (
SELECT
NameString
,PATINDEX('%[0-9]%', NameString) AS [Start]
,LEN(NameString) AS [End]
FROM #Test )
SELECT
NameString
,SUBSTRING(NameString, [Start], [End])
FROM CTE
If you have a slew of names in a table that have nonsequential numbers that don't have a pattern shared amongst them, one way to gather that data is to use an IN predicate and to dynamically construct your query.
E.G. You want names with numbers 000412, 001523 & 001687.
You would dynamically generate a query like this:
SELECT Name, ID
FROM [my_table]
WHERE Name IN ( '000412', '001523', '001687' )
Dynamic queries could be generated from the database or by the software calling to the database, but are primarily discouraged because they pose a security threat and aren't reusable. Nevertheless, this is an option.
If you must use the LIKE predicate because there are other characters surrounding your numeric string, something like this would be the generated query:
SELECT Name, ID
FROM [my_table]
WHERE Name LIKE '%000412%' OR
Name LIKE '%001523%' OR
Name LIKE '%001687%'

Check if a list of items already exists in a SQL database

I want to create a group of users only if the same group does not exist already in the database.
I have a GroupUser table with three columns: a primary key, a GroupId, and a UserId. A group of users is described as several lines in this table sharing a same GroupId.
Given a list of UserId, I would like to find a matching GroupId, if it exists.
What is the most efficient way to do that in SQL?
Let say your UserId list is stored in a table called 'MyUserIDList', the following query will efficiently return the list of GroupId containing exactly your user list. (SQL Server Syntax)
Select GroupId
From (
Select GroupId
, count(*) as GroupMemberCount
, Sum(case when MyUserIDList.UserID is null then 0 else 1 End) as GroupMemberCountInMyList
from GroupUser
left outer join MyUserIDList on GroupUser.UserID=MyUserIDList.UserID
group by GroupId
) As MySubQuery
Where GroupMemberCount=GroupMemberCountInMyList
There are couple of ways of doing this. This answer is for sql server only (as you have not mentioned it in your tags)
Pass the list of userids in comma seperated to a stored procedure and in the SP create a dynamic query with this and use the EXEC command to execute the query. This link will guide you in this regard
Use a table-valued parameter in a SP. This is applicable to sql server 2008 and higher only.
The following link will help you get started.
http://www.codeproject.com/Articles/113458/TSQL-Passing-array-list-set-to-stored-procedure-MS
Hope this helps.
One other solution is that you convert the input list into a table. This can be done with various approaches. Unions, temporary tables and others. A neat solution combines the answer of
user1461607 for another question here on SO, using a comma-separated string.
WITH split(word, csv) AS (
-- 'initial query' (see SQLite docs linked above)
SELECT
'', -- place holder for each word we are looking for
'Auto,A,1234444,' -- items you are looking for
-- make sure the list ends with a comma !!
UNION ALL SELECT
substr(csv, 0, instr(csv, ',')), -- each word contains text up to next ','
substr(csv, instr(csv, ',') + 1) -- next recursion parses csv after this ','
FROM split -- recurse
WHERE csv != '' -- break recursion once no more csv words exist
) SELECT word, exisiting_data
FROM split s
-- now join the key you want to check for existence!
-- for demonstration purpose, I use an outer join
LEFT OUTER JOIN (select 'A' as exisiting_data) as t on t.exisiting_data = s.word
WHERE s.word != '' -- make sure we clamp the empty strings from the split function
;
Results in:
Auto,null
A,A
1234444,null

Multiple replacements in string in single Update Statement in SQL server 2005

I've a table 'tblRandomString' with following data:
ID ItemValue
1 *Test"
2 ?Test*
I've another table 'tblSearchCharReplacement' with following data
Original Replacement
* `star`
? `quest`
" `quot`
; `semi`
Now, I want to make a replacement in the ItemValues using these replacement.
I tried this:
Update T1
SET ItemValue = select REPLACE(ItemValue,[Original],[Replacement])
FROM dbo.tblRandomString T1
JOIN
dbo.tblSpecialCharReplacement T2
ON T2.Original IN ('"',';','*','?')
But it doesnt help me because only one replacement is done per update.
One solution is I've to use as a CTE to perform multiple replacements if they exist.
Is there a simpler way?
Sample data:
declare #RandomString table (ID int not null,ItemValue varchar(500) not null)
insert into #RandomString(ID,ItemValue) values
(1,'*Test"'),
(2,'?Test*')
declare #SearchCharReplacement table (Original varchar(500) not null,Replacement varchar(500) not null)
insert into #SearchCharReplacement(Original,Replacement) values
('*','`star`'),
('?','`quest`'),
('"','`quot`'),
(';','`semi`')
And the UPDATE:
;With Replacements as (
select
ID,ItemValue,0 as RepCount
from
#RandomString
union all
select
ID,SUBSTRING(REPLACE(ItemValue,Original,Replacement),1,500),rs.RepCount+1
from
Replacements rs
inner join
#SearchCharReplacement scr
on
CHARINDEX(scr.Original,rs.ItemValue) > 0
), FinalReplacements as (
select
ID,ItemValue,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY RepCount desc) as rn
from
Replacements
)
update rs
set ItemValue = fr.ItemValue
from
#RandomString rs
inner join
FinalReplacements fr
on
rs.ID = fr.ID and
rn = 1
Which produces:
select * from #RandomString
ID ItemValue
----------- -----------------------
1 `star`Test`quot`
2 `quest`Test`star`
What this does is it starts with the unaltered texts (the top select in Replacements), then it attempts to apply any valid replacements (the second select in Replacements). What it will do is to continue applying this second select, based on any results it produces, until no new rows are produced. This is called a Recursive Common Table Expression (CTE).
We then use a second CTE (a non-recursive one this time) FinalReplacements to number all of the rows produced by the first CTE, assigning lower row numbers to rows which were produced last. Logically, these are the rows which were the result of applying the last applicable transform, and so will no longer contain any of the original characters to be replaced. So we can use the row number 1 to perform the update back against the original table.
This query does do more work than strictly necessary - for small numbers of rows of replacement characters, it's not likely to be too inefficient. We could clear it up by defining a single order in which to apply the replacements.
Will skipping the join table and nesting REPLACE functions work?
Or do you need to actually get the data from the other table?
-- perform 4 replaces in a single update statement
UPDATE T1
SET ItemValue = REPLACE(
REPLACE(
REPLACE(
REPLACE(
ItemValue,'*','star')
ItemValue,'?','quest')
ItemValue,'"','quot')
ItemValue,';','semi')
Note: I'm not sure if you need to escape any of the characters you're replacing

Alternative SQL ways of looking up multiple items of known IDs?

Is there a better solution to the problem of looking up multiple known IDs in a table:
SELECT * FROM some_table WHERE id='1001' OR id='2002' OR id='3003' OR ...
I can have several hundreds of known items. Ideas?
SELECT * FROM some_table WHERE ID IN ('1001', '1002', '1003')
and if your known IDs are coming from another table
SELECT * FROM some_table WHERE ID IN (
SELECT KnownID FROM some_other_table WHERE someCondition
)
The first (naive) option:
SELECT * FROM some_table WHERE id IN ('1001', '2002', '3003' ... )
However, we should be able to do better. IN is very bad when you have a lot of items, and you mentioned hundreds of these ids. What creates them? Where do they come from? Can you write a query that returns this list? If so:
SELECT *
FROM some_table
INNER JOIN ( your query here) filter ON some_table.id=filter.id
See Arrays and Lists in SQL Server 2005
ORs are notoriously slow in SQL.
Your question is short on specifics, but depending on your requirements and constraints I would build a look-up table with your IDs and use the EXISTS predicate:
select t.id from some_table t
where EXISTS (select * from lookup_table l where t.id = l.id)
For a fixed set of IDs you can do:
SELECT * FROM some_table WHERE id IN (1001, 2002, 3003);
For a set that changes each time, you might want to create a table to hold them and then query:
SELECT * FROM some_table WHERE id IN
(SELECT id FROM selected_ids WHERE key=123);
Another approach is to use collections - the syntax for this will depend on your DBMS.
Finally, there is always this "kludgy" approach:
SELECT * FROM some_table WHERE '|1001|2002|3003|' LIKE '%|' || id || '|%';
In Oracle, I always put the id's into a TEMPORARY TABLE to perform massive SELECT's and DML operations:
CREATE GLOBAL TEMPORARY TABLE t_temp (id INT)
SELECT *
FROM mytable
WHERE mytable.id IN
(
SELECT id
FROM t_temp
)
You can fill the temporary table in a single client-server roundtrip using Oracle collection types.
We have a similar issue in an application written for MS SQL Server 7. Although I dislike the solution used, we're not aware of anything better...
'Better' solutions exist in 2008 as far as I know, but we have Zero clients using that :)
We created a table valued user defined function that takes a comma delimited string of IDs, and returns a table of IDs. The SQL then reads reasonably well, and none of it is dynamic, but there is still the annoying double overhead:
1. Client concatenates the IDs into the string
2. SQL Server parses the string to create a table of IDs
There are lots of ways of turning '1,2,3,4,5' into a table of IDs, but the Stored Procedure which uses the function ends up looking like...
CREATE PROCEDURE my_road_to_hell #IDs AS VARCHAR(8000)
AS
BEGIN
SELECT
*
FROM
myTable
INNER JOIN
dbo.fn_split_list(#IDs) AS [IDs]
ON [IDs].id = myTable.id
END
The fastest is to put the ids in another table and JOIN
SELECT some_table.*
FROM some_table INNER JOIN some_other_table ON some_table.id = some_other_table.id
where some_other_table would have just one field (ids) and all values would be unique