How can I run a query on IDs in a string? - sql

I have a table A with this column:
IDS(VARCHAR)
1|56|23
I need to run this query:
select TEST from TEXTS where ID in ( select IDS from A where A.ID = xxx )
TEXTS.ID is an INTEGER. How can I split the string A.IDS into several ints for the join?
Must work on MySQL and Oracle. SQL99 preferred.

First of all, you should not store data like this in a column. You should split that out into a separate table, then you would have a normal join, and not this problem.
Having said that, what you have to do is the following:
Convert the number to a string
Pad it with the | (your separator) character, before it, and after it (I'll tell you why below)
Pad the text you're looking in with the same separator, before and after
Do a LIKE on it
This will run slow!
Here's the SQL that does what you want (assuming all the operators and functions work in your SQL dialect, you don't say what kind of database engine this is):
SELECT
TEXT -- assuming this was misspelt?
FROM
TEXTS -- and this as well?
JOIN A ON
'|' + A.IDS + '|' LIKE '%|' + CONVERT(TEXTS.ID) + '|%'
The reason why you need to pad the two with the separator before and after is this: what if you're looking for the number 5? You need to ensure it wouldn't accidentally fit the 56 number, just because it contained the digit.
Basically, we will do this:
... '|1|56|23|' LIKE '%|56|%'
If there is ever only going to be 1 row in A, it might run faster if you do this (but I am not sure, you would need to measure it):
SELECT
TEXT -- assuming this was misspelt?
FROM
TEXTS -- and this as well?
WHERE
(SELECT '|' + IDS + '|' FROM A) LIKE '%|' + CONVERT(TEXTS.ID) + '|%'
If there are many rows in your TEXTS table, it will be worth the effort to add code to generate the appropriate SQL by first retrieving the values from the A table, construct an appropriate SQL with IN and use that instead:
SELECT
TEXT -- assuming this was misspelt?
FROM
TEXTS -- and this as well?
WHERE
ID IN (1, 56, 23)
This will run much faster since now it can use an index on this query.
If you had A.ID as a column, and the values as separate rows, here's how you would do the query:
SELECT
TEXT -- assuming this was misspelt?
FROM
TEXTS -- and this as well?
INNER JOIN A ON TEXTS.ID = A.ID
This will run slightly slower than the previous one, but in the previous one you have overhead in having to first retrieve A.IDS, build the query, and risk producing a new execution plan that has to be compiled.

Related

postgresql - check if a row contains a string without considering spaces

Is it possible to check if a row contains a string without conisdering spaces?
Suppose I have a table like the one above. I want to know if the query column contains a string that may have different consecutive number of space than the one stored or vice versa?
For example: the first row's query is select id, username from postgresql, and the one I want to know if stored in the table is:
select id, username
from postgresql
That is to say the one that I want to know if exists in the table is indented differently and hence has different number of space.
You can use REGEXP_REPLACE; this will likely be very slow on large data set.
SELECT * from table
where REGEXP_REPLACE('select id, username from postgresql ', '\s+$', '') = REGEXP_REPLACE(query, '\s+$', '')
I think you would phrase this as:
where $str ~ replace('select id, username from postgresql', ' ', '[\s]+')
Note: This assumes that your string does not have other regular expression special characters.

Oracle SQL - Joining list of values to a field with those values concatenated

The title is a bit confusing, so I'll explain with an example what I'm trying to do.
I have a field called "modifier". This is a field with concatenated values for each individual. For example, the value in one row could be:
*26,50,4 *
and the value in the next row
*4 *
And the table (Table A) would look something like this:
Key Modifier
1 *26,50,4 *
2 *4 *
3 *1,2,3,4 *
The asterisks are always going to be in the same position (here, 1 and 26) with an uncertain number of numbers in between, separated by commas.
What I'd like to do is "join" this "modifier" field to another table (Table B) with a list of possible values for that modifier. e.g., that table could look like this:
ID MOD
1 26
2 3
3 50
4 78
If a value in A.modifier appears in B.mod, I want to keep that row in Table A. Otherwise, leave it out. (I use the term "join" loosely because I'm not sure that's what I need here.)
Is this possible? How would I do it?
Thanks in advance!
edit 1: I realize I can use regular expressions and do a bunch of or statements that search for the comma-separated values in the MOD list, but is there a better way?
One way to do it is using TRIM, string concatenations and LIKE.
SELECT *
FROM tableA a
WHERE EXISTS(
SELECT 1 FROM tableB b
WHERE
','|| trim( trim( BOTH '*' FROM a.Modifier )) ||','
LIKE '%,'|| b.mod || ',%'
);
Demo --> http://www.sqlfiddle.com/#!4/1caa8/10
This query migh be still slow for huge tables (it always performs full scans of tables or indexes), however it should be faster than using regular expressions or parsing comma separated lists into individual values.

How to delete a common word from large number of datas in a Postgres table

I have a table in Postgres. In that table more than 1000 names are there. Most of the names are start with SHRI or SMT. I want to delete this SHRT and SMT from the names and to save original name only. How can I do that with out any database function?
I'll step you through the logic:
Select left(name,3) from table
This select statement will bring back the first 3 chars of a column (the 'left' three). If we are looking for SMT in the first three chars, we can move it to the where statement
select * from table where left(name,3) = 'SMT'
Now from here you have a few choices that can be used. I'm going to keep to the left/right style, though replace could likely be used. We want the chars to the right of the SMT, but we don't know how long each string is to pick out those chars. So we use length() to determine that.
select right(name,length(name)-3) from table where left(name,3) = 'SMT'
I hope my syntax is right there, I'm lacking a postgres environment to test it. The logic is 'all the chars on the right of the string except the last 3 (the minus 3 excludes the 3 chars on the left. change this to 4 if you want all but the last 4 on the left)
You can then change this to an update statement (set name = right(name,length(name)-3) ) to update the table, or you can just use the select statement when you need the name without the SMT, but leave the SMT in the actual data.

Matching sub string in a column

First I apologize for the poor formatting here.
Second I should say up front that changing the table schema is not an option.
So I have a table defined as follows:
Pin varchar
OfferCode varchar
Pin will contain data such as:
abc,
abc123
OfferCode will contain data such as:
123
123~124~125
I need a query to check for a count of a Pin/OfferCode combination and when I say OfferCode, I mean an individual item delimited by the tilde.
For example if there is one row that looks like abc, 123 and another that looks like abc,123~124, and I search for a count of Pin=abc,OfferCode=123 I wand to get a count = 2.
Obviously I can do a similar query to this:
SELECT count(1) from MyTable (nolock) where OfferCode like '%' + #OfferCode + '%' and Pin = #Pin
using like here is very expensive and I'm hoping there may be a more efficient way.
I'm also looking into using a split string solution. I have a Table-valued function SplitString(string,delim) that will return table OutParam, but I'm not quite sure how to apply this to a table column vs a string. Would this even be worth wile pursuing? It seems like it would be much more expensive, but I'm unable to get a working solution to compare to the like solution.
Your like/% solution is open to a bug if you had offer codes other than 3 digits (if there was offer code 123 and 1234, searching for like '%123%' would return both, which is wrong). You can use your string function this way:
SELECT Pin, count(1)
FROM MyTable (nolock)
CROSS APPLY SplitString(OfferCode,'~') OutParam
WHERE OutParam.Value = #OfferCode and Pin = #Pin
GROUP BY Pin
If you have a relatively small table you can probably get away with this. If you are working with a large number of rows or encountering performance problems, it would be more effective to normalize it as RedFilter suggested.
using like here is very expensive and I'm hoping there may be a more efficient way
The efficient way is to normalize the schema and put each OfferCode in its own row.
Then your query is more like (although you may need to use an intersection table depending on your schema):
select count(*)
from MyTable
where OfferCode = #OfferCode
and Pin = #Pin
Here is one way to use like for this problem, which is standard for getting exact matches when searching delimited strings while avoiding the '%123%' matches '123' and '1234' problem:
-- Create some test data
declare #table table (
Pin varchar(10) not null
, OfferCode varchar(100) not null
)
insert into #table select 'abc', '123'
insert into #table select 'abc', '123~124'
-- Mock some proc params
declare #Pin varchar(10) = 'abc'
declare #OfferCode varchar(10) = '123'
-- Run the actual query
select count(*) as Matches
from #table
where Pin = #Pin
-- Append delimiters to find exact matches
and '~' + OfferCode + '~' like '%~' + #OfferCode + '~%'
As you can see, we're adding the delimiters to the searched string, and also the search string in order to find matches, thus avoiding the bugs mentioned by other answers.
I highly doubt that a string splitting function will yield better performance over like, but it may be worth a test or two using some of the more recently suggested methods. If you still have unacceptable performance, you have a few options:
Updated:
Try an index on OfferCode (or on a computed persisted column of '~' + OfferCode + '~'). Contrary to the myth that SQL Server won't use an index with like and wildcards, this might actually help.
Check out full text search.
Create a normalized version of this table using a string splitter. Use this table to run your counts. Update this table according to some schedule or event (trigger, etc.).
If you have some standard search terms, pre-calculate the counts for these and store them on some regular basis.
Actually, the LIKE condition is going to have much less cost than doing any sort of string manipulation and comparison.
http://www.simple-talk.com/sql/performance/the-seven-sins-against-tsql-performance/

Easy way to transfer/update a list of numbers to be used in the SQL 'in' command?

I'm always being given a large list of say id's which I need to search in our database have manually put them into a sql statement like the follow which can take a while putting single quotes around each number followed by a comma, I was hoping someone has a easy way of doing this for me? Or am I just being a bit lazy...
select * from blah where idblah in ('1234-A', '1235-A', '1236-A' ................)
You can use the worlds' simplest code generator.
Just paste in the list of values, setup the pattern and voila... you have a set of quoted values.
I have also used Excel in the past, using the CONCAT function with smart paste.
I would set aside a table to hold the values and have my queries JOIN against that table. Set up a simple import script (don't forget to clear out the table at the start) and something like this is a breeze. Run the import, run the query. You never have to touch the query again or regenerate any code.
As an example:
CREATE TABLE Search_ID_List (
id VARCHAR(20) NOT NULL,
CONSTRAINT PK_Search_ID_List PRIMARY KEY CLUSTERED (id)
)
and:
SELECT
<column list>
FROM
Search_ID_List SIL
INNER JOIN Blah B ON
B.id = SIL.id
If you want to be able to save past search criteria or have multiple searches available to you at the same time then you can just add an identifying column which gets filled in by your import. It can be the file from where the ids came, some descriptive code/name, or whatever. Then just add that to the WHERE clause of your query and you're all set.
You could do something like this.
select * from blah where ',' + '1234-A,1235-A,1236-A' + ',' LIKE ',%' + idblah + '%,'
This pattern is super useful when you're being passed a comma delimited list of values to filter by, but I think would be applicable here as well.