SQL Joining with Imperfect Match

SQL Joining with Imperfect Match - sql

Is it possible to match the following combination of such keys in SQL?
The key values like an array and delimiter = '/'.
Key
-----
A/B/C
A
B
C
A/B
A/C
B/C

My first thought to you is that you need to redesign. You should not store data that way. You should have a related table instead. Then you can do ordinary joins to get what you want. Rule 1 of database design is to store only one piece of information per field. If you are finding you need to break this down into smaller chunks than you are storing, you are storing incorrectly.
Some of the proposed solutions will work (depending on what you are really asking which is not clear) but most if not all of them will be slow as they rely on syntax which will not alow you to use indexes. This is one major reason why a redesign is indicated. You do not want a system where the indexes can't be used.

Assuming here your keys are single letters as in your example, you could use LIKE:
SELECT * FROM table WHERE key LIKE '%A%'
Would give you all the values where "key" contains "A".

SELECT *
FROM mytable
WHERE `key` REGEXP '^([ABC]/)*[ABC]?$'
This will match anything from above, but will not match if there are other letters (like D/B/C or AA/BB/CC)

You will be better off using REGEXP
SELECT * FROM table WHERE key REGEXP '[[:<:]]A[[:>:]]'
[[:<:]] and [[:>:]] mark a word boundry, this will stop 'A' matching AB,AC,AD
I'm not sure if they are mysql specific

What database?
If dealing with Oracle, I suggest using INSTR
For SQL Server: CHARINDEX
mySQL also uses INSTR
Use any of those to test for a value greater than zero.
SELECT t.*
FROM TABLE t
WHERE INSTR(t.column, expectedValue) > 0

You haven't stated your problem very clearly. I'm taking it that this list of keys conceptually rep ordered sets, and you want to find all possible subset / superset combinations (e.g. I think you want 'A/C' to "match" 'A/B/C').
This seems to work but I'd be hard pressed to prove that the logic is right:
SELECT a.key subset, b.key superset
FROM key_list a, key_list b
WHERE '/' || REPLACE( b.key, '/', '//') || '/'
LIKE '/' || REPLACE( a.key, '/', '/%/' ) || '/'
OR b.key LIKE '%' || a.key || '%'
ORDER BY length(a.key), a.key, length(b.key),b.key

Related

If trim is used in Select, does it have to be used in Where?

This should be an easy question.
If the trim function has been used on an ID in a Select statement, does it have to be used on the ID in a Where clause? Or can the trim function be left out in the Where clause?
SELECT (TRIM(a.T$ID)) as "ID"
FROM SCHEMA.DDiitm0011 a
WHERE TRIM(a.T$ID) LIKE '4U%'

If the trim function has been used on an ID in a Select statement, does it have to be used on the ID in a Where clause?
There is no general requirement for the function to be applied in both places. It depends on the data and the logic you need to apply to your exclusion filter, and - separately - how you want to return the matching values. You won't get a syntax error if you trim in the select list and not the where clause, or vice versa; but you might not get the result you want if you use the wrong expression(s).
I actually want to exclude ID's that start with 4U. Would this WHERE clause suffice? WHERE (a.T$ID) NOT LIKE '4U%'
Yes, though you don't need the parentheses either:
SELECT (TRIM(a.T$ID)) as "ID"
FROM SCHEMA.DDiitm0011 a
WHERE a.T$ID LIKE '4U%'
That will exclude values starting with 4U, such as '4U', '4U ', '4UP', '4UNDER ' etc.
It will not exclude any that have spaces before that, such as ' 4U' or ' 4UP'.
If you wanted to exclude those as well then you could use TRIM(a.T$ID) or LTRIM(a.T$ID) (to only remove leading spaces, not trailing ones - which are covered by the wildcard % anyway). Or you could use a regular expression, but those tend to be significantly more expensive. Either way, applying a function to the column value would prevent a simple index on that column from being used, if it otherwise would be, but you could add a function-based index if that was an issue.

You can use TRIM in WHERE clause like that for example :
DECLARE #ExampleVarTable TABLE(ID INT IDENTITY, Name1 VARCHAR(100))
INSERT INTO #ExampleVarTable (Name1)
VALUES(' Toto '), ('Toto'), ('Titi'), (' Titi '), (' Toto')
SELECT ID, TRIM(Name1)
FROM #ExampleVarTable
WHERE TRIM(Name1) LIKE 'Toto'
Result :
ID Name1
1 Toto
2 Toto
5 Toto
But you should make this kind of request :
SELECT ID, TRIM(Name1)
FROM #ExampleVarTable
WHERE ID = 3
Result :
ID Name1
3 Titi

SQL - Turn relationship IDs into a delimited list

Say I have a table with the following data:
You can see columns a, b, & c have a lot of redundancies. I would like those redundancies removed while preserving the site_id info. If I exclude the site_id column from the query, I can get part of the way there by doing SELECT DISTINCT a, b, c from my_table.
What would be ideal is a SQL query that could turn the site IDs relevant to a permutation of a/b/c into a delimited list, and output something like the following:
Is it possible to do that with a SQL query? Or will I have to export everything and use a different tool to remove the redundancies?
The data is in a SQL Server DB, though I'd also be curious how to do the same thing with postgres, if the process is different.

For SQL Server, you can use the FOR XML trick as found in the accepted answer in this post.
For your scenario it would look something like this:
SELECT a, b, c, SiteIds =
STUFF((SELECT ', ' + SiteId
FROM your_table t2
WHERE t2.a = t1.a AND t2.b = t1.b AND t2.c = t1.c
FOR XML PATH('')), 1, 2, '')
FROM your_table t1
GROUP BY a, b, c

For Postgres:
select a,b,c, string_agg(site_id::varchar, ',')
from my_table
group by a,b,b;
I assume site_id is a number, and as string_agg() only accepts character value, this needs to be casted to a character string for the aggregation. This is what site_id::text does. Alternatively you can use the cast() operator: string_agg(cast(site_id as varchar), ',')

This is generally known as String Aggregation. Many RDBMS's have the ability baked in, and many others don't.
In Postgres you just use the STRING_AGG(<field>, <delimiter>) function, and make sure to add a GROUP BY for your non-aggregated fields. Simple stuff.
In SQL Server.. not so pretty, but folks have functions and whatnot that will allow you to do this (like in this Q/A)

Compare strings in SQL

I am in a situation where I need to return results if some conditions on the string/character are met.
For example: to return only the names that contain 'F' character from the Person table.
How to create an SQL query based on such conditions? Is there any link to a documentation that explains how can SQL perform such queries?
Thanks in advance

The most basic approach is to use LIKE operator:
-- name starts with 'F'
SELECT * FROM person WHERE name LIKE 'F%'
-- name contains 'F'
SELECT * FROM person WHERE name LIKE '%F%'
(% is a wildcard)

Most RDBMS offer string operations which are able to perform that required task in one way or the other.
In MySQL you might use INSTR:
SELECT *
FROM yourtable
WHERE INSTR(Person, 'F') > 0;
In Oracle, this can be done, too.
In PostgreSQL, you can use STRPOS:
SELECT *
FROM yourtable
WHERE strpos(Person, 'F') > 0;
Usually there are several approaches to solve this, many would choose the LIKE operator. For more details, please refer to the documentation of the RDBMS of your choice.
Update
As requested by the questioner a few words about the LIKE operator, which are used not only in MySQL or Oracle, but in other RDBMS, too.
The use of LIKE will in some cases make your RDBMS try to use an index, it usually does not not try to do so if you use a string functions.
Example:
SELECT *
FROM yourtable
WHERE Person LIKE 'F%';

The query may look like this:
SELECT * FROM Person WHERE FirstName LIKE '%F%' OR LastName LIKE '%F%'

"NOT IN" subquery with a leading wildcard

I have two tables:
Table tablefoo contains a column fulldata.
Table tablebar contains a column partialdata.
I want find a list of tablefoo.fulldata that do NOT have partial matches in tablebar.partialdata.
The following provides a list of tablefoo.fulldata with partial matches in tablebar, but I want the negative of this.
select fulldata from tablefoo
where fulldata like any (select '%' || partialdata from tablebar);
This lists every record in partialdata:
select fulldata from tablefoow
where partialdata not in (select '%' || partialdata from tablebar);
Any idea how to get only the results tablefoo.fulldata that do not contain matches to a leading wildcarded tablebar.partialdata?
I found this link: PostgreSQL 'NOT IN' and subquery which seems like it's headed down the right path, but I'm not getting it to work with the wildcard.
Sure, I could write a script to pull this out of psql and do the comparisons, but it would be much nicer to handle this all as part of the query.

SELECT fulldata
FROM tablefoo f
WHERE NOT EXISTS (
SELECT 1
FROM tablebar b
WHERE f.fulldata LIKE ('%' || b.partialdata)
);

Use string contains function in oracle SQL query

I'm using an Oracle database and I want to know how can I find rows in a varchar type column where the values of that column has a string which contains some character.
I'm trying something like this (that's a simple example of what I want), but it doesn't work:
select p.name
from person p
where p.name contains the character 'A';
I also want to know if I can use a function like chr(1234) where 1234 is an ASCII code instead of the 'A' character in my example query, because in my case I want to search in my database values where the name of a person contains the character with 8211 as ASCII code.
With the query select CHR(8211) from dual; I get the special character that I want.
Example:
select p.name
from person p
where p.name contains the character chr(8211);

By lines I assume you mean rows in the table person. What you're looking for is:
select p.name
from person p
where p.name LIKE '%A%'; --contains the character 'A'
The above is case sensitive. For a case insensitive search, you can do:
select p.name
from person p
where UPPER(p.name) LIKE '%A%'; --contains the character 'A' or 'a'
For the special character, you can do:
select p.name
from person p
where p.name LIKE '%'||chr(8211)||'%'; --contains the character chr(8211)
The LIKE operator matches a pattern. The syntax of this command is described in detail in the Oracle documentation. You will mostly use the % sign as it means match zero or more characters.

The answer of ADTC works fine, but I've find another solution, so I post it here if someone wants something different.
I think ADTC's solution is better, but mine's also works.
Here is the other solution I found
select p.name
from person p
where instr(p.name,chr(8211)) > 0; --contains the character chr(8211)
--at least 1 time
Thank you.

You used the keyword CONTAINS in your sample queries and question. CONTAINS lets you search against columns that have been indexed with an Oracle*Text full-text index.
Because these columns are full-text indexed, you can efficiently query them to search for words and phrases anywhere with the text columns without triggering a full table scan. Depending upon their usage, using LIKE or INSTR will almost always result in a full table scan.
CONTAINS is used to search for words and phrases. Although there are many options it is not appropriate if you are looking for embedded characters such as 'A' or chr(8211).
The following query will return all rows that contain the word "smith" anywhere in their text.
SELECT score(1), p.name
FROM person p
WHERE CONTAINS(p.name, 'smith', 1) > 0;
For more details see:
How does contains() in PL-SQL work?
Oracle SQL "contains" clause tips
Oracle: Contains Documentation
Oracle: Contains Operators

Just in case you need you need to find if a column has any values that have character in it, you can use regexp_like. The first parameter is the column name to be checked and the second parameter is the regular expression.
If the below sql returns count greater than zero, that means there are some row(s) in which there is 1 or more character in it.
SELECT COUNT(*)
FROM TABLE_NAME
WHERE regexp_like (COLUMN_NAME, '[^0-9]')

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Joining with Imperfect Match - sql

Is it possible to match the following combination of such keys in SQL? The key values like an array and delimiter = '/'. Key ----- A/B/C A B C A/B A/C B/C

Assuming here your keys are single letters as in your example, you could use LIKE: SELECT * FROM table WHERE key LIKE '%A%' Would give you all the values where "key" contains "A".

SELECT * FROM mytable WHERE `key` REGEXP '^([ABC]/)*[ABC]?$' This will match anything from above, but will not match if there are other letters (like D/B/C or AA/BB/CC)

You will be better off using REGEXP SELECT * FROM table WHERE key REGEXP '[[:<:]]A[[:>:]]' [[:<:]] and [[:>:]] mark a word boundry, this will stop 'A' matching AB,AC,AD I'm not sure if they are mysql specific

What database? If dealing with Oracle, I suggest using INSTR For SQL Server: CHARINDEX mySQL also uses INSTR Use any of those to test for a value greater than zero. SELECT t.* FROM TABLE t WHERE INSTR(t.column, expectedValue) > 0

Related

If trim is used in Select, does it have to be used in Where?

SQL - Turn relationship IDs into a delimited list

Compare strings in SQL

"NOT IN" subquery with a leading wildcard

Use string contains function in oracle SQL query

Categories

Resources