SQL query equality comparison with special character that is equal to everything - sql

I am writing a python script that gets info from a database through SQL queries. Let's say we have an SQL array with information about some people. I have one query that can retrieve this information about a specific person whose name I pass as an argument to the query.
(" SELECT telephone FROM People_info WHERE name=%s " % (name))
Is it possible to pass as an argument a special character or something like that will return me the telephone for all the names? Meaning something that when I compare with every name the result will be equal? I want to use only one query for all the cases (either if I want the info about one person or all of them)

You can edit the SQL code in
SELECT telephone FROM People_info WHERE name=nvl(%s, name)
and pass null if you want to get all the records
Notice that this will never get the records where name is null, but I suppose this is not a problem.

You can use LIKE and the wild card % which matches no, one or any number of any characters.
SELECT telephone
FROM people_info
WHERE name LIKE '%';
However, it won't show records where name IS NULL.
Maybe the optimizer is smart enough to see, that this actually equivalent to a WHERE name IS NOT NULL and uses an index, if there is one. But maybe it don't see it, then this may come as higher price than necessary. So I'd rather change the WHERE clause (or completely omit it, if I wanted all records) in the application to what I actually want, then use such tricks.

Related

How to search for string in SQL treating apostrophe and single quote as equal

We have a database where our customer has typed "Bob's" one time and "Bob’s" another time. (Note the slight difference between the single-quote and apostrophe.)
When someone searches for "Bob's" or "Bob’s", I want to find all cases regardless of what they used for the apostrophe.
The only thing I can come up with is looking at people's queries and replacing every occurrence of one or the other with (’|'') (Note the escaped single quote) and using SIMILAR TO.
SELECT * from users WHERE last_name SIMILAR TO 'O(’|'')Dell'
Is there a better way, ideally some kind of setting that allows these to be interchangeable?
You can use regexp matching
with a_table(str) as (
values
('Bob''s'),
('Bob’s'),
('Bobs')
)
select *
from a_table
where str ~ 'Bob[''’]s';
str
-------
Bob's
Bob’s
(2 rows)
Personally I would replace all apostrophes in a table with one query (I had the same problem in one of my projects).
If you find that both of the cases above are valid and present the same information then you might actually consider taking care of your data before it arrives into the database for later retrieval. That means you could effectively replace one sign into another within your application code or before insert trigger.
If you have more cases like the one you've mentioned then specifying just LIKE queries would be a way to go, unfortunately.
You could also consider hints for your customer while creating another user that would fetch up records from database and return closest matches if there are any to avoid such problems.
I'm afraid there is no setting that makes two of these symbols the same in DQL of Postgres. At least I'm not familiar with one.

SQL wildcard match part of value

In a table I store the names of people and I wanted to use a wildcard to check if a part of students name is found. I tried using wild cards and this works if the condition value length is shorter than a value already in the database e.g
WHERE name LIKE '%Stu%'
And I have a person called 'Stuart', this will return a row, however if say I miss type the students name and it is longer than the stored i.e:
WHERE name LIKE '%Stuarfd%'
there I will get no results returned. Is there anyway to match only part of the string?
Have you tried using SOUNDEX?
WHERE SOUNDEX(name) = SOUNDEX('Stuarf')
More info in:
https://www.sqlite.org/lang_corefunc.html#soundex
What you might want is Levenshtein distance, which is also called edit distance. This is the number of edits needed to change one string into another.
SQLite implements this through an extension called Spellfix1, which is available here. The particular function is spellfix1_editdist (). Levenshtein distance is not really so useful for partial matches. The idea is to be able to type in "Stuarf" and find "Stuart", because they are close.

Wildcards in database

Any one have any pointers how I can store wildcards in a database and the see which row(s) a string matches? Can it be done?
e.g.
DB contains a table like:
So john3136 should get 10 times his regular pay. fred3136 would get half his regular pay.
harry3136 probably crashes the app since there is no matching data ;-)
The code needs to do something like:
foreach(Employee e in all_employees) {
SELECT Multiplier FROM PayScales WHERE
//??? e.Name matches the PayScales.Name wildcard
}
Thanks!
Edit
This is a real world issue: I've got a parameter file that contains wildcards. The code currently iterates through employees, iterates through the param file looking for a match - you can see why I'd like to "databaserize" it ;-)
Wildcards are optional. The row could have said "john3136" to only match one employee. (The real app isn't actually employees, so it does make sense even if it looks like overkill in this simple example)
One option open: I do know all the employee names before I start, so I could iterate through them and effectively expand the wildcards in a temporary table. (so if I have john3136* in the starting table, it might expand to john3136, john31366 etc based on the list of employees). I was hoping to find a better way than this since it requires more maintenance (e.g. if we add functionality to add an employee we need to maintain the expanded wildcards table).
SELECT * FROM payscales
WHERE e.Name
LIKE regexp_replace(name, E'^\\*|\\*$', '%', 'g');
I don't know which database you're using. The above query works on postgresql and just replace your trailing and leading wildcard with %, that's the LIKE wildcard.
If no wildcard is present, it must match the full string.

Best SQL query for list of records containing certain characters?

I'm working with a relatively large SQL Server 2000 DB at the moment. It's 80 GB in size, and have millions and millions of records.
I currently need to return a list of names that contains at least one of a series of illegal characters. By illegal characters is just meant an arbitrary list of characters that is defined by the customer. In the below example I use question mark, semi-colon, period and comma as the illegal character list.
I was initially thinking to do a CLR function that worked with regular expressions, but as it's SQL server 2000, I guess that's out of the question.
At the moment I've done like this:
select x from users
where
columnToBeSearched like '%?%' OR
columnToBeSearched like '%;%' OR
columnToBeSearched like '%.%' OR
columnToBeSearched like '%,%' OR
otherColumnToBeSearched like '%?%' OR
otherColumnToBeSearched like '%;%' OR
otherColumnToBeSearched like '%.%' OR
otherColumnToBeSearched like '%,%'
Now, I'm not a SQL expert by any means, but I get the feeling that the above query will be very inefficient. Doing 8 multiple wildcard searches in a table with millions of records, seems like it could slow the system down rather seriously. While it seems to work fine on test servers, I am getting the "this has to be completely wrong" vibe.
As I need to execute this script on a live production server eventually, I hope to achieve good performance, so as not to clog the system. The script might need to be expanded later on to include more illegal characters, but this is very unlikely.
To sum up: My aim is to get a list of records where either of two columns contain a customer-defined "illegal character". The database is live and massive, so I want a somewhat efficient approach, as I believe the above queries will be very slow.
Can anyone tell me the best way for achieving my result? Thanks!
/Morten
It doesn't get used much, but the LIKE statement accepts patterns in a similar (but much simplified) way to Regex. This link is the msdn page for it.
In your case you could simplify to (untested):
select x from users
where
columnToBeSearched like '%[?;.,]%' OR
otherColumnToBeSearched like '%[?;.,]%'
Also note that you can create the LIKE pattern as a variable, allowing for the customer defined part of your requirements.
One other major optimization: If you've got an updated date (or timestamp) on the user row (for any audit history type of thing), then you can always just query rows updated since the last time you checked.
If this is a query that will be run repeatedly, you are probably better off creating an index for it. The syntax escapes me at the moment, but you could probably create a computed column (edit: probably a PERSISTED computed column) which is 1 if columnToBeSearched or otherColumnToBeSearched contain illegal characters, and 0 otherwise. Create an index on that column and simply select all rows where the column is 1. This assumes that the set of illegal characters is fixed for that database installation (I assume that that's what you mean by "specified by the customer"). If, on the other hand, each query might specify a different set of illegal characters, this won't work.
By the way, if you don't mind the risk of reading uncommitted rows, you can run the query in a transaction with the the isolation level READ UNCOMMITTED, so that you won't block other transactions.
You can try to partition your data horizontally and "split" your query in a number of smaller queries. For instance you can do
SELECT x FROM users
WHERE users.ID BETWEEN 1 AND 5000
AND -- your filters on columnToBeSearched
putting your results back together in one list may be a little inconvenient, but if it's a report you're only extracting once (or once in a while) it may be feasible.
I'm assuming ID is the primary key of users or a column that has a index defined, which means SQL should be able to create an efficient execution plan, where it evaluates users.ID BETWEEN 1 AND 5000 (fast) before trying to check the filters (which may be slow).
Look up PATINDEX it allows you to put in an array of characters PATINDEX('[._]',ColumnName) returns a 0 or a value of the first occurance of an illegal character found in a certain value. Hope this helps.

How do I perform a simple one-statement SQL search across tables?

Suppose that two tables exist: users and groups.
How does one provide "simple search" in which a user enters text and results contain both users and groups whose names contain the text?
The result of the search must distinguish between the two types.
The trick is to combine a UNION with a literal string to determine the type of 'object' returned. In most (?) cases, UNION ALL will be more efficient, and should be used unless duplicates are required in the sub-queries. The following pattern should suffice:
SELECT "group" type, name
FROM groups
WHERE name LIKE "%$text%"
UNION ALL
SELECT "user" type, name
FROM users
WHERE name LIKE "%$text%"
NOTE: I've added the answer myself, because I came across this problem yesterday, couldn't find a good solution, and used this method. If someone has a better approach, please feel free to add it.
If you use "UNION ALL" then the db doesn't try to remove duplicates - you won't have duplicates between the two queries anyway (since the first column is different), so UNION ALL will be faster.
(I assume that you don't have duplicates inside each query that you want to remove)
Using LIKE will cause a number of problems as it will require a table scan every single time when the LIKE comparator starts with a %. This forces SQL to check every single row and work it's way, byte by byte, through the string you are using for comparison. While this may be fine when you start, it quickly causes scaling issues.
A better way to handle this is using Full Text Search. While this would be a more complex option, it will provide you with better results for very large databases. Then you can use a functioning version of the example Bobby Jack gave you to UNION ALL your two result sets together and display the results.
I would suggest another addition
SELECT "group" type, name
FROM groups
WHERE UPPER(name) LIKE UPPER("%$text%")
UNION ALL
SELECT "user" type, name
FROM users
WHERE UPPER(name) LIKE UPPER("%$text%")
You could convert $text to upper case first or do just do it in the query. This way you get a case insensitive search.