In a Rails WHERE LIKE query, what does the percent sign mean? - sql

In a simple search like this:
find.where('name LIKE ?', "%#{search}%")
I understand that #{search} is just string interpolation. What do the % symbols do?

The percent sign % is a wildcard in SQL that matches zero or more characters. Thus, if search is "hello", it would match strings in the database such as "hello", "hello world", "well hello world", etc.
Note that this is a part of SQL and is not specific to Rails/ActiveRecord. The queries it can be used with, and the precise behavior of LIKE, differ based on SQL dialect (MySQL, PostgreSQL, etc.).

search = 'something'
find.where('name LIKE ?', "%#{search}%")
In your DB it will be interpreted as
SELECT <fields> FROM finds WHERE name LIKE '%something%';

The percent sign in a like query is a wildcard. So, your query is saying "anything, followed by whatever is in the search variable, followed by anything".
Note that this use of the percent sign is part of the SQL standard and not specific to Rails or ActiveRecord. Also be aware that this kind of search does note scale well -- your SQL db will be forced to scan through every row in the table trying to find matches rather than being able to rely on an index.

Related

DB2 complex like

I have to write a select statement following the following pattern:
[A-Z][0-9][0-9][0-9][0-9][A-Z][0-9][0-9][0-9][0-9][0-9]
The only thing I'm sure of is that the first A-Z WILL be there. All the rest is optional and the optional part is the problem. I don't really know how I could do that.
Some example data:
B/0765/E 3
B/0765/E3
B/0764/A /02
B/0749/K
B/0768/
B/0784//02
B/0807/
My guess is that I best remove al the white spaces and the / in the data and then execute the select statement. But I'm having some problems writing the like pattern actually.. Anyone that could help me out?
The underlying reason for this is that I'm migrating a database. In the old database the values are just in 1 field but in the new one they are splitted into several fields but I first have to write a "control script" to know what records in the old database are not correct.
Even the following isn't working:
where someColumn LIKE '[a-zA-Z]%';
You can use Regular Expression via xQuery to define this pattern. There are many question in StackOverFlow that talk about patterns in DB2, and they have been solved with Regular Expressions.
DB2: find field value where first character is a lower case letter
Emulate REGEXP like behaviour in SQL

String matching in Peewee (SQL)

I am trying to query in Peewee with results that should have a specific substring in them.
For instance, if I want only activities with "Physics" in the name:
schedule = Session.select().join(Activity).where(Activity.name % "%Physics%").join(Course).join(StuCouRel).join(Student).where(Student.id == current_user.id)
The above example doesn't give any errors, but doesn't work correctly.
In python, I would just do if "Physics" in Activity.name, so I'm looking for an equivalent which I can use in a query.
You could also use these query methods: .contains(substring), .startswith(prefix), .endswith(suffix).
For example, your where clause could be:
.where(Activity.name.contains("Physics"))
I believe this is case-insensitive and behaves the same as LIKE '%Physics%'.
Quick answer:
just use Activity.name.contains('Physics')
Depending on the database backend you're using you'll want to pick the right "wildcard". Postgresql and MySQL use "%", but for Sqlite if you're performing a LIKE query you will actually want to use "*" (although for ILIKE it is "%", confusing).
I'm going to guess you're using SQLite since the above query is failing, so to recap, with SQLite if you want case-sensitive partial-string matching: Activity.name % "*Physics*", and for case-insensitive: Activity.name ** "%Physics%".
http://www.sqlite.org/lang_expr.html#like

In rails, is it possible to do a query that returns all the records with a minumum number of characters in a text or string column?

I ask because a thorough Google search returns no clue as to how to do this.
I am trying to pull an example of a column field which is rarely used and is unfortunately littered with newlines and dashes even in empty ones, so I can't just ask for ones that have data. I need to ask for a column that has at least 10-15 characters or something like this. I can also imagine this query being useful for validating pre-existing data. I know about the validator that does this, but I'm not trying to validate, I'm trying to search.
Thanks!
Seems ActiveRecord does not support this. But you can do it anyway like (Mysql example)
Model.where("CHAR_LENGTH(text_field) = ?", 10)
in Postgres the same should work but in documentation it says to use char_length()
Also what you could do is on saving the record store the size of the field with a callback
before_save {|r| r.text_field_size = r.text_field.size}
With this you can now query with that, wich will be DB agnostic.
Model.where(text_field_size: 10)
I think you'll have to write so part of the request in SQL.
For MySQL, use something like :
Model.where("CHAR_LENGTH(field_name) >= ?", min_length)

like sql freetext search

maybe this is a dumb question.
How do I achieve in sql " like '%test%' using free text ?
I thought contains and free text are equivalent to " like '%test%', plus one get's grammar check, and performance.
In my case I have :
select * from ServiceOfferResultIndexed where contains(FirstName,'test')
which gives me 18
rows.select * from ServiceOfferResultIndexed where FirstName like '%test%'
which gives me 229 rows.
thanks for the help.
The DB, is MS SQL 2005. I think it does support the * as postfix. It seams that if I provide '"*test"' the * is considered as a word not a wild card. Test becomes the following word.
Contains will only support "test *", where it looks for all the phrases starting with 'test' followed by any other character.
Those are two different expressions. LIKE '%test%' is going to find any row where those four characters are together (e.g. testimonial, or contestant) where contains is going to match on words.
I don't know what full text engine you're using but usually wildcards are supported. For example, what happens if you use where contains(firstame, '*test*')? You'll have to consult your specific dbms' documentation for wildcards in free text search.

How to create simple fuzzy search with PostgreSQL only?

I have a little problem with search functionality on my RoR based site. I have many Produts with some CODEs. This code can be any string like "AB-123-lHdfj". Now I use ILIKE operator to find products:
Product.where("code ILIKE ?", "%" + params[:search] + "%")
It works fine, but it can't find product with codes like "AB123-lHdfj", or "AB123lHdfj".
What should I do for this? May be Postgres has some string normalization function, or some other methods to help me?
Postgres provides a module with several string comparsion functions such as soundex and metaphone. But you will want to use the levenshtein edit distance function.
Example:
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
levenshtein
-------------
2
(1 row)
The 2 is the edit distance between the two words. When you apply this against a number of words and sort by the edit distance result you will have the type of fuzzy matches that you're looking for.
Try this query sample: (with your own object names and data of course)
SELECT *
FROM some_table
WHERE levenshtein(code, 'AB123-lHdfj') <= 3
ORDER BY levenshtein(code, 'AB123-lHdfj')
LIMIT 10
This query says:
Give me the top 10 results of all data from some_table where the edit distance between the code value and the input 'AB123-lHdfj' is less than 3. You will get back all rows where the value of code is within 3 characters difference to 'AB123-lHdfj'...
Note: if you get an error like:
function levenshtein(character varying, unknown) does not exist
Install the fuzzystrmatch extension using:
test=# CREATE EXTENSION fuzzystrmatch;
Paul told you about levenshtein(). That's a very useful tool, but it's also very slow with big tables. It has to calculate the Levenshtein distance from the search term for every single row. That's expensive and cannot use an index. The "accelerated" variant levenshtein_less_equal() is faster for long strings, but still slow without index support.
If your requirements are as simple as the example suggests, you can still use LIKE. Just replace any - in your search term with % in the WHERE clause. So instead of:
WHERE code ILIKE '%AB-123-lHdfj%'
Use:
WHERE code ILIKE '%AB%123%lHdfj%'
Or, dynamically:
WHERE code ILIKE '%' || replace('AB-123-lHdfj', '-', '%') || '%'
% in LIKE patterns stands for 0-n characters. Or use _ for exactly one character. Or use regular expressions for a smarter match:
WHERE code ~* 'AB.?123.?lHdfj'
.? ... 0 or 1 characters
Or:
WHERE code ~* 'AB\-?123\-?lHdfj'
\-? ... 0 or 1 dashes
You may want to escape special characters in LIKE or regexp patterns. See:
Escape function for regular expression or LIKE patterns
If your actual problem is more complex and you need something faster then there are various options, depending on your requirements:
There is full text search, of course. But this may be an overkill in your case.
A more likely candidate is trigram-matching with the additional module pg_trgm. See:
Using Levenshtein function on each element in a tsvector?
PostgreSQL LIKE query performance variations
Related blog post by Depesz
Can be combined it with LIKE, ILIKE, ~, or ~* since PostgreSQL 9.1.
Also interesting in this context: the similarity() function or % operator of that module.
Last but not least you can implement a hand-knit solution with a function to normalize the strings to be searched. For instance, you could transform AB1-23-lHdfj --> ab123lhdfj, save it in an additional column and search with terms transformed the same way.
Or use an index on the expression instead of the redundant column. (Involved functions must be IMMUTABLE.) Possibly combine that with pg_tgrm from above.
Overview of pattern-matching techniques:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL