Create custom lookup in Django similar to startswith - sql

I need to create a custom field lookup in Django that is similar to startswith.
The built in startswith lookup works like this:
id__startswith='abc'
So if the ID field is 'abc1' then it would match.
However I need create a 'contained from start' (cfs) lookup that looks this:
id__cfs='abc12'
So if the ID field is 'abc' then it would match.
Basically __startswith but the other way around.
How should I implement this?

It turns out that creating this custom lookup would be inefficient. For example, if the custom query looked like this:
id__custom='abc12'
Then the sql query would look like this:
WHERE 'abc12' LIKE id||%
However, this LIKE query would be very slow because it can't use any indexes. Every single record would have to be compared to 'abc12'. So the best way to achieve the same goal is to divide the string into substrings and store them in a list. Then you can search for those using the __in lookup.
For instance, if you wanted to find matches that contained at least half of the string you could create three substrings and search for those. By using the __in lookup your query would be much faster because you could use existing indexes. The result would look like this:
substrings = ['abc', 'abc1', 'abc12']
query = Item.objects.filter(id__in=substrings).values('id', 'manufacturer')

Related

Full Text Search query contains column

Is it possible to do the opposite of a CONTAINS in SQL Server? I want my query to match the column, not the opposite. For example:
Not this:
WHERE CONTAINS(Description, '"Just A Test"')
But this:
WHERE CONTAINS('"Just A Test"', Description)
It's of course possible with a LIKE but that's rather slow and I'd like to use FTS for this.

Can you use DOES NOT CONTAIN in SQL to replace not like?

I have a table called logs.lognotes, and I want to find a faster way to search for customers who do not have a specific word or phrase in the note. I know I can use "not like", but my question is, can you use DOES NOT CONTAINS to replace not like, in the same way you can use:
SELECT *
FROM table
WHERE CONTAINS (column, ‘searchword’)
Yes, you should be able to use NOT on any boolean expression, as mentioned in the SQL Server Docs here. And, it would look something like this:
SELECT *
FROM table
WHERE NOT CONTAINS (column, ‘searchword’)
To search for records that do not contain the 'searchword' in the column. And, according to
Performance of like '%Query%' vs full text search CONTAINS query
this method should be faster than using LIKE with wildcards.
You can also simply use this:
select * from tablename where not(columnname like '%value%')

How to overcome ( SQL Like operator ) performance issues?

I have Users table contains about 500,000 rows of users data
The Full Name of the User is stored in 4 columns each have type nvarchar(50)
I have a computed column called UserFullName that's equal to the combination of the 4 columns
I have a Stored Procedure searching in Users table by name using like operatior as below
Select *
From Users
Where UserFullName like N'%'+#FullName+'%'
I have a performance issue while executing this SP .. it take long time :(
Is there any way to overcome the lack of performance of using Like operator ?
Not while still using the like operator in that fashion. The % at the start means your search needs to read every row and look for a match. If you really need that kind of search you should look into using a full text index.
Make sure your computed column is indexed, that way it won't have to compute the values each time you SELECT
Also, depending on your indexing, using PATINDEX might be quicker, but really you should use a fulltext index for this kind of thing:
http://msdn.microsoft.com/en-us/library/ms187317.aspx
If you are using index it will be good.
So you can give the column an id or something, like this:
Alter tablename add unique index(id)
Take a look at that article http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/like-performance-tuning .
It easily describes how LIKE works in terms of performance.
Likely, you are facing such issues because your whole table shoould be traversed because of first % symbol.
You should try creating a list of substrings(in a separate table for example) representing k-mers and search for them without preceeding %. Also, an index for such column would help. Please read more about KMer here https://en.m.wikipedia.org/wiki/K-mer .
This will not break an index and will be more efficient to search.

MySQL: select the closest match?

I want to show the closest related item for a product. So say I am showing a product and the style number is SG-sfs35s. Is there a way to select whatever product's style number is closest to that?
Thanks.
EDIT: to answer your questions. Well I definitely want to keep the first 2 letters as that is the manufacturer code but as for the part after the first dash, just whatever matches closest. so for example SG-sfs35s would match SG-shs35s much more than SG-sht64s. I hope this makes sense whenever I do LIKE product_style_number it only pulls the exact match.
There normally isn't a simple way to match product codes that are roughly similar.
A more SQL friendly solution is to create a new table that maps each product to all the products it is similar to.
This table would either need to be maintained manually, or a more sophisticated script can be executed periodically to update it.
If your product codes follow a consistent pattern (all the letters are the same for similar products, with only the numbers changing), then you should be able to use a regular expression to match the similar items. There are docs on this here...
It sounds like what you want is levenshtein distance .
Unfortunately, there isn't a built-in levenshtein function for mysql, but some folks have come up with a user-defined function that does it(deadlink).
You will probably want to do it as a stored procedure, as I expect that the algorithm may not be trivial.
For example, you may split the term at the -, so you have two parts. You do a LIKE query on each part and use that to make a decision.
You could just loop though, replacing the last character with "%" until you get at least one result, in your stored procedure.
Sounds like you need something like Lucene, though i'm not sure if that would be overkill for your situation. But it certainly would be able to do text searches and return the ones most similar first.
If you need something more simple I would try to start by searching with the full product code, then if that doesn't work try to use wildcards/remove some characters until you return a result.
JD Isaacks.
This situation of yours is very simple to solve.
It`s not like you need to use Artificial Intelligence like the Google.
http://www.w3schools.com/sql/sql_wildcards.asp
Take a look at this manual at w3schools about wildcards to use with your SELECT code.
But also you will need to create a new table with 3 columns: LeftCode, RightCode and WildCard.
Example:
Rows on Table:
LeftCode = SG | RightCode = 35s | WildCard = SG-s_s35s
LeftCode = SG | RightCode = 64s | WildCard = SG-s_t64s
SQL Code
If the user typed the code that matches the row1 of the table:
SELECT * FROM PRODUCTS WHERE CODE LIKE "$WildCard";
Where $WildCard is the PHP variable containing the column 3 of the new table.
I hope I helped, even 4 years late...

What is the best way to do a wildcard search in sql server 2005?

So I have a stored procedure that accepts a product code like 1234567890. I want to facilitate a wildcard search option for those products. (i.e. 123456*) and have it return all those products that match. What is the best way to do this?
I have in the past used something like below:
SELECT #product_code = REPLACE(#product_code, '*', '%')
and then do a LIKE search on the product_code field, but i feel like it can be improved.
What your doing already is about the best you can do.
One optimization you might try is to ensure there's an index on the columns you're allowing this on. SQL Server will still need to do a full scan for the wildcard search, but it'll be only over the specific index rather than the full table.
As always, checking the query plan before and after any changes is a great idea.
A couple of random ideas
It depends, but you might like to consider:
Always look for a substring by default. e.g. if the user enters "1234", you search for:
WHERE product like "%1234%"
Allow users full control. i.e. simply take their input and pass it to the LIKE clause. This means that they can come up with their own custom searches. This will only be useful if your users are interested in learning.
WHERE product like #input