the best way to use LIKE query in websql. - sql

I have a 2 tables that are old_test and new_test /bible database/
old_test table has 7959 rows
new_test table has 23145 rows
I want to use LIKE query to search verse from two tables.
For example:
SELECT *
FROM old_test
where text like "%'+searchword+'%"
union all
SELECT *
FROM new_test
where text like "%'+searchword+'%"
It works good but taking a lot of time to show the result.
What is the best solution to search much faster on above condition?
Thanks

Your query %searchword% cause table scan, it will get slower as number of records increase. Use searchword% query to get index base fast query.
What you need is full-text search, which is not available in websql.
I suggest my own open source library, https://github.com/yathit/ydn-db-fulltext for full-text search implementation. It works with newer IndexedDB API as well.

The main problem with your query is that you having to search entire fields segment by segment to find the string using like - building an index that can be queried instead should alleviate the problem.
Looking at Web SQL it uses the SQLite engine:
User agents must implement the SQL dialect supported by Sqlite 3.6.19.
http://www.w3.org/TR/webdatabase/#parsing-and-processing-sql-statements
Based on that, I would recommend trying to build a full-text index over the table to make these searches run quickly http://www.sqlite.org/fts3.html

Related

How to speed up this LIKE query?

I have a like query that’s processing millions of rows:
SELECT
sample_id,
REPLACE( sample_id, '*', '') AS term
FROM
sample.table
WHERE
sample_id LIKE '%*%'
ORDER BY
sample_id ASC;
I tried batching the queries but its still too slow to process. Have someone experienced this in the past and successfully solved this? I’m basically open to any ideas at this point. Thanks!
You did not mention which RDBMS you are using, but you can speed up processing by using properly designed index.
Index properties (basing on Microsoft SQL Server RDBMS):
filtered index:
you can implement a filtered index. Filter corresponds to the WHERE clause from your query. You can add "sample_id LIKE '%*%'" as a filter condition.
covering index:
your query is not complicated, so it should be easy to create a covering index
for it. By covering index I mean a structure which will contain all the columns
which are mentioned in your query, it will help the RDBMS engine to decide to
use it during execution becaue it will contain all the needed columns, and the
filter also, as mentioned in the first point.
So the syntax could look like this (Microsoft SQL Server pseudo code):
CREATE INDEX idx1 ON your_table_name (sample_id) WHERE sample_id LIKE '%*%'
If you would build it, you would have a DEDICATED structure for your query. You can think of it as of a subset of the data from your table, but physicaly present in your database, written to disk and being constantly updated as the data changes. As long as this index has the filter, it contains only the rows needed by your query. So you can imagine that if the RDBMS engine would choose it - by parsing and analyzing your code - the WHERE clause would not have to execute.
Unfortunatelly, I am not aware if other RDBMSes than Microsoft SQL Server deliver filtered indexes.
If your RDBMS doesn't allow for filtered indexes you can at least create a covering one. Still it might be lighter structure than your table, however, you didn't present the structure of your table.
An index doesn't come without a cost but this is a further story. Just remember that it takes place on disk and is being updated along with the data in your table.

Oracle Index query is not working

I want to improve the performance of a simple query, typical structure like that:
SELECT title,datetime
FROM LICENSE_MOVIES
WHERE client='Alex'
As you can read in different websites,like this, you should make an index like that:
CREATE INDEX INDEX_LICENSE_MOVIES
ON LICENSE_MOVIES(client);
But there is any performance in the query, it is like it where "ignoring" the index.
I have try to use hints like this webpage says.
And the query result like this:
SELECT /*+ INDEX(LICENSE_MOVIES INDEX_LICENSE_MOVIES) */ title, datetime
FROM LICENSE_MOVIES
WHERE client='Alex'
Is there is any error in this syntax? Why couldn't I appreciate any improvement?
Oracle has a smart optimizer. It does not always use indexes -- in fact, you might be surprised to learn that sometimes using an index is exactly the wrong thing to do.
In your case, your data fits on a handful of data pages (well, dozens). The question is: How many "Alex"s are in the data. If there is just one, then Oracle should use the index, as following:
Oracle looks up the row containing "Alex" in the index.
Oracle identifies the data page where the row is located.
Oracle loads the data page.
Oracle processes the query and returns the results.
If lots of rows (say more than a few dozen) are for "Alex", then the optimizer is going to "think" . . . "Gosh, I need to read every data page anyway. Let me avoid using the index and just scan all the data."
Of course, this decision is based on the available statistics (which might be inaccurate or out-of-date). But there are definitely circumstances where a full table scan is the right approach, even when an index is available.

SQLITE FTS3 Query Slower than Standard Tabel

I built sqlite3 from source to include the FTS3 support and then created a new table in an existing sqlite database containing 1.5million rows of data, using
CREATE VIRTUAL TABLE data USING FTS3(codes text);
Then used
INSERT INTO data(codes) SELECT originalcodes FROM original_data;
Then queried each table with
SELECT * FROM original_data WHERE originalcodes='RH12';
This comes back instantly as I have an index on that column
The query on the FTS3 table
SELECT * FROM data WHERE codes='RH12';
Takes almost 28 seconds
Can someone help explain what I have done wrong as I expected this to be significantly quicker
The documentation explains:
FTS tables can be queried efficiently using SELECT statements of two different forms:
Query by rowid. If the WHERE clause of the SELECT statement contains a sub-clause of the form "rowid = ?", where ? is an SQL expression, FTS is able to retrieve the requested row directly using the equivalent of an SQLite INTEGER PRIMARY KEY index.
Full-text query. If the WHERE clause of the SELECT statement contains a sub-clause of the form " MATCH ?", FTS is able to use the built-in full-text index to restrict the search to those documents that match the full-text query string specified as the right-hand operand of the MATCH clause.
If neither of these two query strategies can be used, all queries on FTS tables are implemented using a linear scan of the entire table.
For an efficient query, you should use
SELECT * FROM data WHERE codes MATCH 'RH12'
but this will find all records that contain the search string.
To do 'normal' queries efficiently, you have to keep a copy of the data in a normal table.
(If you want to save space, you can use a contentless or external content table.)
You should read documentation more carefully.
Any query against virtual FTS table using WHERE col = 'value' will be slow (except for query against ROWID), but query using WHERE col MATCH 'value' will be using FTS and fast.
I'm not an expert on this, but here are a few things to think about.
Your test is flawed (I think). You are contrasting a scenario where you have an exact text match (the index can be used on original_data - nothing is going to outperform this scenario) with an equality on the fts3 table (I'm not sure that FTS3 would even come into play in this type of query). If you want to compare apples to apples (to see the benefit of FTS3), you're going to want to compare a "like" operation on original_data against the FTS3 "match" operation on data.

How can I improve this endless query?

I've got a table with close to 5kk rows. Each one of them has one text column where I store my XML logs
I am trying to find out if there's some log having
<node>value</node>
I've tried with
SELECT top 1 id_log FROM Table_Log WHERE log_text LIKE '%<node>value</node>%'
but it never finishes.
Is there any way to improve this search?
PS: I can't drop any log
A wildcarded query such as '%<node>value</node>%' will result in a full table scan (ignoring indexes) as it can't determine where within the field it'll find the match. The only real way I know of to improve this query as it stands (without things like partitioning the table etc which should be considered if the table is logging constantly) would be to add a Full-Text catalog & index to the table in order to provide a more efficient search over that field.
Here is a good reference that should walk you through it. Once this has been completed you can use things like the CONTAINS and FREETEXT operators that are optimised for this type of retrieval.
Apart from implementing full-text search on that column and indexing the table, maybe you can narrow the results by another parameters (date, etc).
Also, you could add a table field (varchar type) called "Tags" which you can populate when inserting a row. This field would register "keywords, tags" for this log. This way, you could change your query with this field as condition.
Unfortunately, about the only way I can see to optimize that is to implement full-text search on that column, but even that will be hard to construct to where it only returns a particular value within a particular element.
I'm currently doing some work where I'm also storing XML within one of the columns. But I'm assuming any queries needed on that data will take a long time, which is okay for our needs.
Another option has to do with storing the data in a binary column, and then SQL Server has options for specifying what type of document is stored in that field. This allows you to, for example, implement more meaningful full-text searching on that field. But it's hard for me to imagine this will efficiently do what you are asking for.
You are using a like query.
No index involved = no good
There is nothing you can do with what you have currently to speed this up unfortunately.
I don't think it will help but try using the FAST x query hint like so:
SELECT id_log
FROM Table_Log
WHERE log_text LIKE '%<node>value</node>%'
OPTION(FAST 1)
This should optimise the query to return the first row.

Search in huge table

I got table with over 1 millions rows.
This table represents user information, e.g userName, email, gender, marrial status etc.
I'm going to write search over all rows in this table, when some conditions are applied.
In simples case, when search is perfomed only on userName, it takes over 4-7 seconds to find result.
select from u where u.name ilike " ... "
Yes, i got indexes over some fileds. I checked that they are applied using explain analyse command.
How search can be boost ?
I heart something about Lucene, can it help ?
I'm wondering how does Facebook search working, they got billions users and their search works much faster.
There is great difference between these three queries:
a) SELECT * FROM u WHERE u.name LIKE "George%"
b) SELECT * FROM u WHERE u.name LIKE "%George"
c) SELECT * FROM u WHERE u.name LIKE "%George%"
a) The first will use the index on u.name (if there is one) and will be very fast.
b) The second will not be able to use any index on u.name but there are ways to circumvent that rather easily.
For example, you could add another field nameReversed in the table where REVERSE(name) is stored. With an index on that field, the query will be rewritten as (and will be as fast as the first one):
b2) SELECT * FROM u WHERE u.nameReversed LIKE REVERSE("%George")
c) The third query poses the greatest difficulty as neither of the two previous indexes will be of any help and the query will scan the whole table. Alternatives are:
Using a dedicated for such problems solution (search for "full text search"), like Sphinx. See this question on SO with more details: which-is-best-search-technique-to-search-records
If your field has names only (or another limited set of words, say a few hundred different words), you could create another auxilary table with those names (words) and store only a foreign key in table u.
If off course that is not the case and you have tens of thousands or millions different words or the field contains whole phrases, then to solve the problem with many auxilary tables, it's like creating a full text search tool for yourself. It's a nice exercise and you won't have to use Sphinx (or other) besides the RDBMS but it's not trivial.
Take a look at
Hibernate Search
this is using Lucene but a lot more easier to implement.
Google or Facebook are using different approaches. They have distributed systems. Googles BigTable is a good keyword or the "Map and Reduce" concept (Apache Hadoop) is a good starting point for more research.
Try to use table partitioning.
In large table scenarios can be helpful to partiton a table.
For PostgreSQL try here PostgreSQL Partitioning.
For high scalable fast performance searches, sometimes may be useful to adopt NoSQL database (like Facebook does).
I heart something about Lucene, can it help ?
Yes, it can. I'm sure, you will love it!
I had the same problem: An table with round about 1.2 Million Messages. By searching trough these Messages it needs some seconds. An full text search on the "message" column needs about 10 seconds.
At the same server hardware lucene returns the result in about 200-400ms.
That's very fast.
Cached results returns in round about 5-10 ms.
Lucene is able to connect to your SQL database (for example mysql) - scans your database an builds an searchable index.
For searching this index it depends on the kind of application.
I my case, my PHP Webaplication uses solr for searching inside lucene.
http://lucene.apache.org/solr/