Index on part of a column in SQLite - sql

Is doing something like the following possible in SQLite:
create INDEX idx on mytable (synopsis(20));
In other words, indexing by something less than the full text field? This is useful on long-text fields where I don't want to index everything into memory (the index itself could take up more space than the entire table).

You seem to be looking for an index on expression:
Use a CREATE INDEX statement to create a new index on one or more expressions just like you would to create an index on columns. The only difference is that expressions are listed as the elements to be indexed rather than column names.
Consider:
CREATE INDEX idx ON mytable(SUBSTR(synopsis, 1, 20));
Please note that, as explained in the documentation, for this index to be considered by the sqlite query planner, you need to use the exact same expression that was given when creating the index.
So this query would use the index:
SELECT * FROM mytable WHERE SUBSTR(synopsis, 1, 20) = 'a text with 20 chars';
While, typically, this would not:
SELECT * FROM mytable WHERE synopsis LIKE 'a text with 20 chars%';
Note: yes, 'a text with 20 chars' is 20 chars long...

Related

Oracle REGEXP_LIKE function to use index table scan?

I am using regexp_like function to search specific patterns on a column. But, I see this query is not taking the index created on this column instead going for full table scan. Is there any option to create function based index for regexp_like so that my query will use that index? Here, the pattern SV4889 is not constant expression but it will vary every time.
select * from test where regexp_like(id,'SV4889')
Yup. Regular expressions do not use indexes. What can you do?
Well, if you are just looking for equality, then use equality:
where id = 'SV4889'
This will use an index on (id).
If you are looking for a leading value, then use like:
where id like 'SV4889%'
This will use an index because the wildcard is at the end of the pattern.
If you are storing multiple values in the column, say 'SV4889,SV4890' then fix your data model. It is broken! You should have another table with one row per id.
Finally, if you really need more sophisticated full text capabilities, then look into Oracle's support for full text indexes. However, such capabilities are usually not needed on a column called id.
You can add a virtual column to your table to determine if the substring you're interested in exists in the field, then index the virtual column. For example:
ALTER TABLE TEST
ADD SV4889_FLAG CHAR(1)
GENERATED ALWAYS AS (CASE
WHEN REGEXP_LIKE(ID,'SV4889') THEN 'Y'
ELSE 'N'
END) VIRTUAL;
This adds a field named SV4889_FLAG to your table which will contain Y if the text SV4889 exists in the ID field, and N if it doesn't. Then you can create an index on the new field:
CREATE INDEX IDX_TEST_SV4889_FLAG
ON TEST (SV4889_FLAG);
So to determine if a row has 'SV4889' in it you can use a query such as:
SELECT *
FROM TEST
WHERE SV4889_FLAG = 'Y'
db<>fiddle here

Need help in optimizing an Oracle query

I came up with a query, to fetch data from a table, which contains 93781665 entries, to display the results as suggestions in an autocomplete text box.
But it takes more than 300 Sec to fetch results.
The query is given below.
select * from table
where upper(column1||' '||column2||' '||column3) like upper('searchstring%')
and rownum <= 10;
Kindly help me to optimize it.
The WHERE clause in your query is not sargable, meaning that no index can be used there. This rules out most of the methods you might use here to optimize the query. Here is one suggestion:
SELECT *
FROM yourTable
WHERE column4 LIKE 'SEARCHSTRING%';
Here, column4 is a new column in your table, which contains the concatenation of the first three columns. Furthermore, all text in column4 will always be uppercase, and the search string you pass into the query will also always be uppercase. Given these assumptions, the following index might help the query:
CREATE INDEX idx ON yourTable (column4);
In Oracle, you can index an expression:
create index idx_t_columns on t(upper(column1||' '||column2||' '||column3))
Then, this condition can use the index:
where upper(column1||' '||column2||' '||column3) like 'searchstring%'
If the search string is constant, then this should also work:
where upper(column1||' '||column2||' '||column3) like upper('searchstring%')
Note that a wildcard at the beginning of the like pattern would preclude the use of an index.

Query JSONB column for any value where =?

I have a jsonb column which has the unfortunate case of being very unpredictable, in some cases its value may be an array with nested values:
["UserMailer", "applicant_setup_3", ["5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"]]
Sometimes it will be something with key/values like this:
[{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}]
Is there a way to write a query which just treats the whole column like text and does a like to see if I can find the uuid in the big text blob? I want to find all the records where a particular uuid string is present in the jsonb column.
The query doesn't need to be fast or efficient.
Postgres has search operator ? for jsonb, but that would require you to search the json content recursively.
A possible, although not very efficient method, would to stringify the object and use LIKE to search it:
myjsonb::text LIKE '%"5cbffeb7-8d5e-4b52-a475-3cf320b2cee9"%'
myjsonb::text LIKE '%"' || myuuid || '"%'
Demo on DB Fiddle:
The problem with the jsonb operator ? is that it only considers top-level keys (including array elements), not values, and no nested objects.
You seem to be looking for values and array elements (not keys) on any level. You can get that with a full text search on top of your json(b) column:
SELECT * FROM tbl
WHERE to_tsvector('simple', jsonb_column)
## tsquery '5cbffeb7-8d5e-4b52-a475-3cf320b2cee9';
db<>fiddle here
to_tsvector() extracts values and array elements on all levels - just what you need.
Requires Postgres 10 or later. json(b)_to_tsvector() in Postgres 11 offers more flexibility.
That's attractive for tables of non-trivial size as it can be supported with a full text index very efficiently:
CREATE INDEX tbl_jsonb_column_fts_gin_idx ON tbl USING GIN (to_tsvector('simple', jsonb_column));
I use the 'simple' text search configuration in the example. You might want a language-specific one, like 'english'. Doesn't matter much while you only look for UUID strings, but stemming for a particular language might make the index a bit smaller ...
Related:
LIKE query on elements of flat jsonb array
Does the phrase search operator <-> work with JSONB documents or only relational tables?
While you are only looking for UUIDs, you might optimize further with a custom (IMMUTABLE) function to extract UUIDs from the JSON document as array (uuid[]) and build a functional GIN index on top of it. (Considerably smaller index, yet.) Then:
SELECT * FROM tbl
WHERE my_uuid_extractor(jsonb_column) #> '{5cbffeb7-8d5e-4b52-a475-3cf320b2cee9}';
Such a function can be expensive, but does not matter much with a functional index that stores and operates on pre-computed values.
You can split the array elements first by using jsonb_array_elements(json), and then filter the casted string from those elements by like operator
select q.elm
from
(
select jsonb_array_elements(js) as elm
from tab
) q
where elm::varchar like '%User%'
elm
----------------------------------------------------------------------------------------------------------------------
"UserMailer"
{"reference_id": "5cbffeb7-8d5e-4b52-a475-3cf320b2cee9", "job_dictionary": ["StatusUpdater", "FollowTwitterUsersJob"]}
Demo

Indexing a varchar2 column in oracle with a pattern?

I have a problem on indexing oracle
Following SQL query is given (means, it is automatically generated, I can't adapt it):
select * from t where f like 'AA.%.123456.%'
On column f, a unique index exists, but it is not beeing used by the oracle optimizer (it's doing a full table scan)
I think this because of the data
Data in column f allways begin with AA
Please note following data example:
AA.248117.123456.UAAA
AA.741447.123456.UAAA
AA.741449.123456.UAAA
AA.741450.123456.UAAA
AA.322812.123456.UAAA
AA.741363.123456.UZZZ
AA.741365.123456.UZZZ
AA.356110.123456.UZZZ
AA.356333.456789.UBBB
AA.256321.456789.UBBB
AA.333219.456789.UBBX
My question:
Is it possible to create an oracle index with a string-pattern?
Meaning a index as some kind of substring (but not using a function based index Substr(), as mentioned above, I can't change the query)
Index should look like this
'AA.<someNumberNotRelevant>.<indexedValue>.<someStringNotRelevant>'
Thanks

How to filter a value of any key of json in postgres

I have a table users with a jsonb field called data. I have to retrieve all the users that have a value in that data column matching a given string. For example:
user1 = data: {"property_a": "a1", "property_b": "b1"}
user2 = data: {"property_a": "a2", "property_b": "b2"}
I want to retrieve any user that has a value data matching 'b2', in this case that will be 'user2'.
Any idea how to do this in an elegant way? I can retrieve all keys from data of all users and create a query manually but that will be neither fast nor elegant.
In addition, I have to retrieve the key and value matched, but first things first.
There is no easy way. Per documentation:
GIN indexes can be used to efficiently search for keys or key/value
pairs occurring within a large number of jsonb documents (datums)
Bold emphasis mine. There is no index over all values. (Those can have non-compatible data types!) If you do not know the name(s) of all key(s) you have to inspect all JSON values in every row.
If there are just two keys like you demonstrate (or just a few well-kown keys), it's still easy enough:
SELECT *
FROM users
WHERE data->>'property_a' = 'b2' OR
data->>'property_b' = 'b2';
Can be supported with a simple expression index:
CREATE INDEX foo_idx ON users ((data->>'property_a'), (data->>'property_b'))
Or with a GIN index:
SELECT *
FROM users
WHERE data #> '{"property_a": "b2"}' OR
data #> '{"property_b": "b2"}'
CREATE INDEX bar_idx ON users USING gin (data jsonb_path_ops);
If you don't know all key names, things get more complicated ...
You could use jsonb_each() or jsonb_each_text() to unnest all values into a set and then check with an ANY construct:
SELECT *
FROM users
WHERE jsonb '"b2"' = ANY (SELECT (jsonb_each(data)).value);
Or
...
WHERE 'b2' = ANY (SELECT (jsonb_each_text(data)).value);
db<>fiddle here
But there is no index support for the last one. You could instead extract all values into and array and create an expression index on that, and match that expression in queries with array operators ...
Related:
How do I query using fields inside the new PostgreSQL JSON datatype?
Index for finding an element in a JSON array
Can PostgreSQL index array columns?
Try this query.
SELECT * FROM users
WHERE data::text LIKE '%b2%'
Of course it won't work if your key will contain such string too.