How to create GIN index with LOWER in PostgreSQL? - sql

First of all - I use JPA ORM (EclipseLink) which doesn't support ILIKE. So I am looking for solution to have case insensitive search. I did the following:
CREATE TABLE IF NOT EXISTS users (
id SERIAL NOT NULL,
name VARCHAR(512) NOT NULL,
PRIMARY KEY (id));
CREATE INDEX users_name_idx ON users USING gin (LOWER(name) gin_trgm_ops);
INSERT INTO users (name) VALUES ('User Full Name');
However, this query returns user:
SELECT * FROM users WHERE name ILIKE '%full%';
But this one doesn't:
SELECT * FROM users WHERE name LIKE '%full%';
So, how to create GIN index with LOWER in PostgreSQL?

I'm not sure I understand the question. because you mention GIN and insert one row and expect it to be returned with case insensitive comparison, but a wild guess - maybe you are looking for citext?..
t=# create extension citext;
CREATE EXTENSION
t=# CREATE TABLE IF NOT EXISTS users (
id SERIAL NOT NULL,
name citext NOT NULL,
PRIMARY KEY (id));
CREATE TABLE
t=# INSERT INTO users (name) VALUES ('User Full Name');
INSERT 0 1
t=# SELECT * FROM users WHERE name LIKE '%full%';
id | name
----+----------------
1 | User Full Name
(1 row)
update
expression based index requires expression in query

Related

Populate virtual SQLite FTS5 (full text search) table from content table

I've followed https://kimsereylam.com/sqlite/2020/03/06/full-text-search-with-sqlite.html to set up SQLite's virtual table extension FTS5 for full text search on an external content table.
While the blog shows how to set up triggers to keep the virtual FTS table updated with the data:
CREATE TABLE user (
id INTEGER PRIMARY KEY,
username TEXT NOT NULL UNIQUE,
email TEXT NOT NULL UNIQUE,
short_description TEXT
)
CREATE VIRTUAL TABLE user_fts USING fts5(
username,
short_description,
email UNINDEXED,
content='user',
content_rowid='id'
)
CREATE TRIGGER user_ai AFTER INSERT ON user
BEGIN
INSERT INTO user_fts (rowid, username, short_description)
VALUES (new.id, new.username, new.short_description);
END;
...
I am failing to populate the FTS table from all previous data in an analogous fashion.
I'll stick to the example from the blog:
INSERT INTO user_fts (rowid, username, short_description) SELECT (id, username, short_description) FROM user;
However, sqlite (3.37.2) fails with row value misused.
Please explain how id, content_rowid, rowid and new.id are related and how to modify the query to update the FTS table properly.
INSERT INTO user_fts (rowid, username, short_description) SELECT id, username, short_description FROM user;
(no parentheses) works.
rowid is a unique 64 bit unsigned integer row id.
If the table contains an integer primary key (as id in user), they are the same (alias). I.e. user.rowid == user.id = user_fts.rowid.
Doc: https://www.sqlite.org/lang_createtable.html#rowid
The new refers to the element being inserted.
Doc: https://www.sqlite.org/lang_createtrigger.html
content_rowid links the virtual FTS table to the external data table row id column (it defaults to rowid).
Doc: https://www.sqlite.org/fts5.html#external_content_tables

Use few analyzers in GIN index in Postgres

I want to create GIN index for Postges full text search and I would like to ask is it possible if I store analyzer name for each row in table in separate column called lang, use it to create GIN index with different analyzer for each row taken from this field lang?
This is what I use now. Analyzer – ‘english’ and it is common for each row in indexed table.
CREATE INDEX CONCURRENTLY IF NOT EXISTS
decription_fts_gin_idx ON api_product
USING GIN(to_tsvector('english', description))
I want to do something like this:
CREATE INDEX CONCURRENTLY IF NOT EXISTS
decription_fts_gin_idx ON api_product
USING GIN(to_tsvector(api_product.lang, description))
( it doesnt work)
in order to retrieve analyzer configuration from field lang and use its name to populate index.
Is it possible to do it somehow or it is only possible to use one analyzer for the whole index?
DDL, just in case..
-- auto-generated definition
create table api_product
(
id serial not null
constraint api_product_pkey
primary key,
name varchar(100) not null,
standard varchar(40) not null,
weight integer not null
constraint api_product_weight_check
check (weight >= 0),
dimensions varchar(30) not null,
description text not null,
textsearchable_index_col tsvector,
department varchar(30) not null,
lang varchar(25) not null
);
alter table api_product
owner to postgres;
create index textsearch_idx
on api_product (textsearchable_index_col);
Query to run for seach:
SELECT *,
ts_rank_cd(to_tsvector('english', description),
to_tsquery('english', %(keyword)s), 32) as rnk
FROM api_product
WHERE to_tsvector('english', description) ## to_tsquery('english', %(keyword)s)
ORDER BY rnk DESC, id
where 'english' would be changed to 'lang' field analyzer name (english, french, etc)
If you know ahead of time the language you are querying against, you could create a series of partial indexes:
CREATE INDEX CONCURRENTLY ON api_product
USING GIN(to_tsvector('english', description)) where lang='english';
Then in your query you would add the language you are searching in:
SELECT *,
ts_rank_cd(to_tsvector('english', description),
to_tsquery('english', %(keyword)s), 32) as rnk
FROM api_product
WHERE to_tsvector('english', description) ## to_tsquery('english', %(keyword)s)
and lang='english'
ORDER BY rnk DESC, id
What you asked about is definitely possible, but you have the wrong type for the lang column:
create table api_product(description text, lang regconfig);
create index on api_product using gin (to_tsvector(lang, description));
insert into api_product VALUES ('the description', 'english');

Generating sql file with uuids and referring those ids further postgres9.5

I am creating a sql file which has uuids as primary key. Here is how my create table definition looks like using pgcrypto extension
CREATE EXTENSION pgcrypto;
CREATE TABLE snw.contacts(
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
email TEXT
);
Now I add a record in this table using
INSERT INTO snw.contacts (name,email) VALUES('Dr Nic Williams','drnic');
postgres=# select * from snw.contacts;
id | name | email
--------------------------------------+-----------------+-------
7c627ee0-ac94-40ee-b39d-071299a55c13 | Dr Nic Williams | drnic
Now going ahead in the same file I want to insert a row in one of tables which looks like
CREATE TABLE snw.address(
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
street TEXT
contact UUID
);
where contact UUID refers to ID in snw.contacts table. How can I fetch the uuid which was generated in the first insert and use it in another insert in the snw.address table?Something like:
INSERT INTO snw.address(street,contact) values('ABC', (select id from snw.contacts where email='drnic'));
I can use where clause I am using this script for generating some test data and so I know what the email would be for fetching the id.
Use a data modifying CTE:
with new_contact as (
INSERT INTO snw.contacts (name,email)
VALUES('Dr Nic Williams','drnic')
returning id
)
INSERT INTO snw.address(street,contact)
select 'ABC', id
from new_contact;

postgresql retype in index

How can I create index in PostgreSQL like:
CREATE INDEX i_users_user_id
ON users
USING btree (user_id::character varying);
I want Integer column to behave like String column:
SELECT * FROM vw_users WHERE user_id='string'
'string' is some value and I don't know if it is user_id or session_id and I want only one query:)
vw_users is:
SELECT user_id::character varying FROM users
UNION
SELECT session_id as user_id FROM temp_users
Tables are:
CREATE TABLE users (user_id integer)
CREATE TABLE temp_users (session_id character varying)
Regards
An index on an expression requires an extra set of parentheses:
CREATE INDEX i_users_user_id
ON users
USING btree ((user_id::character varying));

Storing a database reference within the database

I want to be able to label the database with a single value, i.e its name, from within the database instead of my application, since it will always be one ID per database. For example, something like this:
DATABASE_A.sql
-- Database Name Table
CREATE TABLE database (
name VARCHAR(10) NOT NULL UNIQUE,
);
CREATE TABLE item (
id SERIAL PRIMARY KEY,
name VARCHAR(10) NOT NULL UNIQUE,
);
Insert Into database (name) values ('A');
DATABASE_B.sql
-- Database Name Table
CREATE TABLE database (
name VARCHAR(10) NOT NULL UNIQUE,
);
CREATE TABLE item (
id SERIAL PRIMARY KEY,
name VARCHAR(10) NOT NULL UNIQUE,
);
Insert Into database (name) values ('B');
This is because when they are combined and stored on a SOLR search server their ID is a combination of their database name and their item ID, such as this:
SOLR ITEM ID's
A1
A2
A3
B1
Is it ok to have a single table to define the prefix so that when I do the look up from my SQL website to SOLR I can just do the following query:
database (name) + item (id) = SolrID
I'd be more inclined to build a procedure in each database that contained the database ID, for example:
CREATE OR REPLACE FUNCTION solrid(IN local_id INTEGER, OUT result TEXT) AS $$
DECLARE
database_id TEXT := 'A';
BEGIN
result := database_id || local_id::TEXT;
END;
$$ LANGUAGE PLPGSQL;
Then you could write your select statement like:
SELECT solrid(id), name FROM item;
which seems to be a cleaner solution.