PostgreSQL: Pattern matching only whole words - sql

I have a "queries" table that holds hundreds of SQL queries and I am trying to filter out queries that can only be executed on the DB I am using. Because some of these queries refer to tables that exists only in another DB, so only a fraction of them can be executed successfully.
My query so far looks like this:
SELECT rr.name AS query_name,
(
SELECT string_agg(it.table_name::character varying, ', ' ORDER BY it.table_name)
FROM information_schema.tables it
WHERE rr.config ->> 'queries' SIMILAR TO ('%' || it.table_name || '%')
) AS related_tables
FROM queries rr
and it does work fine except the pattern I provided is not the best to filter out edge cases.
Let's say that I have a table called "customers_archived" in the old DB that does not exist in the new one, and a table called "customers" that exists in both the old and the new DB.
Now, with the query I wrote the engine thinks, "Well, I have a table called customers so any query that includes the word customers must be valid", but the engine is wrong because it also picks queries that include "customers_archived" table which does not exist in that DB.
So I tried to match whole words only but I could not get it to work because \ character won't work in PGSQL as far as I am concerned. How can I get this query to do what I am trying to achieve?

There is no totally reliable way of finding the tables referenced by a query short of building a full PostgreSQL SQL parser. For starters, the name could occur in a string literal, or the query could be
DO $$BEGIN EXECUTE 'SELECT * FROM my' || 'table'; END;$$;
But I think you would be better off if you make sure that there are non-word characters around your name in the match:
WHERE rr.config ->> 'queries' ~ '\y' || it.table_name || '\y'

Related

How to use a table's content for querying other tables in BIgQuery

My team and I are using a query on a daily basis to receive specific results from a large dataset. This query is constantly updated with different terms that I would like to receive from the dataset.
To make this job more scaleable, I built a table of arrays, each containing the terms and conditions for the query. That way the query can lean on the table, and changes that I make in the table will affect the query without the need to change it.
The thing is - I can't seem to find a way to reference the table in the actual query without selecting it. I want to use the content of the table as a WHERE condition. for example:
table1:
terms
[term1, term2, term3]
query:
select * from dataset
where dataset.collumn like '%term1'
or dataset.collumn like '%term2'
or dataset.collumn like '%term3'
etc.
If you have any ideas please let me know (if the solution involves Python or JS this is also great)
thanks!
You can "build" the syntax you want using Procedural Language in BigQuery and then execute it. Here is a way of doing it without "leaving" BQ (meaning, without using external code):
BEGIN
DECLARE statement STRING DEFAULT 'SELECT col FROM dataset.table WHERE';
FOR record IN (SELECT * FROM UNNEST(['term1','term2','term3']) as term)
DO
SET statement = CONCAT(statement, ' col LIKE "', '%', record.term, '" OR');
END FOR;
SET statement = CONCAT(statement, ' 1=2');
EXECUTE IMMEDIATE statement;
END;

How to search for values of several columns at once that start with some string

I completely new to the world of databases and querying.
I have the following db
for which I want to construct a "smart" search query for to be able to fetch relevant entries based on a query that looks for something in the isbn, name and author columns. The catch is that the input query doesn't necessarily have to be a keyword, it can be part of one. I want to be able to enter a part of an isbn number, a part of a book's name, a part of an author's name and get back relevant results.
Say I query for "97182", I want all the books whose isbn begins with that sequence.
Or if I query "ma", I want all the entries whose name or author begins with "ma".
For example I recently learned that one could do the following (I believe the name is "full text search") using Postgresql:
SELECT *
FROM books
WHERE to_tsvector(isbn || ' ' || name || ' ' || author) ## to_tsquery('proust')
Output will be the rows that have "proust" in one of the searched for columns.
The same postgresql query will not work for what I mentioned due to the nature of how it works (vectorizing columns, reducing to lexemes etc.). I have not fully understood full text search yet though.
Is it possible to create a "smarter" query according to my description?
Use LIKE:
WHERE LOWER(isbn || ' ' || name || ' ' || author) LIKE '%proust%'
Unlike the proprietary functions you used, this code will run on all RDBMS.

Oracle 19c Database generate JSON from rows without passing field names

I am trying to find a way to query relational tables, without prior knowledge about the table, to generate each row returned as a JSON object without having to execute two queries to achieve this outcome.
For example, If I have a 'Table_1' with columns a,b and c. I am seeking a way to write a single query that will return me json representing each row.
{
"a":"value of a",
"b":"value of b",
"c":"value of c"
}
At this time, I have to execute the following query to find the column names:
SELECT INTO LISTAGG(column_name,', ') from ALL_TAB_COLUMNS where TABLE_NAME="Table_1"
I then formulate a second query (on the client) based on the results of the previous query that gives me the COLUMN_NAMES
SELECT JSON_OBJECT(colnames) from Table_1;
I am trying to save 2 trips to the database by combining the two queries but to no avail. Any idea on how to make this work? I have tried so many permutations and combinations.
Tried this: won't work.
begin
SELECT INTO colnames LISTAGG(column_name,', ') from ALL_TAB_COLUMNS where TABLE_NAME='Table_1' and owner ='owner1'
select JSON_OBJECT(colnames) from Table_1 where a LIKE '%';
END;
Another failed attempt:
select JSON_OBJECT(SELECT LISTAGG(column_name,', ') from ALL_TAB_COLUMNS where TABLE_NAME='Table_1' and owner ='owner1') from Table_1 where a LIKE '%'

Row-Level Security Predicate Filter

On Oracle 19c.
We have users whose accounts are provisioned by specifying a comma separated list of department_code values. Each of the department_code values is a string of five alpha-numeric [A-Z0-9] characters. This comma separated value list of five character department_codes is what we call the user's security_string. We use this security_string to limit which rows the user may retrieve from a table, Restricted, by applying the following predicate.
select *
from Restricted R
where :security_string like '%' || R.department_code || '%';
A given department_code can be in Restricted many times and a given user can have many department_codes in their comma-separated value :security_string.
This predicate approach to applying row-level security is inefficient. No index can be used and it requires a full table scan on Restricted.
In alternative is to use dynamic SQL to do something like as follows.
execute immediate 'select *
from Restricted R
where R.department_code in(' || udf_quoted_values(:security_string) || ')';
Where udf_quoted_values is a user-defined function (UDF) that wraps in single quotes each department_code value within the :security_string.
However, this alternative approach also seems unsatisfactory as it requires a UDF, dynamic sql, and a full table scan is still likely.
I've considered bit-masking, but the number of bits needed is large 60 million (=36^5) and it would still require a UDF, dynamic sql, and a full table scan (function based index doesn't seem to be a candidate here). Also, bit-masking doesn't make much sense here as there is no nesting/hierarchy of department_codes.
execute immediate 'select *
from Restricted R
where BITAND(R.department_code_encoded,' || udf_encoded_number(:security_string) || ') > 0';
Where Restricted.department_code_encoded is a numeric encoded value of Restricted.department_code and udf_encoded_number is a user-defined function (UDF) that returns a number encoding the department_codes in the :security_string.
I've considered creating a separate table of just department codes, Department, and joining that to the Restricted table.
select *
from Restricted R
join Department D
on R.deparment_code = D.department_code
where :security_string like '%' || D.department_code || '%';
We still have the same problems as before, but now it is on the smaller (table cardinality) Department table (Department.department_code is unique where as Restricted.department_code is not unique). This provides for a smaller full table scan on Department than on Restricted, but now we have a join.
It is possible for us to change security_string or add additional user specific security values when the account is provisioned. We can also change the Oracle objects and queries. Note, the department_codes are not static, but don't change all that regularly either.
Any recommendations? Thank you in advance.
Why not converting the string to a table, like suggested here, and then do a join.

Copy many tables in MySQL

I want to copy many tables with similar names but different prefixes. I want the tables with the wp_ prefix to go into their corresponding tables with the shop_ prefix.
In other words, I want to do something like this:
insert into shop_wpsc_*
select * from wp_wpsc_*
How would you do this?
SQL doesn't allow wildcarding table names - the only way to do this is to loop through a list of tables (via the ANSI INFORMATION_SCHEMA/INFORMATION_SCHEMAS) while using dynamic SQL.
Dynamic SQL is different for every database vendor...
Update
MySQL? Why didn't you say so in the first place...
MySQL's dynamic SQL is called "Prepared Statements" - this is my fav link for it besides the documentation. There're numerous questions on SO about operations on all the tables in a MySQL database - just need to tweak the WHERE clause to get the table names you want.
You'll want to do this from within a MySQL stored procedure...
You can do this by combining multiple statements into a single prepared statement -- try doing this:
SELECT #sql_text := GROUP_CONCAT(
CONCAT('insert into shop_wpsc_',
SUBSTRING(table_name, 9),
' select * from ', table_name, ';'), ' ')
FROM INFORMATION_SCHEMA.TABLES
WHERE table_schema = 'example'
AND table_name LIKE 'wp_wpsc_%';
PREPARE stmt FROM #sql_text;
EXECUTE stmt;
Expanding on OMG Ponies' answer a bit, you can use the data dictionary and write a SQL to write the SQL's. For example, in Oracle, you could do something like this:
SELECT 'insert into shop_wpsc_' || SUBSTR(table_name,9) || ' select * from ' || table_name || ';'
FROM all_tables
WHERE table_name LIKE 'WP_SPSC%'
This will generate a series of SQL statements you can run as a single script. Like OMG Ponies' pointed out though, the syntax will vary depending on what DB vendor you are using (e.g. all_tables is Oracle specific).
First I would select all tables from the catalog views (the name of those may depend on your dmbs, though if they are ansi compatible they should support INFORMATION_SCHEMA) that start with wp_wpsc_.
(For instance for DB2:
SELECT NAME FROM TABLES WHERE NAME LIKE 'wp_wpsc_%'
)
Then iterate through that result set, and create a dynamic statement in the form you have given to read from the current table and insert into the corresponding new one.