Check if CSV string column contains desired values - sql

I am new to PostgreSQL and I want to split string of the following format:
0:1:19
with : as delimiter. After split, I need to check if this split string contains either 0 or 1 as a whole number and select only those rows.
For example:
Table A
Customer
role
A
0:1:2
B
19
C
2:1
I want to select rows which satisfy the criteria of having whole numbers 0 or 1 in role.
Desired Output:
Customer
role
A
0:1:2
C
2:1

Convert to an array, and use the overlap operator &&:
SELECT *
FROM tbl
WHERE string_to_array(role, ':') && '{0,1}'::text[];
To make this fast, you could support it with a GIN index on the same expression:
CREATE INDEX ON tbl USING GIN (string_to_array(role, ':'));
See:
Can PostgreSQL index array columns?
Check if value exists in Postgres array
Alternatively consider a proper one-to-many relational design, or at least an actual array column instead of the string. Would make index and query cheaper.

We can use LIKE here:
SELECT Customer, role
FROM TableA
WHERE ':' || role || ':' LIKE '%:0:%' OR ':' || role || ':' LIKE '%:1:%';
But you should generally avoid storing CSV in your SQL tables if your design would allow for that.

Related

Row-Level Security Predicate Filter

On Oracle 19c.
We have users whose accounts are provisioned by specifying a comma separated list of department_code values. Each of the department_code values is a string of five alpha-numeric [A-Z0-9] characters. This comma separated value list of five character department_codes is what we call the user's security_string. We use this security_string to limit which rows the user may retrieve from a table, Restricted, by applying the following predicate.
select *
from Restricted R
where :security_string like '%' || R.department_code || '%';
A given department_code can be in Restricted many times and a given user can have many department_codes in their comma-separated value :security_string.
This predicate approach to applying row-level security is inefficient. No index can be used and it requires a full table scan on Restricted.
In alternative is to use dynamic SQL to do something like as follows.
execute immediate 'select *
from Restricted R
where R.department_code in(' || udf_quoted_values(:security_string) || ')';
Where udf_quoted_values is a user-defined function (UDF) that wraps in single quotes each department_code value within the :security_string.
However, this alternative approach also seems unsatisfactory as it requires a UDF, dynamic sql, and a full table scan is still likely.
I've considered bit-masking, but the number of bits needed is large 60 million (=36^5) and it would still require a UDF, dynamic sql, and a full table scan (function based index doesn't seem to be a candidate here). Also, bit-masking doesn't make much sense here as there is no nesting/hierarchy of department_codes.
execute immediate 'select *
from Restricted R
where BITAND(R.department_code_encoded,' || udf_encoded_number(:security_string) || ') > 0';
Where Restricted.department_code_encoded is a numeric encoded value of Restricted.department_code and udf_encoded_number is a user-defined function (UDF) that returns a number encoding the department_codes in the :security_string.
I've considered creating a separate table of just department codes, Department, and joining that to the Restricted table.
select *
from Restricted R
join Department D
on R.deparment_code = D.department_code
where :security_string like '%' || D.department_code || '%';
We still have the same problems as before, but now it is on the smaller (table cardinality) Department table (Department.department_code is unique where as Restricted.department_code is not unique). This provides for a smaller full table scan on Department than on Restricted, but now we have a join.
It is possible for us to change security_string or add additional user specific security values when the account is provisioned. We can also change the Oracle objects and queries. Note, the department_codes are not static, but don't change all that regularly either.
Any recommendations? Thank you in advance.
Why not converting the string to a table, like suggested here, and then do a join.

Concatenated index in postgresql

So basically I'm matching addresses by matching strings within 2 tables
Table B has 5m rows so I really don't want to create new columns for it every time I want to match the addresses
So I thought about creating indexes instead, my current index to match addresses would look like:
CREATE INDEX matchingcol_idx ON tableB USING btree (sub_building_name || ', ' || building_name )
However this does not work, it doesn't accept the concatenation bar
My update query would then equal = b.sub_building_name || ', ' || b.building_name
Without a new column and an index this would take multiple hours
Is there a way to achieve this without creating new concatenation columns?
For an expression based index, you need to put the expression between parentheses:
CREATE INDEX matchingcol_idx
ON tableB USING btree ( (sub_building_name || ', ' || building_name) );
But that index will only be used if you use exactly the same condition in your where clause. Any condition only referencing one of the columns will not use that index.

Select if comma separated string contains a value

I have table
raw TABLE
=========
id class_ids
------------------------
1 1234,12334,12341,1228
2 12281,12341,12283
3 1234,34221,31233,43434,1123
How to define regex to select raws if class_ids contains special id.
If we select raws with '1234' in class_ids result list should not contain raws with '12341' in class_ids.
IDs in column class_ids separated with ,
SELECT FROM raw re WHERE re.class_ids LIKE (regex)
You shouldn't be storing comma separated values in a single column.
However, this is better done using string_to_array() in Postgres instead of a regex:
SELECT *
FROM raw
WHERE '1234'= any(string_to_array(class_ids, ','));
If you really want to de-normalize your data, it's better to store those numbers in a proper integer array, instead of comma separated list of strings
A simple way uses like:
where ',' || re.class_ids || ',' like '%,1234,%'
However, this is not the real issue. You should not be storing lists of ids in a string. The SQLish way of storing them would have a table with one row per id and one row per class_id. This is called a junction table.
Even if you don't use a separate table, you should at least use Postgres's built-in mechanisms, such as an array. However, a separate table is much the preferred method, because you can explicitly declare foreign key relationships.
If you really want to do this with regular expressions, you can use the ~ operator:
SELECT FROM raw re WHERE re.class_ids ~ '^(^|,)1234(,|$)$';
But I prefer a_horse_with_no_name's answer that uses arrays.

Multiple columns in one SQL

Can i insert multiple values from different columns in one?
i have:
ref | alt | couple_refalt
--------------------------
A C AC ( i want)
A G AG Etc...
Is there a simple way?
I tried with:
INSERT INTO refalt(couple_refalt)
SELECT ref||alt
FROM refalt
WHERE ref='A';
Is it correct?
But it gives me the error:
null value in column violates not-null constraint
Postgres want a value for each colum, why can't i update or insert into specific column?
Storing comma separated value is not the SQLish way to store values. What you seem to want is a computed column. Postgres does not support that directly. One method is to declare a view:
create view v_refault
select r.*, ref || ',' || alt
from refault;
Other possibilities are:
Define a trigger to maintain the value.
Concatenate the values at the application level.
Use a function-based method to emulate a computed column.
In order to insert two values into one column, you need to concatenate them. In postgresql the syntax is the following.
SELECT ref::text || ', ' || alt::text FROM refalt
If you want more details, here is the string documentation

Oracle SQL - Joining list of values to a field with those values concatenated

The title is a bit confusing, so I'll explain with an example what I'm trying to do.
I have a field called "modifier". This is a field with concatenated values for each individual. For example, the value in one row could be:
*26,50,4 *
and the value in the next row
*4 *
And the table (Table A) would look something like this:
Key Modifier
1 *26,50,4 *
2 *4 *
3 *1,2,3,4 *
The asterisks are always going to be in the same position (here, 1 and 26) with an uncertain number of numbers in between, separated by commas.
What I'd like to do is "join" this "modifier" field to another table (Table B) with a list of possible values for that modifier. e.g., that table could look like this:
ID MOD
1 26
2 3
3 50
4 78
If a value in A.modifier appears in B.mod, I want to keep that row in Table A. Otherwise, leave it out. (I use the term "join" loosely because I'm not sure that's what I need here.)
Is this possible? How would I do it?
Thanks in advance!
edit 1: I realize I can use regular expressions and do a bunch of or statements that search for the comma-separated values in the MOD list, but is there a better way?
One way to do it is using TRIM, string concatenations and LIKE.
SELECT *
FROM tableA a
WHERE EXISTS(
SELECT 1 FROM tableB b
WHERE
','|| trim( trim( BOTH '*' FROM a.Modifier )) ||','
LIKE '%,'|| b.mod || ',%'
);
Demo --> http://www.sqlfiddle.com/#!4/1caa8/10
This query migh be still slow for huge tables (it always performs full scans of tables or indexes), however it should be faster than using regular expressions or parsing comma separated lists into individual values.