Looping through a PostgreSQL table in bash - sql

I have a PostgreSQL table of the following format:
uid | defaults | settings
-------------------------------
abc | ab, bc | -
| |
pqr | pq, ab | -
| |
xyz | xy, pq | -
I am trying to list all the uids which contain ab in the defaults column. In the above case, abc and pqr must be listed.
How do I form the query and loop it around the table to check each row in bash?

#user000001 already provided the bash part. And the query could be:
SELECT uid
FROM tbl1
WHERE defaults LIKE '%ab%'
But this is inherently unreliable, since this would also find 'fab' or 'abz'. It is also hard to create a fast index.
Consider normalizing your schema. Meaning you would have another 1:n table tbl2 with entries for individual defaults and a foreign key to tbl1. Then your query could be:
SELECT uid
FROM tbl1 t1
WHERE EXISTS
(SELECT 1
FROM tbl2 t2
WHERE t2.def = 'ab' -- "default" = reserved word in SQL, so I use "def"
AND t2.tbl1_uid = t1.uid);
Or at least use an array for defaults. Then your query would be:
SELECT uid
FROM tbl1
WHERE 'ab' = ANY (defaults);

It's not really about bash but you can call your query command using psql. You can try this format:
psql -U username -d database_name -c "SELECT uid FROM table_name WHERE defaults LIKE 'ab, %' OR defaults LIKE '%, ab'
Or maybe simply
psql -U username -d database_name -c "SELECT uid FROM table_name WHERE defaults LIKE '%ab%'
-U username is optional.

Use awk:
awk -F\| '$2~/ab/{print $1}' file
Explanation:
The -F\| sets the field seperator to the | character
With $2~/ab/ we filter the lines that contain "ab" in the second column.
With print $1 we print the first column for the lines matched.

Related

Use SQL to get the same list of function that are part of a `pg_dump`

Using pg_dump, we get a dump that contain function definitions (amongst many other data). Here is a simple bash script that output the names of the functions:
pg_dump --no-owner mydatabase | grep ^"CREATE FUNCTION" | cut -f 3 -d " " | cut -f 1 -d "("
How can I get the equivalent list with pure postgres SQL ?. I have tried many other answers here on stackoverflow to list functions in postgres, and I usually get way more functions.
EDIT:
If this is important, I'm on postgres 10.14. This is a postgis-ready database, the mydatabase is created empty and then a few extension are installed (amongst other, postgis) and then it is initialized by replaying a dump containing (amongst regular table schema and data) a list of CREATE FUNCTION .... I suspect the application to add probably a few others after running, and the list of function I get in pg_dump is close to those coming from the initial dump replayed.
The bash script doesn't output postgis function names.
If that helps:
$ echo "\df" | psql -qAt mydatabase | wc -l
756
$ pg_dump --no-owner mydatabase | grep ^"CREATE FUNCTION" | cut -f 3 -d " " | cut -f 1 -d "(" | wc -l
30
EDIT2:
Main issue seems to be in 2 parts:
selecting the right catalog (seems to be at least 'public', and not 'information_schema' nor 'pg')
and managing to have a SQL equivalent to the check findOwningExtension as this is the obvious main way to prevent dumping all the functions that are coming from an extension.
Here is the relevant code:
https://github.com/postgres/postgres/blob/65aaed22a849c0763f38f81338a1cad04ffc0e2c/src/bin/pg_dump/pg_dump.c#L6087
See also the detailed explanatory comment ~30 lines earlier. I copy/pasted the string into a python shell to concatenate the lines, resulting in:
SELECT p.tableoid, p.oid, p.proname, p.prolang, p.pronargs, p.proargtypes, p.prorettype, p.proacl, acldefault('f', p.proowner) AS acldefault, p.pronamespace, (%s p.proowner) AS rolname FROM pg_proc p LEFT JOIN pg_init_privs pip ON (p.oid = pip.objoid AND pip.classoid = 'pg_proc'::regclass AND pip.objsubid = 0) WHERE %s
AND NOT EXISTS (SELECT 1 FROM pg_depend WHERE classid = 'pg_proc'::regclass AND objid = p.oid AND deptype = 'i')
AND (
pronamespace != (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog')
OR EXISTS (SELECT 1 FROM pg_cast
WHERE pg_cast.oid > %u
AND p.oid = pg_cast.castfunc)
OR EXISTS (SELECT 1 FROM pg_transform
WHERE pg_transform.oid > %u AND
(p.oid = pg_transform.trffromsql
OR p.oid = pg_transform.trftosql)
From looking at the code, we can see that the first %s should be username_subquery, the second should be not_agg_check, and the last two are g_last_builtin_oid. I did have to dig around around the file a bit to find those values (and, for g_last_builtin_oid, a google search for the FirstNormalObjectId that it's defined in terms of). The final result, leaving only the human-relevant columns, for PG 10 (differs a bit for newer versions):
testdb=# SELECT p.proname, p.pronamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid =p.proowner) AS rolname FROM pg_proc p LEFT JOIN pg_init_privs pip ON (p.oid = pip.objoid AND pip.classoid = 'pg_proc'::regclass AND pip.objsubid = 0) WHERE NOT p.proisagg
AND NOT EXISTS (SELECT 1 FROM pg_depend WHERE classid = 'pg_proc'::regclass AND objid = p.oid AND deptype = 'i')
AND (
pronamespace != (SELECT oid FROM pg_namespace WHERE nspname = 'pg_catalog')
OR EXISTS (SELECT 1 FROM pg_cast
WHERE pg_cast.oid > 16384
AND p.oid = pg_cast.castfunc)
OR EXISTS (SELECT 1 FROM pg_transform
WHERE pg_transform.oid > 16384 AND
(p.oid = pg_transform.trffromsql
OR p.oid = pg_transform.trftosql)));
proname | pronamespace | rolname
-----------------------------+--------------+----------
blhandler | 2200 | myrole
bar | 2200 | myrole
foo | 2200 | myrole
_pg_expandarray | 13327 | postgres
_pg_keysequal | 13327 | postgres
_pg_index_position | 13327 | postgres
_pg_truetypid | 13327 | postgres
_pg_truetypmod | 13327 | postgres
_pg_char_max_length | 13327 | postgres
_pg_char_octet_length | 13327 | postgres
_pg_numeric_precision | 13327 | postgres
_pg_numeric_precision_radix | 13327 | postgres
_pg_numeric_scale | 13327 | postgres
_pg_datetime_precision | 13327 | postgres
_pg_interval_type | 13327 | postgres
(15 rows)
(this is in a test db I had lying around, not sure what those _pg_* funcs are, but the namespace is information_schema so whatever)

select query for sql

Table description
Table Name:- Name condition
Name | Pattern
A | %A% or Name like %a%
B | %B% or Name like %b%
C | %C% or Name like %c%
D | %D% or Name like %d%
E | %E% or Name like %e%
F | %F% or Name like %f%
G | %G% or Name like %g%
Table name:- Employees
Emp_ID | EMP_NAME
1 | Akshay
2 | Akhil
3 | Gautam
4 | Esha
5 | bhavish
6 | Chetan
7 | Arun
[Table description] [1]: https://i.stack.imgur.com/wvOgr.png
Above are my two tables now my query is (in the image)
Select * from Employees,Name_condition where EMP_NAME like Pattern
Here the query is correct syntactically but produces wrong output.
It takes the column Pattern as a string and searches for it in EMP_NAME and it will find nothing.
So my question is how we can take the values present in the Pattern column as a condition and not as a string so that the query will become like this
Select * from Employees,Name_condition where EMP_NAME like ‘%A%’ or Name like ‘%a%’
what i need is when i pass colunm name(Pattern) in the where condition it takes %A% or Name like %a% whole as a string but i want that select * from Employees,Name_condition where EMP_NAME like Pattern Here the column name pattern internally must be replace by the value present in the column and the the query produces o/p like this
Select * from Employees,Name_condition where EMP_NAME like ‘%A%’ or Name like ‘%a%’
Desired Result:-I expect all the rows in my result which includes bhavish but as we see we have a like condition in the column itself like %B% or Name like %b%
What i want is when it matches
where EMP_NAME like Pattern
The value of pattern must internally replaced by
%B% or Name like %b%
and the it produces the output which includes bhavish which starts with b
try:
select *
from employees
where emp_name like '%oh%'
or emp_name like '%a%';
Good luck.
Try this from orafaq:
SELECT * FROM employees
WHERE emp_name LIKE '\%a\%' ESCAPE '\';
It's not that simple (and it shouldn't be). If you really have to use such tables you have to write a piece of PL/SQL to handle your conditions.
Two things you have to read about:
dynamic sql
sql injection (because you want to prevent it)
Try to Put && instead of 'And' in condition

Postgres matching against an array of regular expressions

My client wants the possibility to match a set of data against an array of regular expressions, meaning:
table:
name | officeId (foreignkey)
--------
bob | 1
alice | 1
alicia | 2
walter | 2
and he wants to do something along those lines:
get me all records of offices (officeId) where there is a member with
ANY name ~ ANY[.*ob, ali.*]
meaning
ANY of[alicia, walter] ~ ANY of [.*ob, ali.*] results in true
I could not figure it out by myself sadly :/.
Edit
The real Problem was missing form the original description:
I cannot use select disctinct officeId .. where name ~ ANY[.*ob, ali.*], because:
This application, stored data in postgres-xml columns, which means i do in fact have (after evaluating xpath('/data/clients/name/text()'))::text[]):
table:
name | officeId (foreignkey)
-----------------------------------------
[bob, alice] | 1
[anthony, walter] | 2
[alicia, walter] | 3
There is the Problem. And "you don't do that, that is horrible, why would you do it like this, store it like it is meant to be stored in a relation database, user a no-sql database for Document-based storage, use json" are no options.
I am stuck with this datamodel.
This looks pretty horrific, but the only way I can think of doing such a thing would be a hybrid of a cross-join and a semi join. On small data sets this would probably work pretty well. On large datasets, I imagine the cross-join component could hit you pretty hard.
Check it out and let me know if it works against your real data:
with patterns as (
select unnest(array['.*ob', 'ali.*']) as pattern
)
select
o.name, o.officeid
from
office o
where exists (
select null
from patterns p
where o.name ~ p.pattern
)
The semi-join helps protect you from cases where you have a name like "alicia nob" that would meet multiple search patterns would otherwise come back for every match.
You could cast the array to text.
SELECT * FROM workers WHERE (xpath('/data/clients/name/text()', xml_field))::text ~ ANY(ARRAY['wal','ant']);
When casting a string array into text, strings containing special characters or consisting of keywords are enclosed in double quotes kind of like {jimmy,"walter, james"} being two entries. Also when matching with ~ it is matched against any part of the string, not the same as LIKE where it's matched against the whole string.
Here is what I did in my test database:
test=# select id, (xpath('/data/clients/name/text()', name))::text[] as xss, officeid from workers WHERE (xpath('/data/clients/name/text()', name))::text ~ ANY(ARRAY['wal','ant']);
id | xss | officeid
----+-------------------------+----------
2 | {anthony,walter} | 2
3 | {alicia,walter} | 3
4 | {"walter, james"} | 5
5 | {jimmy,"walter, james"} | 4
(4 rows)

Postgres DB Size Command

What is the command to find the size of all the databases?
I am able to find the size of a specific database by using following command:
select pg_database_size('databaseName');
You can enter the following psql meta-command to get some details about a specified database, including its size:
\l+ <database_name>
And to get sizes of all databases (that you can connect to):
\l+
You can get the names of all the databases that you can connect to from the "pg_datbase" system table. Just apply the function to the names, as below.
select t1.datname AS db_name,
pg_size_pretty(pg_database_size(t1.datname)) as db_size
from pg_database t1
order by pg_database_size(t1.datname) desc;
If you intend the output to be consumed by a machine instead of a human, you can cut the pg_size_pretty() function.
-- Database Size
SELECT pg_size_pretty(pg_database_size('Database Name'));
-- Table Size
SELECT pg_size_pretty(pg_relation_size('table_name'));
Based on the answer here by #Hendy Irawan
Show database sizes:
\l+
e.g.
=> \l+
berbatik_prd_commerce | berbatik_prd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 19 MB | pg_default |
berbatik_stg_commerce | berbatik_stg | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 8633 kB | pg_default |
bursasajadah_prd | bursasajadah_prd | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 1122 MB | pg_default |
Show table sizes:
\d+
e.g.
=> \d+
public | tuneeca_prd | table | tomcat | 8192 bytes |
public | tuneeca_stg | table | tomcat | 1464 kB |
Only works in psql.
Yes, there is a command to find the size of a database in Postgres. It's the following:
SELECT pg_database.datname as "database_name", pg_size_pretty(pg_database_size(pg_database.datname)) AS size_in_mb FROM pg_database ORDER by size_in_mb DESC;
SELECT pg_size_pretty(pg_database_size('name of database'));
Will give you the total size of a particular database however I don't think you can do all databases within a server.
However you could do this...
DO
$$
DECLARE
r RECORD;
db_size TEXT;
BEGIN
FOR r in
SELECT datname FROM pg_database
WHERE datistemplate = false
LOOP
db_size:= (SELECT pg_size_pretty(pg_database_size(r.datname)));
RAISE NOTICE 'Database:% , Size:%', r.datname , db_size;
END LOOP;
END;
$$
From the PostgreSQL wiki.
NOTE: Databases to which the user cannot connect are sorted as if they were infinite size.
SELECT d.datname AS Name, pg_catalog.pg_get_userbyid(d.datdba) AS Owner,
CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname))
ELSE 'No Access'
END AS Size
FROM pg_catalog.pg_database d
ORDER BY
CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
THEN pg_catalog.pg_database_size(d.datname)
ELSE NULL
END DESC -- nulls first
LIMIT 20
The page also has snippets for finding the size of your biggest relations and largest tables.
Start pgAdmin, connect to the server, click on the database name, and select the statistics tab. You will see the size of the database at the bottom of the list.
Then if you click on another database, it stays on the statistics tab so you can easily see many database sizes without much effort. If you open the table list, it shows all tables and their sizes.
You can use below query to find the size of all databases of PostgreSQL.
Reference is taken from this blog.
SELECT
datname AS DatabaseName
,pg_catalog.pg_get_userbyid(datdba) AS OwnerName
,CASE
WHEN pg_catalog.has_database_privilege(datname, 'CONNECT')
THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(datname))
ELSE 'No Access For You'
END AS DatabaseSize
FROM pg_catalog.pg_database
ORDER BY
CASE
WHEN pg_catalog.has_database_privilege(datname, 'CONNECT')
THEN pg_catalog.pg_database_size(datname)
ELSE NULL
END DESC;
du -k /var/lib/postgresql/ |sort -n |tail

Postgres concurrent copy without ID value?

I am performing concurrent copy commands but am not specifying a value for a serial ID field. As far as I know this is ok if I have just one copy command since Postgres will generate an ID.
But would this cause conflicts with more than 1 copy command running since the sequence is never updated by a copy command?
copy command update id serial automatically. so, it works fine without id conflicts.
I test performing concurrent copy commands in postgresql 9.24
I create table like below
create table tbl_test (id serial primary key, name varchar(16), age integer);
I also made 2 csv file having 1,000,000 data.
file1.csv
"1", 1
"2", 2
...
"1000000", 1000000
file2.csv
"n1", 1
"n2", 2
...
"n1000000", 1000000
when I try to copy simultaneously from file1, I get result like below
...
1000245 | n453649 | 453649
1000246 | 546595 | 546595
1000247 | n453650 | 453650
1000248 | 546596 | 546596
1000249 | n453651 | 453651
1000250 | 546597 | 546597
...
all data copied well.
postgres=# select count(*) from tbl_test;
count
---------
2000000
(1 row)
As long as the column has a sequence as a default (or is a SERIAL/BIGSERIAL datatype) and you are not referencing that directly in the COPY command you will not have ever have conflicts on that id.
Sequences are designed to be atomic, even within transactions, which also generates another common question "How do I get gapless sequences?"