How to perform the same aggregation on every column, without listing the columns? - sql

I have a table with N columns. Let's call them c1, c2, c3, c4, ... cN. Among multiple rows, I want to get a single row with COUNT DISTINCT(cX) for each X in [1, N].
c1 | c2 | ... | cn
0 | 4 | ... | 1
Is there a way I can do this (in a stored procedure) without writing every column name into the query manually?
Why?
We've had a problem where bugs in application servers mean we rewrite good column values with garbage inserted later. To solve this, I'm storing the information log-structure, where each row represents a logical UPDATE query. Then, when given a signal that the record is complete, I can determine if any values were (erroneously) overwritten.
An example of a single correct record in multiple rows: there is at most one value for each column.
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 12:00am | NULL | 2:00pm |
Reconciled row:
| 1 | 12:00am | 1:00pm | 2:00pm |
An example of an irreconcilable record that I want to detect:
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 9:00am | 1:00pm | 2:00pm | -- New initialize time => irreconcilable!

You need dynamic SQL for that, which means you have to create a function or run a DO command. Since you cannot return values directly from the latter, a plpgsql function it is:
CREATE OR REPLACE function f_count_all(_tbl text
, OUT columns text[]
, OUT counts bigint[])
RETURNS record LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT
ARRAY[' || string_agg('''' || quote_ident(attname) || '''', ', ') || ']
, ARRAY[' || string_agg('count(' || quote_ident(attname) || ')' , ', ') || ']
FROM ' || _tbl
FROM pg_attribute
WHERE attrelid = _tbl::regclass
AND attnum >= 1 -- exclude tableoid & friends (neg. attnum)
AND NOT attisdropped -- exclude deleted columns
GROUP BY attrelid
)
INTO columns, counts;
END
$func$;
Call:
SELECT * FROM f_count_all('myschema.mytable');
Returns:
columns | counts
--------------+--------
{c1, c2, c3} | {17, 1, 0}
More explanation and links about dynamic SQL and EXECUTE in this related question - or a couple more here on SO, try this search.
Related:
Count values for every column in a table
You could even try and return a polymorphic record type to get single columns dynamically, but that's rather complex and advanced. Probably too much effort for your case. More in this related answer.

Related

Check string for substring existence

How can I check whether a certain substring (for instance 18UT) is part of a string in a column?
Redshifts' SUBSTRING function allows me to "cut" a certain substring based on a starting index + length of the subtring, but not check whether a specific substring exists is in the column's value.
Example:
+------------------+
| col |
+------------------+
| 14TH, 14KL, 18AB |
| 14LK, 18UT, 15AK |
| 14AB, 08ZT, 18ZH |
| 14GD, 52HG, 18UT |
+------------------+
Desired result:
+------------------+------+
| col | 18UT |
+------------------+------+
| 14TH, 14KL, 18AB | No |
| 14LK, 18UT, 15AK | Yes |
| 14AB, 08ZT, 18ZH | No |
| 14GD, 52HG, 18UT | Yes |
+------------------+------+
Here is one option:
select col,
case when ', ' || col || ', ' like '%, 18UT, %' then 'yes' else 'no' end has_18ut
from mytable
While this will solve your immediate, problem, it should be note that storing delimited lists in a database table is bad practice, and should be avoided. Each value should go to a separate row instead.

Big Query - Transpose arrays into colums

We have a table in Big Query like below.
Input table:
Name | Interests
-----+----------
Bob | ["a"]
Sue | ["a","b"]
Joe | ["b","c"]
We want to convert the above table to below format to make it BI/Visualisation friendly.
Target/Required table:
+------------------+
| Name | a | b | c |
+------------------+
| Bob | 1 | 0 | 0 |
| Sue | 1 | 1 | 0 |
| Joe | 0 | 1 | 0 |
+------------------+
Note: The Interests column is an array datatype. Is this sort of transformation possible in Big Query? If yes, Any reference query?
Thanks in advance!
Below is for BigQuery Standard SQL and uses scripting features of BQ
#standardSQL
create temp table ttt as (
select name, interest
from `project.dataset.table`,
unnest(interests) interest
);
EXECUTE IMMEDIATE (
SELECT """
SELECT name, """ ||
STRING_AGG("""MAX(IF(interest = '""" || interest || """', 1, 0)) AS """ || interest, ', ')
|| """
FROM ttt
GROUP BY name
"""
FROM (
SELECT DISTINCT interest
FROM ttt
ORDER BY interest
)
);
if to apply to sample data from your question - output is

How to concat data columns using loop?

How to concat columns data using loop in Postgres?
I have this table:
+------+------+------+--------+--------+--------+
| col1 | col2 | col3 | other1 | other2 | other3 |
+------+------+------+--------+--------+--------+
| 1 | 1 | 1 | 1 | 1 | 1 |
| 2 | 2 | 2 | 2 | 2 | 2 |
+------+------+------+--------+--------+--------+
and want to concat columns (col*).
Expected output:
+----------------+--------+--------+--------+
| concatedcolumn | other1 | other2 | other3 |
+----------------+--------+--------+--------+
| **1**1**1** | 1 | 1 | 1 |
| **2**2**2** | 2 | 2 | 2 |
+----------------+--------+--------+--------+
I can concat using:
select concat('**', col1, '**',col2, '**', col3, '**') as concatedcolumn
,other1, other2, other3
from sample_table
I have some 200 columns with prefix "col" and don't want to spell out all columns in sql. How could I achieve this with a loop?
Questionable database design aside, you can generate the SELECT statement dynamically:
SELECT 'SELECT concat_ws(''**'', '
|| string_agg(quote_ident(attname), ', ') FILTER (WHERE attname LIKE 'col%')
|| ') AS concat_col, '
|| string_agg(quote_ident(attname), ', ') FILTER (WHERE attname NOT LIKE 'col%')
|| ' FROM public.tbl;' -- your table name here
FROM pg_attribute
WHERE attrelid = 'public.tbl'::regclass -- ... and here
AND attnum > 0
AND NOT attisdropped;
db<>fiddle here
Query the system catalog pg_attribute or, alternatively, the information schema table columns. I prefer the system catalog.
Related answer on dba.SE discussing "information schema vs. system catalogs"
Execute in a second step (after verifying it's what you want).
No loop involved. You can build the statement dynamically, but you cannot (easily) return the result dynamically as SQL demands to know the return type at execution time.
concat_ws() is convenient, but it ignores NULL values. I didn't deal with those specially. You may or may not want to do that. Related:
Combine two columns and add into one new column
How to concatenate columns in a Postgres SELECT?

Mutating error on an AFTER insert trigger

CREATE OR REPLACE TRIGGER TRG_INVOICE
AFTER INSERT
ON INVOICE
FOR EACH ROW
DECLARE
V_SERVICE_COST FLOAT;
V_SPARE_PART_COST FLOAT;
V_TOTAL_COST FLOAT;
V_INVOICE_DATE DATE;
V_DUEDATE DATE;
V_REQ_ID INVOICE.SERVICE_REQ_ID%TYPE;
V_INV_ID INVOICE.INVOICE_ID%TYPE;
BEGIN
V_REQ_ID := :NEW.SERVICE_REQ_ID;
V_INV_ID := :NEW.INVOICE_ID;
SELECT SUM(S.SERVICE_COST) INTO V_SERVICE_COST
FROM INVOICE I, SERVICE_REQUEST SR, SERVICE S, SERVICE_REQUEST_TYPE SRT
WHERE I.SERVICE_REQ_ID = SR.SERVICE_REQ_ID
AND SR.SERVICE_REQ_ID = SRT.SERVICE_REQ_ID
AND SRT.SERVICE_ID = S.SERVICE_ID
AND I.SERVICE_REQ_ID = V_REQ_ID;
SELECT SUM(SP.PRICE) INTO V_SPARE_PART_COST
FROM INVOICE I, SERVICE_REQUEST SR, SERVICE S, SERVICE_REQUEST_TYPE SRT,
SPARE_PART_SERVICE SRP,
SPARE_PART SP
WHERE I.SERVICE_REQ_ID = SR.SERVICE_REQ_ID
AND SR.SERVICE_REQ_ID = SRT.SERVICE_REQ_ID
AND SRT.SERVICE_ID = S.SERVICE_ID
AND S.SERVICE_ID = SRP.SERVICE_ID
AND SRP.SPARE_PART_ID = SP.SPARE_PART_ID
AND I.SERVICE_REQ_ID = V_REQ_ID;
V_TOTAL_COST := V_SERVICE_COST + V_SPARE_PART_COST;
SELECT SYSDATE INTO V_INVOICE_DATE FROM DUAL;
SELECT ADD_MONTHS(SYSDATE, 1) INTO V_DUEDATE FROM DUAL;
UPDATE INVOICE
SET COST_SERVICE_REQ = V_SERVICE_COST, COST_SPARE_PART =
V_SPARE_PART_COST,
TOTAL_BALANCE = V_TOTAL_COST, PAYMENT_DUEDATE = V_DUEDATE, INVOICE_DATE =
V_INVOICE_DATE
WHERE INVOICE_ID = V_INV_ID;
END;
I'm trying to calculate some columns after the user inserts a row.
Using the service_request_id I want to calculate the service/parts/total cost. Also, I would like to generate the creation and due dates. But, I keep getting
INVOICE is mutating, trigger/function may not see it
Not sure how the table is mutating after the insert statement.
Not sure how the table is mutating after the insert statement.
Imagine a simple table:
create table x(
x int,
my_sum int
);
and an AFTER INSERT FOR EACH ROW trigger, similar to yours, which calculates a sum of all values in the table and updates my_sum column.
Now imagine this insert statement:
insert into x( x )
select 1 as x from dual
connect by level <= 1000;
This single statement basically inserts 1000 records, each one with 1 value, see this demo: http://sqlfiddle.com/#!4/0f211/7
Since in SQL each individual statement must be ATOMIC (more on this here: Statement-Level Read Consistency, Oracle is free to perform this query in any way as long as the final result is correct (consistent). It can save records in the order of execution, maybe in reverse order, it can divide the batch into 10 threads and do it in parallel.
Since the trigger is fired individually after inserting each row, and it cannot know in advance the "final" result, then considering the above all the below results are possible depending on "internal" method choosed by Oracle to execute this query. As you see, these result do not meet the definition of consistency. And Oracle prevents this issuing mutating table error.
In other words - your assumption are bad and your design is flawed, you need to change it.
| X | MY_SUM |
|---|--------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
...
...
or maybe :
| X | MY_SUM |
|---|--------|
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
...
or maybe:
| X | MY_SUM |
|---|--------|
| 1 | 4 |
| 1 | 8 |
| 1 | 12 |
| 1 | 16 |
| 1 | 20 |
| 1 | 24 |
| 1 | 28 |
...
...

Concatenate three columns data into one column in Postgres

Can anyone tell me which command is used for concatenate three columns data into one column in PostgreSQL database?
e.g.
If the columns are
begin | Month | Year
12 | 1 | 1988
13 | 3 | 1900
14 | 4 | 2000
15 | 5 | 2012
result like
Begin
12-1-1988
13-3-1900
14-4-2000
15-5-2012
Just use concatenation operator || : http://www.sqlfiddle.com/#!1/d66bb/2
select begin || '-' || month || '-' || year as begin
from t;
Output:
| BEGIN |
-------------
| 12-1-1988 |
| 13-3-1900 |
| 14-4-2000 |
| 15-5-2012 |
If you want to change the begin column itself, begin column must be of string type first, then do this: http://www.sqlfiddle.com/#!1/13210/2
update t set begin = begin || '-' || month || '-' || year ;
Output:
| BEGIN |
-------------
| 12-1-1988 |
| 13-3-1900 |
| 14-4-2000 |
| 15-5-2012 |
UPDATE
About this:
but m not getting null value column date
Use this:
select (begin || '-' || month || '-' || year)::date as begin
from t
Have a look at 9.4. String Functions and Operators
This is an old post, but I just stumbled upon it. Doesn't it make more sense to create a date data type? You can do that using:
select make_date(year, month, begin)
A date seems more useful than a string (and you can even format it however you like using to_char()).