SQL: Problems translating recursive connect by Statement for Postgres - sql

I try to translate the following oracle sql, which inserts 1000 rows with incremental values into a table:
insert into tableName (col1, col2, col3)
select 'AN' || (1000000 + ROWNUM), 'EXT' || (9000000 + ROWNUM), ROWNUM
from dual
Connect By ROWNUM <= 1000 ;
For Postgres support, i know i can substitute ROWNUM with ROW_NUMBER() OVER (), but i'm really getting a headache about translating the connect by statement. I have read about CTEs but i don't get how i can use this with an insert statement.
Does anyone know how to write this statement for postgresql? Thanks.

You can generate a series and just use that:
insert into tableName (col1, col2, col3)
select 'AN' || (1000000 + g.n), 'EXT' || (9000000 + g.n), g.n
from generate_series(1, 1000) g(n);

Try generate_series.
select 'AN' || (1000000 + ROWNUM), 'EXT' || (9000000 + ROWNUM),
ROWNUM from generate_series(1,10000) as rownum ;

Related

Count number of null values for every column on a table

I would like to calculate, for each column in a table, the percent of rows that are null.
For one column, I was using:
SELECT ((SELECT COUNT(Col1)
FROM Table1)
/
(SELECT COUNT(*)
FROM Table1)) AS Table1Stats
Works great and is fast.
However, I want to do this for all ~50 columns of the table, and my environment does not allow me to use dynamic SQL.
Any recommendations? I am using snowflake to connect to AWS, but as an end user I am using the snowflake browser interface.
You can combine this as:
SELECT COUNT(Col1) * 1.0 / COUNT(*)
FROM Table1;
Or, if you prefer:
SELECT AVG( (Col1 IS NOT NULL)::INT )
FROM Table1;
You can use a mix of object_construct() and flatten() to move the column names into rows. Then do the math for the values missing:
create or replace temp table many_cols as
select 1 a, 2 b, 3 c, 4 d
union all select 1, null, 3, 4
union all select 8, 8, null, null
union all select 8, 8, 7, null
union all select null, null, null, null;
select key column_name
, 1-count(*)/(select count(*) from many_cols) ratio_null
from (
select object_construct(a.*) x
from many_cols a
), lateral flatten(x)
group by key
;
You can do this using a SQL generator if you don't mind copying the text and running it once it's done.
-- SQL generator option:
select 'select' || listagg(' ((select count(' || COLUMN_NAME || ') from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10000"."ORDERS") / ' ||
'(select count(*) from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10000"."ORDERS")) as ' || COLUMN_NAME, ',') as SQL_STATEMENT
from "SNOWFLAKE_SAMPLE_DATA"."INFORMATION_SCHEMA"."COLUMNS"
where TABLE_CATALOG = 'SNOWFLAKE_SAMPLE_DATA' and TABLE_SCHEMA = 'TPCH_SF10000' and TABLE_NAME = 'ORDERS'
;
If the copy and paste is not plausible because you need to script it, you can use the results of the SQL generator in a stored procedure I wrote to execute a single line of dynamic SQL:
call run_dynamic_sql(
select 'select' || listagg(' ((select count(' || COLUMN_NAME || ') from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10000"."ORDERS") / ' ||
'(select count(*) from "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF10000"."ORDERS")) as ' || COLUMN_NAME, ',') as SQL_STATEMENT
from "SNOWFLAKE_SAMPLE_DATA"."INFORMATION_SCHEMA"."COLUMNS"
where TABLE_CATALOG = 'SNOWFLAKE_SAMPLE_DATA' and TABLE_SCHEMA = 'TPCH_SF10000' and TABLE_NAME = 'ORDERS'
);
If you want the stored procedure, until it's published on Snowflake's blog it's available here: https://snowflake.pavlik.us/index.php/2021/01/22/running-dynamic-sql-in-snowflake/

Wrong symbol inside replace function (PL/SQL, ORACLE)

I have below procedure inside package:
PROCEDURE test1
IS
InsertST varchar2(32000) : = 'INSERT INTO tableA (col1, col2)
(select cola,
INITCAP(REPLACE(colX, '_', ''))
from tableB))';
Begin
execute immediate InsertST;
END
during compilation I got error:
Error(1177,45): PLS-00103: Encountered the symbol "_" when expecting one of the following: * & = - + ; < / > at in is mod remainder not rem <> or != or ~= >= <= <> and or like like2 like4 likec between || member submultiset
Something is wrong with "_" inside function: INITCAP(REPLACE(colX, '_', ''))
How to fix it? Maybe is other way?
The quoted string starting 'INSERT ends at colX, '. To quote a quote you need to either double up the quotes:
'INSERT INTO tableA (col1, col2)
(select cola,
INITCAP(REPLACE(colX, ''_'', ''''))
from tableB))'
or else use q-quoting syntax:
q'[INSERT INTO tableA (col1, col2)
(select cola,
INITCAP(REPLACE(colX, '_', ''))
from tableB))]';
Also, the assignment operator is := not : =.
It looks like you want to generate a statement like this:
insert into tablea ( col1, col2 )
select cola, initcap(replace(colx, '_', ''))
from tableb
which has a couple less brackets.
It doesn't look like it needs to be dynamic at all, but I'm assuming this is a simplified version of something that does.

Count distinct values for every column individually

Can I count distinct values of every column without enumerating them ?
Say I have a table with col1, col2, col3, and no other column. Without mentioning these columns explicitly, I would like to have the same result as:
SELECT
count(distinct col1) as col1,
count(distinct col2) as col2,
count(distinct col3) as col3
FROM mytable;
How can I do this ?
I think the best you could easily do with plain SQL is to run a query like this to generate the query you want, and then run that.
select 'select count(distinct '
|| listagg(column_name || ') as ' || column_name, ', count(distinct ') within group (order by column_id)
|| ' from ' || max(table_name) || ';' as script
from all_tab_cols
where table_name = 'MYTABLE';

How to find the 2nd highest value in a row with Oracle SQL?

I have a row of the database like this:
1|abc|10|30|12
The biggest value is 30 and the second highest is 12. How can I get the second value for each row in my table?
This will work with any number of columns, just make sure you add them to the concatenated list where labelled merge_col in my query below:
select col1, col2, col3, col4, col5, second_highest
from (select x.*,
regexp_substr(merge_col, '[^|]+', 1, levels.column_value) as second_highest,
row_number() over(partition by x.col1 order by to_number(regexp_substr(merge_col, '[^|]+', 1, levels.column_value)) desc) as rn
from (select t.*, col3 || '|' || col4 || '|' || col5 as merge_col
from tbl t) x,
table(cast(multiset
(select level
from dual
connect by level <=
length(regexp_replace(merge_col,
'[^|]+')) + 1) as
sys.OdciNumberList)) levels)
where rn = 2
Fiddle test: http://sqlfiddle.com/#!4/b446f/2/0
In other words, for additional columns, change:
col3 || '|' || col4 || '|' || col5 as merge_col
to:
col3 || '|' || col4 || '|' || col5 || '|' || col6 ......... as merge_col
with however many columns there are in place of the ......
Assuming the values are all different:
select t.*,
(case when col1 <> greatest(col1, col2, col3) and
col1 <> least(col1, col2, col3)
then col1
when col2 <> greatest(col1, col2, col3) and
col2 <> least(col1, col2, col3)
then col2
else col3
end) as secondgreatest
from table t;
There are different ways to achieve that. try this one for instance:
SELECT MAX('column_name') FROM 'table_name'
WHERE 'column_name' NOT IN (SELECT MAX('column_name') FROM 'table_name' )
You basically exclude the highest number from your query and then you select the highest out of the rest .
If you can concatenate all the columns as a comma-delimited list, then you can do the following (assuming you're looking for the second highest numerical value):
WITH d1 AS (
SELECT keycolumn1, keycolumn2, col1 || ',' || col2 || ',' || ... || ',' || colx AS other_columns
FROM mytable
), d2 AS (
SELECT keycolumn1, keycolumn2, LEVEL AS pos
, TO_NUMBER(REGEXP_SUBSTR(other_columns, '[^,]+', 1, LEVEL)) AS cvalue
FROM d1
CONNECT BY REGEXP_SUBSTR(other_columns, '[^,]+', 1, LEVEL) IS NOT NULL
AND PRIOR keycolumn1 = keycolumn1
AND PRIOR keycolumn2 = keycolumn2
AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
)
SELECT keycolumn1, keycolumn2, pos, cvalue FROM (
SELECT keycolumn1, keycolumn2, pos, cvalue
, ROW_NUMBER() OVER ( PARTITION BY keycolumn1, keycolumn2 ORDER BY cvalue DESC ) AS rn
FROM d2
) WHERE rn = 2;
The above will return the second-highest value for a given set of "key" column values along with the position of the column in which that value is found. I use concatenation and CONNECT BY to unpivot the table.

DB2 Dynamic View CREATE statement from SYS tables

Uncertain if this is even possible, but here goes...
In DB2 v9.7, I'm trying to create a view for every table in a schema.
The view should contain each column, but appends one more timestamp column.
i.e.
TABLE1 (COL1, COL2, COL3)
V_TABLE1 (COL1, COL2, COL3, INS_DTTM)
TABLE2 (COL1, COL4)
V_TABLE2 (COL1, COL4, INS_DTTM)
Here's what I've got so far
SELECT 'CREATE VIEW V_' || NAME || ' (' ||
(SELECT LISTAGG(NAME, ', ') FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR=CREATOR) || ')
AS SELECT ' || (SELECT LISTAGG(NAME, ', ')
FROM SYSIBM.SYSCOLUMNS WHERE TBCREATOR='SCHEMA') || ', CAST(NULL AS TIMESTAMP)
AS INS_DTTM FROM ' || CREATOR || '.' || NAME
FROM SYSIBM.SYSTABLES
WHERE CREATOR = 'SCHEMA' AND NAME LIKE 'T%' ORDER BY NAME;
It's my subselect that is causing me grief...
I get the following:
CREATE VIEW V_TABLE1 (COL1, COL2, COL3) AS SELECT COL1, COL2, COL3, CAST(NULL AS TIMESTAMP) AS INS_DTTM FROM SCHEMA.TABLE1
CREATE VIEW V_TABLE2 (COL1, COL2, COL3) AS SELECT COL1, COL2, COL3, CAST(NULL AS TIMESTAMP) AS INS_DTTM FROM SCHEMA.TABLE2
I'll apologize in advance if my syntax is off...I'm retyping across development environs and the internet...so I'm certain I've made mistakes...
I want the columns listed to relate to the table name of that row...but the LISTAGG appears for just the first row...
Hope this makes sense...and if anyone has any suggestions, please let me know.
Thanks