Query in PostgreSQL with large quantity of squid access requests - sql

Hello people, I'm using a log daemon (https://github.com/paranormal/blooper) in Squid Proxy to put access log into PostreSQL and I make a Trigger Function:
DECLARE
newtime varchar := EXTRACT (MONTH FROM NEW."time")::varchar;
newyear varchar := EXTRACT (YEAR FROM NEW."time")::varchar;
user_name varchar := REPLACE (NEW.user_name, '.', '_');
partname varchar := newtime || '_' || newyear;
tablename varchar := user_name || '.accesses_' || partname;
BEGIN
IF NEW.user_name IS NOT NULL THEN
EXECUTE 'CREATE SCHEMA IF NOT EXISTS ' || user_name;
EXECUTE 'CREATE TABLE IF NOT EXISTS '
|| tablename
|| '('
|| 'CHECK (user_name = ''' || NEW.user_name || ''' AND EXTRACT(MONTH FROM "time") = ' || newtime || ' AND EXTRACT (YEAR FROM "time") = ' || newyear || ')'
|| ') INHERITS (public.accesses)';
EXECUTE 'CREATE INDEX IF NOT EXISTS access_index_' || partname || '_user_name ON ' || tablename || ' (user_name)';
EXECUTE 'CREATE INDEX IF NOT EXISTS access_index_' || partname || '_time ON ' || tablename || ' ("time")';
EXECUTE 'INSERT INTO ' || tablename || ' SELECT $1.*' USING NEW;
END IF;
RETURN NULL;
END;
The main function of it is make a table partition by user_name and by month-year of the access, inhering from a master clean table:
CREATE TABLE public.accesses
(
id integer NOT NULL DEFAULT nextval('accesses_id_seq'::regclass),
"time" timestamp with time zone NOT NULL,
time_response integer,
mac_source macaddr,
ip_source inet NOT NULL,
ip_destination inet,
user_name character varying(40),
http_status_code numeric(3,0) NOT NULL,
http_reply_size bigint NOT NULL,
http_request_method character varying(15) NOT NULL,
http_request_url character varying(4166) NOT NULL,
http_content_type character varying(100),
squid_hier_code character varying(20),
squid_request_status character varying(50),
user_id integer,
CONSTRAINT accesses_http_request_method_fkey FOREIGN KEY (http_request_method)
REFERENCES public.http_requests (method) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT accesses_http_status_code_fkey FOREIGN KEY (http_status_code)
REFERENCES public.http_statuses (code) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT accesses_user_id_fkey FOREIGN KEY (user_id)
REFERENCES public.users (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
The main problem is get the sum of http_reply_size grouping by user_name and time, my query is:
SELECT
"time",
user_name,
sum(http_reply_size)
FROM
accesses
WHERE
extract(epoch from "time") BETWEEN 1516975122 AND 1516996722
GROUP BY
"time",
user_name
But this query is very slow in the server (3'237'976 rows currently in 2 days only). So, PostgreSQL has something to optimize a query with that need, or I need to use another SQL or NoSQL system.

Try to include a CHECK condition on each partition so doesn't have to scan all tables.
In my case is like this:
CREATE TABLE IF NOT EXISTS ' || table_name || '(
CONSTRAINT ' || pk || ' PRIMARY KEY (avl_id),
CHECK ( event_time >= ''' || begin_time || ''' AND event_time < ''' || end_time || ''' )
) INHERITS (avl_db.avl);
Also don't use extract(epoch from "time") that will need to calculate the value for each row and can't use the index you create for "time"
so use like this to get advantage of the index.
WHERE "time" >= '2018-01-01'::timestamp with time zone
and "time" < '2018-02-01'::timestamp with time zone

Related

Create Primary Constraint from table columns

I have a table named PRIMARY_CONSTRAINTS into the DB such as it has the table name, the constraint name and the column name of the primary constraints which I want to add.
It is an SQL oracle DB.
I have tried something like this but it does not run
ALTER TABLE TABLE_NAME
ADD CONSTRAINT CONSTRAINT_NAME PRIMARY KEY (COLUMN_NAME)
WHERE TABLE_NAME, CONSTRAINT_NAME, COLUMN_NAME IN
( SELECT TABLE_NAME,
CONSTRAINT_NAME,
COLUMN_NAME
FROM PRIMARY_CONSTRAINTS );
How can I reference it so I can alter the tables?
Thanks
You can use the following dynamic block to achieve the same:
BEGIN
FOR Q IN (
SELECT
'ALTER TABLE '
|| TABLE_NAME
|| ' ADD CONSTRAINT '
|| CONSTRAINT_NAME
|| ' PRIMARY KEY ( '
|| COLUMN_NAME
|| ')' QRY
FROM
PRIMARY_CONSTRAINTS
) LOOP
EXECUTE IMMEDIATE Q.QRY;
END LOOP;
END;
/
I have created the small demo for you --> db<>fiddle demo
UPDATED ACCORDING TO COMMENT
If there are two entries for a single table then You can use the following block:
BEGIN
FOR Q IN (
SELECT
'ALTER TABLE '
|| TABLE_NAME
|| ' ADD CONSTRAINT '
|| MAX(CONSTRAINT_NAME)
|| ' PRIMARY KEY ( '
|| LISTAGG(COLUMN_NAME,',') WITHIN GROUP (ORDER BY NULL)
|| ')' QRY
FROM
PRIMARY_CONSTRAINTS
GROUP BY TABLE_NAME
) LOOP
EXECUTE IMMEDIATE Q.QRY;
END LOOP;
END;
/
db<>fiddle demo updated
With regard to your comment about multiple columns on the constraints, we could setup the data in PRIMARY_CONSTRAINTS table in the same way it would be used in the Alter Table Statement.
For Example : Suppose we have to use the following statement to create a constraint Alter table add constraint Constraint_Name primary key columnA, columnB;
Then the entry for this particular constraint in your PRIMARY_CONSTRAINTS table will have value columnA , columnB for COLUMN_NAME.
Hope this helps.
Regards
Akash

How to create dynamic partition table in postgres with bigint column?

I have a master table such as
CREATE TABLE public.user_event_firebase
(
user_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
event_name character varying(255) COLLATE pg_catalog."default" NOT NULL,
"timestamp" bigint NOT NULL,
platform character varying(255) COLLATE pg_catalog."default" NOT NULL,
created_at timestamp without time zone DEFAULT now()
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
GOAL
I want to partition this table by year_month table with "timestamp" column such as user_event_firebase_2018_04 , user_event_firebase_2018_05, user_event_firebase_2018_06. The rows will automation redirect to insert into partition table with timestamp condition.
I created function create partition such as:
CREATE OR REPLACE FUNCTION partition_uef_table( bigint, bigint )
returns void AS $$
DECLARE
create_query text;
index_query text;
BEGIN
FOR create_query, index_query IN SELECT
'create table user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( check( timestamp >= bigint '''
|| TO_CHAR( d, 'YYYY-MM-DD' )
|| ''' and timestamp < bigint '''
|| TO_CHAR( d + INTERVAL '1 month', 'YYYY-MM-DD' )
|| ''' ) ) inherits ( user_event_firebase );',
'create index user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| '_time on user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( timestamp );'
FROM generate_series( $1, $2, '1 month' ) AS d
LOOP
EXECUTE create_query;
EXECUTE index_query;
END LOOP;
END;
$$
language plpgsql;
CREATE OR REPLACE FUNCTION test_partition_function_uef()
RETURNS TRIGGER AS $$
BEGIN
EXECUTE 'insert into user_event_firebase_'
|| to_char( NEW.timestamp, 'YYYY_MM' )
|| ' values ( $1, $2, $3, $4 )' USING NEW.user_id, NEW.event_name, NEW.timestamp, NEW.platform;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
with trigger
CREATE TRIGGER test_partition_trigger_uef
BEFORE INSERT
ON user_event_firebase
FOR each ROW
EXECUTE PROCEDURE test_partition_function_uef() ;
I trying with example
SELECT partition_uef_table(1518164237,1520583437) ;
PROBLEM :
ERROR: invalid input syntax for integer: "1 month"
LINE 14: FROM generate_series( $1, $2, '1 month' ) AS d
^
QUERY: SELECT
'create table user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( check( timestamp >= bigint '''
|| TO_CHAR( d, 'YYYY-MM-DD' )
|| ''' and timestamp < bigint '''
|| TO_CHAR( d + INTERVAL '1 month', 'YYYY-MM-DD' )
|| ''' ) ) inherits ( user_event_firebase );',
'create index user_event_firebase_'
QUESTION:
How to create range for generate_series function in ' 1 month ' , set step property such int or bigint suck because of day of month is diffirence ( 2nd - 28 days, 3rd - 30 days ).
Thank you.
answer to your second question would be opinion based (so I skip it), but to the first would be such:
with args(a1,a2) as (values(1518164237,1520583437))
select d,to_char(d,'YYYY_MM') from args, generate_series(to_timestamp(a1),to_timestamp(a2),'1 month'::interval) d;
gives reult:
d | to_char
------------------------+---------
2018-02-09 08:17:17+00 | 2018_02
2018-03-09 08:17:17+00 | 2018_03
(2 rows)
Use
generate_series(start, stop, step interval) timestamp or timestamp with time zone

How to get access to procedure parameter DB2

I have next procedure for adding partitions into table:
CREATE PROCEDURE GBC_CR20877_CDWH.ADD_NEW_PARTITIONS (
IN TBL_NAME VARCHAR(32)
,IN DATE_FROM DATE
,IN DATE_TO DATE
,IN TBL_SPC_NM VARCHAR(32)
,IN SRC_STM VARCHAR(8)
,IN NCI_SCHEME VARCHAR(32)
) LANGUAGE SQL
BEGIN
DECLARE counter INTEGER DEFAULT 0;
DECLARE stmt2 VARCHAR(1024);
DECLARE ALTER_DDL_SQL VARCHAR(1024);
DECLARE S1 STATEMENT;
DECLARE ALTER_PARTITIONS CURSOR
FOR
SELECT 'ALTER TABLE ' || TBL_NAME || ' ADD PARTITION PART' || MSR_PRD_ID || '_' || SRC_STM || ' STARTING FROM (' || MSR_PRD_ID || ', ' || SRC_STM || ') ENDING AT (' || MSR_PRD_ID || ', ' || SRC_STM || ') IN ' || TBL_SPC_NM || ' INDEX IN ' || TBL_SPC_NM || ' LONG IN ' || TBL_SPC_NM || ';'
FROM NCI_SCHEME
WHERE EFF_DT BETWEEN DATE_FROM
AND DATE_TO
ORDER BY MSR_PRD_ID;
END
I try to make a cursor with many DDL commands ALTER TABLE ADD PARTITION. Later I would like to execute these commands in cycle looping through the cursor.
I have problem with parameter NCI_SCHEME. I want to use it as source table in select query, but using like that I get an error:
Run: GBC_CR20877_CDWH.ADD_NEW_PARTITIONS(VARCHAR(32), DATE, DATE,
VARCHAR(32), VARCHAR(8), VARCHAR(32))
{call GBC_CR20877_CDWH.ADD_NEW_PARTITIONS(?,?,?,?,?,?)}
An error occurred during implicit system action type "3". Information
returned for the error includes SQLCODE "-204", SQLSTATE "42704" and
message tokens "MY_USERNAME.NCI_SCHEME".. SQLCODE=-727,
SQLSTATE=56098, DRIVER=3.66.46
Run of routine failed.
Roll back completed successfully.
However, if I hardcode table_name without parameter it works fine! I don't understand why other parameters in select query work, but NCI_SCHEME doesn't. Any help please!)

PostgreSQL 9.5 Insert/Update while partitioning table

I need to achieve updating (via ON CONFLICT()) row in a partitioned tables.
So far, my tries:
Table creation:
CREATE TABLE public.my_tbl
(
goid character varying(255) NOT NULL,
timestamps timestamp without time zone[],
somenumber numeric[],
CONSTRAINT my_tbl_pkey PRIMARY KEY (goid)
)
WITH (
OIDS=FALSE
);
ALTER TABLE public.my_tbl
OWNER TO postgres;
Table Sequence:
CREATE SEQUENCE public.fixations_data_pkey_seq
INCREMENT 1
MINVALUE 1
MAXVALUE 9223372036854775807
START 1
CACHE 1;
ALTER TABLE public.fixations_data_pkey_seq
OWNER TO postgres;
Table partition trigger, which creates new table with name "table_YYYY_MM_DD", where "YYYY_MM_DD" - current date (query execution date):
CREATE OR REPLACE FUNCTION public.my_tbl_insert_trigger()
RETURNS trigger AS
$BODY$
DECLARE
table_master varchar(255) := 'my_tbl';
table_part varchar(255) := '';
BEGIN
-- Partition table name --------------------------------------------------
table_part := table_master
|| '_' || DATE_PART( 'year', NOW() )::TEXT
|| '_' || DATE_PART( 'month', NOW() )::TEXT
|| '_' || DATE_PART( 'day', NOW() )::TEXT;
-- Check if partition exists --------------------------------
PERFORM
1
FROM
pg_class
WHERE
relname = table_part
LIMIT
1;
-- If not exist, create new one --------------------------------------------
IF NOT FOUND
THEN
-- Create parition, which inherits master table --------------------------
EXECUTE '
CREATE TABLE ' || table_part || '
(
goid character varying(255) NOT NULL DEFAULT nextval(''' || table_master || '_pkey_seq''::regclass),
CONSTRAINT ' || table_part || '_pkey PRIMARY KEY (goid)
)
INHERITS ( ' || table_master || ' )
WITH ( OIDS=FALSE )';
-- Create indices for current table-------------------------------
EXECUTE '
CREATE INDEX ' || table_part || '_adid_date_index
ON ' || table_part || '
USING btree
(goid)';
END IF;
-- Insert row into table (without ON CONFLICT)--------------------------------------------
EXECUTE '
INSERT INTO ' || table_part || '
SELECT ( (' || QUOTE_LITERAL(NEW) || ')::' || TG_RELNAME || ' ).*';
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION public.my_tbl_insert_trigger()
OWNER TO postgres;
CREATE TRIGGER my_tbl_insert_trigger
BEFORE INSERT
ON my_tbl
FOR EACH ROW
EXECUTE PROCEDURE my_tbl_insert_trigger();
After this I can insert new rows into table:
INSERT INTO my_tbl (goid, timestamps, somenumber)
VALUES ('qwe123SSsssa3', '{"2016-11-16 00:00:00", "2016-11-16 01:00:00"}', '{3, 12333}')
But when I'm trying to do UPSERT:
INSERT INTO my_tbl (goid, timestamps, somenumber)
VALUES ('qwe123SSsssa3', '{"2016-11-16 02:00:00"}', '{999}')
ON CONFLICT (goid)
DO UPDATE
SET timestamps=array_append(my_tbl.timestamps::timestamp[], '2016-11-16 02:00:00'),
somenumber=array_append(my_tbl.somenumber,'999');
I'm geting DUPLICATE PKEY error.
I guess, that I have to add ON CONFLICT to third EXECUTE in trigger function. But how should I do this?
Well , I've changed my third EXECUTE to :
-- Insert row into table (with ON CONFLICT)--------------------------------------------
EXECUTE '
INSERT INTO ' || table_part || '
SELECT ( (' || QUOTE_LITERAL(NEW) || ')::' || TG_RELNAME || ' ).*
ON CONFLICT (goid)
DO UPDATE
SET timestamps=' || table_part || '.timestamps::timestamp[] || ' || QUOTE_LITERAL(NEW.timestamps) || ',
somenumber=' || table_part || '.somenumber::numeric[] || ' || QUOTE_LITERAL(NEW.somenumber) || '
';
RETURN NULL;
Now, when I execute query:
INSERT INTO my_tbl (goid, timestamps, somenumber)
VALUES ('potato_1', ARRAY['2016-11-16 12:00:00', '2016-11-16 15:00:00']::timestamp[], ARRAY[223, 211]::numeric[]);
there are no any errors, and it extends array-type columns as I expected
I can admit that this is a dirty solution, but it seems that it works.
If someone has a better solution, I'll glad to look at it.

Postgres dynamic column headers (from another table)

Just a silly example:
Table A:
eggs
bread
cheese
Table B (when they are eaten):
Egg | date
Bread | date
Egg | date
cheese | date
Bread | date
For statistics purpouses, i need to have statistics per date per food type in a look like this:
Table Statistics:
egg | bread | cheese
date1 2 1 0
date2 6 4 2
date3 2 0 0
I need the column headers to be dynamic in the report (if new ones are added, it should automatically appear).
Any idea how to make this in postgres?
Thanks.
based on answer Postgres dynamic column headers (from another table) (the work of Eric Vallabh Minikel) i improved the function to be more flexible and convenient. I think it might be useful for others too, especially as it only relies on pg/plsql and does not need installation of extentions as other derivations of erics work (i.e. plpython) do. Testet with 9.3.5 but should also work at least down to 9.2.
Improvements:
deal with pivoted column names containing spaces
deal with multiple row header columns
deal with aggregate function in pivot cell as well as non-aggregated pivot cell (last parameter might be 'sum(cellval)' as well as 'cellval' in case the underlying table/view already does aggregation)
auto detect data type of pivot cell (no need to pass it to the function any more)
Useage:
SELECT get_crosstab_statement('table_to_pivot', ARRAY['rowname' [, <other_row_header_columns_as_well>], 'colname', 'max(cellval)');
Code:
CREATE OR REPLACE FUNCTION get_crosstab_statement(tablename character varying, row_header_columns character varying[], pivot_headers_column character varying, pivot_values character varying)
RETURNS character varying AS
$BODY$
--returns the sql statement to use for pivoting the table
--based on: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
--based on: https://stackoverflow.com/questions/4104508/postgres-dynamic-column-headers-from-another-table
--based on: http://www.postgresonline.com/journal/categories/24-tablefunc
DECLARE
arrayname CONSTANT character varying := 'r';
row_headers_simple character varying;
row_headers_quoted character varying;
row_headers_castdown character varying;
row_headers_castup character varying;
row_header_count smallint;
row_header record;
pivot_values_columnname character varying;
pivot_values_datatype character varying;
pivot_headers_definition character varying;
pivot_headers_simple character varying;
sql_row_headers character varying;
sql_pivot_headers character varying;
sql_crosstab_result character varying;
BEGIN
-- 1. create row header definitions
row_headers_simple := array_to_string(row_header_columns, ', ');
row_headers_quoted := '''' || array_to_string(row_header_columns, ''', ''') || '''';
row_headers_castdown := array_to_string(row_header_columns, '::text, ') || '::text';
row_header_count := 0;
sql_row_headers := 'SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = ''' || tablename || ''' AND column_name IN (' || row_headers_quoted || ')';
FOR row_header IN EXECUTE sql_row_headers LOOP
row_header_count := row_header_count + 1;
row_headers_castup := COALESCE(row_headers_castup || ', ', '') || arrayname || '[' || row_header_count || ']::' || row_header.data_type || ' AS ' || row_header.column_name;
END LOOP;
-- 2. retrieve basic column name in case an aggregate function is used
SELECT coalesce(substring(pivot_values FROM '.*\((.*)\)'), pivot_values)
INTO pivot_values_columnname;
-- 3. retrieve pivot values datatype
SELECT data_type
FROM information_schema.columns
WHERE table_name = tablename AND column_name = pivot_values_columnname
INTO pivot_values_datatype;
-- 4. retrieve list of pivot column names.
sql_pivot_headers := 'SELECT string_agg(DISTINCT quote_ident(' || pivot_headers_column || '), '', '' ORDER BY quote_ident(' || pivot_headers_column || ')) as names, string_agg(DISTINCT quote_ident(' || pivot_headers_column || ') || '' ' || pivot_values_datatype || ''', '', '' ORDER BY quote_ident(' || pivot_headers_column || ') || '' ' || pivot_values_datatype || ''') as definitions FROM ' || tablename || ';';
EXECUTE sql_pivot_headers INTO pivot_headers_simple, pivot_headers_definition;
-- 5. set up the crosstab query
sql_crosstab_result := 'SELECT ' || replace (row_headers_castup || ', ' || pivot_headers_simple, ', ', ',
') || '
FROM crosstab (
''SELECT ARRAY[' || row_headers_castdown || '] AS ' || arrayname || ', ' || pivot_headers_column || ', ' || pivot_values || '
FROM ' || tablename || '
GROUP BY ' || row_headers_simple || ', ' || pivot_headers_column || (CASE pivot_values_columnname=pivot_values WHEN true THEN ', ' || pivot_values ELSE '' END) || '
ORDER BY ' || row_headers_simple || '''
,
''SELECT DISTINCT ' || pivot_headers_column || '
FROM ' || tablename || '
ORDER BY ' || pivot_headers_column || '''
) AS newtable (
' || arrayname || ' varchar[]' || ',
' || replace(pivot_headers_definition, ', ', ',
') || '
);';
RETURN sql_crosstab_result;
END
$BODY$
LANGUAGE plpgsql STABLE
COST 100;
I came across the same problem, and found an alternative solution. I'd like to ask for comments here.
The idea is to create the "output type" string for crosstab dynamically. The end result can't be returned by a plpgsql function, because that function would either need to have a static return type (which we don't have) or return setof record, thus having no advantage over the original crosstab function. Therefore my function saves the output cross-table in a view. Likewise, the input table of "pivot" data not yet in cross-table format is taken from an existing view or table.
The usage would be like this, using your example (I added the "fat" field to illustrate the sorting feature):
CREATE TABLE food_fat (name character varying(20) NOT NULL, fat integer);
CREATE TABLE eaten (food character varying(20) NOT NULL, day date NOT NULL);
-- This view will be formatted as cross-table.
-- ORDER BY is important, otherwise crosstab won't merge rows
CREATE TEMPORARY VIEW mymeals AS
SELECT day,food,COUNT(*) AS meals FROM eaten
GROUP BY day, food ORDER BY day;
SELECT auto_crosstab_ordered('mymeals_cross',
'mymeals', 'day', 'food', 'meals', -- what table to convert to cross-table
'food_fat', 'name', 'fat'); -- where to take the columns from
The last statement creates a view mymeals_cross that looks like this:
SELECT * FROM mymeals_cross;
day | bread | cheese | eggs
------------+-------+--------+------
2012-06-01 | 3 | 3 | 2
2012-06-02 | 2 | 1 | 3
(2 rows)
Here comes my implementation:
-- FUNCTION get_col_type(tab, col)
-- returns the data type of column <col> in table <tab> as string
DROP FUNCTION get_col_type(TEXT, TEXT);
CREATE FUNCTION get_col_type(tab TEXT, col TEXT, OUT ret TEXT)
AS $BODY$ BEGIN
EXECUTE $f$
SELECT atttypid::regtype::text
FROM pg_catalog.pg_attribute
WHERE attrelid='$f$||quote_ident(tab)||$f$'::regclass
AND attname='$f$||quote_ident(col)||$f$'
$f$ INTO ret;
END;
$BODY$ LANGUAGE plpgsql;
-- FUNCTION get_crosstab_type(tab, row, val, cattab, catcol, catord)
--
-- This function generates the output type expression for the crosstab(text, text)
-- function from the PostgreSQL tablefunc module when the "categories"
-- (cross table column labels) can be looked up in some view or table.
--
-- See auto_crosstab below for parameters
DROP FUNCTION get_crosstab_type(TEXT, TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION get_crosstab_type(tab TEXT, rw TEXT, val TEXT, cattab TEXT,
catcol TEXT, catord TEXT, OUT ret TEXT)
AS $BODY$ BEGIN
EXECUTE $f$
SELECT '"$f$||quote_ident(rw)||$f$" $f$
||get_col_type(quote_ident(tab), quote_ident(rw))||$f$'
|| string_agg(',"'||_values._v||'" $f$
||get_col_type(quote_ident(tab), quote_ident(val))||$f$')
FROM (SELECT DISTINCT ON(_t.$f$||quote_ident(catord)||$f$) _t.$f$||quote_ident(catcol)||$f$ AS _v
FROM $f$||quote_ident(cattab)||$f$ _t
ORDER BY _t.$f$||quote_ident(catord)||$f$) _values
$f$ INTO ret;
END; $BODY$ LANGUAGE plpgsql;
-- FUNCTION auto_crosstab_ordered(view_name, tab, row, cat, val, cattab, catcol, catord)
--
-- This function creates a VIEW containing a cross-table of input table.
-- It fetches the column names of the cross table ("categories") from
-- another table.
--
-- view_name - name of VIEW to be created
-- tab - input table. This table / view must have 3 columns:
-- "row", "category", "value".
-- row - column name of the "row" column
-- cat - column name of the "category" column
-- val - column name of the "value" column
-- cattab - another table holding the possible categories
-- catcol - column name in cattab to use as column label in the cross table
-- catord - column name in cattab to sort columns in the cross table
DROP FUNCTION auto_crosstab_ordered(TEXT, TEXT, TEXT, TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION auto_crosstab_ordered(view_name TEXT, tab TEXT, rw TEXT,
cat TEXT, val TEXT, cattab TEXT, catcol TEXT, catord TEXT) RETURNS void
AS $BODY$ BEGIN
EXECUTE $f$
CREATE VIEW $f$||quote_ident(view_name)||$f$ AS
SELECT * FROM crosstab(
'SELECT $f$||quote_ident(rw)||$f$,
$f$||quote_ident(cat)||$f$,
$f$||quote_ident(val)||$f$
FROM $f$||quote_ident(tab)||$f$',
'SELECT DISTINCT ON($f$||quote_ident(catord)||$f$) $f$||quote_ident(catcol)||$f$
FROM $f$||quote_ident(cattab)||$f$ m
ORDER BY $f$||quote_ident(catord)||$f$'
) AS ($f$||get_crosstab_type(tab, rw, val, cattab, catcol, catord)||$f$)$f$;
END; $BODY$ LANGUAGE plpgsql;
-- FUNCTION auto_crosstab(view_name, tab, row, cat, val)
--
-- This function creates a VIEW containing a cross-table of input table.
-- It fetches the column names of the cross table ("categories") from
-- DISTINCT values of the 2nd column of the input table.
--
-- view_name - name of VIEW to be created
-- tab - input table. This table / view must have 3 columns:
-- "row", "category", "value".
-- row - column name of the "row" column
-- cat - column name of the "category" column
-- val - column name of the "value" column
DROP FUNCTION auto_crosstab(TEXT, TEXT, TEXT, TEXT, TEXT);
CREATE FUNCTION auto_crosstab(view_name TEXT, tab TEXT, rw TEXT, cat TEXT, val TEXT) RETURNS void
AS $$ BEGIN
PERFORM auto_crosstab_ordered(view_name, tab, rw, cat, val, tab, cat, cat);
END; $$ LANGUAGE plpgsql;
The code is working fine. you will receive output as dynamic query.
CREATE OR REPLACE FUNCTION public.pivotcode(tablename character varying, rowc character varying, colc character varying, cellc character varying, celldatatype character varying)
RETURNS character varying AS
$BODY$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct ''_''||'||colc||'||'' '||celldatatype||''','','' order by ''_''||'||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
--create temp table temp as
dynsql2 = 'select * from crosstab ( ''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'', ''select distinct '||colc||' from '||tablename||' order by 1'' ) as newtable ( '||rowc||' varchar,'||columnlist||' );';
return dynsql2;
end
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This is an example of pivoting data - in postgreSQL 9 there is a crosstab function to do this: http://www.postgresql.org/docs/current/static/tablefunc.html
I solved it folowing this article:
http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html
In short, i used function that for every result in a query dynamically creates the values for the next query, and returns the result needed as a refcursor. This solved my sql result part, now i need to figure out the java part, but that isnt connected much to the question :)