How to address a computed column value in an SQLite query? - sql

Having a following table:
CREATE TABLE `foo`(
`year` INT NOT NULL,
`month` INT NOT NULL,
`day` INT NOT NULL,
`hour` INT NOT NULL,
`minute` INT NOT NULL,
`value` INT NOT NULL,
PRIMARY KEY(`year`, `month`, `day`, `hour`, `minute`)
);
I want to write a query which will add 2 columns for every record: date for standard single-column data representation and weekday to indicate day-of-a-week number.
I have tried
SELECT
`year`, `month`, `day`, `hour`, `minute`, `value`,
substr('000' || year, -4) || '-' || substr('0' || month, -2) || '-' || substr('0' || day, -2) AS `date`,
strftime('%w', `date`) AS `weekday`
FROM
`foo`;
But this says
Error: no such column: date
This is an illustration. In real I have much more complex logic in calculating additional columns and need to reuse these calculated columns to calculate other columns and just copying whole code in every place I need its value look quite scary.
Is there a way I can address a computed column value in computing another one?

Use a subquery:
select f.*, strftime('%w', `date`) AS `weekday`
from (SELECT `year`, `month`, `day`, `hour`, `minute`, `value`,
substr('000' || year, -4) || '-' || substr('0' || month, -2) || '-' || substr('0' || day, -2) AS `date`
from foo
) t

Related

Separate one column to some column in bigquery

I have table with one column called cols. I want to separate cols to new column based on type: SKU and MARK using bigquery.
cols
"dsc":[{"amount":30000,"c_amount":0,"d_amount":0,"d_id":"","scope":"CART","title":"Promo","type":"SKU"},{"amount":7000,"c_amount":0,"d_amount":7000,"d_id":"x","scope":"CART_D","title":"","type":"MARK"}]
The result i want is like:
sku_amount sku_c_amount sku_d_amount sku_d_id sku_scope sku_title mark_amount mark_c_amount mark_d_amount mark_d_id mark_scope
30000 0 0 CART Promo 7000 0 0 x CART_D
Anyone know the script? thank you
Consider below approach
create temp table temp_table as
select id1,
max(if(key = 'type', value, null)) over (partition by id1, id2) || '_' || key as key,
value, offset
from (
select md5(cols) as id1, md5(json) id2, arr[offset(0)] as key, arr[offset(1)] as value, offset
from your_table, unnest(json_extract_array('{' || cols || '}', '$.dsc')) json with offset,
unnest(split(translate(json, '{}"', ''))) kv,
unnest([struct(split(kv, ':') as arr)])
);
execute immediate (select '''
select * except(id1) from (select * except(offset) from temp_table)
pivot (any_value(value) for key in ("''' || string_agg(key, '","' order by offset, key) || '''"))
'''
from (select distinct key, offset from temp_table where not ends_with(key, '_type'))
);
if applied to sample data in your question - output is

How to create dynamic partition table in postgres with bigint column?

I have a master table such as
CREATE TABLE public.user_event_firebase
(
user_id character varying(32) COLLATE pg_catalog."default" NOT NULL,
event_name character varying(255) COLLATE pg_catalog."default" NOT NULL,
"timestamp" bigint NOT NULL,
platform character varying(255) COLLATE pg_catalog."default" NOT NULL,
created_at timestamp without time zone DEFAULT now()
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
GOAL
I want to partition this table by year_month table with "timestamp" column such as user_event_firebase_2018_04 , user_event_firebase_2018_05, user_event_firebase_2018_06. The rows will automation redirect to insert into partition table with timestamp condition.
I created function create partition such as:
CREATE OR REPLACE FUNCTION partition_uef_table( bigint, bigint )
returns void AS $$
DECLARE
create_query text;
index_query text;
BEGIN
FOR create_query, index_query IN SELECT
'create table user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( check( timestamp >= bigint '''
|| TO_CHAR( d, 'YYYY-MM-DD' )
|| ''' and timestamp < bigint '''
|| TO_CHAR( d + INTERVAL '1 month', 'YYYY-MM-DD' )
|| ''' ) ) inherits ( user_event_firebase );',
'create index user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| '_time on user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( timestamp );'
FROM generate_series( $1, $2, '1 month' ) AS d
LOOP
EXECUTE create_query;
EXECUTE index_query;
END LOOP;
END;
$$
language plpgsql;
CREATE OR REPLACE FUNCTION test_partition_function_uef()
RETURNS TRIGGER AS $$
BEGIN
EXECUTE 'insert into user_event_firebase_'
|| to_char( NEW.timestamp, 'YYYY_MM' )
|| ' values ( $1, $2, $3, $4 )' USING NEW.user_id, NEW.event_name, NEW.timestamp, NEW.platform;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
with trigger
CREATE TRIGGER test_partition_trigger_uef
BEFORE INSERT
ON user_event_firebase
FOR each ROW
EXECUTE PROCEDURE test_partition_function_uef() ;
I trying with example
SELECT partition_uef_table(1518164237,1520583437) ;
PROBLEM :
ERROR: invalid input syntax for integer: "1 month"
LINE 14: FROM generate_series( $1, $2, '1 month' ) AS d
^
QUERY: SELECT
'create table user_event_firebase_'
|| TO_CHAR( d, 'YYYY_MM' )
|| ' ( check( timestamp >= bigint '''
|| TO_CHAR( d, 'YYYY-MM-DD' )
|| ''' and timestamp < bigint '''
|| TO_CHAR( d + INTERVAL '1 month', 'YYYY-MM-DD' )
|| ''' ) ) inherits ( user_event_firebase );',
'create index user_event_firebase_'
QUESTION:
How to create range for generate_series function in ' 1 month ' , set step property such int or bigint suck because of day of month is diffirence ( 2nd - 28 days, 3rd - 30 days ).
Thank you.
answer to your second question would be opinion based (so I skip it), but to the first would be such:
with args(a1,a2) as (values(1518164237,1520583437))
select d,to_char(d,'YYYY_MM') from args, generate_series(to_timestamp(a1),to_timestamp(a2),'1 month'::interval) d;
gives reult:
d | to_char
------------------------+---------
2018-02-09 08:17:17+00 | 2018_02
2018-03-09 08:17:17+00 | 2018_03
(2 rows)
Use
generate_series(start, stop, step interval) timestamp or timestamp with time zone

Query in PostgreSQL with large quantity of squid access requests

Hello people, I'm using a log daemon (https://github.com/paranormal/blooper) in Squid Proxy to put access log into PostreSQL and I make a Trigger Function:
DECLARE
newtime varchar := EXTRACT (MONTH FROM NEW."time")::varchar;
newyear varchar := EXTRACT (YEAR FROM NEW."time")::varchar;
user_name varchar := REPLACE (NEW.user_name, '.', '_');
partname varchar := newtime || '_' || newyear;
tablename varchar := user_name || '.accesses_' || partname;
BEGIN
IF NEW.user_name IS NOT NULL THEN
EXECUTE 'CREATE SCHEMA IF NOT EXISTS ' || user_name;
EXECUTE 'CREATE TABLE IF NOT EXISTS '
|| tablename
|| '('
|| 'CHECK (user_name = ''' || NEW.user_name || ''' AND EXTRACT(MONTH FROM "time") = ' || newtime || ' AND EXTRACT (YEAR FROM "time") = ' || newyear || ')'
|| ') INHERITS (public.accesses)';
EXECUTE 'CREATE INDEX IF NOT EXISTS access_index_' || partname || '_user_name ON ' || tablename || ' (user_name)';
EXECUTE 'CREATE INDEX IF NOT EXISTS access_index_' || partname || '_time ON ' || tablename || ' ("time")';
EXECUTE 'INSERT INTO ' || tablename || ' SELECT $1.*' USING NEW;
END IF;
RETURN NULL;
END;
The main function of it is make a table partition by user_name and by month-year of the access, inhering from a master clean table:
CREATE TABLE public.accesses
(
id integer NOT NULL DEFAULT nextval('accesses_id_seq'::regclass),
"time" timestamp with time zone NOT NULL,
time_response integer,
mac_source macaddr,
ip_source inet NOT NULL,
ip_destination inet,
user_name character varying(40),
http_status_code numeric(3,0) NOT NULL,
http_reply_size bigint NOT NULL,
http_request_method character varying(15) NOT NULL,
http_request_url character varying(4166) NOT NULL,
http_content_type character varying(100),
squid_hier_code character varying(20),
squid_request_status character varying(50),
user_id integer,
CONSTRAINT accesses_http_request_method_fkey FOREIGN KEY (http_request_method)
REFERENCES public.http_requests (method) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT accesses_http_status_code_fkey FOREIGN KEY (http_status_code)
REFERENCES public.http_statuses (code) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT accesses_user_id_fkey FOREIGN KEY (user_id)
REFERENCES public.users (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
The main problem is get the sum of http_reply_size grouping by user_name and time, my query is:
SELECT
"time",
user_name,
sum(http_reply_size)
FROM
accesses
WHERE
extract(epoch from "time") BETWEEN 1516975122 AND 1516996722
GROUP BY
"time",
user_name
But this query is very slow in the server (3'237'976 rows currently in 2 days only). So, PostgreSQL has something to optimize a query with that need, or I need to use another SQL or NoSQL system.
Try to include a CHECK condition on each partition so doesn't have to scan all tables.
In my case is like this:
CREATE TABLE IF NOT EXISTS ' || table_name || '(
CONSTRAINT ' || pk || ' PRIMARY KEY (avl_id),
CHECK ( event_time >= ''' || begin_time || ''' AND event_time < ''' || end_time || ''' )
) INHERITS (avl_db.avl);
Also don't use extract(epoch from "time") that will need to calculate the value for each row and can't use the index you create for "time"
so use like this to get advantage of the index.
WHERE "time" >= '2018-01-01'::timestamp with time zone
and "time" < '2018-02-01'::timestamp with time zone

PostgreSQL: Function with multiple date parameter

I'm trying to create a function with multiple parameter as below:
CREATE OR REPLACE FUNCTION select_name_and_date (
IN f_name character,
IN m_name character,
IN l_name character,
IN start_date date,
IN end_date date )
RETURNS TABLE (
start_date date ,first_name character, middle_name character,last_name character ) AS $BODY$
BEGIN RETURN QUERY
select a.start_date, a.first_name, a.middle_name, a.last_name
FROM table1 a
where code in ('NEW', 'OLD')
and ( (a.first_name like '%' || f_name || '%' and a.middle_name like '%' || m_name || '%' and a.last_name like '%' || l_name || '%'))
or ((a.date_applied) between start_date and end_date );
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
When I tried to execute with date, it shows correct result.
select * from select_name_and_date ('Firstname','','','2016-06-27','2016-06-28');
When i tried to remove the value of date, it shows:
ERROR: invalid input syntax for type date: ""
select * from select_name_and_date ('Firstname','','','','');
When I tried to replace with NULL value of the date, it shows: 0 rows retrieved. (when it should have)
select * from select_name_and_date ('Firstname','','',NULL,NULL);
I want to have parameter that not depending on each parameter.
The between operator does not handle nulls. If you want to allow them, you'll to treat them explicitly. E.g., you could rewrite the part of the condition that applies to a.date_applied as follows:
((a.date_applied BETWEEN start_date AND end_date) OR
(start_date IS NULL AND a.date_applied < end_date) OR
(end_date IS NULL AND a.date_applied >= end_date) OR
(start_date IS NULL AND end_date IS NULL))

Transposing a table through select query

I have a table like:
Key type value
---------------------
40 A 12.34
41 A 10.24
41 B 12.89
I want it in the format:
Types 40 41 42 (keys)
---------------------------------
A 12.34 10.24 XXX
B YYY 12.89 ZZZ
How can this be done through an SQL query. Case statements, decode??
What you're looking for is called a "pivot" (see also "Pivoting Operations" in the Oracle Database Data Warehousing Guide):
SELECT *
FROM tbl
PIVOT(SUM(value) FOR Key IN (40, 41, 42))
It was added to Oracle in 11g. Note that you need to specify the result columns (the values from the unpivoted column that become the pivoted column names) in the pivot clause. Any columns not specified in the pivot are implicitly grouped by. If you have columns in the original table that you don't wish to group by, select from a view or subquery, rather than from the table.
You can engage in a bit of wizardry and get Oracle to create the statement for you, so that you don't need to figure out what column values to pivot on. In 11g, when you know the column values are numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN ('
|| LISTAGG(Key, ',') WITHIN GROUP (ORDER BY Key)
|| ');'
FROM tbl;
If the column values might not be numeric:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM tbl;
LISTAGG probably repeats duplicates (would someone test this?), in which case you'd need:
SELECT
'SELECT * FROM tbl PIVOT(SUM(value) FOR Key IN (\''
|| LISTAGG(Key, '\',\'') WITHIN GROUP (ORDER BY Key)
|| '\'));'
FROM (SELECT DISTINCT Key FROM tbl);
You could go further, defining a function that takes a table name, aggregate expression and pivot column name that returns a pivot statement by first producing then evaluating the above statement. You could then define a procedure that takes the same arguments and produces the pivoted result. I don't have access to Oracle 11g to test it, but I believe it would look something like:
CREATE PACKAGE dynamic_pivot AS
-- creates a PIVOT statement dynamically
FUNCTION pivot_stmt (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE)
RETURN varchar2(300);
PRAGMA RESTRICT_REFERENCES (pivot_stmt, WNDS, RNPS);
-- creates & executes a PIVOT
PROCEDURE pivot_table (tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE);
END dynamic_pivot;
CREATE PACKAGE BODY dynamic_pivot AS
FUNCTION pivot_stmt (
tbl_name IN varchar2(30),
pivot_col IN varchar2(30),
aggr_expr IN varchar2(40),
quote_values IN BOOLEAN DEFAULT TRUE
) RETURN varchar2(300)
IS
stmt VARCHAR2(400);
quote VARCHAR2(2) DEFAULT '';
BEGIN
IF quote_values THEN
quote := '\\\'';
END IF;
-- "\||" shows that you are still in the dynamic statement string
-- The input fields aren't sanitized, so this is vulnerable to injection
EXECUTE IMMEDIATE 'SELECT \'SELECT * FROM ' || tbl_name
|| ' PIVOT(' || aggr_expr || ' FOR ' || pivot_col
|| ' IN (' || quote || '\' \|| LISTAGG(' || pivot_col
|| ', \'' || quote || ',' || quote
|| '\') WITHIN GROUP (ORDER BY ' || pivot_col || ') \|| \'' || quote
|| '));\' FROM (SELECT DISTINCT ' || pivot_col || ' FROM ' || tbl_name || ');'
INTO stmt;
RETURN stmt;
END pivot_stmt;
PROCEDURE pivot_table (tbl_name IN varchar2(30), pivot_col IN varchar2(30), aggr_expr IN varchar2(40), quote_values IN BOOLEAN DEFAULT TRUE) IS
BEGIN
EXECUTE IMMEDIATE pivot_stmt(tbl_name, pivot_col, aggr_expr, quote_values);
END pivot_table;
END dynamic_pivot;
Note: the length of the tbl_name, pivot_col and aggr_expr parameters comes from the maximum table and column name length. Note also that the function is vulnerable to SQL injection.
In pre-11g, you can apply MySQL pivot statement generation techniques (which produces the type of query others have posted, based on explicitly defining a separate column for each pivot value).
Pivot does simplify things greatly. Before 11g however, you need to do this manually.
select
type,
sum(case when key = 40 then value end) as val_40,
sum(case when key = 41 then value end) as val_41,
sum(case when key = 42 then value end) as val_42
from my_table
group by type;
Never tried it but it seems at least Oracle 11 has a PIVOT clause
If you do not have access to 11g, you can utilize a string aggregation and a grouping method to approx. what you are looking for such as
with data as(
SELECT 40 KEY , 'A' TYPE , 12.34 VALUE FROM DUAL UNION
SELECT 41 KEY , 'A' TYPE , 10.24 VALUE FROM DUAL UNION
SELECT 41 KEY , 'B' TYPE , 12.89 VALUE FROM DUAL
)
select
TYPE ,
wm_concat(KEY) KEY ,
wm_concat(VALUE) VALUE
from data
GROUP BY TYPE;
type KEY VALUE
------ ------- -----------
A 40,41 12.34,10.24
B 41 12.89
This is based on wm_concat as shown here: http://www.oracle-base.com/articles/misc/StringAggregationTechniques.php
I'm going to leave this here just in case it helps, but I think PIVOT or MikeyByCrikey's answers would best suit your needs after re-looking at your sample results.