What i want to do, is to change my date column type from varchar to timestamp w/o timezone, and migrate all my data. This is what im doing:
ALTER TABLE mytable ALTER COLUMN "datecol"
TYPE timestamp without time zone USING(to_timestamp("datecol", 'YYYY-MM-DD')::timestamp without time zone);
This works fine if data in datecol is in date format. But if i have some not valid data, like random strings (e.g "abc") how can i validate and check if date format is good? I want to set default value for those invalid fields.
EDIT: Thanks Ludvig, i solved my problem:
create or replace function is_date(s varchar) returns boolean as $$
begin
perform s::date;
return true;
exception when others then
return false;
end;
$$ language plpgsql;
ALTER TABLE mytable ALTER COLUMN "datecol"
TYPE timestamp without time zone USING(
CASE WHEN is_date("datecol") = true
THEN to_timestamp("datecol", 'YYYY-MM-DD')::timestamp without time zone
ELSE '1970-01-01 00:00:00.000'
END
);
You can't alter type for column with try/catch logic, so you either have to regexp all possible formats or you can:
add column "ts" of timestamp data type
add column "exc" boolean
do $$
declare _r record;
begin
for _r in (select * from mytable) loop
update mytable set "ts" = to_timestamp("datecol", 'YYYY-MM-DD')
where ...PK..=_r.PK;
when others
then update mytable set "exc" = true;
end loop; end; $$;
drop datecol, rename "ts" to "datecol"
deal with values where "exc"
The big minus would be changed order of columns in mytable. So poor apology that you can after create table tt as select NEEDED order from mytable; drop table mytable;alter table tt rename to mytable, but then you will have to rebuild all references and dependants as well of course. Or even more exotic way - you can start adding column in needed order, dropping renaming until you get the old set...
Instead of testing for a valid date I would create a function that tries casting using different formats and returns the default date in case none of them work:
create or replace function convert_date(s varchar)
returns timestamp
as
$$
begin
-- test standard ISO format
begin
return to_timestamp(s, 'yyyy-mm-dd');
exception when others then
-- ignore
end;
begin
return to_timestamp(s, 'yyyy.mm.dd');
exception when others then
-- ignore
end;
begin
return to_timestamp(s, 'yyyy/mm/dd');
exception when others then
-- ignore
end;
begin
return to_timestamp(s, 'yyyy-mm');
exception when others then
-- ignore
end;
begin
return to_timestamp(s, 'yyyy');
exception when others then
-- ignore
end;
return timestamp '1970-01-01 00:00:00';
end
$$ language plpgsql;
Then use:
ALTER TABLE mytable
ALTER COLUMN "datecol" TYPE timestamp without time zone
USING (convert_date(datecol));
This won't be very efficient, but for a one-time job it should work
Related
I want to convert this code in Postgres to something shorter that will do the same. I read about upsert but I couldn't understand a good way to implement that on my code.
What I wrote works fine, but I want to find a more elegant way to write it.
Hope someone here can help me! This is the query:
CREATE OR REPLACE FUNCTION insert_table(
in_guid character varying,
in_x_value character varying,
in_y_value character varying
)
RETURNS TABLE(response boolean) LANGUAGE 'plpgsql'
DECLARE _id integer;
BEGIN
-- guid exists and it's been 10 minutes from created_date:
IF ((SELECT COUNT (*) FROM public.tbl_client_location WHERE guid = in_guid AND created_date < NOW() - INTERVAL '10 MINUTE') > 0) THEN
RETURN QUERY (SELECT FALSE);
-- guid exists but 10 minutes hasen't passed yet:
ELSEIF ((SELECT COUNT (*) FROM public.tbl_client_location WHERE guid = in_guid) > 0) THEN
UPDATE
public.tbl_client_location
SET
x_value = in_x_value,
y_value = in_y_value,
updated_date = now()
WHERE
guid = in_guid;
RETURN QUERY (SELECT TRUE);
-- guid not exist:
ELSE
INSERT INTO public.tbl_client_location
( guid , x_value , y_value )
VALUES
( in_guid, in_x_value, in_y_value )
RETURNING id INTO _id;
RETURN QUERY (SELECT TRUE);
END IF;
END
This can indeed be a lot simpler:
CREATE OR REPLACE FUNCTION insert_table(in_guid text
, in_x_value text
, in_y_value text
, OUT response bool) -- ④
-- RETURNS record -- optional noise -- ④
LANGUAGE plpgsql AS -- ①
$func$ -- ②
-- DECLARE
-- _id integer; -- what for?
BEGIN
INSERT INTO tbl AS t
( guid, x_value, y_value)
VALUES (in_guid, in_x_value, in_y_value)
ON CONFLICT (guid) DO UPDATE -- guid exists
SET ( x_value, y_value, updated_date)
= (EXCLUDED.x_value, EXCLUDED.y_value, now()) -- ⑤
WHERE t.created_date >= now() - interval '10 minutes' -- ③ have not passed yet
-- RETURNING id INTO _id -- what for?
;
response := FOUND; -- ⑥
END
$func$;
Assuming guid is defined UNIQUE or PRIMARY KEY, and created_date is defined NOT NULL DEFAULT now().
① Language name is an identifier - better without quotes.
② Quotes around function body were missing (invalid command). See:
What are '$$' used for in PL/pgSQL
③ UPDATE only if 10 min have not passed yet. Keep in mind that timestamps are those from the beginning of the respective transactions. So keep transactions short and simple. See:
Difference between now() and current_timestamp
④ A function with OUT parameter(s) and no RETURNS clause returns a single row (record) automatically. Your original was declared as set-returning function (0-n returned rows), which didn't make sense. See:
Return multiple fields as a record in PostgreSQL with PL/pgSQL
⑤ It's generally better to use the special EXCLUDED row than to spell out values again. See:
How could this UPSERT query be made shorter?
⑤ Also using short syntax for updating multiple columns. See:
Update multiple columns that start with a specific string
⑥ To see whether a row was written use the special variable FOUND. Subtle difference: different from your original, you get true or false after the fact, saying that a row has actually been written (or not). In your original, the INSERT or UPDATE might still be skipped (without raising an exception) by a trigger or rule, and the function result would be misleading in this case. See:
IS NOT NULL test for a record does not return TRUE when variable is set
Further reading:
Postgres ON CONFLICT ON CONSTRAINT triggering errors in the error log
How to use RETURNING with ON CONFLICT in PostgreSQL?
You might just run the single SQL statement instead, providing your values once:
INSERT INTO tbl AS t(guid, x_value,y_value)
VALUES ($in_guid, $in_x_value, $in_y_value) -- your values here, once
ON CONFLICT (guid) DO UPDATE
SET (x_value,y_value, updated_date)
= (EXCLUDED.x_value, EXCLUDED.y_value, now())
WHERE t.created_date >= now() - interval '10 minutes';
I finally solved it. I made another function that'll be called and checked if it's already exists and the time and then I can do upsert without any problems.
That's what I did at the end:
CREATE OR REPLACE FUNCTION fnc_check_table(
in_guid character varying)
RETURNS TABLE(response boolean)
LANGUAGE 'plpgsql'
COST 100
VOLATILE
ROWS 1000
AS $BODY$
BEGIN
IF EXISTS (SELECT FROM tbl WHERE guid = in_guid AND created_date < NOW() - INTERVAL '10 MINUTE' ) THEN
RETURN QUERY (SELECT FALSE);
ELSEIF EXISTS (SELECT FROM tbl WHERE guid = in_guid AND created_date > NOW() - INTERVAL '10 MINUTE') THEN
RETURN QUERY (SELECT TRUE);
ELSE
RETURN QUERY (SELECT TRUE);
END IF;
END
$BODY$;
CREATE OR REPLACE FUNCTION fnc_insert_table(
in_guid character varying,
in_x_value character varying,
in_y_value character varying)
RETURNS TABLE(response boolean)
LANGUAGE 'plpgsql'
COST 100
VOLATILE
ROWS 1000
AS $BODY$
BEGIN
IF (fnc_check_table(in_guid)) THEN
INSERT INTO tbl (guid, x_value, y_value)
VALUES (in_guid,in_x_value,in_y_value)
ON CONFLICT (guid)
DO UPDATE SET x_value=in_x_value, y_value=in_y_value, updated_date=now();
RETURN QUERY (SELECT TRUE);
ELSE
RETURN QUERY (SELECT FALSE);
END IF;
END
$BODY$;
The data length in our production DB, DATE data type, is 7, the sysdate function return 8 characters (dd/mm/yy) .
Is there any way to eliminate '/' and only populate 'ddmmyy'.
Tried the below but no luck.
INSERT INTO TESTER(tablename,columnname,defaultdate,prime_number) VALUES ('tabL7845','field894',REPLACE(SYSDATE,'/',''),105);
INSERT INTO TESTER(tablename,columnname,defaultdate,prime_number) VALUES ('ta68888','fiG987',TO_CHAR(sysdate,'MMDDYY'),180);
INSERT INTO TESTER(tablename,columnname,defaultdate,prime_number) VALUES ('tab345','field464',TRIM(BOTH '/' FROM SYSDATE),65);
Row is getting inserted but in table i could find the same sysdate format,
for eg ) 07/08/20
How can i populate as 070820
Trigger code :
SET SERVEROUTPUT ON;
CREATE OR REPLACE TRIGGER def_trig
BEFORE INSERT OR UPDATE OF datedef ON Test
REFERENCING OLD AS O NEW AS N
FOR EACH ROW
DECLARE
V_DATA_LENGTH NUMBER;
V_DATA_TYPE VARCHAR2(15);
BEGIN
SELECT DATA_LENGTH,DATA_TYPE
INTO V_DATA_LENGTH,V_DATA_TYPE
FROM all_tab_columns
WHERE table_name = :n.tablename
AND column_name =:n.columnname
IF INSERTING THEN
IF v_data_type = 'DATE' THEN
IF length(:n.datedef) > V_DATA_LENGTH THEN
RAISE_APPLICATION_ERROR(-20001,'DATE FIELD LENGTH IS MORE THAN THE CORESPONDING COLUMN DATA LENGTH');
END IF;
ELSE
DBMS_OUTPUT.PUT_LINE('INSERT IS SUCCESSFUL');
END IF;
END if;
END;
/
Dates in Oracle do not have any internal sort of text formatting; they are stored as binary. If you want to view a SYSDATE value in a certain text format, then use TO_CHAR with the appropriate format mask:
INSERT INTO TESTER (tablename, columnname, defaultdate, prime_number)
VALUES ('ta68888', 'fiG987', sysdate, 180); -- just insert plain SYSDATE here
SELECT
tablename,
columnname,
TO_CHAR(defaultdate, 'DDMMYY'), -- then view SYSDATE however you want
prime_number
FROM TESTER;
Here is the output from the above query:
Demo
Internally dates are stored as a structure of 7 1byte integers, that's where the length 7 comes from. From that structure any valid format can be displayed without any data change. To see this select the same column with multiple formats. For example:
alter session st nls_date_format = 'hh24:mi:ss dd-Mon-yyyy';
select defaultdate
from teaster
where defaultdate is not null
and row_num<2;
alter session set nls_date_format = 'Dy dd-Mon-yyyy # hh24:mi:ss' ;
select defaultdate
from teaster
where defaultdate is not null
and row_num<2;
To get a glimpse at the internal format run:
select defaultdate, dump(defaultdate) from teaster;
This will show you the default date (interpreted as directed by nls_date_format) and a glimpse of the internal structure.
So I am working to create a function that will delete the 1 month worth records from a table. The table is in postgres. As postgres does not have stored procedures I am trying to declare a function with the logic that will insert the 1 month records into a history table and then delete the records from the live table. I have the following code :
CREATE FUNCTION DeleteAndInsertTransaction(Integer)
RETURNS Void
AS $Body$
SELECT now() into saveTime;
SELECT * INTO public.hist_table
FROM (select * from public.live_table
WHERE update < ((SELECT * FROM saveTime) - ($1::text || ' months')::interval)) as sub;
delete from public.live_table
where update < ((SELECT * FROM saveTime) - ($1::text || ' months')::interval);
DROP TABLE saveTime;
$Body$
Language 'sql';
So the above code compiles fine but when I try to run it by invoking it :- DeleteAndInsertTransaction(27) it gives me an
Error: relation "savetime" does not exist and I have no clue what is going on here.
If I take out the SELECT now() into saveTime; out of the function bloc and declare it before invoking the function then it runs fine but I need to store the current date into a variable and use that as a constant for the insert and delete and this is going against a huge table and there could be significant time difference between the insert and deletes. Any pointers as to what is going on here ?
select .. into .. is the deprecated syntax for create table ... as select ... which creates a new table.
So, SELECT now() into saveTime; actually creates a new table (named savetime), and is equivalent to: create table savetime as select now(); - it's not storing something in a variable.
To store a value in a variable, you need to first declare the variable, then you can assign the value. But you can only do that in PL/pgSQL, not SQL
CREATE FUNCTION DeleteAndInsertTransaction(p_num_months integer)
returns void
as
$Body$
declare
l_now timestamp;
begin
l_now := now();
...
end;
$body$
language plpgsql;
To insert into an existing table you need
insert into public.hist_table
select *
from public.live_table.
To select the rows from the last x month, there is no need to store the current date and time in a variable to begin with. It's also easier to use make_interval() to generate an interval based on a specified unit.
You can simply use
select *
from live_table
where updated_at <= current_date - make_interval(mons => p_pum_months);
And as you don't need a variable, you can actually do all that with a language sql function.
So the function would look something like this:
CREATE FUNCTION DeleteAndInsertTransaction(p_num_months integer)
RETURNS Void
AS
$Body$
insert into public.hist_table
select *
from live_table
where updated_at < current_date - make_interval(months => p_pum_months);
delete from public.live_table
where updated_at < current_date - make_interval(months => p_pum_months);
$Body$
Language sql;
Note that the language name is an identifier and should not be quoted.
You can actually do the DELETE and INSERT in a single statement:
with deleted as (
delete from public.live_table
where updated_at <= current_date - make_interval(months => p_pum_months)
returning *
)
insert into hist_table
select *
from deleted;
I have a postgres table with millions of record in it. Now I want to add new column to that table called "time_modified" with the value in another column "last_event_time". Running a migration script is taking long time , so need a simple solution to run in production.
Assuming that the columns are timestamps you can try:
alter table my_table add time_modified text;
alter table my_table alter time_modified type timestamp using last_event_time;
I suggest use function with pg_sleep, which wait between iteriation in loop
This way don't invoke exclusive lock and others locks on your_table.
SELECT pg_sleep(seconds);
But time of execute is long
alter table my_table add time_modified timestamp;
CREATE OR REPLACE FUNCTION update_mew_column()
RETURNS void AS
$BODY$
DECLARE
rec record;
BEGIN
for rec in (select id,last_event_time from your_table) loop
update your_table set time_modified = rec.last_event_time where id = rec.id;
PERFORM pg_sleep(0.01);
end loop;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
and execute function:
select update_mew_column();
I have this procedure
PROCEDURE insertSample
(
return_code_out OUT VARCHAR2,
return_msg_out OUT VARCHAR2,
sample_id_in IN table1.sample_id%TYPE,
name_in IN table1.name%TYPE,
address_in IN table1.address%TYPE
)
IS
BEGIN
return_code_out := '0000';
return_msg_out := 'OK';
INSERT INTO table1
sample_id, name, address)
VALUES
(sample_id_in, name_in, address_in);
EXCEPTION
WHEN OTHERS
THEN
return_code_out := SQLCODE;
return_msg_out := SQLERRM;
END insertSample;
I want to add 4th column in table1 like day_time and add current day timestamp in it.. ho can i do that in this procedure.. thank you
Assuming you you have (or add) a column to the table outside of the procedure, i.e.
ALTER TABLE table1
ADD( insert_timestamp TIMESTAMP );
you could modify your INSERT statement to be
INSERT INTO table1
sample_id, name, address, insert_timestamp)
VALUES
(sample_id_in, name_in, address_in, systimestamp);
In general, however, I would strongly suggest that you not return error codes and error messages from procedures. If you cannot handle the error in your procedure, you should let the exception propagate up to the caller. That is a much more sustainable method of writing code than trying to ensure that every caller to every procedure always correctly checks the return code.
Using Sysdate can provide all sorts of manipulation including the current date, or future and past dates.
http://edwardawebb.com/database-tips/sysdate-determine-start-previous-month-year-oracle-sql
SYSDATE will give you the current data and time.
and if you add the column with a default value you can leave your procedure as it is
ALTER TABLE table1 ADD when_created DATE DEFAULT SYSDATE;