I have a timestamp data that still in STRING type like below:
+-----------------------------+
| created_at |
+-----------------------------+
| 2019-09-05T07:44:32.117283Z |
+-----------------------------+
| 2019-09-05T08:44:32.117213D |
+-----------------------------+
| 2019-09-06T08:44:32.117283A |
+-----------------------------+
| 2019-09-21T09:42:32.117223T |
+-----------------------------+
| 2019-10-21T10:21:14.1174dwC |
+-----------------------------+
How can I change it to ISO Format like "2020-09-05 07:44:32 UTC"?
Thanks in advance
You can use PARSE_TIMESTAMP('%FT%T', SPLIT(created_at, '.')[OFFSET(0)]) or PARSE_TIMESTAMP('%FT%T', SUBSTR(created_at, 1, 19)) - whatever you like better
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT '2019-09-05T07:44:32.117283Z' created_at UNION ALL
SELECT '2019-09-05T08:44:32.117213D' UNION ALL
SELECT '2019-09-06T08:44:32.117283A' UNION ALL
SELECT '2019-09-21T09:42:32.117223T' UNION ALL
SELECT '2019-10-21T10:21:14.1174dwC'
)
SELECT PARSE_TIMESTAMP('%FT%T', SPLIT(created_at, '.')[OFFSET(0)])
FROM `project.dataset.table`
with output
Row f0_
1 2019-09-05 07:44:32 UTC
2 2019-09-05 08:44:32 UTC
3 2019-09-06 08:44:32 UTC
4 2019-09-21 09:42:32 UTC
5 2019-10-21 10:21:14 UTC
I am just learning DB Syntax so im sorry if this is not a relevant question.
I'm trying to change the text of a column when a condition is meet. I have tried many things but have not achieved anything.
|---------------------|------------------|------------------|
| Some PK | Some FK | someDatetime |
|---------------------|------------------|------------------|
| 12 | 34 | 1900/01/01 |
|---------------------|------------------|------------------|
| 13 | 54 | 2018/05/32 |
|---------------------|------------------|------------------|
| 15 | 60 | 2000/01/01 |
|------------------------------------------------------------
What i Need is to display this same table, but when the date is lower from 2018(I know that can be achieved with a where), the query brings this back:
|---------------------|------------------|------------------|
| Some PK | Some FK | someDatetime |
|---------------------|------------------|------------------|
| 12 | 34 | ---------- |
| | | or My own string|
|---------------------|------------------|------------------|
| 13 | 54 | 2018/05/32 |
|---------------------|------------------|------------------|
| 15 | 60 | ---------- |
| | | or My own string |
|------------------------------------------------------------
You could use the YEAR function to check the date:
SELECT
PK,
FK,
CASE WHEN YEAR(someDatetime) < 2018
THEN 'my own string'
ELSE CONVERT(VARCHAR, someDatetime, 120) END AS someDatetime
FROM yourTable;
Note that if you want to generate a text column with your message, in the case the year be earlier than 2018, then the entire CASE expression should generate text. So, we can use CONVERT on the datetime column to generate a text version of the date.
use case when and Year function for converting date to year
select some_PK,some_FK,
case when Year(someDatetime)<'2018' then 'My own string' else someDatetime
end as someDatetime
from yourtable
Why wouldn't you just do:
select . . . ,
(case when someDateTime < '2018-01-01'
then 'my own string'
else convert(varchar(255), someDateTime) -- might want to include a format
end)
No function is needed for the date comparison, just a date comparison.
I have a table in Postgres as follows:
| id | start_time | end_time | duration |
|----|--------------------------|--------------------------|----------|
| 1 | 2018-05-11T00:00:20.631Z | 2018-05-11T01:03:14.496Z | 1:02:54 |
| 2 | 2018-05-11T00:00:04.877Z | 2018-05-11T00:00:14.641Z | 0:00:10 |
| 3 | 2018-05-11T01:03:28.063Z | 2018-05-11T01:04:36.410Z | 0:01:08 |
| 4 | 2018-05-11T00:00:20.631Z | 2018-05-11T02:03:14.496Z | 2:02:54 |
start_time and end_time are stored as varchar. Format is 'yyyy-mm-dd hh24:mi:ss.ms' (ISO format).
duration has been calculated as end_time - start_time. Format is hh:mi:ss.
I need result table output as follows:
| id | start_time | end_time | duration | start | end | duration_minutes |
|----|--------------------------|--------------------------|----------|-----------|-----------|------------------|
| 1 | 2018-05-11T00:00:20.631Z | 2018-05-11T01:03:14.496Z | 1:02:54 | 5/11/2018 | 5/11/2018 | 62 | -- (60+2)
| 2 | 2018-05-11T00:00:04.877Z | 2018-05-11T00:00:14.641Z | 0:00:10 | 5/11/2018 | 5/11/2018 | 0 |
| 3 | 2018-05-11T01:03:28.063Z | 2018-05-11T01:04:36.410Z | 0:01:08 | 5/11/2018 | 5/11/2018 | 1 |
| 4 | 2018-05-11T00:00:20.631Z | 2018-05-11T02:03:14.496Z | 2:02:54 | 5/11/2018 | 5/11/2018 | 122 | -- (2X60 +2)
start and end need to contain only the mm/dd/yyyy portion of start_time and end_time respectively.
duration_minutes should calculate total duration in minutes (eg, if duration is 1:02:54, duration in minutes should be 62 which is 60+2)
How can I do this using SQL?
Based in varchar input, this query produces your desired result, exactly:
SELECT *
, to_char(start_time::timestamp, 'FMMM/DD/YYYY') AS start
, to_char(end_time::timestamp, 'FMMM/DD/YYYY') AS end
, extract(epoch FROM duration::interval)::int / 60 AS duration_minutes
FROM tbl;
Major points:
Use timestamp and interval instead of varchar to begin with.
Or do not store the functionally dependent column duration at all. It can cheaply be computed on the fly.
For display / a particular text representation use to_char().
Be explicit and do not rely on locale settings that may change from session to session.
The FM pattern modifier is for (quoting the manual):
fill mode (suppress leading zeroes and padding blanks)
extract (epoch FROM interval_tpe) produces the number of contained seconds. You want to truncate fractional minutes? Integer division does just that, so cast to int like demonstrated. Related:
Get difference in minutes between times with timezone
The following appears to do what you want:
select v.starttime::timestamp::date, v.endtime::date,
extract(epoch from v.endtime::timestamp - v.starttime::timestamp)/60
from (values ('2018-05-11T00:00:20.631Z', '2018-05-11T01:03:14.496Z')) v(starttime, endtime)
If you want the dates in a particular format, then use to_char().
I have a hive table with two rows like this:
0: jdbc:hive2://localhost:10000/default> select * from t2;
+-----+--------+
| id | value |
+-----+--------+
| 10 | 100 |
| 11 | 101 |
+-----+--------+
2 rows selected (1.116 seconds)
but when I issue a query :
select cast(1 as timestamp) from t2;
it gives out unconsistent result, can anyone tell me the reason ?
0: jdbc:hive2://localhost:10000/default> select cast(1 as timestamp) from t2;
+--------------------------+
| _c0 |
+--------------------------+
| 1970-01-01 07:00:00.001 |
| 1970-01-01 07:00:00.001 |
+--------------------------+
2 rows selected (0.913 seconds)
0: jdbc:hive2://localhost:10000/default> select cast(1 as timestamp) from t2;
+--------------------------+
| _c0 |
+--------------------------+
| 1970-01-01 08:00:00.001 |
| 1970-01-01 07:00:00.001 |
+--------------------------+
2 rows selected (1.637 seconds)
I can't reproduce your problem, which Hive version are you using? Hive had a bug with timestamp and bigint (see https://issues.apache.org/jira/browse/HIVE-3454), but it doesn't explain your problem. For example Hive 0.14 gives different results for
SELECT (cast 1 as timestamp), cast(cast(1 as double) as timestamp) from my_table limit 5;
I'm new to PostgreSQL and am using version 9.4. I'm having a table with collected measurements as strings and need to convert it to a kind of PIVOT table using something which is always up-to-date, like a VIEW.
Furthermore, some values need to be converted, e. g. multiplied by 1000, as you
can see in the example below for "sensor3".
Source Table:
CREATE TABLE source (
id bigint NOT NULL,
name character varying(255),
"timestamp" timestamp without time zone,
value character varying(32672),
CONSTRAINT source_pkey PRIMARY KEY (id)
);
INSERT INTO source VALUES
(15,'sensor2','2015-01-03 22:02:05.872','88.4')
, (16,'foo27' ,'2015-01-03 22:02:10.887','-3.755')
, (17,'sensor1','2015-01-03 22:02:10.887','1.1704')
, (18,'foo27' ,'2015-01-03 22:02:50.825','-1.4')
, (19,'bar_18' ,'2015-01-03 22:02:50.833','545.43')
, (20,'foo27' ,'2015-01-03 22:02:50.935','-2.87')
, (21,'sensor3','2015-01-03 22:02:51.044','6.56');
Source Table Result:
| id | name | timestamp | value |
|----+-----------+---------------------------+----------|
| 15 | "sensor2" | "2015-01-03 22:02:05.872" | "88.4" |
| 16 | "foo27" | "2015-01-03 22:02:10.887" | "-3.755" |
| 17 | "sensor1" | "2015-01-03 22:02:10.887" | "1.1704" |
| 18 | "foo27" | "2015-01-03 22:02:50.825" | "-1.4" |
| 19 | "bar_18" | "2015-01-03 22:02:50.833" | "545.43" |
| 20 | "foo27" | "2015-01-03 22:02:50.935" | "-2.87" |
| 21 | "sensor3" | "2015-01-03 22:02:51.044" | "6.56" |
Desired Final Result:
| timestamp | sensor1 | sensor2 | sensor3 | foo27 | bar_18 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" | | 88.4 | | | |
| "2015-01-03 22:02:10.887" | 1.1704 | | | -3.755 | |
| "2015-01-03 22:02:50.825" | | | | -1.4 | |
| "2015-01-03 22:02:50.833" | | | | | 545.43 |
| "2015-01-03 22:02:50.935" | | | | -2.87 | |
| "2015-01-03 22:02:51.044" | | | 6560.00 | | |
Using this:
-- CREATE EXTENSION tablefunc;
SELECT *
FROM
crosstab(
'SELECT
source."timestamp",
source.name,
source.value
FROM
public.source
ORDER BY
1'
,
'SELECT
DISTINCT
source.name
FROM
public.source
ORDER BY
1'
)
AS
(
"timestamp" timestamp without time zone,
"sensor1" character varying(32672),
"sensor2" character varying(32672),
"sensor3" character varying(32672),
"foo27" character varying(32672),
"bar_18" character varying(32672)
)
;
I got the result:
| timestamp | sensor1 | sensor2 | sensor3 | foo27 | bar_18 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:02:05.872" | | | | 88.4 | |
| "2015-01-03 22:02:10.887" | | -3.755 | 1.1704 | | |
| "2015-01-03 22:02:50.825" | | -1.4 | | | |
| "2015-01-03 22:02:50.833" | 545.43 | | | | |
| "2015-01-03 22:02:50.935" | | -2.87 | | | |
| "2015-01-03 22:02:51.044" | | | | | 6.56 |
Unfortunately,
the values aren't assigned to the correct column,
the columns aren't dynamic; that means the query fails when there is an additional entry in the name column like 'sensor4' and
I don't know how to change the values of some columns (multiply).
Your query works like this:
SELECT * FROM crosstab(
$$SELECT "timestamp", name
, CASE name
WHEN 'sensor3' THEN value::numeric * 1000
-- WHEN 'sensor9' THEN value::numeric * 9000 -- add more ...
ELSE value::numeric END AS value
FROM source
ORDER BY 1, 2$$
,$$SELECT unnest('{bar_18,foo27,sensor1,sensor2,sensor3}'::text[])$$
) AS (
"timestamp" timestamp
, bar_18 numeric
, foo27 numeric
, sensor1 numeric
, sensor2 numeric
, sensor3 numeric);
To multiply the value for selected columns use a "simple" CASE statement. But you need to cast to a numeric type first. Using value::numeric in the example.
Which begs the question: Why not store value as numeric type to begin with?
You need to use the version with two parameters. Detailed explanation:
PostgreSQL Crosstab Query
Truly dynamic cross tabulation tables is next to impossible, since SQL demands to know the result type in advance - at call time at the latest. But you can do something with polymorphic types:
Dynamic alternative to pivot with CASE and GROUP BY
#Erwin: It said "too long by 7128 characters" for a comment! Anyway:
Your post gave me the hints for the right direction, so thank you very much,
but particularly in my case I need it be truly dynamic. Currently I've got
38886 rows with 49 different items (= columns to be pivoted).
To first answer yours and #Jasen's urgent question:
The source table layout is not up to me, I'm already very happy to get this
data into an RDBMS. If it were to me, I'd always save UTC-timestamps! But
there's also a reason for having the data saved as strings: it may contain
various data types, like boolean, integer, float, string etc.
To avoid confusing me further I created a new demo dataset, prefixing the data
type (I know some hate this!) to avoid problems with keywords and changing the
timestamp (--> minutes) for better overview:
-- --------------------------------------------------------------------------
-- Create demo table of given schema and insert arbitrary data
-- --------------------------------------------------------------------------
DROP TABLE IF EXISTS table_source;
CREATE TABLE table_source
(
column_id BIGINT NOT NULL,
column_name CHARACTER VARYING(255),
column_timestamp TIMESTAMP WITHOUT TIME ZONE,
column_value CHARACTER VARYING(32672),
CONSTRAINT table_source_pkey PRIMARY KEY (column_id)
);
INSERT INTO table_source VALUES ( 15,'sensor2','2015-01-03 22:01:05.872','88.4');
INSERT INTO table_source VALUES ( 16,'foo27' ,'2015-01-03 22:02:10.887','-3.755');
INSERT INTO table_source VALUES ( 17,'sensor1','2015-01-03 22:02:10.887','1.1704');
INSERT INTO table_source VALUES ( 18,'foo27' ,'2015-01-03 22:03:50.825','-1.4');
INSERT INTO table_source VALUES ( 19,'bar_18','2015-01-03 22:04:50.833','545.43');
INSERT INTO table_source VALUES ( 20,'foo27' ,'2015-01-03 22:05:50.935','-2.87');
INSERT INTO table_source VALUES ( 21,'seNSor3','2015-01-03 22:06:51.044','6.56');
SELECT * FROM table_source;
Furthermore based on #Erwin's suggestions I created a view which already
converts the data type. This has the nice feature, beside being fast, to only
add required transformations for known items, but not impacting other (new)
items.
-- --------------------------------------------------------------------------
-- Create view to process source data
-- --------------------------------------------------------------------------
DROP VIEW IF EXISTS view_source_processed;
CREATE VIEW
view_source_processed
AS
SELECT
column_timestamp,
column_name,
CASE LOWER( column_name)
WHEN LOWER( 'sensor3') THEN CAST( column_value AS DOUBLE PRECISION) * 1000.0
ELSE CAST( column_value AS DOUBLE PRECISION)
END AS column_value
FROM
table_source
;
SELECT * FROM view_source_processed ORDER BY column_timestamp DESC LIMIT 100;
This is the desired result of the whole question:
-- --------------------------------------------------------------------------
-- Desired result:
-- --------------------------------------------------------------------------
/*
| column_timestamp | bar_18 | foo27 | sensor1 | sensor2 | seNSor3 |
|---------------------------+---------+---------+---------+---------+---------|
| "2015-01-03 22:01:05.872" | | | | 88.4 | |
| "2015-01-03 22:02:10.887" | | -3.755 | 1.1704 | | |
| "2015-01-03 22:03:50.825" | | -1.4 | | | |
| "2015-01-03 22:04:50.833" | 545.43 | | | | |
| "2015-01-03 22:05:50.935" | | -2.87 | | | |
| "2015-01-03 22:06:51.044" | | | | | 6560 |
*/
This is #Erwin's solution, adopted to the new demo source data. It's perfect,
as long as the items (= columns to be pivoted) doesn't change:
-- --------------------------------------------------------------------------
-- Solution by Erwin, modified for changed demo dataset:
-- http://stackoverflow.com/a/27773730
-- --------------------------------------------------------------------------
SELECT *
FROM
crosstab(
$$
SELECT
column_timestamp,
column_name,
column_value
FROM
view_source_processed
ORDER BY
1, 2
$$
,
$$
SELECT
UNNEST( '{bar_18,foo27,sensor1,sensor2,seNSor3}'::text[])
$$
)
AS
(
column_timestamp timestamp,
bar_18 DOUBLE PRECISION,
foo27 DOUBLE PRECISION,
sensor1 DOUBLE PRECISION,
sensor2 DOUBLE PRECISION,
seNSor3 DOUBLE PRECISION
)
;
When reading through the links #Erwin provided, I found a Dynamic SQL example
by #Clodoaldo Neto and remembered, that I had already done it this way in
Transact-SQL; this is my attempt:
-- --------------------------------------------------------------------------
-- Dynamic attempt based on:
-- http://stackoverflow.com/a/12989297/131874
-- --------------------------------------------------------------------------
DO $DO$
DECLARE
list_columns TEXT;
BEGIN
DROP TABLE IF EXISTS temp_table_pivot;
list_columns := (
SELECT
string_agg( DISTINCT column_name, ' ' ORDER BY column_name)
FROM
view_source_processed
);
EXECUTE(
FORMAT(
$format_1$
CREATE TEMP TABLE
temp_table_pivot(
column_timestamp TIMESTAMP,
%1$s
)
$format_1$
,
(
REPLACE(
list_columns,
' ',
' DOUBLE PRECISION, '
) || ' DOUBLE PRECISION'
)
)
);
EXECUTE(
FORMAT(
$format_2$
INSERT INTO temp_table_pivot
SELECT
*
FROM crosstab(
$crosstab_1$
SELECT
column_timestamp,
column_name,
column_value
FROM
view_source_processed
ORDER BY
column_timestamp, column_name
$crosstab_1$
,
$crosstab_2$
SELECT DISTINCT
column_name
FROM
view_source_processed
ORDER BY
column_name
$crosstab_2$
)
AS
(
column_timestamp TIMESTAMP,
%1$s
);
$format_2$
,
REPLACE( list_columns, ' ', ' DOUBLE PRECISION, ')
||
' DOUBLE PRECISION'
)
);
END;
$DO$;
SELECT * FROM temp_table_pivot ORDER BY column_timestamp DESC LIMIT 100;
Beside getting this into a stored procedure, I will, for performance reasons,
try to adopt this to an intermediate table where only new values are inserted.
I'll keep you up-to-date!
Thanks!!!
L.
PS: NO, I don't want to answer my own question, but the "comment"-field is too small!