Group by day from nanosecond timestamp - sql

I have a table column transaction_timestamp storing timestamps as epochs with nanosecond resolution.
How do I group and/or count by day? I guess I have to convert the nanosecond timestamp to milliseconds first. How can I do that?
I tried:
SELECT DATE_TRUNC('day', CAST((transaction_timestamp /pow(10,6))as bigint)), COUNT(*)
FROM transaction
GROUP BY DATE_TRUNC('day', transaction_timestamp)
which is does not work:
error: function date_trunc(unknown, bigint) does not exist
I also tried this:
SELECT DATE_TRUNC('day', to_timestamp(transaction_timestamp / 1000000000.0)),
COUNT(*)
FROM transaction
GROUP BY DATE_TRUNC('day', transaction_timestamp)

Basic conversion as instructed here:
What kind of datestyle can this be?
Repeat the same expression in GROUP BY, or use a simple positional reference, like:
SELECT date_trunc('day', to_timestamp(transaction_timestamp / 1000000000.0))
, count(*)
FROM transaction
GROUP BY 1;
Be aware that to_timestamp() assumes UTC time zone for the given epoch to produce a timestamp with time zone (timestamptz). The following date_trunc() then uses the timezone setting of your current session to determine where to truncate "days". You may want to define a certain time zone explicitly ...
Basics:
Ignoring time zones altogether in Rails and PostgreSQL
Typically, it's best to work with a proper timestamptz to begin with. Unfortunately, Postgres timestamps only offer microsecond resolution. Since you need nanoseconds, your approach seems justified.

Related

NUMTODSINTERVAL in Redshift. Convert a number to hours

My goal is to offset timestamps in table Date_times to reflect local timezones. I have a Timezone_lookup table that I use for that, which has a column utc_convert and its values are (2, -1, 5, etc.) depending on the timezone.
I used to use NUMTODSINTERVAL in Oracle to be able to convert the utc_convert values to hours so I can add/subtract from the datetimes in the Date_times table.
For Redshift I found INTERVAL, but that's only hardcoding the offset with a specific number.
I also tried:
SELECT CAST(utc as TIME)
FROM(
SELECT *
,to_char(cast(utc_convert as int)||':00:00', 'HH24') as utc
from Timezon_lookup
)
But this doesn't work as some number in the utc_convert column have negative values. Any ideas?
Have you tried multiplying the offset by an interval:
select current_timestamp + utc_convert * interval '1 hour'
In Oracle, you can use the time zone of the user's session (which means you do not need to maintain a table of time zone look-ups or compensate for daylight savings time).
SELECT FROM_TZ( your_timestamp_column, 'UTC' ) AT LOCAL
FROM Date_times
SQLFIDDLE
In RedShift you should be able to use the CONVERT_TIMEZONE( ['source_timezone',] 'target_timezone', 'timestamp') function rather adding a number of intervals. This would allow you to specify the target_timezone as a numeric offset from UTC or as a time zone name (which would automatically compensate for DST).

google bigquery select from a timestamp column between now and n days ago

I have a dataset in bigquery with a TIMESTAMP column "register_date" (sample value "2017-11-19 22:45:05.000 UTC" ).
I need to filter records based on x days or weeks before today criteria.
Example query
select all records which are 2 weeks old.
Currently I have this query (which I feel like a kind of hack) that works and returns the correct results
SELECT * FROM `my-pj.my_dataset.sample_table`
WHERE
(SELECT
CAST(DATE(register_date) AS DATE)) BETWEEN DATE_ADD(CURRENT_DATE(), INTERVAL -150 DAY)
AND CURRENT_DATE()
LIMIT 10
My question is do I have to use all that CASTing stuff on a TIMESTAMP column (which seems like over complicating the otherwise simple query)?
If I remove the CASting part, my query doesn't run and returns error.
Here is my simplified query
SELECT
*
FROM
`my-pj.my_dataset.sample_table`
WHERE
register_date BETWEEN DATE_ADD(CURRENT_DATE(), INTERVAL -150 DAY)
AND CURRENT_DATE()
LIMIT
10
that results into an error
Query Failed
Error: No matching signature for operator BETWEEN for argument types: TIMESTAMP, DATE, DATE. Supported signature: (ANY) BETWEEN (ANY) AND (ANY) at [6:17]
any insight is highly appreciated.
Use timestamp functions:
SELECT t.*
FROM `my-pj.my_dataset.sample_table` t
WHERE register_date BETWEEN TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -150 DAY) AND CURRENT_TIMESTAMP()
LIMIT 10;
BigQuery has three data types for date/time values: date, datetime, and timestamp. These are not mutually interchangeable. The basic idea is:
Dates have no time component and no timezone.
Datetimes have a time component and no timezone.
Timestamp has both a time component and a timezone. In fact, it represents the value in UTC.
INTERVAL values are defined in gcp documentation
Conversion between the different values is not automatic. Your error message suggests that register_date is really stored as a Timestamp.
One caveat (from personal experience): the definition of day is based on UTC. This is not much of an issue if you are in London. It can be a bigger issue if you are in another time zone and you want the definition of "day" to be based on the local time zone. If that is an issue for you, ask another question.

In Postgres, how do you extract the month (according to specific timezone) from a given TIMESTAMP WITH TIME ZONE column?

I have a column called login_timestamp, which is of type TIMESTAMP WITH TIME ZONE.
To retrieve the month for this timestamp, I would do: EXTRACT(MONTH FROM login_timestamp).
However, I would like to retrieve the month for a specific time zone (in my case, Pakistan), but can't figure out how to do that.
Documentation for this is under Date/Time Functions and Operators. Search that page for "at time zone".
select extract(month from login_timestamp at time zone 'Asia/Karachi');
You can change the time zone for a single session or for a single transaction with set session... or set local.... For example, this changes the time zone for the current session.
set session time zone 'Asia/Karachi';
Use the AT TIME ZONE construct:
SELECT EXTRACT(MONTH FROM login_timestamp AT TIME ZONE '-5');
-5 is the constant offset for Pakistan.
Details:
Ignoring timezones altogether in Rails and PostgreSQL
Try applying AT TIME ZONE. Demo
select extract(month from cast ('2017-07-01 01:00+03' as TIMESTAMP WITH TIME ZONE) AT TIME ZONE '+08') as monthNo
returns
monthno
1 6

How to get the date and time from timestamp in PostgreSQL select query?

How to get the date and time only up to minutes, not seconds, from timestamp in PostgreSQL. I need date as well as time.
For example:
2000-12-16 12:21:13-05
From this I need
2000-12-16 12:21 (no seconds and milliseconds only date and time in hours and minutes)
From a timestamp with time zone field, say update_time, how do I get date as well as time like above using PostgreSQL select query.
Please help me.
There are plenty of date-time functions available with postgresql:
See the list here
http://www.postgresql.org/docs/9.1/static/functions-datetime.html
e.g.
SELECT EXTRACT(DAY FROM TIMESTAMP '2001-02-16 20:38:40');
Result: 16
For formatting you can use these:
http://www.postgresql.org/docs/9.1/static/functions-formatting.html
e.g.
select to_char(current_timestamp, 'YYYY-MM-DD HH24:MI') ...
To get the date from a timestamp (or timestamptz) a simple cast is fastest:
SELECT now()::date
You get the date according to your local time zone either way.
If you want text in a certain format, go with to_char() like #davek provided.
If you want to truncate (round down) the value of a timestamp to a unit of time, use date_trunc():
SELECT date_trunc('minute', now());
This should be enough:
select now()::date, now()::time
, pg_typeof(now()), pg_typeof(now()::date), pg_typeof(now()::time)

Generating time series between two dates in PostgreSQL

I have a query like this that nicely generates a series of dates between 2 given dates:
select date '2004-03-07' + j - i as AllDate
from generate_series(0, extract(doy from date '2004-03-07')::int - 1) as i,
generate_series(0, extract(doy from date '2004-08-16')::int - 1) as j
It generates 162 dates between 2004-03-07 and 2004-08-16 and this what I want. The problem with this code is that it wouldn't give the right answer when the two dates are from different years, for example when I try 2007-02-01 and 2008-04-01.
Is there a better solution?
Can be done without conversion to/from int (but to/from timestamp instead)
SELECT date_trunc('day', dd):: date
FROM generate_series
( '2007-02-01'::timestamp
, '2008-04-01'::timestamp
, '1 day'::interval) dd
;
To generate a series of dates this is the optimal way:
SELECT t.day::date
FROM generate_series(timestamp '2004-03-07'
, timestamp '2004-08-16'
, interval '1 day') AS t(day);
Additional date_trunc() is not needed. The cast to date (day::date) does that implicitly.
But there is also no point in casting date literals to date as input parameter. Au contraire, timestamp is the best choice. The advantage in performance is small, but there is no reason not to take it. And you do not needlessly involve DST (daylight saving time) rules coupled with the conversion from date to timestamp with time zone and back. See below.
Equivalent, less explicit short syntax:
SELECT day::date
FROM generate_series(timestamp '2004-03-07', '2004-08-16', '1 day') day;
Or with the set-returning function in the SELECT list:
SELECT generate_series(timestamp '2004-03-07', '2004-08-16', '1 day')::date AS day;
The AS keyword is required in the last variant, Postgres would misinterpret the column alias day otherwise. And I would not advise that variant before Postgres 10 - at least not with more than one set-returning function in the same SELECT list:
What is the expected behaviour for multiple set-returning functions in SELECT clause?
(That aside, the last variant is typically fastest by a tiny margin.)
Why timestamp [without time zone]?
There are a number of overloaded variants of generate_series(). Currently (Postgres 11):
SELECT oid::regprocedure AS function_signature
, prorettype::regtype AS return_type
FROM pg_proc
where proname = 'generate_series';
function_signature | return_type
:-------------------------------------------------------------------------------- | :--------------------------
generate_series(integer,integer,integer) | integer
generate_series(integer,integer) | integer
generate_series(bigint,bigint,bigint) | bigint
generate_series(bigint,bigint) | bigint
generate_series(numeric,numeric,numeric) | numeric
generate_series(numeric,numeric) | numeric
generate_series(timestamp without time zone,timestamp without time zone,interval) | timestamp without time zone
generate_series(timestamp with time zone,timestamp with time zone,interval) | timestamp with time zone
(numeric variants were added with Postgres 9.5.) The relevant ones are the last two in bold taking and returning timestamp / timestamptz.
There is no variant taking or returning date. An explicit cast is needed to return date. The call with timestamp arguments resolves to the best variant directly without descending into function type resolution rules and without additional cast for the input.
timestamp '2004-03-07' is perfectly valid, btw. The omitted time part defaults to 00:00 with ISO format.
Thanks to function type resolution we can still pass date. But that requires more work from Postgres. There is an implicit cast from date to timestamp as well as one from date to timestamptz. Would be ambiguous, but timestamptz is "preferred" among "date/time types". So the match is decided at step 4d.:
Run through all candidates and keep those that accept preferred types
(of the input data type's type category) at the most positions where
type conversion will be required. Keep all candidates if none accept
preferred types. If only one candidate remains, use it; else continue
to the next step.
In addition to the extra work in function type resolution this adds an extra cast to timestamptz - which not only adds more cost, it can also introduce problems with DST leading to unexpected results in rare cases. (DST is a moronic concept, btw, can't stress this enough.) Related:
How do I generate a date series in PostgreSQL?
How do I generate a time series in PostgreSQL?
I added demos to the fiddle showing the more expensive query plan:
db<>fiddle here
Related:
Is there a way to disable function overloading in Postgres
Generate series of dates - using date type as input
Postgres data type cast
You can generate series directly with dates. No need to use ints or timestamps:
select date::date
from generate_series(
'2004-03-07'::date,
'2004-08-16'::date,
'1 day'::interval
) date;
You can also use this.
select generate_series ( '2012-12-31'::timestamp , '2018-10-31'::timestamp , '1 day'::interval) :: date