How is deprecated.legacy-timestamp supposed to work in Presto 0.220? - amazon-emr

I'm encountering an issue with reading timestamps properly, without any automatic conversions on Presto on EMR.
Example: within the AWS Glue catalog, i have a table with timestamp columns in UTC time (data type timestamp). When querying in Athena, they return as expected. When querying in Presto on EMR (EMR 5.26, Presto 0.220), there is an automatic conversion happening to a different time zone.
Presto docs describe a method of disabling this behavior here - https://prestosql.io/docs/current/language/timestamp.
The legacy semantics can be enabled using the deprecated.legacy-timestamp config property. Setting it to true (the default) enables the legacy semantics, whereas setting it to false enables the new semantics.
They outline their result differences with this option set to true vs false at the bottom
Query: SELECT TIME '10:00:00 Asia/Kathmandu' AT TIME ZONE 'UTC'
Legacy result: 04:30:00.000 UTC
New result: 04:15:00.000 UTC
After including deprecated.legacy-timestamp set to true in my EMR config (within presto-config), I'm still getting the new result according to this test query, (and my UTC timestamps are still being auto converted).
Any suggestions on what else i need to do to enable the legacy timestamp behavior?

Legacy timestamp behavior is still the default, you can track current state at https://github.com/prestosql/presto/issues/37. Apparently Athena evaluates timestamps as Presto would do when run with UTC session zone.
Since Presto 317 you can force client session zone with a config property:
sql.forced-session-time-zone=UTC
For all Presto versions, you can set client session zone. How to do this depends on the particular client in use. For example, with presto-cli you would typically do
java -Duser.timezone=UTC -jar presto-cli.jar

Related

Why does time_bucket_gapfill consistently throw a PSQLException, at 10th request?

I have a Spring Boot application, with a REST service where I use JdbcTemplate. Then I have a PostgreSQL with TimescaleDB (verson 2.3.0) where the data is stored.
In one of my endpoinds, I call the following code to get some timestamps from the database in the clients local time zone:
SELECT time_bucket_gapfill(CAST(:numOfHours * INTERVAL '1 hour' AS INTERVAL),
timestamp AT TIME ZONE 'UTC' AT TIME ZONE :timezone,
:start AT TIME ZONE :timezone, :end AT TIME ZONE :timezone) AS time
FROM info
WHERE timestamp >= :start AND timestamp < :end
GROUP BY time
When I call that specific endpoint it returns the data perfectly the first 9 times, and then on the 10th time, it throws the following SQL error:
ERROR: invalid time_bucket_gapfill argument: start must be a simple expression
The TimescaleDB manual states:
Note that explicitly provided start and stop or derived from WHERE clause values need to be simple expressions. Such expressions should be evaluated to constants at the query planning. For example, simple expressions can contain constants or call to now(), but cannot reference to columns of a table.
What they are trying to say is that these arguments must be constants. So you cannot use a parameter here.
Why this works for the first 10 executions is because of the way the JDBC driver and PostgreSQL handle these parameters:
for the first 5 executions of the JDBC java.sql.PreparedStatement, the PostgreSQL driver interpolates the parameters into the query and sends a simple query string
from the sixth execution on, the JDBC driver deems it worth creating a named prepared statement in PostgreSQL
during the first five executions of that prepared statement, PostgreSQL generates a custom plan that uses the actual parameter values
only from the sixth execution on, PostgreSQL will consider a generic plan, where the parameters are placeholders
Executing this generic plan causes the TimescaleDB error.
So there are several remedies:
Don't use parameters there. That is the best and most reliable solution. However, then you cannot use a prepared statement, but you have to construct the query string each time (dynamic SQL).
See the JDBC driver documentation:
The driver uses server side prepared statements by default when PreparedStatement API is used. In order to get to server-side prepare, you need to execute the query 5 times (that can be configured via prepareThreshold connection property). An internal counter keeps track of how many times the statement has been executed and when it reaches the threshold it will start to use server side prepared statements.
That allows you to work around the problem by setting prepareThreshold to 0, at the price of worse performance.
Set the PostgreSQL parameter plan_cache_mode to force_custom_plan to avoid the use of generic plans. This could affect your overall performance negatively.
All three solutions reduce the effectiveness of prepared statements, but that's the price you have to pay to work around this limitation of TimescaleDB.

PostgreSQL TIMEZONE configuration

I'm having the following problem:
I need to configure the TIMEZONE of my PostgreSQL installation, because from different terminals I'm obtaining different results when converting timestamps to dates.
I have read that the command to change the time zone is: SET TIMEZONE = 'xxx'.
However, from one terminal I can set the parameter without problems, but from the production server, whenever I set the timezone and I query with SELECT current_setting('TIMEZONE'); I obtain UTC (which is not the time zone I'm setting it to).
It seems to not follow the command and keep the value it has already configured.
Any reason why such a behaviour could be occurring? Am I operation under some false assumption?
You must be doing something wrong, like querying from different connections. The SET command changes the setting only for the current database session. Perhaps you are using a connection pool, then you will have to set the parameter every time you get a connection from the pool.

SQL GetDate() returns wrong time

I am having an issue while using GetDate(), for some reason is not returning the right time (it is 7 hours ahead from the actual time) I am using AZURE and the Database is configured with the right location (West US). I will appreciate any help!
I tried to run this script:
SELECT id,
status,
AcceptedDate,
Getdate(),
Datediff(hour, AcceptedDate, Getdate())
FROM orderoffers
WHERE status = 'Accepted'
Azure SQL Databases are always UTC, regardless of the data center. You'll want to handle time zone conversion at your application.
In this scenario, since you want to compare "now" to a data column, make sure AcceptedDate is also stored in UTC.
Reference
The SQL databases on the Azure cloud are pegged against Greenwich Mean Time(GMT) or Coordinated Universal Time(UTC) however many applications are using DateTime.Now which is the time according to the regional settings specified on the host machine.
Sometimes this is not an issue when the DateTime is not used for any time spanning or comparisons and instead for display only. However if you migrate an existing Database to SQL Azure using the dates populated via GETDATE() or DateTime.Now you will have an offset, in your case it’s 7 hours during Daylight Saving Time or 8 hours during Standard Time.
I created a simple function that returns the correct UK time whether in DST or not.
It can be adapted for other time zones where DST kicks in.
CREATE FUNCTION [dbo].[f_CurrentDateTime]() RETURNS DATETIME AS
BEGIN
RETURN DATEADD(HOUR,CONVERT(INT,(SELECT is_currently_dst FROM sys.time_zone_info WHERE 1=1 AND NAME = 'GMT Standard Time')),GETDATE())
END
In this modern times where infrastructure is scaled globally, it is good idea to save data in UTC and convert to a timezone based on users location preference.
Please refere: https://learn.microsoft.com/en-us/dotnet/api/system.datetime.utcnow?view=netframework-4.7.2

SQL may UTC +8 hour issue

........
Where
(microsdb.MENU_ITEM_DETAIL.CheckDetailID = microsdb.CHECK_DETAIL.CheckDetailID Or
microsdb.DISCOUNT_DETAIL.CheckDetailID = microsdb.CHECK_DETAIL.CheckDetailID) And
microsdb.CHECKS.CheckOpen = CONVERT(CHAR(23), CURRENT_TIMESTAMP, 25)
**Return no result.
Field Data Type
microsdb.CHECKS.CheckOpen (datetime, not null)
CheckOpen 2013-04-08 06:29:26.000
I wondered why my CheckOpen time always 8 hours early than my server time.
Please advise.
Thanks
More than likely, when you stored data into the CheckOpen column of your CHECKS table you parsed it (or read it) directly from a client machine or client interface using their time-zone of US/Pacific.
Later, when you read CURRENT_TIME from your DB server you got the system time for that machine in UTC (since the machine was setup to use UTC by your server admin).
So, the two times are 8 hours off. UTC (GMT) is 8 hours ahead of US/Pacific.
Generally, if a client machine gives you data, you need to parse it, validate it, and sometimes translate it to valid server values or be aware when it's stored that it's only a "client" value. For date/time values, either convert to UTC or be sure to store the "offset" with the stored time. (actually it can be good to store offset even after converting to UTC)

Having an issue with timestamps from my oracle server being in different format on different clients despite

Having an issue with vb.net clients interpreting timestamps from my oracle 11g server differently. I am using VS2010, VB.NET, 11.2.0.30 driver, Win XP on the client machines.
the software is identical on both machines, and the users have the same permissions on the server.
When I login through toad with each users credentials and do a SELECT SYSDATE FROM DUAL; I get YYYY-MM-DD HH24:MI:SS format.
through my application when i do a SELECT SYSDATE FROM DUAL;
PC1: DD/MM/YYYY HH24:MI:SS
PC2: YYYY-MM-DD HH24:MI:SS
I did change my serverside settings to use the YYYY-MM-DD 24HH:MI:SS format
Why is the setting being overruled on only some of the pcs? and how can i get it consistent for all clients?
thanks in advance.
Each client has its own NLS_DATE_FORMAT setting, which might be set somewhere visible (e.g. in Toad), or might be defaulted by the platform or driver, and might be inherited indirectly through locale settings. Having NLS_TIMESTAMP_FORMAT overridden by default is less common I think, but I'm not sure if you're referring to Oracle TIMESTAMP or DATE, which also has a time component.
The client setting always overrides the server setting, so changing the server is unlikely to help, and you might just have some clients that are already set with a matching format. The client setting can itself be explicitly overridden after you connect at session level with an alter session command, or individually in SQL statements. The precedence is shown in the globalization support guide.
In general you should never rely on implicit data conversions. Other than in simple ad hoc queries, date are normally selected for display using an explicit date format mask, e.g. select to_char(sysdate, 'YYYY-MM-DD HH24:MI:SS') from dual. If you have an application that expects a date to be returned (as a string) in a particular format then it's safer to do that... though normally you'd pull a DATE back and let the client do the formatting anyway.
It's arguably even more important to always specify a format mask when inserting data from a string with a to_date() call, but again your client would normally handle that and pass a DATE.