Standard SQL - not able to pull actual UTC time in days/hours - sql

I am having trouble pulling the below data in actual UTC time. I would like the day, hour, and ticket count but for some reason it does not seem to be pulling in UTC (data doesn't make sense). It is also pulling a few hours of 7-15 and only goes as far as 8-25
Does it have to do with a setting that I'm not aware of? Any ideas would be greatly appreciated!
Query below:
SET timezone "UTC";
SELECT
day,
CONCAT(cast(hour as STRING),':00') as hour,
COUNT(DISTINCT units) as count
FROM
tableABC
WHERE
created.timestamp BETWEEN TIMESTAMP("2018-07-16 00:00:00 UTC")
AND TIMESTAMP("2018-08-26 00:00:00 UTC")

I figured it out:
FORMAT_TIMESTAMP('%Y-%m-%d',timestamp, "UTC") as day,
CONCAT(FORMAT_TIMESTAMP('%H',timestamp, "UTC"),':00') as hour,
Thank you Zaynul.
dbms was bigquery

Related

Date_diff with specific condition time start and time end

is it possible to have date_diff with specific start and end time?
let say my store are open from 8AM - 10PM, which is 14 Hours.
and I have a lot of stuff to sell during that time. One of the SKU is out of stock from 2022-11-01 06.00 PM until tomorrow 2022-11-02 11.00 AM.
Instead of calculate 24 hours, I just want to calculate only from opening store until it closed or until its restock. Meaning from 6PM to 11AM is 8 Hours
my query
select date_diff('2022-11-02 11.00 AM', '2022-11-02 06.00 PM', hour) from table
with the result 17 hours instead of 8 hours
There isn't a way to configure DATE_DIFF to do this for you, but it's possible to do what you want, with some effort.
You should convert your dates to timestamps (TIMESTAMP(yourdate) or CAST(yourdate AS TIMESTAMP)) and use TIMESTAMP_DIFF instead.
This will allow you to work with smaller intervals than days.
For your calculation, you ultimately need to find the total time difference between the two timestamps and then subtract the out-of-hours timeframe.
However, calculating the latter is not as simple as taking the difference in days and multiplying by 8 hours (10pm-6am), because your out-of-hours calculation has to account for weekends and possibly holidays etc. Hence it can get quite complex, which is where the solution in my first link might come in.

Counting variable by day and time with day and time in column header

I am looking to write a query that gives a count of purchases grouped by day and hour (from variable of date and time of purchase).
However, column headers should contain the date and hour as such:
ID
Tuesday, 11-12
Tuesday, 12-13
Xxxxx
4
6
Xxxxx
1
8
Variables include ID, Date of purchase (DD-MM-YY timestamp), QTY
Having done some reading, I am not entirely convinced this is possible? But am unsure and could be misinformed.
Thanks for your help in advance. Any suggestions would be greatly appreciated.
If any more information is needed, please let me know.
If you're using SQL Server, you can solve this with the PIVOT clause. Derive the unique dates you want in the query, then use PIVOT to put across into columns. I'd provide an example but I don't have the time this moment.

Incorrect date difference in seconds in Hive

I am trying to calculate the difference in seconds between 2 dates in hive. I found that one of the records is being calculated incorrectly, and I can't understand why or how to fix it.
The example is as follows:
select '2020-03-08 03:00:48' as stop_time,
UNIX_TIMESTAMP('2020-03-08 03:00:48') as stop_timestamp,
'2020-03-08 02:45:03' as start_time,
UNIX_TIMESTAMP('2020-03-08 02:45:03') as start_timestamp,
UNIX_TIMESTAMP('2020-03-08 03:00:48') - UNIX_TIMESTAMP('2020-03-08 02:45:03') as difference
I am getting a result of -2,655 instead of +945
Any advise?
Thank you!
This happens because of daylight savings time change. The place where you're located has day lights savings time that changed during 8 March, 2020 at the same hour. So, it is calculating different timestamp for start_timestamp.

Simple Averaging Algorithm is Slightly Off. Why? Active Record/PostgreSQL issue?

In my Rails app, I have two custom Rake Tasks running every 30 minutes. Task A scrapes hourly prices from the internet and saves them to a database as HourlyPrice. Task B goes into the db, takes hourly prices from each day for the last seven days, and averages them to create a new DailyAveragePrice record in a separate DB Table.
However, when running Task B, the last day's (of the seven) average price is incorrect.
After fiddling with the hourly prices of that day in an Excel spreadsheet, I see that the average price Task B is generating is the result of taking only the last three hours and averaging them.
Task B is mostly done with this single query:
averages = HourlyPrice.where('date >= ?', 7.days.ago).average(:price, :group => "DATE_TRUNC('day', date - INTERVAL '1 hour')")
I can't figure out why this is happening?
Clues
HourlyPrice has two attributes (datetime,price). Each HourlyPrice actually represents a price for the previous hour. So, source data lists a 24:00:00 price for each day which PostgreSQL does not want to import as is into a datetime column. Instead, it converts all 24:00:00 prices to 00:00:00 of the next day. To make up for this, I've tried to subtract an hour interval, as you can see in the query. Is this causing the problem?
My ActiveRecord's time zone is currently set to 'Mountain Time (US & Canada)'. That is where the price exchange is located. I have not adjusted my PostgreSQL DB's timezone, and I believe it defaults to UTC. When running Task B, I noticed that it was 9:20PM UTC, leaving three hours left in the UTC day, which might explain the averaging of only three HourlyPrices of the last of the seven days. I'll try running Task B again in the next hour to see if it will average only two hours. Update to come... Is this timezone conflict causing a problem, or is what I am doing insulated from timezones since I have my own date columns?
UPDATE - Problem identified, but how to fix?
Clue #2 is correct. It is a timezone issue. I just ran Task B again (an hour later, with 2 hours left until UTC day change), and it only averages two HourlyPrices now for the last of the seven days.
How can I fix my query above to average ONLY if there are 24 HourlyPrice records available?

A Database DateTime Value Conflict

I have hourly price data for 10 years. Meaning, 24 prices for each day.
The problem is, the price is from the previous hour of trading. So, the source of my data has listed a 24th hour for each day, and there is no 0 hour.
Example (for further clarity):
The records for a day start at: 07/20/2010 01:00:00
The records for a day end at: 07/20/2010 24:00:00
This conflicts with the way my Rails Apps PostgreSQL DB wants to save DateTime value. When I imported this data from CSV into my DB and saved the dates into a DateTime column, it changed all of the 24:00:00 into 00:00:00 of the following day. This throws off the accuracy of my various end-uses.
Is there anyway I can modify my Postgres DB's behavior to not do this? Any other suggestions?
You could always subtract an hour after you perform the import.
I don't know your database schema so to do this in a general fashion you'd have to execute this SQL on each column that has a date.
UPDATE table SET date_field = date_field - INTERVAL '1 hour'