A Database DateTime Value Conflict - sql

I have hourly price data for 10 years. Meaning, 24 prices for each day.
The problem is, the price is from the previous hour of trading. So, the source of my data has listed a 24th hour for each day, and there is no 0 hour.
Example (for further clarity):
The records for a day start at: 07/20/2010 01:00:00
The records for a day end at: 07/20/2010 24:00:00
This conflicts with the way my Rails Apps PostgreSQL DB wants to save DateTime value. When I imported this data from CSV into my DB and saved the dates into a DateTime column, it changed all of the 24:00:00 into 00:00:00 of the following day. This throws off the accuracy of my various end-uses.
Is there anyway I can modify my Postgres DB's behavior to not do this? Any other suggestions?

You could always subtract an hour after you perform the import.
I don't know your database schema so to do this in a general fashion you'd have to execute this SQL on each column that has a date.
UPDATE table SET date_field = date_field - INTERVAL '1 hour'

Related

Hive - How to query a unix timestamp to identify yesterday's values?

I have the following problem to solve. I have a hive table, that store events, and each event timestamp is stored as unix timestamp (e.g. 1484336244).
Every day I want to run a query that fetches yesterdays events.
How could I form this query in Hive?
So for example, today is the 9th February, I want to get only the events that occurred on the 8th February.
Subtract one day from current_date and compare it with the column converted to yyyy-MM-dd format.
date_add(current_date,-1) = from_unixtime(colName,'yyyy-MM-dd')

Get timestamp of one month ago in PostgreSQL

I have a PostgreSQL database in which one table rapidly grows very large (several million rows every month or so) so I'd like to periodically archive the contents of that table into a separate table.
I'm intending to use a cron job to execute a .sql file nightly to archive all rows that are older than one month into the other table.
I have the query working fine, but I need to know how to dynamically create a timestamp of one month prior.
The time column is stored in the format 2013-10-27 06:53:12 and I need to know what to use in an SQL query to build a timestamp of exactly one month prior. For example, if today is October 27, 2013, I want the query to match all rows where time < 2013-09-27 00:00:00
Question was answered by a friend in IRC:
'now'::timestamp - '1 month'::interval
Having the timestamp return 00:00:00 wasn't terrible important, so this works for my intentions.
select date_trunc('day', NOW() - interval '1 month')
This query will return date one month ago from now and round time to 00:00:00.
When you need to query for the data of previous month, then you need to query for the respective date column having month values as (current_month-1).
SELECT *
FROM {table_name}
WHERE {column_name} >= date_trunc('month', current_date-interval '1' month)
AND {column_name} < date_trunc('month', current_date)
The first condition of where clause will search the date greater than the first day (00:00:00 Day 1 of Previous Month)of previous month and second clause will search for the date less than the first day of current month(00:00:00 Day 1 of Current Month).
This will includes all the results where date lying in previous month.

An Advanced Query Date Grouping Dilemna

In my Rails app's PostgreSQL DB are records containing hourly prices for the last 10 years:
10(24 x 365) of these: "12/31/2012 01:00:00", "11.99"
The following query, groups prices by day, averages the prices in those daily groupings to create daily price averages, and returns "day", "daily average" pairs for each day:
HourlyPrice.average(:price, :group => "DATE_TRUNC('day', date)")
The problem is, the hourly prices in my source data actually reflect the price for the previous hour. So, in my data source .CSV, the day starts at the time 01:00:00 and ends at the time 24:00:00.
This conflicts with how PostgreSQL likes to save records in its DateTime column. Upon importing the CSV data, PostgreSQL converts my records containing the time 24:00:00 to 00:00:00 of the next day.
This throws off the accuracy of my Averaging Query above. To fix the query, I still want to group by day, but offset 1 hour. So, that the range averaged starts at 01:00:00 and ends with the 00:00:00 value of the next day.
Is it possible to adjust the above query to reflect this?
You could subtract one hour from date before applying the DATE_TRUNC function to it, like this:
HourlyPrice.average(:price, :group => "DATE_TRUNC('day', date - INTERVAL '1 hour')")

Simple Averaging Algorithm is Slightly Off. Why? Active Record/PostgreSQL issue?

In my Rails app, I have two custom Rake Tasks running every 30 minutes. Task A scrapes hourly prices from the internet and saves them to a database as HourlyPrice. Task B goes into the db, takes hourly prices from each day for the last seven days, and averages them to create a new DailyAveragePrice record in a separate DB Table.
However, when running Task B, the last day's (of the seven) average price is incorrect.
After fiddling with the hourly prices of that day in an Excel spreadsheet, I see that the average price Task B is generating is the result of taking only the last three hours and averaging them.
Task B is mostly done with this single query:
averages = HourlyPrice.where('date >= ?', 7.days.ago).average(:price, :group => "DATE_TRUNC('day', date - INTERVAL '1 hour')")
I can't figure out why this is happening?
Clues
HourlyPrice has two attributes (datetime,price). Each HourlyPrice actually represents a price for the previous hour. So, source data lists a 24:00:00 price for each day which PostgreSQL does not want to import as is into a datetime column. Instead, it converts all 24:00:00 prices to 00:00:00 of the next day. To make up for this, I've tried to subtract an hour interval, as you can see in the query. Is this causing the problem?
My ActiveRecord's time zone is currently set to 'Mountain Time (US & Canada)'. That is where the price exchange is located. I have not adjusted my PostgreSQL DB's timezone, and I believe it defaults to UTC. When running Task B, I noticed that it was 9:20PM UTC, leaving three hours left in the UTC day, which might explain the averaging of only three HourlyPrices of the last of the seven days. I'll try running Task B again in the next hour to see if it will average only two hours. Update to come... Is this timezone conflict causing a problem, or is what I am doing insulated from timezones since I have my own date columns?
UPDATE - Problem identified, but how to fix?
Clue #2 is correct. It is a timezone issue. I just ran Task B again (an hour later, with 2 hours left until UTC day change), and it only averages two HourlyPrices now for the last of the seven days.
How can I fix my query above to average ONLY if there are 24 HourlyPrice records available?

time stamp field to output records from last 24 hours

I need to make an Access query output records that were only from last 24 hours. The field called " SYSADM_CUSTOMER_ORDER.CREATE_DATE" is the time-stamp field. I cant use the criteria ">date()-1", because that would give me records from after 12AM the previous day and I need to run the query at 4PM every day and only output records from after 4PM the previous day. Please give me the preoper SQL for me to copy and paste, based on my SQL below. thank you very much, Nathaniel
SELECT , SYSADM_CUSTOMER_ORDER.ID
FROM SYSADM_CUSTOMER_ORDER;
I think you should probably be using now() - 1, something like:
select * from sysadm_customer_order where create_date > now() - 1;
The date function returns the date with an implicit time of 00:00:00. You want now() which gives you both current date and time.