Is 'YYYYQ' a valid DATETIME format for SQL? And if so, how do I make it with my data? - sql

I have some tables in a postgres that have a column for year and a column for quarter (both stored as bigint). I need to be able to combine those together in the output of a query in the form of 'YYYYQ' (not the hard part) AND have the datatype of that field be datetime (<--the hard part).
The only query I have attempted that didn't fail was -
SELECT to_date((year::VARCHAR + quarter::VARCHAR),'YYYYQ') AS Stuff
FROM company.products
And while the output is in DATETIME format, there is no Quarter info in it.
Sample -
stuff
2011-01-01
2011-01-01
2012-01-01
2012-01-01
2012-01-01
Is it even possible to create output that has the format 'YYYYQ' AND is in DATETIME format? And if so, how?

From the PostgreSQL docs (emphasis mine):
In to_timestamp and to_date, weekday names or numbers (DAY, D, and related field types) are accepted but are ignored for purposes of computing the result. The same is true for quarter (Q) fields.

You can save the date of the 1st day of the quarter. Multiply the recorded quarter -1 by 3.
SELECT to_date('2021','YYYY') + interval '6 month';
?column?
---------------------
2021-07-01 00:00:00
SELECT to_char(to_date('2021','YYYY') + interval '6 month','YYYYQ');
to_char
---------
20213
SELECT q,
to_char(to_date('2021','YYYY') + interval '3 month'*(q-1),'YYYYQ') as YYYYQ,
to_date('2021','YYYY') + interval '3 month'*(q-1) as d
FROM generate_series(1,4) f(q);
q | yyyyq | d
---+-------+---------------------
1 | 20211 | 2021-01-01 00:00:00
2 | 20212 | 2021-04-01 00:00:00
3 | 20213 | 2021-07-01 00:00:00
4 | 20214 | 2021-10-01 00:00:00

Related

How to bin timestamp data into buckets of custom width of n hours in vertica

I have a table which contains a column Start_Timestamp which has time stamp values like 2020-06-02 21:08:37. I would like to create new column which classifies these timestamps into bins of 6hours.
Eg.
Input :
Start_Timestamp
2020-06-02 21:08:37
2020-07-19 01:23:40
2021-11-13 12:08:37
Expected Output ( Here each bin is of 6hours width) :
Start_Timestamp
Bin
2020-06-02 21:08:37
18H - 24H
2020-07-19 01:23:40
00H - 06H
2021-11-13 12:08:37
12H - 18H
I have tried using TIMESERIES but can anyone help to generate output in following format
It's Vertica. Use the TIME_SLICE() function. Then, combine it with the TO_CHAR() function that Vertica shares with Oracle.
You can always add a CASE WHEN expression to change 00:00 to 24:00, but as that is not the standard, I wouldn't even bother.
WITH
indata(start_ts) AS (
SELECT TIMESTAMP '2020-06-02 21:08:37'
UNION ALL SELECT TIMESTAMP '2020-07-19 01:23:40'
UNION ALL SELECT TIMESTAMP '2021-11-13 12:08:37'
)
SELECT
TIME_SLICE(start_ts,6,'HOUR')
AS tm_slice
, TO_CHAR(TIME_SLICE(start_ts,6,'HOUR'),'HH24:MIH - ')
||TO_CHAR(TIME_SLICE(start_ts,6,'HOUR','END'),'HH24:MIH')
AS caption
, start_ts
FROM indata;
-- out tm_slice | caption | start_ts
-- out ---------------------+-----------------+---------------------
-- out 2020-06-02 18:00:00 | 18:00H - 00:00H | 2020-06-02 21:08:37
-- out 2020-07-19 00:00:00 | 00:00H - 06:00H | 2020-07-19 01:23:40
-- out 2021-11-13 12:00:00 | 12:00H - 18:00H | 2021-11-13 12:08:37
You can simply extract the hour and do some arithmetic:
select t.*,
floor(extract(hour from start_timestamp) / 6) * 6 as bin
from t;
Note: This characterizes the bin by the earliest hour. That seems more useful than a string representation, but you can construct a string if you really want.

incorrect date time format in Oracle DB, convert to hours and minutes

Don't ask me why but for some reason we have a date time column that is in the wrong format that I need help converting.
Example timestamp from DB: 01-OCT-20 12.18.44.000000000 AM
In the example above the hours is actually 18 and the minutes is 44.
Not sure how this happened by 12 is the default for everything. All I want to do is get the difference in HH:MM from 2 timestamps, but i dont know how to convert this properly with the hours being in the minute section and the minutes being in the seconds section.
Example of what I'm looking for:
01-OCT-20 12.18.44.000000000 AM - 01-OCT-20 12.12.42.000000000 AM
Output: 06:02 . so the timespan would be 6 hours and 2 minutes in this case.
Thanks,
In the example above the hours is actually 18 and the minutes is 44.
Not sure how this happened by 12 is the default for everything. All I want to do is get the difference in HH:MM from 2 timestamps, but i dont know how to convert this properly with the hours being in the minute section and the minutes being in the seconds section.
To convert minutes to hours, you need to multiply by 60.
To convert seconds to minutes, you also need to multiply by 60.
So, if you want to convert the time part of the correct value then you take the time since midnight and multiply it all by 60.
If you want to get the difference between the current and correct time (after multiplying by 60) then you want to subtract the current time (which can be simplified to just multiplying by 59).
So to get the time difference you can use:
SELECT (value - TRUNC(value))*59 AS difference,
value + (value - TRUNC(value))*59 AS updated_value
FROM table_name;
So, for your sample data:
CREATE TABLE table_name ( value ) AS
SELECT TO_TIMESTAMP( '01-OCT-20 12.18.44.000000000 AM', 'DD-MON-RR HH12.MI.SS.FF9 AM' ) FROM DUAL
Then the output is:
DIFFERENCE | UPDATED_VALUE
:---------------------------- | :-------------------------
+000000000 18:25:16.000000000 | 2020-10-01 18:44:00.000000
db<>fiddle here
If you want to compare two wrong values just subtract one timestamp from the other and multiply by 60 (assuming that the hour will always be 12 AM or 00 in the 24 hour clock):
SELECT (value1 - value2) * 60 AS difference,
value1,
value1 + (value1 - TRUNC(value1))*59 AS updated_value1,
value2,
value2 + (value2 - TRUNC(value2))*59 AS updated_value2
FROM table_name;
So, for the sample data:
CREATE TABLE table_name ( value1, value2 ) AS
SELECT TO_TIMESTAMP( '01-OCT-20 12.18.44.000000000 AM', 'DD-MON-RR HH12.MI.SS.FF9 AM' ),
TO_TIMESTAMP( '01-OCT-20 12.12.42.000000000 AM', 'DD-MON-RR HH12.MI.SS.FF9 AM' )
FROM DUAL
The output is:
DIFFERENCE | VALUE1 | UPDATED_VALUE1 | VALUE2 | UPDATED_VALUE2
:---------------------------- | :------------------------- | :------------------------- | :------------------------- | :-------------------------
+000000000 06:02:00.000000000 | 2020-10-01 00:18:44.000000 | 2020-10-01 18:44:00.000000 | 2020-10-01 00:12:42.000000 | 2020-10-01 12:42:00.000000
Which gives the difference as 6 hours and 2 minutes.
db<>fiddle here

How to go between a set of dates and times

I have a set of data where one column is date and time. I have been asked for all the data in the table, between two date ranges and within those dates, only certain time scale. For example, I was data between 01/02/2019 - 10/02/2019 and within the times 12:00 AM to 07:00 AM. (My real date ranges are over a number of months, just using these dates as an example)
I can cast the date and time into two different columns to separate them out as shown below:
select
name
,dateandtimetest
,cast(dateandtimetest as date) as JustDate
,cast(dateandtimetest as time) as JustTime
INTO #Test01
from [dbo].[TestTable]
I put this into a test table so that I could see if I could use a between function on the JustTime column, because I know I can do the between on the dates no problem. My idea was to get them done in two separate tables and perform an inner join to get the results I need
from #Test01
WHERE justtime between '00:00' and '05:00'
The above code will not give me the data I need. I have been racking my brain for this so any help would be much appreciated!
The test table I am using to try and get the correct code is shown below:
|Name | DateAndTimeTest
-----------------------------------------|
|Lauren | 2019-02-01 04:14:00 |
|Paul | 2019-02-02 08:20:00 |
|Bill | 2019-02-03 12:00:00 |
|Graham | 2019-02-05 16:15:00 |
|Amy | 2019-02-06 02:43:00 |
|Jordan | 2019-02-06 03:00:00 |
|Sid | 2019-02-07 15:45:00 |
|Wes | 2019-02-18 01:11:00 |
|Adam | 2019-02-11 11:11:00 |
|Rhodesy | 2019-02-11 15:16:00 |
I have now tried and got the data to show me information between the times on one date using the below code, but now I would need to make this piece of code run for every date over a 3 month period
select *
from dbo.TestTable
where DateAndTimeTest between '2019-02-11 00:00:00' and '2019-02-11 08:30:00'
You can use SQL similar to following:
select *
from dbo.TestTable
where (CAST(DateAndTimeTest as date) between '2019-02-11' AND '2019-02-11') AND
(CAST(DateAndTimeTest as time) between '00:00:00' and '08:30:00')
Above query will return all records where DateAndTimeTest value in date range 2019-02-11 to 2019-02-11 and with time between 12AM to 8:30AM.

Oracle query returns no result if extending time on condition

If I do:
SELECT count(*) FROM XX where "date" >= '8-APR-2015' and "date" <= '8-APR-2016'
It would return many rows, but if I do:
SELECT count(*) FROM XX where "date" >= '8-APR-2010' and "date" <= '8-APR-2016'
It returns 0. How is that possible? If anything I would get more rows because I'm increasing the range that is valid for retrieval. Any ideas?
EDIT:
NLS_TIMESTAMP_FORMAT 'DD-MON-RR HH.MI.SSXFF
NLS_DATE_FORMAT DD-MON-RR
If you look at the execution plans for the two queries, particularly the predicate information, you'll see that the first one does:
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | TABLE ACCESS FULL| XX | 1 | 13 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
2 - filter("date">=TO_TIMESTAMP('8-APR-2015') AND
"date"<=TO_TIMESTAMP('8-APR-2016'))
while the second does:
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 0 (0)| |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| XX | 1 | 13 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(NULL IS NOT NULL)
3 - filter("date">=TO_TIMESTAMP('8-APR-2010') AND
"date"<=TO_TIMESTAMP('8-APR-2016'))
And since NULL IS NOT NULL is never true, that gets zero rows. But that's down to your NLS settings. With other format masks it does not have that filter step.
You can get a sense of what's happening if you look at how those to_timestamp() calls are being evaluated with your format NLS settings:
alter session set nls_timestamp_format = 'DD-MON-RR HH.MI.SSXFF';
select to_char(to_timestamp('8-APR-2015'), 'YYYY-MM-DD') as from_1,
to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_1,
to_char(to_timestamp('8-APR-2010'), 'YYYY-MM-DD') as from_2,
to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_2
from dual;
FROM_1 TO_1 FROM_2 TO_2
---------- ---------- ---------- ----------
2015-04-08 2016-04-08 2020-04-08 2016-04-08
The first pair of dates look OK - 2015 is before 2016. But the second 'from' has come out as 2020, not 2010; and since Oracle is smart enough to realise that 2020 is later than 2016, it knows there can be no data that matches, and adds the impossible condition to short circuit and avoid redundant data access.
Compare that with a mask that handles four-digit years properly:
alter session set nls_timestamp_format = 'DD-MON-RRRR HH.MI.SSXFF';
select to_char(to_timestamp('8-APR-2015'), 'YYYY-MM-DD') as from_1,
to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_1,
to_char(to_timestamp('8-APR-2010'), 'YYYY-MM-DD') as from_2,
to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_2
from dual;
FROM_1 TO_1 FROM_2 TO_2
---------- ---------- ---------- ----------
2015-04-08 2016-04-08 2010-04-08 2016-04-08
Now the second from 'date' is correct.
The difference is down to how the RR format mask behaves, though this specific behaviour isn't really documented.
What's actually happening is down to Oracle's helpfulness in trying to be flexible in interpreting format masks. As it says in the docs, just under the table of datetime format elements, "Oracle Database converts strings to dates with some flexibility" - but the effects of that are sometimes a bit unexpected.
It's actually the bit after RR that's throwing it out. You can see that with this little demo:
with t as (
select 1998 + level as year from dual connect by level < 16
)
select year, to_char(to_timestamp(to_char(year), 'RR HH'), 'YYYY-MM-DD HH24:MI:SS')
from t;
YEAR TO_CHAR(TO_TIMESTAM
---------- -------------------
1999 1999-04-01 00:00:00
2000 2000-04-01 00:00:00
2001 2020-04-01 01:00:00
2002 2020-04-01 02:00:00
2003 2020-04-01 03:00:00
2004 2020-04-01 04:00:00
2005 2020-04-01 05:00:00
2006 2020-04-01 06:00:00
2007 2020-04-01 07:00:00
2008 2020-04-01 08:00:00
2009 2020-04-01 09:00:00
2010 2020-04-01 10:00:00
2011 2020-04-01 11:00:00
2012 2020-04-01 12:00:00
2013 2013-04-01 00:00:00
The RR model only seems to look at the first two digits of the year, but when being helpful it also tries to handle four-digit years for you, and that is working for 2015 and 2016. And it would work for other years if the mask didn't have a time component. But it does, and it's preferring to interpret the third and fourth characters of your four-digit year using the HH part of the mask.
So for 2010, it's seeing the '10', decides it can interpret that as an HH value, does that, and then only converts the remaining two digits '20' using the RR mask - which it treats as 2020. So you end up with 10am on April 8th 2020. The same thing happens for 2000 (though you can't tell the difference) through to 2012. When you get to 2013, '13' is no longer valid for the HH mask, so it goes back to treating all four digits as the year. If the NLS format mast had HH24 then it would 'break' for 2013-2023 as well.
The moral is to never rely on NLS settings. (And never use 2-digit years, or 2-digit year masks). Convert strings to dates/timestamp explicitly:
where "date" >= to_timestamp('8-APR-2015', 'DD-MON-YYYY')
and "date" <= to_timestamp('8-APR-2016', 'DD-MON-YYYY');
... though preferably not with month names as they are also NLS-dependent, though you can specify you want English translation:
where "date" >= to_timestamp('8-APR-2015', 'DD-MON-YYYY', 'NLS_DATE_LANGUAGE=ENGLISH')
and "date" <= to_timestamp('8-APR-2016', 'DD-MON-YYYY', 'NLS_DATE_LANGUAGE=ENGLISH');
Or even better for fixed values, use ANSI date/timestamp literals:
where "date" >= timestamp '2010-04-08 00:00:00'
and "date" <= timestamp '2016-04-08 00:00:00';

Group records by time

I have a table containing a datetime column and some misc other columns. The datetime column represents an event happening. It can either contains a time (event happened at that time) or NULL (event didn't happen)
I now want to count the number of records happening in specific intervals (15 minutes), but do not know how to do that.
example:
id | time | foreign_key
1 | 2012-01-01 00:00:01 | 2
2 | 2012-01-01 00:02:01 | 4
3 | 2012-01-01 00:16:00 | 1
4 | 2012-01-01 00:17:00 | 9
5 | 2012-01-01 00:31:00 | 6
I now want to create a query that creates a result set similar to:
interval | COUNT(id)
2012-01-01 00:00:00 | 2
2012-01-01 00:15:00 | 2
2012-01-01 00:30:00 | 1
Is this possible in SQL or can anyone advise what other tools I could use? (e.g. exporting the data to a spreadsheet program would not be a problem)
Give this a try:
select datetime((strftime('%s', time) / 900) * 900, 'unixepoch') interval,
count(*) cnt
from t
group by interval
order by interval
Check the fiddle here.
I have limited SQLite background (and no practice instance), but I'd try grabbing the minutes using
strftime( FORMAT, TIMESTRING, MOD, MOD, ...)
with the %M modifier (http://souptonuts.sourceforge.net/readme_sqlite_tutorial.html)
Then divide that by 15 and get the FLOOR of your quotient to figure out which quarter-hour you're in (e.g., 0, 1, 2, or 3)
cast(x as int)
Getting the floor value of a number in SQLite?
Strung together it might look something like:
Select cast( (strftime( 'YYYY-MM-DD HH:MI:SS', your_time_field, '%M') / 15) as int) from your_table
(you might need to cast before you divide by 15 as well, since strftime probably returns a string)
Then group by the quarter-hour.
Sorry I don't have exact syntax for you, but that approach should enable you to get the functional groupings, after which you can massage the output to make it look how you want.