I have data in this way
' date | Col 1
----------------------------------------
2014-07-07 00:02:15.089-07 | 10
2014-07-07 00:08:15.069-08 | 20
2014-07-10 00:04:17.079-09 | 40
2014-07-08 00:07:15.089-06 | 30
The 07/08/09/06 at the end of the date string represents the time zone. I am trying to get a avg of column. But first i need to convert all the different time zones in to a unique time zone. I need to convert all the date strings to UTC and then do an average of col 1 for different days or hours. I thought of using substrings, but doesn't help. Any help would be appreciated, thanks a lot.
I think you can use UDF (User defined functions) for the purpose.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Related
I want to filter SignIn-Logs with Kusto whose timestamps are only between 6pm and 6am.
Something like that
SignInLogs
| where TimeGenerated between(dateStart .. dateEnd)
All examples I have found are always based on a full timestamp with exact date, like (2014-05-25T08:20:03.123456Z). But I am only interested in the time.
Any idea how to solve this?
Kusto: How to filter Logs in a certian time period?
between operator - Filters a record set for data that falls within an inclusive range of values.
between is used to allow a certain range, but you can also use !between to exclude a time range.
Here Iam excluding from 6 am to 6 pm , so it gives the left over time range i.e.. from 6pm to 6 am
Try the below query
SignInLogs
| where TimeGenerated > ago(1d)
| extend hour = datetime_part("hour", TimeGenerated)
| where hour !between (6 .. 18)
I have a dataset where certain operations occur during the overnight hours which I'd like to attribute to the day before.
For example, anything happening between 2/23 8pm and 2/24 6am should be included in 2/23's metrics rather than 2/24. Anything from 6:01 am to 7:59pm should be counted in 2/24's metrics.
I've seen a few posts about decrementing time by 6 hours but that doesn't work in this case.
Is there a way to use an If function to specify that midnight-6am should be counted as date-1 rather than date without affecting the metrics for the 6am - 7:59pm hours?
Thanks in advance! Also, a SQL newbie here so apologies if I have lots of followup questions.
You can use date_add with -6 hours and then optionally cast the timestamp as a date.
create table t (dcol datetime);
insert into t values
('2022-02-25 06:01:00'),
('2022-02-25 06:00:00'),
('2022-02-25 05:59:00');
SELECT CAST(DATE_ADD(dcol, INTERVAL -6 HOUR)AS DATE) FROM t;
| CAST(DATE_ADD(dcol, INTERVAL -6 HOUR)AS DATE) |
| :-------------------------------------------- |
| 2022-02-25 |
| 2022-02-25 |
| 2022-02-24 |
db<>fiddle here
As said in the comments, your requirement is the occurrences in a 6 AM to 6 AM day instead of a 12-12 day. You can achieve this by decreasing the time by 6 hours as shown in #Kendle’s answer. Another way to do it is to use an IF condition as shown below. Here, the date is decremented if the time is before 6 AM on each day and the new date is put in a new column.
Query:
SELECT
IF
(TIME(eventTime) <= "06:00:00",
DATE_ADD(DATE(eventTime), INTERVAL -1 DAY),
DATE(eventTime)) AS newEventTime
FROM
`project.dataset.table`
ORDER BY
eventTime;
Output from sample data:
As seen in the output, timestamps before 6 AM are considered for the previous day while the ones after are considered in the current day.
I have a spreadsheet with Datetimes as follows:
I am importing this file into an application so in Javascript I see the date being brought through as the normal 5 digit datetime code:
So far as I expect... However, when I then try getting this datetime readable in SQL Server, I run the following scripts:
select
CONVERT(varchar(25),cast(28540 as datetime),121),
dateadd(D,28540,0)
And the dates all return PLUS 2 days!
The same happens for all dates I pass through. I could easily just remove 2 from the 5 digit number but I don't want to just do that if there is a rule or reason for this?
Any advice on this is greatly appreciated!
Excel dates are tricky. What they do is count the number of days since Dec 30th, 1899 (and early years are not entirely accurate).
One option is:
dateadd(d, 28540, '1899-12-30')
Demo on DB Fiddle:
select dateadd(d, 28540, '1899-12-30') new_dt
| new_dt |
| :---------------------- |
| 1978-02-19 00:00:00.000 |
First off I apologize I do not even know where to start and haven't been able to find anything specific to this particular question.
I have a table with datetimes (start and end) and i need to find a way to get minutes/hours between those days. It could either be a sum of the time on weekdays or a some kind of pivot on each day and grouping by the ID number. I had thought to assign a value to the number of days however the times are random and do not start/end at midnight so I am at a loss as how to approach this.
Here are some examples of the date/time format if that helps.
startdate 2018-12-14 10:53:01
enddate 2018-12-27 11:50:00
Any helps or hints would be greatly appreciated!
Edit
forgot to include I am working in SQL Server (SSMS)
Editing For Additional Clarification
Here is a sample date range with an ID number, I wanted to keep it simple.
|ID number| start time |end time
|1 |12/14/2018 10:53|12/17/2018 12:00
here is what I'm trying to achieve (the separation of each date range/ID #)
ID number| start time |end time |mins|
1 | 12/14/2018 10:53|12/14/2018 23:59|786 |
1 | 12/15/2018 0:00 |12/15/2018 23:59|1439|
1 | 12/16/2018 0:00 |12/16/2018 23:59|1439|
1 | 12/17/2018 0:00 |12/17/2018 12:00|960 |
The MINUTE parameter of the DATEDIFF function can be used to determine the difference in minutes between two datetime columns. As below, the second parameter is the start date and the third parameter is the end date, with the result being the amount of time in the specified interval (days, minutes, etc.) from the start to the end date. If you need to find the number of hours between these two columns the HOUR parameter can be used for this. Grouping can be performed as well, as in the second example.
DATEDIFF:
SELECT DATEDIFF(MINUTE, StartDateColumn, EndDateColumn)
DATEDIFF with Grouping and Aggregation:
SELECT ColumnA, SUM(DATEDIFF(MINUTE, StartDateColumn, EndDateColumn)) as DifferenceInMinutes
FROM YourSchema.YourTable
GROUP BY ColumnA
In my application, I have a database storing a calendar of events:
id | name | date
----+--------------------+--------------------
1 | Birthday Party | 2013-04-27 16:30:00
2 | Dinner Reservation | 2013-03-20 17:00:00
3 | Sunday Brunch | 2013-03-31 11:15:00
When viewing events in the application, users should be able to configure how far in advance from the present moment they wish to view events, stored as a value in the database:
username | datediff
----------+------------------------------------------
user123 | 2 days in advance
goodguy | 93 days in advance
spudly | 365 days in advance
aaaaaa | 17 days, 3 hours, 30 seconds in advance
My question is: what is the best (i.e., most SQL-idiomatic) way to store such a date differential? I could store the time difference as a number in milliseconds, but is there some built-in SQL datatype that is suitable for date differentials, rather than just particular points in time? Is something like DATETIME or TIMESTAMP appropriate for this task?
It must be a relative difference -- for example, for "2 days in advance" I'm not interested in storing a particular date two days in the future, because I'd like the user to see events for the next two days every time he looks at the application.
I'm using Microsoft SQL Server 2008, if it makes any difference.
(This may be a duplicate, but all my search attempts have turned up results about datediff -- which is used to calculate time differences -- but nothing about how best to store time differences.)
Standard SQL has a specific data type for date and time durations: interval. SQL Server doesn't support the interval data type.
DateDiff() returns a signed integer. If you need to store the SQL Server equivalent to a SQL interval, you'll need to store an integer. The integer is a count of the number of datepart boundaries, so you also need to store what kind of datepart boundary the integer refers to. Without the datepart, the signed integer 3 could just as easily mean 3 years or 3 seconds.
As a practical matter, I think I'd rather calculate a timestamp for the reminder, and store that instead of the integer and datepart that define an interval. A timestamp can be indexed and queried much more simply than the integer and datepart. And without the need to support recurring events, I don't see a compelling reason to build a solution more complicated than that.