Convert date to int with Hive - sql

I'm trying to port some T-SQL to Hive SQL and am running into trouble with the following statement:
create table param as
select convert( int, CONVERT( char(8), convert( date, begin_date ), 112 ) ) as begin_dtkey
, convert( int, CONVERT( char(8), convert( date, end_date ), 112 ) ) as end_dtkey
, convert( int, CONVERT( char(8), convert( date, cutoff_date ), 112 ) ) as cutoff_dtkey
, *
from tmp_param;
The idea is to convert dates to their integer formats. Is there a way to do this in Hive (v0.13)?
In the case of T-SQL this was a select...into statement, but I went ahead and made it create table...as select for Hive.

Probably the easiest way -- and one that should work in both databases -- is to use the date part extraction functions:
select (year(begin_date) * 10000 + month(begin_date) * 100 + day(begin_date)) as begin_dtkey,
(year(end_date) * 10000 + month(end_date) * 100 + day(end_date)) as end_dtkey,
(year(cutoff_date) * 10000 + month(cutoff_date) * 100 + day(cutoff_date)) as cutoff_dtkey,
p.*
from tmp_param p;

Related

which rounding rounding datetime is more efficient in SQL

I'm going to write a code that round a datetime value but i can't compare which ones is more efficient:
DECLARE #DateValue DATETIME = '2021-01-13 11:59:59'
---- FIRST SOLUSTION:
SELECT CAST(#DateValue AS smalldatetime) AS DateRoundS1
---- SECOND SOLUTION:
SELECT CONVERT(smalldatetime, #DateValue) AS DateRoundS2
---- THIRD SOLUTION:
SELECT DATEADD(HOUR, DATEDIFF(HOUR, 0, #DateValue), 0) AS DateRoundS3
---- FORTH SOLUTION:
DECLARE #DateValue DATETIME = '2021-01-13 11:59:59'
DECLARE #DiffMinsTime INT = DATEPART(MINUTE,#DateValue)
DECLARE #DiffSecsTime INT = DATEPART(SECOND,#DateValue)
DECLARE #DiffMSTime INT = DATEPART(MILLISECOND,#DateValue)
IF( #DiffMinsTime > 0 )
BEGIN
SELECT DATEADD(MINUTE,DATEDIFF(MINUTE,#DiffMinsTime,#DateValue),0)
END
IF(#DiffSecsTime > 0)
BEGIN
SELECT DATEADD(SECOND,DATEDIFF(SECOND,#DiffSecsTime,#DateValue),0)
END
IF(#DiffMSTime > 0)
BEGIN
SELECT DATEADD(MILLISECOND,DATEDIFF(MILLISECOND,#DiffMSTime,#DateValue),0)
END
PS: I know the last one has overflow!!
Is there any more efficient way to do that?!
With conversions and transformations it always depends on lots of factors.
Best way: Build a test rig and set aside a few minutes or better hours.
The test rig creates numbers from 0 to 9 999 999 - about 10 million rows. Kudos to Jeff Moden for the SQL spackle that creates the number list.
DROP TABLE IF EXISTS #numbers
;
DROP TABLE IF EXISTS #dates
;
CREATE TABLE
#numbers
(
number integer NOT NULL
)
;
CREATE TABLE
#dates
(
dated datetime2(7) NOT NULL
)
;
WITH
cteNum
(
smallnum
)
AS
(
SELECT Cast( 1 AS integer )
UNION ALL
SELECT Cast( 2 AS integer )
UNION ALL
SELECT Cast( 3 AS integer )
UNION ALL
SELECT Cast( 4 AS integer )
UNION ALL
SELECT Cast( 5 AS integer )
UNION ALL
SELECT Cast( 6 AS integer )
UNION ALL
SELECT Cast( 7 AS integer )
UNION ALL
SELECT Cast( 8 AS integer )
UNION ALL
SELECT Cast( 9 AS integer )
UNION ALL
SELECT Cast( 0 AS integer )
)
INSERT INTO
#numbers
(
number
)
SELECT
num1.smallnum * 1000000 + num2.smallnum * 100000 + num3.smallnum * 10000 + num4.smallnum * 1000
+ num5.smallnum * 100 + num6.smallnum * 10 + num7.smallnum
FROM
cteNum AS num1
CROSS JOIN cteNum AS num2
CROSS JOIN cteNum AS num3
CROSS JOIN cteNum AS num4
CROSS JOIN cteNum AS num5
CROSS JOIN cteNum AS num6
CROSS JOIN cteNum AS num7
;
INSERT INTO
#dates
(
dated
)
SELECT
Cast( DateAdd( ms, nums.number, dated.basedate) AS datetime2(7) ) AS dated
FROM
#numbers AS nums
CROSS JOIN
(
SELECT Cast(GetDate() AS datetime2(7) ) AS basedate
) dated
;
DROP TABLE IF EXISTS #numbers
;
This will take from a couple of seconds to a couple of minutes to create.
Using a table minimises the amount of effort to table maintenance done by the query analyzer.
The test of conversion times can then be done in a controlled environment. Tip: Using a start and end time capture eliminates the overhead for a SQL trace.
SET NOCOUNT ON
;
DECLARE
#datetimestarted datetime2(7),
#datetimeended datetime2(7),
#result varchar(200),
#crlf char(2) = char(13) + char(10)
;
SET #datetimestarted = Getdate()
;
SELECT
dated,
datedrounded = Cast( dated AS datetime2(0) )
FROM
#dates
;
SET #datetimeended = Getdate()
;
SET #result = 'Processing time = ' + Cast( Datediff( ms, #datetimestarted, #datetimeended ) / 1000 AS varchar(12) ) + ' seconds' + #crlf
+ ' > Start time: ' + Convert( varchar(20), #datetimestarted, 126 ) + #crlf
+ ' > End time: ' + Convert( varchar(20), #datetimeended, 126 ) + #crlf
;
PRINT #result
;
SET #datetimestarted = Getdate()
;
SELECT
dated,
datedrounded = Convert( datetime2(0), dated )
FROM
#dates
;
SET #datetimeended = Getdate()
;
SET #result = 'Processing time = ' + Cast( Datediff( ms, #datetimestarted, #datetimeended ) / 1000 AS varchar(12) ) + ' seconds' + #crlf
+ ' > Start time: ' + Convert( varchar(20), #datetimestarted, 126 ) + #crlf
+ ' > End time: ' + Convert( varchar(20), #datetimeended, 126 ) + #crlf
;
PRINT #result
;
Run time on my VM is about 3 minutes for each option with Cast taking a tad bit longer than Convert. Now this depends on what one wants to achieve so feel free to change the transformations as required.
And finally the clean up.
DROP TABLE IF EXISTS #dates
;
Final tip: date and datetime are kind of obsolete datetime types in SQL Server. Calculation with these is actually easier than the new datetime2 type allowing mathematical calculations.
datetime2 requires the Dateadd function use.
However datetime2 is easier to convert to current database interactions between different types of RDBMS. Also easier to work in standard ISO 8601 formats which makes transferring data between different parts of the world a whole lot easier.

Calculate amount of days worked by employee from 1 column to another with a function

This is my function I have so long, I need to be able to just call it with with an Employee number from the Employee table and it has to calculate the days between the 2 columns.
CREATE FUNCTION getDaysWorked (#Employee_No int)
Returns Datetime
as
Begin
declare #DayStart datetime
declare #DayResigned datetime
declare #DaysWorked int
set #DayStart = (Select e.Group_Start_Date) from Employee e
set #DayResigned =(Select e.ResignDate) from Employee e
set #DaysWorked = (#DayStart - #DayResigned)
Return(#DaysWorked)
end
GO
If there is a better way please let me know, this is what I have...
Presumably, you want something related to the employee being passed in. I would surmise:
create function getDaysWorked (#Employee_No int)
returns int as
begin
declare #DaysWorked int;
select #DaysWorked = datediff(day, e.Group_Start_Date, e.ResignDate)
from Employee e
where e.Employee_No = #Employee_No;
return(#DaysWorked)
end;
Note that the function returns an integer not a date.
You can use this:
CREATE FUNCTION getDaysWorked (#Employee_No int)
Returns int
as
Begin
declare #DaysWorked int
SELECT #DaysWorked = DATEDIFF(day,e.Group_Start_Date,e.ResignDate) FROM from Employee e WHERE Emp_ID=#Employee_No
Return(#DaysWorked)
end
GO
Datediff is the function you want to use indeed. However this has a little caveat emptor which needs to be handled by business logic.
Does the first day of employment counts towards the days worked or is this considered day = 0 and thus not counted?
Concept code (without function):
WITH
cte_employee
(
code_employee,
dtt_start,
dtt_resigned
)
AS
( -- result expected as a lot of days
SELECT
Cast( 1 AS smallint ) AS code_employee,
Convert( datetime2(0), '1988-01-01', 126 ) AS dtt_start,
Convert( datetime2(0), '1995-03-06', 126 ) AS dtt_resigned
UNION
-- no result possible
SELECT
Cast( 2 AS smallint ) AS code_employee,
Convert( datetime2(0), '2000-01-01', 126 ) AS dtt_start,
Convert( datetime2(0), NULL, 126 ) AS dtt_resigned
UNION
-- result expected as 365 day (no leap year)
SELECT
Cast( 3 AS smallint ) AS code_employee,
Convert( datetime2(0), '2005-04-01', 126 ) AS dtt_start,
Convert( datetime2(0), '2006-03-31', 126 ) AS dtt_resigned
UNION
-- result expected as 366 day (no leap year)
SELECT
Cast( 4 AS smallint ) AS code_employee,
Convert( datetime2(0), '1999-04-01', 126 ) AS dtt_start,
Convert( datetime2(0), '2000-03-31', 126 ) AS dtt_resigned
)
SELECT
code_employee,
dtt_start,
dtt_resigned,
-- first day is considered day 0 so add 1
Datediff( dd, dtt_start, dtt_resigned ) +1 AS days_worked
FROM
cte_employee
;
Now the SQL function is out of the way let us check two options to create a function itself. For this I use scalar (UDF = user defined function) and ITVF (inline table valued function).
The ITVF is very primitive and just demonstrates the concept for OUTER APPLY.
Please do not under any circumstances use the following code in your production environment!
First there must be a table to work with:
DROP TABLE IF EXISTS dbo.TestEmployee
;
CREATE TABLE dbo.TestEmployee
(
code_employee smallint,
dtt_start datetime2(0),
dtt_resigned datetime2(0)
)
;
WITH
cte_employee
(
code_employee,
dtt_start,
dtt_resigned
)
AS
( -- result expected as a lot of days
SELECT
Cast( 1 AS smallint ) AS code_employee,
Convert( datetime2(0), '1988-01-01', 126 ) AS dtt_start,
Convert( datetime2(0), '1995-03-06', 126 ) AS dtt_resigned
UNION
-- no result possible
SELECT
Cast( 2 AS smallint ) AS code_employee,
Convert( datetime2(0), '2000-01-01', 126 ) AS dtt_start,
Convert( datetime2(0), NULL, 126 ) AS dtt_resigned
UNION
-- result expected as 365 day (no leap year)
SELECT
Cast( 3 AS smallint ) AS code_employee,
Convert( datetime2(0), '2005-04-01', 126 ) AS dtt_start,
Convert( datetime2(0), '2006-03-31', 126 ) AS dtt_resigned
UNION
-- result expected as 366 day (no leap year)
SELECT
Cast( 4 AS smallint ) AS code_employee,
Convert( datetime2(0), '1999-04-01', 126 ) AS dtt_start,
Convert( datetime2(0), '2000-03-31', 126 ) AS dtt_resigned
)
INSERT INTO dbo.TestEmployee
(
code_employee,
dtt_start,
dtt_resigned
)
SELECT
code_employee,
dtt_start,
dtt_resigned
FROM
cte_employee
;
GO
And with a few records to work with here comes the scalar function:
CREATE OR ALTER FUNCTION udf_count_days_worked
( -- what is required?
#code_employee smallint
)
-- what is delivered?
RETURNS int
WITH
SCHEMABINDING, -- important if this function shall continue to work
RETURNS NULL ON NULL INPUT -- just a fail safe
AS
BEGIN
DECLARE
#num_days_worked int
;
SELECT
#num_days_worked = Datediff( dd, dtt_start, dtt_resigned ) +1
FROM
dbo.TestEmployee
WHERE
code_employee = #code_employee
;
RETURN
#num_days_worked
;
END
;
GO
SELECT
emp.code_employee,
emp.dtt_start,
emp.dtt_resigned,
dbo.udf_count_days_worked( emp.code_employee ) AS num_days_worked
FROM
dbo.TestEmployee AS emp
;
In a scalar function I use the function itself PER ROW using a variable derived from the table.
On to ITVF. The main benefit over a scalar function is the single execution instead of row-by-agonizing-row execution of a scalar function.
CREATE OR ALTER FUNCTION itvf_count_days_worked
(
#code_employee smallint
)
RETURNS TABLE
WITH
SCHEMABINDING
AS
RETURN
(
SELECT
--code_employee,
Datediff( dd, dtt_start, dtt_resigned ) +1 AS num_days_worked
FROM
dbo.TestEmployee
WHERE
code_employee = #code_employee
)
;
GO
SELECT
emp.code_employee,
emp.dtt_start,
emp.dtt_resigned,
func.num_days_worked
FROM
dbo.TestEmployee AS emp
OUTER APPLY dbo.itvf_count_days_worked( emp.code_employee ) AS func
;
As mentioned this ITVF is primitive. The main benefit is speed as long as the return values are simple calculations. Now these calculations can be lots of code to find the correct result. Using either scalar or ITVF allows to "hide" the code in a function and thus apply the same calculation at different places.
And finally some clean up code:
-- clean up
DROP FUNCTION udf_count_days_worked
;
DROP FUNCTION itvf_count_days_worked
;
DROP TABLE dbo.TestEmployee
;
As I mention in comment, I would use an inline table-value function. It seems odd, however, that you return a datetime, but if that's what you want, at least make it more clear you want this odd logic in your SQL by explicitly using DATEADD to add the difference in days to the "date" 0 (1900-01-01):
CREATE FUNCTION dbo.getDaysWorked (#Employee_No int)
RETURNS table
AS RETURN
SELECT DATEADD(DAY,0,DATEDIFF(DAY,e.Group_Start_Date,e.ResignDate)) AS DaysWorked
FROM dbo.Employee;
--WHERE EmployeeNo = #Employee_No --This wasn't in your attempt either, oddly.
GO
So, if someone joined on 2017-01-01 and left on 2021-02-10' it would return the datetime value 1904-02-02T00:00:00.000.

How to convert round number to data and time format

Two Column in table tblpress
Date Time
20160307 120949
20160307 133427
Need to be select below the format:
07-03-2016 12:09:49
07-03-2016 13:34 27
or
03-March-2016 12:09: 49 PM
03-March-2016 01:34: 27 PM
You can try below
select format(cast([Date] as date),'dd-MMMM-yyyy') as [Date],
TIMEFROMPARTS(LEFT([Time],2), SUBSTRING([Time],3,2), RIGHT([Time],2), 0,0) as [Time]
I think CAST/CONVERT will help you:
SELECT
CAST('20160307' AS date),
CAST(STUFF(STUFF('120949',3,0,':'),6,0,':') AS time)
And convert for out:
SELECT
CONVERT(varchar(20),NormalDate,105) OutDate, -- Italian style
CONVERT(varchar(20),NormalTime,108) OutTime -- hh:mi:ss
FROM
(
SELECT
CAST([Date] AS date) NormalDate,
CAST(STUFF(STUFF([Time],3,0,':'),6,0,':') AS time) NormalTime
FROM YourTable
) q
CAST and CONVERT (Transact-SQL)
And you can use FORMAT (Transact-SQL)
SELECT
FORMAT(GETDATE(),'dd-MM-yyyy'),
FORMAT(GETDATE(),'HH:mm:ss')
Best way to do it is to create a function :
create FUNCTION [dbo].[udfGetDateTimeFromInteger]
(
#intDate int,
#intTime int
)
RETURNS datetime
AS BEGIN
-- Declare the return variable here
DECLARE #DT_datetime datetime = NULL,
#str_date varchar(11),
#str_time varchar(8)
if(#intDate is not null and #intDate > 0)
begin
select #str_date = CAST( cast(#intDate as varchar(8)) AS date)
if #intTime=0
select #str_time ='000000'
else
select #str_time = right('0'+CONVERT(varchar(11),#intTime),6)
select #str_time =
SUBSTRING(#str_time,1,2)+':'+SUBSTRING(#str_time,3,2)+':'+SUBSTRING(#str_time,5,2)
select #DT_datetime = CAST(#str_date+' '+#str_time as datetime)
end
-- Return the result of the function
RETURN #DT_datetime
END
and then call it in select like :
declare #next_run_date int, #next_run_time int
select #next_run_date = 20160307
select #next_run_time = 130949
SELECT #next_run_date inputdate,
#next_run_time inputtime,
dbo.udfGetDateTimeFromInteger(#next_run_date, #next_run_time) outputdatetime
Output will be like :
inputdate inputtime outputdatetime
20160307 130949 2016-03-07 13:09:49.000
You said those are numbers, right? You can use datetimefromparts (or datetime2fromparts). ie:
select
datetimefromparts(
[date]/10000,
[date]%10000/100,
[date]%100,
[time]/10000,
[time]%10000/100,
[time]%100,0)
from tblpress;
DB Fiddle demo
Note that naming fields like that and also storing date and time like that is a bad idea.
I later noticed it was char fields:
select
cast([date] as datetime) +
cast(stuff(stuff([time],5,0,':'),3,0,':') as datetime)
from tblpress;

Convert int 20171116101856 to yyyymmddhhmmss format to be used in datediff function

So I have this table that has date columns in int type.
last_run_date | last_run_time
20171116 | 100234
Im trying to convert this two values into a datetime to be used in datediff statement.
this is my statement
SELECT 1
FROM V_Jobs_All_Servers vjas
WHERE JobName='DailyReports_xxxx' and Step_Name='xxxx'
and DATEDIFF(hour, Convert(varchar,STUFF(STUFF(STUFF(STUFF(STUFF(cast(
Convert(varchar(100),vjas.last_run_date) + Convert(varchar(100),vjas.last_run_time) as varchar)
,5,0,'-'),8,0,'-'),11,0,' '),14,0,':'),17,0,':')), Getdate()) <3
This works but only when the last_run_time value is in two digits hour format
101216, but whenever its one digit hour 91316 it fails with the following error,
The conversion of a char data type to a datetime data type resulted in an out-of-range datetime value.
I am on SQL Server 2005
If you're getting this value from msdb.dbo.sysjobsteps, there's a built-in function, msdb.dbo.agent_datetime(), to convert last_run_date and last_run_time to a datetime already:
select job_id,
step_id,
step_name,
msdb.dbo.agent_datetime(nullif(last_run_date,0),nullif(last_run_time,0)) as last_run_datetime
from msdb.dbo.sysjobsteps
It is an undocumented function. However, at least in my version of SQL Server (2012), that function has this definition:
CREATE FUNCTION agent_datetime(#date int, #time int)
RETURNS DATETIME
AS
BEGIN
RETURN
(
CONVERT(DATETIME,
CONVERT(NVARCHAR(4),#date / 10000) + N'-' +
CONVERT(NVARCHAR(2),(#date % 10000)/100) + N'-' +
CONVERT(NVARCHAR(2),#date % 100) + N' ' +
CONVERT(NVARCHAR(2),#time / 10000) + N':' +
CONVERT(NVARCHAR(2),(#time % 10000)/100) + N':' +
CONVERT(NVARCHAR(2),#time % 100),
120)
)
END
You are massively over complicating this, just pad your time value with a leading 0 and convert from there:
declare #t table(last_run_date int, last_run_time int);
insert into #t values(20171116,90234),(20171116,100234);
select last_run_date
,last_run_time
,convert(datetime,cast(last_run_date as nvarchar(8))
+ ' '
+ stuff(stuff(right('0' + cast(last_run_time as nvarchar(6))
,6)
,5,0,':')
,3,0,':')
,112) as DateTimeData
from #t
Output:
+---------------+---------------+-------------------------+
| last_run_date | last_run_time | DateTimeData |
+---------------+---------------+-------------------------+
| 20171116 | 100234 | 2017-11-16 09:02:34.000 |
| 20171116 | 100234 | 2017-11-16 10:02:34.000 |
+---------------+---------------+-------------------------+
Here's an ugly way...
declare #table table (last_run_date int, last_run_time int)
insert into #table
values
(20171116,100234),
(20171116,91316)
select
cast(cast(cast(last_run_date as varchar) as datetime) + ' ' + stuff(stuff(last_run_time,len(last_run_time) - 1,0,':'),len(stuff(last_run_time,len(last_run_time) - 1,0,':')) - 4,0,':') as datetime)
from #table
DECLARE #temp TABLE (last_run_date int, last_run_time int)
INSERT INTO #temp VALUES (20171116, 100234)
SELECT convert(datetime,CAST(last_run_date as varchar))
+ Convert(time, Dateadd(SECOND, Right(last_run_time,2)/1
,Dateadd(MINUTE, Right(last_run_time,4)/100
,Dateadd(hour, Right(last_run_time,6)/10000
,'1900-01-01'
)
)
)
) [DateConverted]
FROM #temp
Produces Output:
DateConverted
2017-11-16 10:02:34.000
You can see how this works by doing each part individually.
SELECT Dateadd(hour, Right(last_run_time,6)/10000
,'1900-01-01')
FROM #temp
Gives the hours position.
SELECT Dateadd(MINUTE, Right(last_run_time,4)/100
,Dateadd(hour, Right(last_run_time,6)/10000
,'1900-01-01'))
FROM #temp
Gives the hours plus minutes position.
Etc.

Add or subtract time from datetime in SQL server

I have a column offset in db as varchar(50) which contains a value such as 05:30:00 or -2:15:00.
I need to add or subtract this value from another column which is a DATETIME datatype as 2011-07-22 14:51:00.
try something like -
select convert(datetime, '05:30:00') + GETDATE()
What's your database platform?
On MS SQL you'd do it like this...
-- Create some test data
create table dbo.MyData (
Adjustment varchar(50) NOT NULL,
BaseDate datetime NOT NULL
) on [primary]
go
insert into dbo.MyData ( Adjustment, BaseDate ) values ( '05:30:00', cast('2011-07-22 14:51:00' as datetime) )
insert into dbo.MyData ( Adjustment, BaseDate ) values ( '-2:15:00', cast('2011-06-12 10:27:30' as datetime) )
go
-- Perform the adjustment
select
c.Adjustment,
c.BaseDate,
c.AdjSecs,
dateadd(s, c.AdjSecs, c.BaseDate ) as AdjustedDate
from (
select
case
when left( Adjustment, 1 ) = '-' then -1 * datediff(s, 0, right( Adjustment, len(Adjustment) - 1 ))
else datediff(s, 0, right( Adjustment, len(Adjustment) - 1 ))
end as AdjSecs,
Adjustment,
BaseDate
from dbo.MyData
) as c
Note, this takes account of negative adjustment periods too.
Replace getdate() function with your date column
DECLARE #mytime AS VARCHAR(10)
SET #mytime = '2:15:00'
SELECT DATEADD(
s
,CASE
WHEN SUBSTRING(#mytime,1,1)='-'
THEN -DATEDIFF(s,0, SUBSTRING(#mytime,2,LEN(#mytime)-1)
ELSE DATEDIFF(s,0, #mytime)
END
,GETDATE()
)