Converting Time into Minutes in Pentaho (PDI Script) - pentaho

I want to calculate sum of activity time(HH:mm:ss) for various transactions in PDI. For example, consider 3 activity times: 1) Activity 1 - 01:22:03,
2) Activity 2 - 01:10:11 and 3) Activity 3 - 00:22:20. The sum of all this time should be 02:54:34 but the result displays negative value. How would I improve it?

I ran into your issue (and the solution) almost by accident. It's worth explaining in a bit of detail.
Date fields are not meant for durations. They define instants. If you define a date field without its date part you're actually defining it as an instant on 1 Jan 1970, which is Unix time's start.
So, if you take your first timestamp, when you set 01:22:23 as a date field you're actually setting it to be "1 Jan 1970 01:22:23". You'd expect this field to return 4923 it's value in seconds (e.g., using getTime() on Javascript). This would work out nicely for your calculations; you could then add them up, and re-format to display.
However, if you don't specify the Timezone when setting the date value, then the date field will use your LOCAL timezone settings to define that time.
So, if you're in NY timezone, defining 01:22:23 as a Date field with format HH:mm:ss returns "31 Dec 1969 6:22:23 UTC", which returns 22923 in seconds.
If you're in Paris or another city which was ahead of UTC on 1 Jan 1970, some or all of your durations may return negative values.
The reason I say I ran into it by accident is because I'm based in London, which should be on UTC in the winter. However, oddly, that was not the case in 1970 (see UNIX timestamp(0): Europe/London returns UTC+1)
So, when calculating the timestamps in London timezone, I got:
Local time, seconds in Unix time
1:22:03, 1323
1:10:11, 611
0:22:20, -2260
These numbers add up to -326.
My suggestion:
Don't define durations as dates, or timestamps; that's not what they are. Durations are time intervals. A 1h duration is worth the same regardless of the year, day or timezone you measure it in.
Instead, just parse the values in a javascript step and do the math without resorting to date parsing.
Hacks to get around the problem (which I don't recommend):
explicitly set the timezone as +0000 when converting the fields to dates;
change your computer's timezone to UTC.

Related

Time offset at specific date

I was wondering if anybody knows a function to convert UTC to local time but at the time the record was saved, not the time the script is being run.
Example:
I have a record saved in February and another one in August. Both are in UTC.
When I look at them through the application the February one shows the time -5 hrs and the August one shows the time -4 hours.
When I run a SQL script I need to see the same but using the T-SQL functions both show their time -5 hours (or -4 hours) depending when I run the script.
Analyzing the situation a bit further I realized it’s more complicated that what I thought.
There are 4 possibilities: The record saved during DST or ST and the SQL script being run during DST or ST.
If both are in DST or ST then I've just have to subtract the offset; if the record was saved in ST and the query is run in DST I'll have to subtract the offset – 1 but if it’s the other way around (record in DST and script run in ST) I should subtract the offset + 1.
Let's assume the UTC date was 2017-08-01 13:30:00 and I'm on the East Coast (US)
Select DateAdd(MINUTE, DateDiff(MINUTE, GetUTCDate(), GetDate()), '2017-08-01 13:30:00')
Returns
2017-08-01 09:30:00.000
Also, take a peek at SYSDATETIMEOFFSET()

sql Sum time that occurs in an interval

i have a lot of data that has at start time and a finnish time. These are formated i datetime format.
i want to sum the time that occurs in an timeinterval
if specify the time interval 08-11
i only want to get the time between these to even if the evvent progresses from 06 to 12
If you are using SQL Server you could do it like that:
SELECT SUM(DATEDIFF(HOUR,StartTimeColumn,EndTimeColumn)) AS ElapsedHoursTotal,
SUM(DATEDIFF(MINUTE,StartTimeColumn,EndTimeColumn)) AS ElapsedMinutesTotal,
SUM(DATEDIFF(SECOND,StartTimeColumn,EndTimeColumn)) AS ElapsedSecondsTotal,
FROM dbo.YourTable
You will have to find the perfekt interval (First Parameter of DATEDIFF Function) for your requirements... Hours, Minutes, Seconds, Nanoseconds,...
Im using mssql 2012
the problem is that i can get the full elapsed time from start to finnish but I only want the part that matches my search
if the pattern i match for is 8-11
thing one 08-12 should produce 3 hours
thing two 10-11 should produce 1 hour
thing tre 9.30- 14 shoud produce 1.5 hour

How is the date argument formatted in the euronext data download url?

I'd like to download historical stock prices from nyx.com with a script. The download URL has to following from:
https://europeanequities.nyx.com/nyx_eu_listings/price_chart/download_historical?typefile=csv&layout=vertical&typedate=dmy&separator=point&mic=XPAR&isin=FR0010557264&name=AB%20SCIENCE&namefile=Price_Data_Historical&from=1356998400000&to=1386115200000&adjusted=1&base=0
The format of most arguments is obvious, except for the "from" and "to" arguments, which determine the begin and end date of the historical prices. In this example, the begin date is January 1, 2013 and the end date is December 4, 2013. How are these dates transformed into numbers like 1356998400000 and 1386115200000?
P.S. I'd rather not use Yahoo finance due to the large amount of errors in the data, especially for the European Markets.
In case you are still interested by the answer : The numbers used for "to" and "from" fields are timestamps with respect to the millisecond and not to the second as usual.
Basicaly, this is the number of milli-seconds elapsed since January the first 1970 0h00m00s000 UTC.
For the first number, 1356998400000, you can truncate the last three 0 to convert it to a classic timestamp (remove milli-seconds part). Any timstamp convertor will map 1356998400 to 1/1/2013 1:00:00
For the second number, once truncated, 1386115200 converts to 12/04/2013 1:00:00
Hop it'll help and thanks for the link to nyx.com ! Was looking for something like that for a long time !

datetime manipulation: replace all dates with 00:00 time with 24:00 the previous day

I have a table described here: http://sqlfiddle.com/#!3/f8852/3
The date_time field for when the time is 00:00 is wrong. For example:
5/24/2013 00:00
This should really be:
5/23/2013 24:00
So hour 00:00 corresponds to the last hour of the previous day (I didn't create this table but have to work with it). Is there way quick way when I do a select I can replace all dates with 00:00 as the time with 24:00 the previous day? I can do it easily in python in a for loop but not quite sure how to structure it in sql. Appreciate the help.
All datetimes are instants in time, not spans of a finite length, and they can exist in only one day. The instant that represents Midnight is by definition, in the next day, the day in which it is the start of the day, i.e., a day is closed on its beginning and open at its end, or, to phrase it again, valid allowable time values within a single calendar date vary from 00:00:00.00000, to 23:59:59.9999.
This would be analogous to asking that the minute value within an hour be allowed to vary from 1 to 60, instead of from 0 to 59, and that the value of 60 was the last minute of the previous hour.
What you are talking about is only a display issue. Even if you could enter a date as 1 Jan 2013 24:00, (24:00:00 is not a legal time of day) it would be entered as a datetime at the start of the date 2 Jan, not at the end of 1 Jan.
One thing that illustrates this, is to notice that, because of rounding (SQL can only resolve datetimes to within about 300 milleseconds), if you create a datetime that is only a few milleseconds before midnight, it will round up to midnight and move to the next day, as can be seen by running the following in enterprise manager...
Select cast ('1 Jan 2013 23:59:59.999' as datetime)
SQL server stoers all datetimes as two integers, one that represents the number days since 1 Jan 1900, and the other the number of ticks (1 tick is 1/300th of a second, about 3.33 ms), since midnight. If it has been zero time interval since Midnight, it is stll the same day, not the previous day.
If you have been inserting data assuming that midnight 00:00:00 means the end of the day, you need to fix that.
If you need to correct your existing data, you need to add one day to every date in your database that has midnight as it's time component, (i.e., has a zero time component).
Update tbale set
date_time = dateAdd(day, 1, date_time)
Where date_time = dateadd(day, datediff(day, 0, date_time), 0)

Bucketize CFAbsoluteTimes into round day NSDates

I have an NSArray of CFAbsoluteTimes. They should be sorted from earliest to latest, but if not I can sort them.
What I need to do is find the min and max date (e.g. Jan 1 to Jan 5) and create a bucketization that shows the count for each day between, e.g.:
Jan 1 - 1
Jan 2 - 0
Jan 3 - 4
Jan 4 - 0
Jan 5 - 3
Something like that. What is the simplest way to turn the absolute times into a rounded NSDate of some sort I can count? Intermediate forms don't really matter to me. I just need to write a function that returns a count when given a date.
You'll probably be happier working completely in Cocoa for this problem, using NSDate instead of CFAbsoluteTime.
Applicable document is the Date and Time Programming Guide.
The general approach you'll need is to work within an NSCalendar, which encapsulates all of the information about days per month, months per year, leap years, DST changes, all for a particular time zone. Convert your NSDate instances to NSDateComponent and you'll be able to extract the day and month numbers, then bucketize from there.
Remember that the function you write (return a count when given a date) will implicitly or explicitly have to handle NSCalendar and NSTimeZone values. The answer will vary by year (is it a leap year? does the interval include a leap day) and locale/date (are we observing Daylight Saving Time in this location right now?).