Create Excel DateTime Serial/Decimal Fraction values in SQL - sql

I am trying to recreate a staff members Excel work in SQL to save time and also drive reporting.
In their spreadsheet, they take 2 time values, minus the smallest from the largest to arrive at a difference, convert that time value to a serialised time value:
They then sum that serial integer to define performance calculations.
Is there a conversion or similar process in SQL that can return the same/similar serial time value so I can perform equivalent calculations (or has anyone experience with a function that achieves this)?
I have tried the following line in the code (based on the Excel DateTime explanation here) and the value isn't the same result as Excel...
datediff(MINUTE,cf_pick_pack.date_start, cf_pick_pack.date_end) * (convert(float,1.00000000/1440)) as 'duration_serial'
SQL returns 0.00902777^, which is short of the 0.00923611 that Excel returns.

Ok, my bad. SQL was calculating the difference to the nearest minute because the datediff was set to minutes.
The following works...
datediff(SECOND,cf_pick_pack.date_start, cf_pick_pack.date_end) * (convert(float,1.00000000/86400)) as 'duration_serial'

Related

Invalid Time String Error when trying to change type of data from string to time

I am very new to data analytics and I need some help troubleshooting a SQL error I got. So, I have a column in this table which transferred over from Excel to SQL as a string type rather than a time piece of data. I want to make it into a time type so i can further analyze it.
So, I did the attached query to try and change the type of data using the CAST function. . However, it could not complete the query thanks to an outlier in the data set I have yet to clean the data and this was one of my first steps to so, but how do I remove this particular row that contains the invalid time string so the query can actually work? Or is there a better way to convert this entire column from text string to time?
BigQuery Time types adjust values outside the 24 hour boundary - 00:00:00 to 24:00:00; for example, if you subtract an hour from 00:30:00, the returned value is 23:30:00.
Based on your screenshot it looks like you are storing a duration? So 330 hours, 25 minutes and 55 seconds?
You would probably be best using timestamp, converting the hours to days and adding the remainder to your minutes and seconds.
You can then cast the resulting string to timestamp.
Edit
A much simpler solution is just cast('330:25:55' as interval) - thanks to #MatBailie

Latest date from a date list >180 days in past from a given date in same list

I have a "Appeared date" column A and next to it i have a ">180" date column B. There is also "CONCAT" column C and a "ATTR" column D.
What i want to do is find out the latest date 180 or more from past, and write it in ">180" column, for each date in "Appeared Date" column, where the Concat column values are same.
The Date in >180 column should be more than 180 days from "Appeared date" column in the past, but should also be an earliest date found only from the "Appeared date" column.
Based on this i would like to check if a particular product had "ATTR" = 'NEW' >180 earlier also i.e. was it launched 180 days or more ago and appearing again recently?
Is there an excel formula which can get the nearest dates (>180) picked from the Appeared date and show it in the ">180" column?
Will it involve a mix of SMALL(), FREQUENCY(), MATCH(), INDEX() etc?
Or a VBA procedure is required?
To do this efficiently with formulas, you can use something called Range Slicing to reduce the size of the arrays to be processed, by efficiently truncating them so that they contain just the subset of those 3,000 to 50,000 rows that could possibly hold the correct answer, and THEN doing the actual equality check. (As apposed to your MAX/Array approach, which does computationally expensive array operations on all the rows, even though most of the rows have no relationship with the current row that you seek an answer for).
Here's my approach. First, here's my table layout:
...and here's my formulas:
180: =[#Appeared]-180
Start: =MATCH([#CONCAT],[CONCAT],0)
End: =MATCH([#CONCAT],[CONCAT],1)
LastRow: =MATCH(1,--(OFFSET([Appeared],[#Start],,[#End]-[#Start])>[#180]),0)+[#Start]-1
LastItem: =INDEX([Appeared],[#LastRow])
LastDate > 180: =IF([#Appeared]-[#LastItem]>180,[#LastItem],"")
Days: =IFERROR([#Appeared]-[#[LastDate > 180]],"")
Even with this small data set, my approach is around twice as fast as your MAX approach. And as the size of the data grows, your approach is going to get exponentially slower, as more and more processing power is wasted on crunching rows that can't possibly contain the answer. Whereas mine will get slower in a linear fashion. We're probably talking a difference of minutes, or perhaps even an hour or so at the extremes.
Note that while you could do my approach with a single mega-formula, you would be wise not to: it won't be anywhere near as efficient. splitting your mega-formulas into separate cells is a good idea in any case because it may help speed up calculation due to something called multithreading. Here’s what Diego Oppenheimer, a former program manager for Microsoft Excel, had to say on the subject back in 2005 :
Multithreading enables Excel to spot formulas that can be calculated concurrently, and then run those formulas on multiple processors simultaneously. The net effect is that a given spreadsheet finishes calculating in less time, improving Excel’s overall calculation performance. Excel can take advantage of as many processors (or cores, which to Excel appear as processors) as there are on a machine—when Excel loads a workbook, it asks the operating system how many processors are available, and it creates a thread for each processor. In general, the more processors, the better the performance improvement.
Diego went on to outline how spreadsheet design has a direct impact on any performance increase:
A spreadsheet that has a lot of completely independent calculations should see enormous benefit. People who care about performance can tweak their spreadsheets to take advantage of this capability.
The bottom line: Splitting formulas into separate cells increases the chances of calculating formulas in parallel, as further outlined by Excel MVP and calculation expert Charles Williams at the following links:
Decision Models: Excel Calculation Process
Excel 2010 Performance: Performance and Limit Improvements
I think i found the answer. Earlier i was using the MIN function, though incorrectly, as the dates in the array formula (when you select and hit F9 key) were coming in descending order. So i finally used the MAX function to find the earliest date which was more than 180 in the past.
=IF(MAX(IF(--(A2-$A$2:$A$33>=180)*(--(C2=$C$2:$C$33))*(--
($D$2:$D$33="NEW")),$A$2:$A$33))=0,"",MAX(IF(--(A2-$A$2:$A$33>=180)*(--
(C2=$C$2:$C$33))*(--($D$2:$D$33="NEW")),$A$2:$A$33)))
Check the revised Sample.xlsx which is self-explanatory. I have added the Attr='NEW' criteria in the formula for the final workaround, to find if there were any new items that came 180 days or earlier.
Though still an ADO query alternative may be required to process the large amounts of data.

Store datetime -time only in Access database

I have a vb.net program, updating the time value in an access database. The database is connected using OleDB.
Basically this is what is happening:
Dim commandBuilder As New OleDb.OleDbCommandBuilder(dataEventAdapter)
eventDataset.Tables("EventList").Rows(selectedEvent)("EventTime") = Format(dateTimePick.Value, "hh:mm tt")
dataEventAdapter.Update(eventDataset, "EventList")
The time is taken from a datetime picker, and it should store only the time value.
The problem is, that the database already has values in it, which only has the time, like: 9:00 AM, but when I'm updating with this, it gets the date as well. And honestly I don't know where it gets the date from. If I
MsgBox(Format(dateTimePick.Value, "hh:mm tt"))
I get only the time, and nothing else.
How can I store the time only?
If you look at the datatypes available in MS-Access you will find that there isn't a type just for Time values but there is a type for Date/Time values. This means that Access will store always the date AND the time for the values that you supply. The display that you observe looking at the MS-Access grid is controlled by the Format setting in the structure page of your table and here you could change it to show just the Time part of your data.
Said that, there is the problem that you don't supply a DateTime value, but a string. Access is gracious(?) enough to not trigger an exception for this, but compensates adding a date by itself thus you should see the current day for every value that you supply.
So you shouldn't be concerned about how your value has been displayed, but more on how you pass that value to the database. If only the time part is meaningful for your program then leaving the database engine convert back your string to a datetime value is not an option. (Without talking about the localization issues that this automation will involve)
I suggest to pass a constant value for the Date part (like DateTime.MinValue or 1/1/1) and add your time to this value. In this way you could easily ignore the date part if you eventually need to use some queries on this data.
Dim dt As DateTime = new DateTime(1,1,1, dateTimePick.Value.Hour, _
dateTimePick.Value.Minute,
dateTimePick.Value.Second)
eventDataset.Tables("EventList").Rows(selectedEvent)("EventTime") = dt
You can make a simple experiment in Access. Open the Immediate window with Ctrl-G and enter
?Format(#00:00:00#,"yyyy/mm/dd hh:nn:ss")Enter
1899/12/30 00:00:00
?Format(#08:31:57#,"yyyy/mm/dd hh:nn:ss")Enter
1899/12/30 08:31:57
The result shows you the origin Access uses for its time axis.
Another experiment shows this:
?#1899/12/30 08:31:57#Enter
08:31:57
Access automatically displays only the time part for the date 1899/12/30.
Therefore I suggest to use this date as a base for time-only data.
Access uses Double values to store dates internally, where the integer part represents the number of days elapsed since 1899/12/30 and the decimal fraction represents the time as fraction of 24h (i.e. 0.25 is 06:00 am and 0.75 is 18:00).
?CDbl(#1899/12/30 08:00:00#)Enter
0.333333333333333
?CDbl(#1899/12/30#)Enter
0
?CDate(0)Enter
00:00:00
?CDate(0.25)Enter
06:00:00
In .NET you can use the System.DateTime.FromOADate(d As Double) As Date method for the conversion of Access Dates given as Double to .NET Dates (VB Date = System.DateTime).
You are confusing data types with formatting. In your database the column you are inserting into has a datetime datatype (Access has no data type for just time). This means that it stores everything that goes in there as a date + a time.
If in Access you are seeing values with only a time, it's likely that Access decided the date is useless (possibly because it was stored with a date of 1/1/1900).
Thing to remember is that the date still being stored. When you re-display the data just format it to only display the time. Judging from your code example you already know how to do that.

Converting from excel formula for Using forecast with times

When using forecast, you input a number and it should return a value based on the known X data and Known Y data.
However if you put in a time this does not work.
I need two things.
First of all I need the VBA equivalent of forecast. I suspect this to be application.forecast
Then how to use the date as a value for the forecast to work as it should
The formula is as follows:
=FORECAST(15:00:00,A10:A33,B10:B33)
Currently this equation flags up an error.
Any ideas to get this to work for time values?
I see two potential problem areas. The first is the time. Use the TIME function to get a precise time. Second, in D9:D12, the values are left-aligned. Typically, this means they are text, not true numbers. If you absolutely require the m suffix, use a Custom number Format of General\m in order that they retain their numeric status while displaying an m as an increment suffix. If you type the m in, they become text-that-look-like-numbers and are useless for any maths.
=FORECAST(TIME(15, 0, 0), B10:B33, A10:A33)
That returns 3.401666667 which is either 09:38 AM or 3.4 m (it's been a while since I played with the FORECAST function).

How do I get the time period start time using SQL over PI-ODBC?

I'm using the most recent PI-OLEDB library to read data from aggregate views in OSIsoft PI Historian into SQL Server. Example:
SELECT time, value
FROM piavg
WHERE
timestep = RELDATE('1h')
AND tag = TAGNAME('mytag')
AND time > DATE('4-Mar-12 00:00:00');
Unfortunately, the aggregate views (PIavg, etc.) only provide a single time column, which represents the end of the period specified by the timestep column.
How can I retrieve the start time as well for the same period? I know PI-SQL supports some funky date math literals, but I can't quite figure out the syntax for time - RELDATE('1h') or whatever that can then be aliased as the starttime.
(Caveat: I don't use PI, so I'm flying blind and can't just trial-and-error this. I have the PI OLEDB Data Provider Manual, but it's pretty sparse on details.)
I realize I could cobble something together in SQL Server, but I'd rather use PI date functions so when SQL Server gets the data back there's no additional work needed. I'm working with a number of timestep values, so it's not just a static DATEADD() in SQL Server.
It turns out that time - RELDATE('1h') works perfectly.