Doing math with sprintf - printf

When creating a label for GNUplot, reading from text files, how would I get the difference in hours:minutes from two columns which each contain an H:M timestamp (e.g. 23.42).
For example, this creates a concatenation of two columns for an existing label:
myDate(col1,col3)=sprintf("%s-%s",strcol(1),strcol(3))
Is it possible modify it to do that date math to get something like:
timeDiffLabel(col5,col6)=sprintf(do-some-math-here,strcol(5),strcol(6))

To parse a time, use the strptime function:
print strptime('%H:%M', '12:34')
This prints 45240.0, which is the number of seconds parsed from the time string.
If you parse the strings from the two columns like this, you can subtract the values and reformat the result with strftime:
timeDiff(c1, c2) = strftime('%k:%M', strptime('%H:%M', strcol(c2)) - strptime('%H:%M', strcol(c1)))
plot 'test.dat' using 0:0:(timeDiff(1,2)) with labels
This works in principle, but only for positive differences. If the difference is e.g. -1, you'll get 23, because the str*time functions work on datetimes.
A more sophisticated solution uses only the absolute value of the difference for the actual formatting, and prepends an optional - to the result:
timeDiff(c1, c2) = (diff = strptime('%H:%M', strcol(c2)) - strptime('%H:%M', strcol(c1)), (diff < 0 ? '-' : '').strftime('%k:%M', abs(diff)))
plot 'test.dat' using 0:0:(timeDiff(1,2)) with labels
So, with a test file like
12:34 23:54
13:45 11:44
2:33 1:11
you get

Related

Convert float64 to a string with thousand separators

I have a Population Estimate series with numbers as float64 and I need to convert them to a string with thousands separator (using commas). Using all significant digits (no rounding).
e.g. 12345678.90345 -> 12,345,678.90345
Try applying a comma-float string formatter.
population = population.apply('{:,.5f}'.format)
To achieve the desired formatting, you could use '{:,}'.format.
This will use commas as thousands separator and only output the values that are in your data and not clip or fill to a certain number of digits.
data = data.apply('{:,}'.format)

Converting Negative Number in String Format to Numeric when Sign as at the end

I have certain numbers within a column of my dataframe that have negative numbers in a string format like this: "500.00-" I need to convert every negative number within the column to numeric format. I'm sure there's an easy way to do this, but I have struggled finding one specific to pandas dataframe. Any help would be greatly appreciated.
I have tried the basic to_numeric function as shown below, but it doesn't read it in correctly. Also, only some of the numbers within the column are negative, therefore I can't simply remove all the negative signs and multiply the column by 1.
Q1['Credit'] = pd.to_numeric(Q1['Credit'])
Sample data:
df:
num
0 50.00
1 60.00-
2 70.00+
3 -80.00
Using series str accessor to check last digit. If it is '-' or '+', swap it to front. Use df.mask to apply it only to rows having -/+ as suffix. Finally, astype column to float
df.num.mask(df.num.str[-1].isin(['-','+']), df.num.str[-1].str.cat(df.num.str[:-1])).astype('float')
Out[1941]:
0 50.0
1 -60.0
2 70.0
3 -80.0
Name: num, dtype: float64
Possibly a bit explicit but would work
# build a mask of negative numbers
m_neg = Q1["Credit"].str.endswith("-")
# remove - signs
Q1["Credit"] = Q1["Credit"].str.rstrip("-")
# convert to number
Q1["Credit"] = pd.to_numeric(Q1["Credit"])
# Apply the mask to create the negatives
Q1.loc[m_neg, "Credit"] *= -1
Let us consider the following example dataframe:
Q1 = pd.DataFrame({'Credit':['500.00-', '100.00', '300.00-']})
Credit
0 500.00-
1 100.00
2 300.00-
We can use str.endswith to create a mask which indicates the negative numbers. Then we use np.where to conditionally convert the numbers to negative:
m1 = Q1['Credit'].str.endswith('-')
m2 = Q1['Credit'].str[:-1].astype(float)
Q1['Credit'] = np.where(m1, -m2, m2)
Output
Credit
0 -500.0
1 100.0
2 -300.0

Calling preprocessing.scale on a heterogeneous array

I have this TypeError as per below, I have checked my df and it all contains numbers only, can this be caused when I converted to numpy array? After the conversion the array has items like
[Timestamp('1993-02-11 00:00:00') 28.1216 28.3374 ...]
Any suggestion how to solve this, please?
df:
Date Open High Low Close Volume
9 1993-02-11 28.1216 28.3374 28.1216 28.2197 19500
10 1993-02-12 28.1804 28.1804 28.0038 28.0038 42500
11 1993-02-16 27.9253 27.9253 27.2581 27.2974 374800
12 1993-02-17 27.2974 27.3366 27.1796 27.2777 210900
X = np.array(df.drop(['High'], 1))
X = preprocessing.scale(X)
TypeError: float() argument must be a string or a number
While you're saying that your dataframe "all contains numbers only", you also note that the first column consists of datetime objects. The error is telling you that preprocessing.scale only wants to work with float values.
The real question, however, is what you expect to happen to begin with. preprocessing.scale centers values on the mean and normalizes the variance. This is such that measured quantities are all represented on roughly the same footing. Now, your first column tells you what dates your data correspond to, while the rest of the columns are numeric data themselves. Why would you want to normalize the dates? How would you normalize the dates?
Semantically speaking, I believe you should leave your dates alone. Whatever post-processing you're planning to perform on your numerical data, the normalized data should still be parameterized by the original dates. If you want to process your dates too, you need to come up with an explicit way to handle your dates to something numeric (say, elapsed time from a given date in given units).
So I believe you should drop your dates from your processing round altogether, and start with
X = df.drop(['Date','High'], 1).as_matrix()

Graph to show departure and arrival times between stations

I have the start and end times of trips made by a bus, with the times in an Excel sheet. I want to make the graph as below :
I tried with Matlab nodes and graphs but did not got the exact figure, below is the Matlab code which I tried as an example:
A = [1 4]
B = [2 3]
weights = [5 5];
G = digraph(A,B,weights,4)
plot(G)
And the figure it generates:
I have got many more than 4 points in the Excel sheet, and I want them to all be displayed as in the first image.
Overview
You don't need any sort of complicated graph package for this, just use normal line plots! Here are methods in Excel and Matlab.
Excel
Give each bus stop a number, and list the bus stop number by the time it arrives/leaves there. I'll use stops number 0 and 1 for this example.
0 04:41
1 05:35
1 05:40
0 06:34
0 06:51
1 07:45
1 15:21
0 16:15
Then simply highlight the data and insert a "scatter with straight lines"
The rest is formatting. You can format the y-axis and tick "values in reverse order" to get the time increasing as in your desired plot. You can change the x-axis tick marks to just show integer stop numbers, get rid of the legend etc.
Final output:
Matlab
Here is the Matlab documentation for converting Excel formatted dates into Matlab datetime arrays: Convert Excel Date Number to Datetime.
Once you have the datetime objects, you can do this easily with the standard plot function.
% Set times up as a datetime array, could do this any number of ways
times = datetime(strcat({'1/1/2000 '}, {'04:41', '05:35', '05:40', '06:34', '06:51', '07:45', '15:21', '16:15'}, ':00'), 'format', 'dd/MM/yyyy HH:mm:ss');
% Set up the location of the bus at each of the above times
station = [0,1,1,0,0,1,1,0];
% Plot
plot(station, times) % Create plot
set(gca, 'xtick', [0,1]) % Limit to just ticks at the 2 stops
set(gca, 'ydir', 'reverse') % Reverse y axis to have earlier at top
set(gca,'XTickLabel',{'R', 'L'}) % Name the stops
Output:

What is the keyword to get time in milliseconds in robot framework?

Currently I am getting time with the keyword Get time epoch , which is returning time in seconds. But I need time in milliseconds , So that I can get time span for a particular event.
or is there any other way to get the time span for a particular event or a testsceanrio?
Check the new test library DateTime, which contains keyword Get Current Date, which also returns milliseconds. It also has keyword Subtract Dates to calculate difference between two timestamps.
One of the more powerful features of robot is that you can directly call python code from a test script using the Evaluate keyword. For example, you can call the time.time() function, and do a little math:
*** Test cases
| Example getting the time in milliseconds
| | ${ms}= | Evaluate | int(round(time.time() * 1000)) | time
| | log | time in ms: ${ms}
Note that even though time.time returns a floating point value, not all systems will return a value more precise than one second.
Using the DateTime library, as suggested by janne:
*** Settings ***
Library DateTime
*** Test Cases ***
Performance Test
${timeAvgMs} = Test wall clock time 100 MyKeywordToPerformanceTest and optional arguments
Should be true ${timeAvgMs} < 50
*** Keywords ***
MyKeywordToPerformanceTest
# Do something here
Test wall clock time
[Arguments] ${iterations} #{commandAndArgs}
${timeBefore} = Get Current Date
:FOR ${it} IN RANGE ${iterations}
\ #{commandAndArgs}
${timeAfter} = Get Current Date
${timeTotalMs} = Subtract Date From Date ${timeAfter} ${timeBefore} result_format=number
${timeAvgMs} = Evaluate int(${timeTotalMs} / ${iterations} * 1000)
Return from keyword ${timeAvgMs}
In the report, for each suite, test and keyword, you have the information about start, end and length with millisecond details. Something like:
Start / End / Elapsed: 20140602 10:57:15.948 / 20140602 10:57:16.985 / 00:00:01.037
I don't see a way to do it using Builtin, look:
def get_time(format='timestamp', time_=None):
"""Return the given or current time in requested format.
If time is not given, current time is used. How time is returned is
is deternined based on the given 'format' string as follows. Note that all
checks are case insensitive.
- If 'format' contains word 'epoch' the time is returned in seconds after
the unix epoch.
- If 'format' contains any of the words 'year', 'month', 'day', 'hour',
'min' or 'sec' only selected parts are returned. The order of the returned
parts is always the one in previous sentence and order of words in
'format' is not significant. Parts are returned as zero padded strings
(e.g. May -> '05').
- Otherwise (and by default) the time is returned as a timestamp string in
format '2006-02-24 15:08:31'
"""
time_ = int(time_ or time.time())
format = format.lower()
# 1) Return time in seconds since epoc
if 'epoch' in format:
return time_
timetuple = time.localtime(time_)
parts = []
for i, match in enumerate('year month day hour min sec'.split()):
if match in format:
parts.append('%.2d' % timetuple[i])
# 2) Return time as timestamp
if not parts:
return format_time(timetuple, daysep='-')
# Return requested parts of the time
elif len(parts) == 1:
return parts[0]
else:
return parts
You have to write your own module, you need something like:
import time
def get_time_in_millies():
time_millies = lambda: int(round(time.time() * 1000))
return time_millies
Then import this library in Ride for the suite and you can use the method name like keyword, in my case it would be Get Time In Millies. More info here.