I am running Jmeter scripts from the command line. While running I get this summary after every request. I understood from the documentation that we need comment or set summariser.name=summary to none. I don't want to see this summary. Pl. let me know how to disable it.
00:44:10.785 summary + 6 in 00:00:32 = 0.2/s Avg: 241 Min: 2 Max: 1239 Err: 1 (16.67%) Active: 1 Started: 1 Finished: 0
00:44:10.785 summary = 498 in 00:39:27 = 0.2/s Avg: 126 Min: 0 Max: 2851 Err: 32 (6.43%)
00:44:42.892 summary + 7 in 00:00:31 = 0.2/s Avg: 88 Min: 0 Max: 418 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
00:44:42.892 summary = 505 in 00:39:57 = 0.2/s Avg: 126 Min: 0 Max: 2851 Err: 32 (6.34%)
00:45:14.999 summary + 6 in 00:00:31 = 0.2/s Avg: 73 Min: 2 Max: 216 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
00:45:14.999 summary = 511 in 00:40:28 = 0.2/s Avg: 125 Min: 0 Max: 2851 Err: 32 (6.26%)
00:45:41.565 summary + 6 in 00:00:31 = 0.2/s Avg: 68 Min: 2 Max: 205 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
00:45:41.565 summary = 517 in 00:40:58 = 0.2/s Avg: 125 Min: 0 Max: 2851 Err: 32 (6.19%)
00:46:13.681 summary + 6 in 00:00:31 = 0.2/s Avg: 103 Min: 2 Max: 384 Err: 0 (0.00%) Active: 1 Started: 1 Finished: 0
00:46:13.681 summary = 523 in 00:41:29 = 0.2/s Avg: 124 Min: 0 Max: 2851 Err: 32 (6.12%)
If you don't want to see the summariser output in the console you can amend your command to
jmeter -Jsummariser.out=false -n -t test.jmx -l result.jtl
in order to make the change permanent - put this line: summariser.out=false to user.properties file.
If you want to turn off the summariser completely:
Open jmeter.properties file with your favourite text editor
Locate this line
summariser.name=summary
and either comment it by putting # character in front of it:
#summariser.name=summary
or just simply delete it
That's it, you won't see summariser output on next execution
More information:
Summariser - Generate Summary Results - configuration
Configuring JMeter
Apache JMeter Properties Customization Guide
I have this matrix df.head():
0 1 2 3 4 5 6 7 8 9 ... 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857
0 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.00000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.00000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 30.88689 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.00000 0.0 0.0 0.0 0.00000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 42.43819 0.0 0.0 0.0 0.00000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 rows × 1858 columns
And I need to apply a transformation to it every time a value other than 0.0 is found, dividing the value by 0.32
So far I have the mask, like so:
normalize = 0.32
mask = (df>=0.0)
df = df.where(mask)
How do I apply such a transformation on a very large dataframe, after masking it?
You don't need mask, just divide your dataframe by 0.32.
df / 0.32
>>> df
A B
0 0 3
1 5 0
>>> df / 0.32
A B
0 0.000 9.375
1 15.625 0.000
If you needed to use mask, try;
mask = (df.eq(0))
df.where(mask, df/0.32)
I have three lists as follows.
mylist = [["sensor9", [[0.5, 0.3, 0.8, 0.9, 0.8], [0.5, 0.6, 0.8, 0.9, 0.9]]],
["sensor12", [[10.6, 0.5, 0.9, 1.0, 0.9], [10.6, 0.9, 0.8, 0.8, 0.8]]]]
columns = ['score_1', 'score_2']
years = [2001, 2002, 2003, 2004, 2005]
I want to change the orientation of mylist as follows using columns as the headings and years for each element in mylist. More specifically, my final output should look as follows.
id, sensor, time, score_1, score_2
0, sensor9, 2001, 0.5, 0.5
0, sensor9, 2002, 0.3, 0.6
0, sensor9, 2003, 0.8, 0.8
0, sensor9, 2004, 0.9, 0.9
0, sensor9, 2005, 0.8, 0.9
1, sensor12, 2001, 0.6, 0.6
1, sensor12, 2002, 0.5, 0.9
1, sensor12, 2003, 0.9, 0.8
1, sensor12, 2004, 1.0, 0.8
1, sensor12, 2005, 0.9, 0.8
Dataframe that describes the id of the above dataframe
id, sensor
0, sensor9
1, sensor12
I was trying to do this with DataFrame.from_dict in pandas. However, I am not sure how to change the orientation of the mylist and align it with the years in pandas. Is it possible to do this?
I am happy to provide more details if needed.
Use list comprehension for generate DataFrame for second values of lists (nested lists) with transpose by DataFrame.T, then concat together and last create new column id by Series.map and DataFrame.insert for first position:
df1 = pd.DataFrame({'id':[0,1],
'sensor':['sensor9','sensor12']})
mylist = [["sensor9", [[0.5, 0.3, 0.8, 0.9, 0.8], [0.5, 0.6, 0.8, 0.9, 0.9]]],
["sensor12", [[10.6, 0.5, 0.9, 1.0, 0.9], [10.6, 0.9, 0.8, 0.8, 0.8]]]]
columns = ['score_1', 'score_2']
years = [2001, 2002, 2003, 2004, 2005]
L = [pd.DataFrame(x[1], index=columns, columns=years).T for x in mylist]
df = pd.concat(L, keys=[x[0] for x in mylist]).rename_axis(('sensor','time')).reset_index()
df.insert(0, 'id', df['sensor'].map(df1.set_index('sensor')['id']))
print (df)
id sensor time score_1 score_2
0 0 sensor9 2001 0.5 0.5
1 0 sensor9 2002 0.3 0.6
2 0 sensor9 2003 0.8 0.8
3 0 sensor9 2004 0.9 0.9
4 0 sensor9 2005 0.8 0.9
5 1 sensor12 2001 10.6 10.6
6 1 sensor12 2002 0.5 0.9
7 1 sensor12 2003 0.9 0.8
8 1 sensor12 2004 1.0 0.8
9 1 sensor12 2005 0.9 0.8
EDIT:
mylist = [["sensor9", [[0.5, 0.3, 0.8, 0.9, 0.8], [0.5, 0.6, 0.8, 0.9, 0.9]]],
["sensor12", [[10.6, 0.5, 0.9, 1.0, 0.9], [10.6, 0.9, 0.8, 0.8, 0.8]]]]
columns = ['score_1', 'score_2']
years = [2001, 2002, 2003, 2004, 2005]
L = [pd.DataFrame(x[1], index=columns, columns=years).T for x in mylist]
df = pd.concat(L, keys=[x[0] for x in mylist]).rename_axis(('sensor','time')).reset_index()
df.insert(0, 'id', pd.factorize(df['sensor'])[0])
print (df)
id sensor time score_1 score_2
0 0 sensor9 2001 0.5 0.5
1 0 sensor9 2002 0.3 0.6
2 0 sensor9 2003 0.8 0.8
3 0 sensor9 2004 0.9 0.9
4 0 sensor9 2005 0.8 0.9
5 1 sensor12 2001 10.6 10.6
6 1 sensor12 2002 0.5 0.9
7 1 sensor12 2003 0.9 0.8
8 1 sensor12 2004 1.0 0.8
9 1 sensor12 2005 0.9 0.8
Suppose I have sparse data in dataframe. How can I create a sparse matrix from it and in which models I can use it for predictions?
Consider the dataframe df
df = pd.DataFrame(np.zeros((10, 10)))
df.iloc[5, 5] = 1
df
0 1 2 3 4 5 6 7 8 9
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Memmory Usage: 880
You can make it sparse with to_sparse(0).
The first argument is the value to assume is the filler value.
d1 = df.to_sparse(0)
d1
0 1 2 3 4 5 6 7 8 9
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Memmory Usage: 88
The memory footprint is a 10th the size.
This answer will keep the data as sparse as possible and avoids memory issues. The csr_matrix is a standard sparse matrix format that can be used with scipy and sklearn for modeling.
import pandas as pd
from scipy import sparse
df = pd.DataFrame({'rowid':[1,2,3,4,5], 'val1':[1, 1, 0, 0, 0], 'val2':[1, 0, 0, 1, 0]})
print 'Input data frame\n{0}'.format(df)
print 'DataFrame to a sparse matrix'
df_as_sparse_matrix = sparse.csr_matrix(df.as_matrix())
print df_as_sparse_matrix.todense()
I have a data file which content two columns. One of them have periodic variation of whom the max and min are different in each period :
a 3
b 4
c 5
d 4
e 3
f 2
g 1
h 2
i 3
j 4
k 5
l 6
m 5
n 4
o 3
p 2
q 1
r 0
s 1
t 2
u 3
We can find that in the 1st period (from a to i): max = 5, min = 1. In the 2nd period (from i to u) : max = 6, min = 0.
Using awk, I can only print the max and min of all second column, but I cannot print these values min and max after each period. That means I wish to obtain results like this :
period min max
1 1 5
2 0 6
Here is what I did :
{
nb_lignes = 21
period = 9
nb_periodes = int(nb_lignes/period)
}
{
for (j = 0; j <= nb_periodes; j++)
{ if (NR == (1 + period*j)) {{max=$2 ; min=$2}}
for (i = (period*j); i <= (period*(j+1)); i++)
{
if (NR == i)
{
if ($2 >= max) {max = $2}
if ($2 <= min) {min = $2}
{print "Min: "min,"Max: "max,"Ligne: " NR}
}
}
}
}
#END { print "Min: "min,"Max: "max }
However the result is far away from what I search for :
Min: 3 Max: 3 Ligne: 1
Min: 3 Max: 4 Ligne: 2
Min: 3 Max: 5 Ligne: 3
Min: 3 Max: 5 Ligne: 4
Min: 3 Max: 5 Ligne: 5
Min: 2 Max: 5 Ligne: 6
Min: 1 Max: 5 Ligne: 7
Min: 1 Max: 5 Ligne: 8
Min: 1 Max: 5 Ligne: 9
Min: 1 Max: 5 Ligne: 9
Min: 4 Max: 4 Ligne: 10
Min: 4 Max: 5 Ligne: 11
Min: 4 Max: 6 Ligne: 12
Min: 4 Max: 6 Ligne: 13
Min: 4 Max: 6 Ligne: 14
Min: 3 Max: 6 Ligne: 15
Min: 2 Max: 6 Ligne: 16
Min: 1 Max: 6 Ligne: 17
Min: 0 Max: 6 Ligne: 18
Min: 0 Max: 6 Ligne: 18
Min: 1 Max: 1 Ligne: 19
Min: 1 Max: 2 Ligne: 20
Min: 1 Max: 3 Ligne: 21
Thank you in advance for you help.
Try something like:
$ awk '
BEGIN{print "period", "min", "max"}
!f{min=$2; max=$2; ++f; next}
{max = ($2>max)?$2:max; min = ($2<min)?$2:min; f++}
f==9{print ++a, min, max; f=0}' file
period min max
1 1 5
2 0 6
When the flag f is not set, you assign the second column to max and min variables and start incrementing your flag.
For each line, check the second column. If it is bigger than our max variable assign that column to max. Like wise, if it is smaller than our min variable, assign it to our min variable. Keep incrementing the flag.
Once the flag reaches 9, print the period number, min and max variables. Reset the flag to 0 and start again afresh from next line.
I've started, so I'll finish. I chose to create an array which contains the minimum and maximum for each period:
awk -v period=9 '
BEGIN { print "period", "min", "max" }
NR % period == 1 { ++i }
!min[i] || $2 < min[i] { min[i] = $2 }
$2 > max[i] { max[i] = $2 }
END { for (i in min) print i, min[i], max[i] }' input
The index i increases every period number of lines (in this case 9). If no value has been set yet or a new minimum/maximum has been found, update the array.
edit: if max[i] has not yet been set then $2 > max[i], so no need to check !max[i].
awk 'BEGIN{print "Period","min","max"}
NR==1||(NR%10==0){mi=ma=$2}
{$2<mi?mi=$2:0;$2>ma?ma=$2:0}
NR%9==0{print ++i,mi,ma}' your_file
Tester here