SQL Server Reporting services chart report - sql

I currently have a dataset with the columns API50, Counter Value and DBName. The values for example are something as below.
API50 CounterValue DBName
34.5 1 Test1
44.5 25 Test1
34.5 42 Test1
54.5 67 Test1
34.5 76 Test1
94.5 88 Test1
14.5 99 Test1
I have created a chart report and selected my X axis as CounterValue my Y axis as API50.
The report is plotted correctly and my Y axis goes only upto 99.
Is there anyway I can have my X axis as 50 point increments till 600 (e.g., 0, 50, 100, 150...and so on till 600) and plot the counter value?
Any help is greatly appreciated.

Creating a basic chart with your sample data:
Gets the same result you have described, i.e. the X Axis just goes up to the maximum CounterValue value or so:
You need to update the X Axis properties:
Here I've updated:
List item
Maximum
Interval
Interval Type
I've also checked the Scalar Axis value - this is most important otherwise the above values won't work properly.
Now you can see the change in the designer:
And the end result has your axis requirements:

Related

Changing column name and it's values at the same time

Pandas help!
I have a specific column like this,
Mpg
0 18
1 17
2 19
3 21
4 16
5 15
Mpg is mile per gallon,
Now I need to replace that 'MPG' column to 'litre per 100 km' and change those values to litre per 100 km' at the same time. Any help? Thanks beforehand.
-Tom
I changed the name of the column but doing both simultaneously,i could not.
Use pop to return and delete the column at the same time and rdiv to perform the conversion (1 mpg = 1/235.15 liter/100km):
df['litre per 100 km'] = df.pop('Mpg').rdiv(235.15)
If you want to insert the column in the same position:
df.insert(df.columns.get_loc('Mpg'), 'litre per 100 km',
df.pop('Mpg').rdiv(235.15))
Output:
litre per 100 km
0 13.063889
1 13.832353
2 12.376316
3 11.197619
4 14.696875
5 15.676667
An alternative to pop would be to store the result in another dataframe. This way you can perform the two steps at the same time. In my code below, I first reproduce your dataframe, then store the constant for conversion and perform it on all entries using the apply method.
df = pd.DataFrame({'Mpg':[18,17,19,21,16,15]})
cc = 235.214583 # constant for conversion from mpg to L/100km
df2 = pd.DataFrame()
df2['litre per 100 km'] = df['Mpg'].apply(lambda x: cc/x)
print(df2)
The output of this code is:
litre per 100 km
0 13.067477
1 13.836152
2 12.379715
3 11.200694
4 14.700911
5 15.680972
as expected.

How do you iterate through a data frame based on the value in a row

I have a data frame which I am trying to iterate through, however not based on time, but on an increase of 10 for example
Column A
Column B
12:05
1
13:05
6
14:05
11
15:05
16
so in this case it would return a new data frame with the rows with 1 and 11. How am I able to do this? The different methods that I have tried such as asfreq resample etc. don't seem to work. They say invalid frequency. The reason I think about this is that it is not time based. What is the function that allows me to do this that isn't time based but based on a numerical value such as 10 or 7. I don't want the every nth number, but every time the column value changes by 10 from the last selected value. ex 1 to 11 then if the next values were 12 15 17 21, it would be 21.
here is one way to do it
# do a remainder division, and choose rows where remainder is zero
# offset by the first value, to make calculation simpler
first_val = df.loc[0]['Column B']
df.loc[((df['Column B'] - first_val) % 10).eq(0)]
Column A Column B
0 12:05 1
2 14:05 11

How to plot a chart so it adds to the value of previous value instead of plotting it over a zero line

In this code i have ploted pct_day. Since the value does not increase like it would in a stock value, is it possible to plot this data where the current value which is to be plotted is added to the previous value and that data is plotted. This way the line graph would increase over time as opposed to the image below where the chart is plotted over a zero line?
High Low Open Close Volume Adj Close year pct_day
month day
1 2 794.913004 779.509998 788.783002 789.163007 6.372860e+08 789.163007 1997.400000 0.002211
3 833.470005 818.124662 823.937345 828.889339 9.985193e+08 828.889339 1997.866667 0.004160
4 863.153573 849.154299 858.737861 853.571429 1.042729e+09 853.571429 1997.714286 -0.003345
5 900.455715 888.571429 895.716426 894.472137 1.022023e+09 894.472137 1998.357143 -0.001216
6 847.453076 837.161537 840.123847 844.383843 8.889831e+08 844.383843 1998.076923 0.003679
... ... ... ... ... ... ... ... ... ...
12 27 909.735997 900.942000 905.528664 904.734009 7.485793e+08 904.734009 1998.133333 -0.000308
28 946.635010 940.440016 942.995721 944.127147 7.552150e+08 944.127147 1998.071429 0.001251
29 950.723837 941.625390 944.760775 947.200773 6.830400e+08 947.200773 1998.076923 0.002899
30 891.501671 883.954989 887.031665 887.819181 6.010675e+08 887.819181 1997.833333 0.001844
31 918.943857 910.320763 916.251549 913.786154 6.879523e+08 913.786154 1997.923077 -0.002772
363 rows × 8 columns
in Jupyter notebook as shows below:
You need the cumulative sum of the column pct_day. First, create a new column where you compute that value by means of numpy cumsum
pct_value_list = df['pct_value'].tolist()
pct_value_cumsum = list(np.cumsum(pct_value_list))
df['pct_value_cumsum'] = pct_value_cumsum
After that you can plot by df.plot(y='pct_value_cumsum')

Gnuplot: How to load and display single numeric value from data file

My data file has this content
# data file for use with gnuplot
# Report 001
# Data as of Tuesday 03-Sep-2013
total 1976
case1 522 278 146 65 26 7
case2 120 105 15 0 0 0
case3 660 288 202 106 63 1
I am making a histogram from the case... lines using the script below - and that works. My question is: how can I load the grand total value 1976 (next to the word 'total') from the data file and either (a) store it into a variable or (b) use it directly in the title of the plot?
This is my gnuplot script:
reset
set term png truecolor
set terminal pngcairo size 1024,768 enhanced font 'Segoe UI,10'
set output "output.png"
set style fill solid 1.00
set style histogram rowstacked
set style data histograms
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
plot for [i=3:7] 'mydata.dat' every ::1 using i:xticlabels(1) with histogram \
notitle, '' every ::1 using 0:2:2 \
with labels \
title "My Title"
For the benefit of others trying to label histograms, in my data file, the column after the case label represents the total of the rest of the values on that row. Those total numbers are displayed at the top of each histogram bar. For example for case1, 522 is the total of (278 + 146 + 65 + 26 + 7).
I want to display the grand total somewhere on my chart, say as the second line of the title or in a label. I can get a variable into sprintf into the title, but I have not figured out syntax to load a "cell" value ("cell" meaning row column intersection) into a variable.
Alternatively, if someone can tell me how to use the sum function to total up 522+120+660 (read from the data file, not as constants!) and store that total in a variable, that would obviate the need to have the grand total in the data file, and that would also make me very happy.
Many thanks.
Lets start with extracting a single cell at (row,col). If it is a single values, you can use the stats command to extract the values. The row and col are specified with every and using, like in a plot command. In your case, to extract the total value, use:
# extract the 'total' cell
stats 'mydata.dat' every ::::0 using 2 nooutput
total = int(STATS_min)
To sum up all values in the second column, use:
stats 'mydata.dat' every ::1 using 2 nooutput
total2 = int(STATS_sum)
And finally, to sum up all values in columns 3:7 in all rows (i.e. the same like the previous command, but without using the saved totals) use:
# sum all values from columns 3:7 from all rows
stats 'mydata.dat' every ::1 using (sum[i=3:7] column(i)) nooutput
total3 = int(STATS_sum)
These commands require gnuplot 4.6 to work.
So, your plotting script could look like the following:
reset
set terminal pngcairo size 1024,768 enhanced
set output "output.png"
set style fill solid 1.00
set style histogram rowstacked
set style data histograms
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
# extract the 'total' cell
stats 'mydata.dat' every ::::0 using 2 nooutput
total = int(STATS_min)
plot for [i=3:7] 'mydata.dat' every ::1 using i:xtic(1) notitle, \
'' every ::1 using 0:(s = sum [i=3:7] column(i), s):(sprintf('%d', s)) \
with labels offset 0,1 title sprintf('total %d', total)
which gives the following output:
For linux and similar.
If you don't know the row number where your data is located, but you know it is in the n-th column of a row where the value of the m-th column is x, you can define a function
get_data(m,x,n,filename)=system('awk "\$'.m.'==\"'.x.'\"{print \$'.n.'}" '.filename)
and then use it, for example, as
y = get_data(1,"case2",4,"datafile.txt")
using data provided by user424855
print y
should return 15
It's not clear to me where your "grand total" of 1976 comes from. If I calculate 522+120+660 I get 1302 not 1976.
Anyway, here is a solution which works even without stats and sum which were not available in gnuplot 4.4.0.
In the data you don't necessarily need the "grand total" or the sum of each row, because gnuplot can calculate this for you. This is done by (not) plotting the file as a matrix, and at the same time summing up the rows in the string variable S0 and the total sum in variable Total. There will be a warning warning: matrix contains missing or undefined values which you can ignore. The labels are added by plotting '+' ... with labels extracting the desired values from the S0 string.
Data: SO18583180.dat
So, the reduced input data looks like this:
# data file for use with gnuplot
# Report 001
# Data as of Tuesday 03-Sep-2013
case1 278 146 65 26 7
case2 105 15 0 0 0
case3 288 202 106 63 1
Script: (works for gnuplot>=4.4.0, March 2010 and gnuplot 5.x)
### histogram with sums and total sum
reset
FILE = "SO18583180.dat"
set style histogram rowstacked
set style data histograms
set style fill solid 0.8
set xlabel "Case"
set ylabel "Frequency"
set boxwidth 0.8
set key top left noautotitle
set grid y
set xrange [0:2]
set offsets 0.5,0.5,0,0
Total = 0
S0 = ''
addSums(v) = S0.sprintf(" %g",(M=$2,(N=$1+1)==1?S1=0:0,S1=S1+v))
plot for [i=2:6] FILE u i:xtic(1) notitle, \
'' matrix u (S0=addSums($3),Total=Total+$3,NaN) w p, \
'+' u 0:(real(S2=word(S0,int($0*N+N)))):(S2) every ::::M w labels offset 0,0.7 title sprintf("Total: %g",Total)
### end of script
Result: (created with gnuplot 4.4.0, Windows terminal)

How to Resize using Lanczos

I can easily calculate the values for sinc(x) curve used in Lanczos, and I have read the previous explanations about Lanczos resize, but being new to this area I do not understand how to actually apply them.
To resample with lanczos imagine you
overlay the output and input over
eachother, with points signifying
where the pixel locations are. For
each output pixel location you take a
box +- 3 output pixels from that
point. For every input pixel that lies
in that box, calculate the value of
the lanczos function at that location
with the distance from the output
location in output pixel coordinates
as the parameter. You then need to
normalize the calculated values by
scaling them so that they add up to 1.
After that multiply each input pixel
value with the corresponding scaling
value and add the results together to
get the value of the output pixel.
For example, what does "overlay the input and output" actually mean in programming terms?
In the equation given
lanczos(x) = {
0 if abs(x) > 3,
1 if x == 0,
else sin(x*pi)/x
}
what is x?
As a simple example, suppose I have an input image with 14 values (i.e. in addresses In0-In13):
20 25 30 35 40 45 50 45 40 35 30 25 20 15
and I want to scale this up by 2, i.e. to an image with 28 values (i.e. in addresses Out0-Out27).
Clearly, the value in address Out13 is going to be similar to the value in address In7, but which values do I actually multiply to calculate the correct value for Out13?
What is x in the algorithm?
If the values in your input data is at t coordinates [0 1 2 3 ...], then your output (which is scaled up by 2) has t coordinates at [0 .5 1 1.5 2 2.5 3 ...]. So to get the first output value, you center your filter at 0 and multiply by all of the input values. Then to get the second output, you center your filter at 1/2 and multiply by all of the input values. Etc ...