I have a set of files that contain data that I want to produce a set of box plots for in order to compare them. I can get the data into gnuplot, but I don't know the correct format to separate each file into its own plot.
I have tried reading all the required files into a variable, which does work, however when the plot is produced, all the boxplots are on top of each other. I need to get gnuplot to index each plot along one space for each new data file.
For example, this produces the output with overlaying plots:
FILES = system("ls -1 /path/to/files/*")
plot for [data in FILES] data using (1):($4) with boxplot notitle
I know the X position is being stated explicitly there with the (1), but I'm not sure what to replace it with to get the position to move for each plot. This isn't a problem with other chart types, since they don't have the same field locating them.
You can try the following.
You can access the file in your file list by index via word(FILES,i). Check help word and help words. The code below assumes that you have some datafiles Data0*.dat in your directory. Maybe there is a smarter/shorter way to implement the xtic labels.
Code:
### boxplots from a list of files
reset session
# get a list of files (Windows)
FILES = system('dir /B "C:\Data\Data0*.dat"')
# set tics as filenames
set xtics () # remove xtics
set yrange [-2:27]
do for [i=1:words(FILES)] {
set xtics add (word(FILES,i) i) rotate by 45 right
}
plot for [i=1:words(FILES)] word(FILES,i) u (i):2 w boxplot notitle
### end of code
Result:
This only works for me if I do not set a yrange.
Let's say I have a sample-time-overview.csv like
,avg,std,,,TProc,2267.5202096317,4573.0532262204
TParse,4.9922379603,138.6595434301,,,,,
THash,86.4020623229,548.8593468508,,,,,
TEnq,1.1181869688,2.0684998031,,,,,
TInQ,1482.2243626062,4257.8024051927,,,,,
TSend,2253.1871161473,4514.2823125251,,,,,
TWait,1.7578696884,43.1050730747,,,,,
TAnsw,14.3452407932,201.9216484892,,,,,
TProcAll,2269.2780793201,4573.3927526674,,,,,
TTotal,3853.3679320114,7095.0740689587,,,,,
where I am not interested in the first or last two lines.
Basically copy-pasted the code from the link above with minor adjusts:
#!/usr/bin/gnuplot
reset
filename = "sample-time-overview"
set terminal pngcairo size 500,500 enhanced font 'Verdana,10'
set output filename."_piechart.png"
#set title ""
unset border
unset tics
set xrange[-1:1.5]
#uncommend yrange and the plotdisappears
#set yrange[-1.25:1.25]
centerX=0
centerY=0
radius=1
set datafile separator ','
set key off
set style fill solid 1
stats filename.".csv" u 2 every ::1::7 noout prefix "A"
angle(x)=x*360/A_sum
percentage(x)=x*100/A_sum
pos=0.0
colour=0
yi=0
plot filename.".csv" u (centerX):(centerY):(radius):(pos):(pos=pos+angle($2)):(colour=colour+1) every::1::7 w circle lc var
system(sprintf("display %s_piechart.png", filename))
this ends up looking like
I uncomment the yrange and comment the unset border and it looks like this:
which is very annoying because when I then try to add labels ...
plot filename.".csv" u (centerX):(centerY):(radius):(pos):(pos=pos+angle($2)):(colour=colour+1) every::1::7 w circle lc var,\
"" u (1.5):(yi=yi+0.5/A_records):($1) every::1::7 w labels
this will happen:
Which I suspect is due to the missing yrange (because other than that, the code doesn't differ much from what was posted in the linked answer).
How do I get the bloody thing working?
It is better to configure graph properties just before the plot command. Other routines (e.g. stats and thus A_sum) will be affected by these properties (e.g. set yrange).
This is why the pie chart disappears.
Also, be sure to have equal unit lengths for the x and y axes (use set size ratio -1). If not, the circumference will be drawn with respect to the canvas size, and not with respect to the axes. The pie chart will appear cut otherwise (unless an appropriate yrange is given).
With some modifications, I obtain this chart:
This is the code:
filename = 'sample-time-overview'
rowi = 1
rowf = 7
# obtain sum(column(2)) from rows 1 to 7
set datafile separator ','
stats filename.'.csv' u 2 every ::rowi::rowf noout prefix "A"
angle(x)=x*360/A_sum
percentage(x)=x*100/A_sum
# circumference dimensions for pie-chart
centerX=0
centerY=0
radius=1
# label positions
yposmin = 0.0
yposmax = 0.95*radius
xpos = 1.5*radius
ypos(i) = yposmax - i*(yposmax-yposmin)/(1.0*rowf-rowi)
#-------------------------------------------------------------------
# now we can configure the canvas
set style fill solid 1 # filled pie-chart
unset key # no automatic labels
unset tics # remove tics
unset border # remove borders; if some label is missing, comment to see what is happening
set size ratio -1 # equal scale length
set xrange [-radius:2*radius] # [-1:2] leaves place for labels
set yrange [-radius:radius] # [-1:1]
#-------------------------------------------------------------------
pos = 0 # init angle
colour = 0 # init colour
# 1st line: plot pie-chart
# 2nd line: draw colored boxes at (xpos):(ypos)
# 3rd line: place labels at (xpos+offset):(ypos)
plot filename.'.csv' u (centerX):(centerY):(radius):(pos):(pos=pos+angle($2)):(colour=colour+1) every ::rowi::rowf w circle lc var,\
for [i=0:rowf-rowi] '+' u (xpos):(ypos(i)) w p pt 5 ps 4 lc i+1,\
for [i=0:rowf-rowi] filename.'.csv' u (xpos):(ypos(i)):(sprintf('%05.2f%% %s', percentage($2), stringcolumn(1))) every ::i+1::i+1 w labels left offset 3,0
Setting yrange also influences the execution of the stats command. Therefore you should try to set yrange[-1.25:1.25] after the stats command, not before.
PS:
Plotting the labels with
plot filename.'.csv' u (1.5):(yi=yi+0.5/A_records):($1) every::1::7 w labels
does not work for me. I have to remove the dollar sign:
plot filename.'.csv' u (1.5):(yi=yi+0.5/A_records):1 every::1::7 w labels
And I have to adjust the values 1.5 and 0.5 a little bit.
Gnuplot reads weather data from a huge file called file.dat and plots the weather data for a given date and time.
But if there is no data for the given date and time (xrange), gnuplot crashes.
How can I tell gnuplot, if there is no data for a given date and time, display a text in the output image?
("There is no data available, I am sorry")
The error, if there is no data available:
line 0: all points y2 value undefined!
The script.dem file, which is loaded by gnuplot:
reset
#SET TERMINAL
set term svg
set output 'temp-verlauf.svg'
set title "Temperaturverlauf"
#Axes label
set xlabel "Messzeitpunkt"
set ylabel "Luftfeuchte/Temperatur"
set y2label "Luftdruck"
#Axis setup
set xdata time # x-Achse wird im Datums/Zeitformat skaliert
set timefmt "%d.%m.%Y\t%H:%M:%S" # Format Zeitangaben yyyy.mm.dd_hh:mm:ss
set format x "%H:%M" # Format für die Achsenbeschriftung
#Axis ranges
set yrange [0:60] # die y-Achse geht von:bis
#Tics
set ytics nomirror
set y2tics nomirror
#OTHER
set datafile separator "\t"
set xrange ["06.11.2014 14:00:00":"07.11.2014 21:00:00"]
plot \
"file.dat" every 10 using 1:5 title "Luftfeuchte" with lines, \
"file.dat" every 10 using 1:6 title "Temperatur" with lines, \
"file.dat" every 10 using 1:7 title "Luftdruck" with lines axes x1y2, \
"file.dat" every 10 using 1:17 title "Niederschlagsintensitaet Synop (4677)" with lines
EDIT
Thanks to user "bibi".
He had the good idea to let gnuplot plot -1 to have data if there is nothing avaible in file.dat.
The script will look like that:
reset
#SET TERMINAL
set term svg
set output 'temp-verlauf.svg'
set title "Temperaturverlauf"
#Axes label
set xlabel "Messzeitpunkt"
set ylabel "Luftfeuchte/Temperatur"
set y2label "Luftdruck"
#Axis setup
set xdata time # x-Achse wird im Datums/Zeitformat skaliert
set timefmt "%d.%m.%Y\t%H:%M:%S" # Format Zeitangaben yyyy.mm.dd_hh:mm:ss
set format x "%H:%M" # Format für die Achsenbeschriftung
#Axis ranges
set yrange [0:60] # die y-Achse geht von:bis
#Tics
set ytics nomirror
set y2tics nomirror
#OTHER
set datafile separator "\t"
set xrange ["06.11.2014 14:00:00":"07.11.2014 21:00:00"]
plot \
-1 axes x1y2, \
-1 axes x1y1, \
"file.dat" every 10 using 1:5 title "Luftfeuchte" with lines, \
"file.dat" every 10 using 1:6 title "Temperatur" with lines, \
"file.dat" every 10 using 1:7 title "Luftdruck" with lines axes x1y2, \
"file.dat" every 10 using 1:17 title "Niederschlagsintensitaet Synop (4677)" with lines
The simplest solution it came to my mind is to draw an horizontal line outside the plotting region (-1 is ok since you have set yrange [0:60]):
plot \
-1, \
"file.dat" every 10 using 1:5 title "Luftfeuchte" with lines, \
"file.dat" every 10 using 1:6 title "Temperatur" with lines, \
"file.dat" every 10 using 1:7 title "Luftdruck" with lines axes x1y2, \
"file.dat" every 10 using 1:17 title "Niederschlagsintensitaet Synop (4677)" with lines
Moreover the gnuplot internal variable GPVAL_ERRNO will be non-zero if something weird happened, you might check that and print a banner on screen.
Your data in your file is double. So after plotting once it jumps back to the start and plots everything another time on top of the first plot.
Took me a couple of hours to figure out. :)
I am trying to give different colors to the error bars in a bar plot using gnuplot, but I couldn't. Some of the default color combinations between the bar on which the error line sits are not good. I tried in python and got what I wanted, but would like to get a similar output using gnuplot.
The top picture is produced using the following MWE:
reset
set term postscript eps size 5.5,4.5 enhanced color font 'Arial-Bold' 25
set output 'check.eps'
set style fill solid 0.3 noborder
set bars front
set key horizontal Left reverse noenhanced autotitles nobox
set style histogram errorbars linewidth 9
set style data histograms
set xlabel " "
set xtics rotate by -45
set xlabel offset character 0, -1, 0
set yrange [0:100]
set ylabel "%"
plot \
newhistogram "label 1",'check.mat' \
using 2:3:4:xtic(1) t "M1", \
'' u 6:7:8 t "M2",\
newhistogram "label 2", '' u 10:11:12:xtic(1) t "M1",\
'' u 14:15:16:xtic(1) t "M2",\
newhistogram "label 3", '' u 14:15:16:xtic(1) t "M1",\
'' u 18:19:20:xtic(1) t "M2" lc rgb "black"
quit
I produced the bottom picture following the example given here: python matplotlib example. The colors for the error lines can be controlled using the variable ecolor in python. Do we have something similar in gnuplot?
I also don't understand why gnuplot is giving me dotted error lines for some of the cases. Is it possible to make them all solid?
I am beginner in gnuplot, so any help is greatly appreciated!
I'm trying to plot a histogram for the following data:
<text>,<percentage>
--------------------
"Statement A",50%
"Statement B",20%
"Statement C",30%
I used the set datafile separator "," to obtain the corresponding columns. The plot should have percentage on the X-axis and the statements on the Y-axis (full character string). So each histogram is horizontal.
How can I do this in gnuplot?
Or is there other tools for plotting good vector images?
The gnuplot histogram and boxes plotting styles are for vertical boxes. To get horizontal boxes, you can use boxxyerrorbars.
For the strings as y-labels, I use yticlabels and place the boxes at the y-values 0, 1 and 2 (according to the row in the data file, which is accessed with $0).
I let gnuplot treat the second column as numerical value, which strips the % off. It is added later in the formatting of the xtics:
set datafile separator ','
set format x '%g%%'
set style fill solid
plot 'data.txt' using ($2*0.5):0:($2*0.5):(0.4):yticlabels(1) with boxxyerrorbars t ''
The result with version 4.6.4 is:
#Christoph Thank you. Your answer helped me.
#Slayer Regarding your question to add labels using gnuplot v5.2 patchlevel 6 and using #Christoph's provided sample.
Sample Code:
# set the data file delimiter
set datafile separator ','
# set the x-axiz labels to show percentage
set format x '%g%%'
# set the x-axis min and max range
set xrange [ 0 : 100]
# set the style of the bars
set style fill solid
# set the textbox style with a blue line colour
set style textbox opaque border lc "blue"
# plot the data graph and place the labels on the bars
plot 'plotv.txt' using ($2*0.5):0:($2*0.5):(0.3):yticlabels(1) with boxxyerrorbars t '', \
'' using 2:0:2 with labels center boxed notitle column
Sample Data Provided:(plotv.txt)
<text>,<percentage>
--------------------
"Statement A",50%
"Statement B",20%
"Statement C",30%
Reference(s):
gnuplot 5.2 demo sample - textbox and the related sample data
gnuplot