How to position color bar in GrADS? - grads

I am looking for a code which positions the color bar by itself. Here is graph:
I used the set_pareas.gs script to fix the graphs in columns and color.gs script to color the plots. The color bar script is xcbar.gs. Here are the command lines
c
set_parea 1 3 1 1 -margin 0.8
color 0 12 1.2 -kind red->orange->yellow->dodgerblue->blue
d var1
set_parea 1 3 1 2 -margin 0.8
color 0 12 1.2 -kind red->orange->yellow->dodgerblue->blue
d var2
set_parea 1 3 1 3 -margin 0.8
color -12 12 2.4 -kind blue->white->red
d var1-var2
I would like that the color bar stay just below the differences map and red->orange->yellow->dodgerblue->blue color bar stay just below the orange maps.

You can adjust the position of color bar in the command of xcbar.
The below document maybe necessary for you:
http://kodama.fubuki.info/wiki/wiki.cgi/GrADS/script/xcbar.gs?lang=en

Related

Scatter plot derived from two pandas dataframes with multiple columns in plotly [express]

I want to create a scatter plot that drives its x values from one dataframe and y values from another dataframe having multiple columns.
x_df :
red blue
0 1 2
1 2 3
2 3 4
y_df:
red blue
0 1 2
1 2 3
2 3 4
I want to plot a scatter plot like
I would like to have two red and blue traces such that x values should come from x_df and y values are derived from y_df.
at some layer you need to do data integration. IMHO better to be done at data layer i.e. pandas
have modified your sample data so two traces do not overlap
used join() assuming that index of data frames is the join key
could have further structured dataframe, however I generated multiple traces using plotly express modifying as required to ensure colors and legends are created
have not considered axis labels...
x_df = pd.read_csv(io.StringIO(""" red blue
0 1 2
1 2 3
2 3 4"""), sep="\s+")
y_df = pd.read_csv(io.StringIO(""" red blue
0 1.1 2.2
1 2.1 3.2
2 3.1 4.2"""), sep="\s+")
df = x_df.join(y_df, lsuffix="_x", rsuffix="_y")
px.scatter(df, x="red_x", y="red_y").update_traces(
marker={"color": "red"}, name="red", showlegend=True
).add_traces(
px.scatter(df, x="blue_x", y="blue_y")
.update_traces(marker={"color": "blue"}, name="blue", showlegend=True)
.data
)

groupby .sum() returns wrong value in pandas

I have a data frame as follows,
Category Feature valueCount
A color 153
A color 7
A color 48
A color 16
B length 5
C height 1
C height 16
I want to get the sum of valueCount by Category and Feature
I am using the following code;
DF['valueSum'] = DF.groupby(['Category','Feature'])['valueCount'].transform('sum')
I am getting the output as;
Category Feature valueCount valueSum
A color 153 26018
A color 7 26018
A color 48 26018
A color 16 26018
B length 5 25
C height 1 257
C height 16 257
which is really weird, as it is taking the square of valueCount and then adding up. Anyone knows, what is going wrong here?
According to the doc, The GroupBy objects provides a sum method that do what you need:
In [12]: grouped.sum()
the ideal way is:
In [4]: df
Out[4]:
Category Feature valueCount
0 A color 153
1 A color 7
2 A color 48
3 A color 16
4 B length 5
5 C height 1
6 C height 16
In [5]: df.groupby(df['Category']).sum()
Out[5]:
valueCount
Category
A 224
B 5
C 17

Pandas: expanding_apply with groupby for unique counts of string type

I have the dataframe:
import pandas as pd
id = [0,0,0,0,1,1,1,1]
color = ['red','blue','red','black','blue','red','black','black']
test = pd.DataFrame(zip(id, color), columns = ['id', 'color'])
and would like to create a column of the running count of the unique colors grouped by id so that the final dataframe looks like this:
id color expanding_unique_count
0 0 red 1
1 0 blue 2
2 0 red 2
3 0 black 3
4 1 blue 1
5 1 red 2
6 1 black 3
7 1 black 3
I tried this simple way:
def len_unique(x):
return(len(np.unique(x)))
test['expanding_unique_count'] = test.groupby('id')['color'].apply(lambda x: pd.expanding_apply(x, len_unique))
And got ValueError: could not convert string to float: black
If I change the colors to integers:
color = [1,2,1,3,2,1,3,3]
test = pd.DataFrame(zip(id, color), columns = ['id', 'color'])
Then running the same code above produces the desired result. Is there a way for this to work while maintaining the string type for the column color?
It looks like expanding_apply and rolling_apply mainly work on numeric values. Maybe try creating a numeric column to code the color string as numeric values (this can be done by make color column categorical), and then expanding_apply.
# processing
# ===================================
# create numeric label
test['numeric_label'] = pd.Categorical(test['color']).codes
# output: array([2, 1, 2, 0, 1, 2, 0, 0], dtype=int8)
# your expanding function
test['expanding_unique_count'] = test.groupby('id')['numeric_label'].apply(lambda x: pd.expanding_apply(x, len_unique))
# drop the auxiliary column
test.drop('numeric_label', axis=1)
id color expanding_unique_count
0 0 red 1
1 0 blue 2
2 0 red 2
3 0 black 3
4 1 blue 1
5 1 red 2
6 1 black 3
7 1 black 3
Edit:
def func(group):
return pd.Series(1, index=group.groupby('color').head(1).index).reindex(group.index).fillna(0).cumsum()
test['expanding_unique_count'] = test.groupby('id', group_keys=False).apply(func)
print(test)
id color expanding_unique_count
0 0 red 1
1 0 blue 2
2 0 red 2
3 0 black 3
4 1 blue 1
5 1 red 2
6 1 black 3
7 1 black 3

Gnuplot smooth confidence interval lines as opposed to error bars

I'd like a 95% confidence interval line above and below my data line - as opposed to vertical bars at each point.
Is there a way that I can do this in gnuplot without plotting another line? Or do I need to plot another line and then label it appropriately?
You can use the filledcurves style to fill the region of 95% confidence. Consider the example data file data.dat with the content:
# x y ylow yhigh
1 3 2.6 3.5
2 5 4 6
3 4 3.2 4.3
4 3.5 3.3 3.7
and plot this with the script
set style fill transparent solid 0.2 noborder
plot 'data.dat' using 1:3:4 with filledcurves title '95% confidence', \
'' using 1:2 with lp lt 1 pt 7 ps 1.5 lw 3 title 'mean value'
to get
To plot the data, with the mean and the standard deviation as error bars.
Save the below code as example.gnuplot
set terminal pdf size 6, 4.5 enhanced font "Times-New-Roman,20"
set output 'out.pdf'
red = "#CC0000"; green = "#4C9900"; blue = "#6A5ACD"; skyblue = "#87CEEB"; violet = "#FF00FF"; brown = "#D2691E";
set xrange [0:*] nowriteback;
set yrange [0.0: 10.0]
set title "Line graph with confidence interval"
set xlabel "X Axis"
set ylabel "Y Axis"
plot "data.dat" using 1:2:3 title "l1" with yerrorlines lw 3 lc rgb red,\
''using 1:4:5 title "l2" with yerrorlines lw 3 lc rgb brown
Create a new file called "data.dat" and give some sample values, such as
X Y Stddev y1 std
1 2 0.5 3 0.25
2 4 0.2 5 0.3
3 3 0.3 4 0.35
4 5 0.1 6 0.3
5 6 0.2 7 0.25
Run the script using the command gnuplot example.gnuplot

pandas matplotlib barh bars print bottom to top but label print top to bottom

I have a dataframe which has been sorted into ascending order by value. It looks like this.
Name Count
19 PAGEBGFX 1
18 CODE 1
17 .orpc 2
16 .sdata 3
15 PAGE 4
14 PAGELK 4
13 .data1 4
12 data 6
11 .tls 6
10 text 6
9 .ndata 8
8 minATL 13
7 .imrsiv 41
6 .rdata 209
5 .pdata 501
4 .idata 660
3 .reloc 896
2 .data 930
1 .rsrc 962
0 .text 998
When I plot this using a horizontal bar chart, the bars seem to fill from the bottom to the top. So the first line in my data frame is the bottom bar in my chart. Here is the plotting code.
ypos = np.arange(len(section_names_df)) + .1
plt.barh(ypos, section_names_df['Count'])
plt.yticks(ypos +.4, section_names_df['Name'])
plt.show()
So the labels appear to have been populated from the top to the bottom of the data frame but the length of the bars was populated from the bottom to the top. Is that how it is supposed to be or did I do something wrong? Any pointers on how to make this easier?
EDIT I created an iPython notebook to demonstrate the problem with a full code reproducing the problem. http://nbviewer.ipython.org/gist/blackfist/dd0941f3ddbbc0f724a1
For the first part of your question, barh plots in the reverse order you would expect use this code to reverse your DataFrame before plotting:
df = df.iloc[::-1]
Try calling
section_names_df.reset_index(inplace=True)
before plotting. The problem is that dataframes are iterated in the order of the index, so it needs to be reset to match the sorted order.