Altering the X-axis in Altair - facet

I'd like to fill the charts with selectors like the example below. Any tips on how to get this to work in a faceted chart?
np.random.seed(42)
source = pd.DataFrame(np.cumsum(np.random.rand(8, 4), 0).round(2),
columns=['A', 'B', 'C', 'D'], index=pd.RangeIndex(8, name='x'))
source = source.reset_index().melt('x', var_name='category', value_name='y')
xRange= pd.DataFrame(np.linspace(min(source['x']), max(source['x']), num=100), columns=['x'])
pts = alt.selection_multi(fields=['x'], nearest=True, on='click',empty='none')
# The basic line
main = alt.Chart(source).mark_line(interpolate='basis').encode(
x='x:Q',
y='y:Q',
).transform_filter(
alt.FieldEqualPredicate(field='category', equal='A')
)
line = alt.Chart(source).mark_line(color='Maroon').encode(
x='x:Q',
y='y:Q',
).transform_filter(
alt.FieldEqualPredicate(field='category', equal='B')
)
# Transparent selectors across the chart. This is what tells us
# the x-value of the cursor
selectors = alt.Chart(xRange).mark_rule(size=2).encode(
x='x:Q',
#y='y:Q',
#opacity=alt.value(0.4),
opacity = alt.condition(pts, alt.value(1.0), alt.value(0.2))
).add_selection(pts)
position = alt.Chart(xRange).mark_text(
align='right', dy=140, dx=-8, fontSize=14).encode(
x=alt.X('x'),
text=alt.Text('x',format='.1f')
).transform_filter(pts)
alt.vconcat(
main + selectors + position,
line + selectors + position
)
But ideally using facet, however i have not found a way around that you can only use a single DataFrame/source. Is there a way to use alt.sequence of impute to generate additional points on the x-axis?
pts = alt.selection_multi(fields=['x'], nearest=True, on='click',empty='none')
# The basic line
line = alt.Chart().mark_line(interpolate='basis').encode(
x='x:Q',
y='y:Q',
)
# Transparent rules across the chart.
rules = alt.Chart().mark_rule(size=2).encode(
x='x:Q',
opacity = alt.condition(pts, alt.value(1.0), alt.value(0.3))
).add_selection(pts)
text = alt.Chart().mark_text(
align='right', dy=140, dx=-8, fontSize=14).encode(
x=alt.X('x'),
text=alt.Text('x',format='.1f')
).transform_filter(pts)
alt.layer(line, rules, text, data=source).facet(
'category:N',
columns=2
)

You can use the sequence generator. It is almost the same to what you had already:
import numpy as np
import pandas as pd
import altair as alt
np.random.seed(42)
source = pd.DataFrame(np.cumsum(np.random.rand(8, 4), 0).round(2),
columns=['A', 'B', 'C', 'D'], index=pd.RangeIndex(8, name='x'))
source = source.reset_index().melt('x', var_name='category', value_name='y')
# xRange= pd.DataFrame(np.linspace(min(source['x']), max(source['x']), num=100), columns=['x'])
xRange = alt.sequence(0, 7.1, 0.1, as_='x')
pts = alt.selection_multi(fields=['x'], nearest=True, on='mouseover',empty='none')
# The basic line
line = alt.Chart().mark_line(interpolate='linear').encode(
x='x:Q',
y='y:Q',
)
# Transparent rules across the chart.
rules = alt.Chart(xRange).mark_rule(size=2).encode(
x='x:Q',
opacity = alt.condition(pts, alt.value(1.0), alt.value(0.3))
).add_selection(pts)
text = alt.Chart(xRange).mark_text(
align='right', dy=140, dx=-8, fontSize=14).encode(
x=alt.X('x:Q'),
text=alt.Text('x:Q',format='.1f')
).transform_filter(pts)
alt.layer(line, rules, text, data=source).facet(
'category:N',
columns=2
)

Related

ggplot2: add title changes point colors <-> scale_color_manual removes ggtitle

I am facing a silly point color in a dot plot with ggplot 2. I have a whole table of data of which i take relevant rows to make a dot plot. With scale_color_manual my points get colored according to the named palette and factor genotype specified in aes() and when i simply want to add a title specifying the cell line used, the points get colored back to automatic yellow and purple. Adding the title first and setting scale_color_manual as the last layer changes the points colors and removes the title.
What is wrong in there? I don't get it and it is a bit frustrating
thanks for your help!
Here's reproducible code to get my whole df and the subset for the plots:
# df of data to plot
exp <- c(rep(284, times = 6), rep(285, times = 12))
geno <- c(rep(rep(c("WT", "KO"), each =3), times = 6))
line <- c(rep(5, times = 6),rep(8, times= 12), rep(5, times =12), rep(8, times = 6))
ttt <- c(rep(c(0, 10, 60), times = 10), rep(c("ZAc60", "Cu60", "Cu200"), times = 2))
rep <- c(rep(1, times = 12), rep(2, times = 6), rep(c(1,2), times = 6), rep(1, times = 6))
rel_expr <- c(0.20688185, 0.21576131, 0.94046028, 0.30327675, 0.22865200,
0.92941881, 0.13787508, 0.13325281, 0.22114990, 0.95591724,
1.03239718, 0.83339248, 0.15332420, 0.17558160, 0.22475604,
1.02356351, 0.77882000, 0.69214403, 0.16874097, 0.15548158,
0.45207943, 0.28123760, 0.23500083, 0.51588856, 0.1399634,
0.14610184, 1.06716713, 0.16517801, 0.34736164, 0.64773650,
0.18334429, 0.05924757, 0.01803593, 0.86685230, 0.39554685,
0.25764805)
df_all <- data.frame(exp, geno, line, ttt, rep, rel_expr)
names(df_all) <- c("EXP", "Geno", "Line", "TTT", "Rep", "Rel_Expr")
str(df_all)
# make Geno an ordered factor
df_all$Geno <- ordered(df_all$Geno, levels = c("WT", "KO"))
# select set of whole dataset for current plot
df_ions <- df_all[df_all$Line == 8 & !df_all$TTT %in% c(10, 60),]
# add a treatment as factor columns fTTT
df_ions$fTTT <- ordered(df_ions$TTT, levels = c("0", "ZAc60", "Cu60", "Cu200"))
str(df_ions)
# plot rel_exp vs factor treatment, color points by geno
# with named color palette
library(ggplot2)
col_palette <- c("#000000", "#1356BC")
names(col_palette) <- c("WT", "KO")
plt <- ggplot(df_ions, aes(x = fTTT, y = Rel_Expr, color = Geno)) +
geom_jitter(width = 0.1)
plt # intermediate_plt_1.png
plt + scale_color_manual(values = col_palette) # intermediate_plt_2.png
plt + ggtitle("mRPTEC8") # final_plot.png
images:

Multiple grouped charts with altair

My data has 4 attributes: dataset (D1/D2), model (M1/M2), layer (L1/L2), scene (S1/S2). I can make a chart grouped by scenes and then merge plots horizontally and vertically (pic above).
However, I would like to have 'double grouping' by scene and dataset, like merging the D1 and D2 plots by placing blue/orange bars from next to each other but with different opacity or pattern/hatch.
Basically something like this (pretend that the black traits are a hatch pattern).
Here is the code to reproduce the first plot
import numpy as np
import itertools
import argparse
import pandas as pd
import matplotlib.pyplot as plt
import os
import altair as alt
alt.renderers.enable('altair_viewer')
np.random.seed(0)
################################################################################
model_keys = ['M1', 'M2']
data_keys = ['D1', 'D2']
scene_keys = ['S1', 'S2']
layer_keys = ['L1', 'L2']
ys = []
models = []
dataset = []
layers = []
scenes = []
for sc in scene_keys:
for m in model_keys:
for d in data_keys:
for l in layer_keys:
for s in range(10):
data_y = list(np.random.rand(10) / 10)
ys += data_y
scenes += [sc] * len(data_y)
models += [m] * len(data_y)
dataset += [d] * len(data_y)
layers += [l] * len(data_y)
# ------------------------------------------------------------------------------
df = pd.DataFrame({'Y': ys,
'Model': models,
'Dataset': dataset,
'Layer': layers,
'Scenes': scenes})
bars = alt.Chart(df, width=100, height=90).mark_bar().encode(
# field to group columns on
x=alt.X('Scenes:N',
title=None,
axis=alt.Axis(
grid=False,
title=None,
labels=False,
),
),
# field to use as Y values and how to calculate
y=alt.Y('Y:Q',
aggregate='mean',
axis=alt.Axis(
grid=True,
title='Y',
titleFontWeight='normal',
),
),
# field to use for sorting
order=alt.Order('Scenes',
sort='ascending',
),
# field to use for color segmentation
color=alt.Color('Scenes',
legend=alt.Legend(orient='bottom',
padding=-10,
),
title=None,
),
)
error_bars = alt.Chart(df).mark_errorbar(extent='ci').encode(
x=alt.X('Scenes:N'),
y=alt.Y('Y:Q'),
)
text = alt.Chart(df).mark_text(align='center',
baseline='line-bottom',
color='black',
dy=-5 # y-shift
).encode(
x=alt.X('Scenes:N'),
y=alt.Y('mean(Y):Q'),
text=alt.Text('mean(Y):Q', format='.1f'),
)
chart_base = bars + error_bars + text
chart_base = chart_base.facet(
# field to use to use as the set of columns to be represented in each group
column=alt.Column('Layer:N',
# header=alt.Header(
# labelFontStyle='bold',
# ),
title=None,
sort=list(set(models)), # get unique indices
),
spacing={"row": 0, "column": 15},
)
def unique(sequence):
seen = set()
return [x for x in sequence if not (x in seen or seen.add(x))]
for i, m in enumerate(unique(models)):
chart_imnet = chart_base.transform_filter(
alt.FieldEqualPredicate(field='Dataset', equal='D1'),
).transform_filter(
alt.FieldEqualPredicate(field='Model', equal=m)
)
chart_places = chart_base.transform_filter(
alt.FieldEqualPredicate(field='Dataset', equal='D2')
).transform_filter(
alt.FieldEqualPredicate(field='Model', equal=m)
)
if i == 0:
title_params = dict({'align': 'center', 'anchor': 'middle', 'dy': -10})
chart_imnet = chart_imnet.properties(title=alt.TitleParams('D1', **title_params))
chart_places = chart_places.properties(title=alt.TitleParams('D2', **title_params))
chart_places = alt.concat(chart_places,
title=alt.TitleParams(
m,
baseline='middle',
orient='right',
anchor='middle',
angle=90,
# dy=10,
dx=30 if i == 0 else 0,
),
)
if i == 0:
chart = (chart_imnet | chart_places).resolve_scale(x='shared')
else:
chart = (chart & (chart_imnet | chart_places).resolve_scale(x='shared'))
chart.save('test.html')
For now, I don't know a good answer, but once https://github.com/altair-viz/altair/pull/2528 is accepted you can use the xOffset encoding channel as such:
alt.Chart(df, height=90).mark_bar(tooltip=True).encode(
x=alt.X("Scenes:N"),
y=alt.Y("mean(Y):Q"),
color=alt.Color("Scenes:N"),
opacity=alt.Opacity("Dataset:N"),
xOffset=alt.XOffset("Dataset:N"),
column=alt.Column('Layer:N'),
row=alt.Row("Model:N")
).resolve_scale(x='independent')
Which will result in:
See Colab Notebook or Vega Editor
EDIT
To control the opacity and legend names one can do as such
alt.Chart(df, height=90).mark_bar(tooltip=True).encode(
x=alt.X("Scenes:N"),
y=alt.Y("mean(Y):Q"),
color=alt.Color("Scenes:N"),
opacity=alt.Opacity("Dataset:N",
scale=alt.Scale(domain=['D1', 'D2'],
range=[0.2, 1.0]),
legend=alt.Legend(labelExpr="datum.label == 'D1' ? 'D1 - transparent' : 'D2 - full'")),
xOffset=alt.XOffset("Dataset:N"),
column=alt.Column('Layer:N'),
row=alt.Row("Model:N")
).resolve_scale(x='independent')

How to create textbox on figure using first row in geodataframe?

I am looking to plot a textbox on a figure displaying the 5-Day NHC forecast cone for a tropical cyclone, in this case Hurricane Dorian. I have the four shapefiles (track line, cone, points, and watches/warnings). On the figure I want to display the following from the first row of points_gdf (yellow circles in the image; the two commented out lines near the bottom of the code is what I tried initially):
Latest Tracking Information: (regular string; below are variables from points_gdf)
LAT LON
MAXWIND
GUST
MSLP
TCSPD
track_line_gdf = geopandas.read_file('nhc/al052019_5day_037/al052019-037_5day_lin.shp')
cone_gdf = geopandas.read_file('nhc/al052019_5day_037/al052019-037_5day_pgn.shp')
points_gdf = geopandas.read_file('nhc/al052019_5day_037/al052019-037_5day_pts.shp')
ww_gdf = geopandas.read_file('nhc/al052019_5day_037/al052019-037_ww_wwlin.shp')
fig = plt.figure(figsize=(14,12))
fig.set_facecolor('white')
ax = plt.subplot(1,1,1, projection=map_crs)
ax.set_extent([-88,-70,25,50])
ax.add_geometries(cone_gdf['geometry'], crs=data_crs, facecolor='white',
edgecolor='black', linewidth=0.25, alpha=0.4)
ax.add_geometries(track_line_gdf['geometry'], crs=data_crs, facecolor='none',
edgecolor='black', linewidth=2)
sc = ax.scatter(points_gdf['LON'], points_gdf['LAT'], transform=data_crs,
zorder=10, c=points_gdf['MAXWIND'], cmap='jet')
ww_colors = {'Tropical Storm Watch': 'gold',
'Hurricane Watch': 'pink',
'Tropical Storm Warning': 'tab:blue',
'Hurricane Warning': 'tab:red'}
for ww_type in ww_colors.keys():
ww_subset = ww_gdf[ww_gdf['TCWW']==ww_type]
ax.add_geometries(ww_subset['geometry'], facecolor='none',
edgecolor=ww_colors[ww_type], crs=data_crs,
linewidth=5)
markers = [plt.Line2D([0,0],[0,0],color=color, marker='o', linestyle='') for color in ww_colors.values()]
Name = ww_gdf['STORMNAME'][0]
Storm = ww_gdf['STORMTYPE'][0]
AdvDate = ww_gdf['ADVDATE'][0]
AdvNum = ww_gdf['ADVISNUM'][0]
props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
plt.colorbar(sc, label='Wind Speed (mph)')
plt.title(Storm + ' ' + Name + ' - ' + AdvDate + ' Advisory', fontsize=14, fontweight='bold')
plt.legend(markers, ww_colors.keys())
plt.text(0.05, 0.95, 'Testing', transform=ax.transAxes, va='top', bbox=props)
It would help to know either what error you're running into, or what exactly isn't behaving how you want. I can slightly tweak your code to make this:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(14,12))
fig.set_facecolor('white')
ax = plt.subplot(1,1,1, projection=ccrs.LambertConformal())
plt.title('Storm Advisory', fontsize=14, fontweight='bold')
points_gds = pd.DataFrame(dict(GUST=[165.0], LAT=[26.8],
LON=[-78.3], MSLP=[930.2]))
storminfo = f'''Max Wind Gusts: {points_gds.iloc[0]['GUST']:.0f} mph
Current Latitude: {points_gds.iloc[0]['LAT']:.1f}
Current Longitude: {points_gds.iloc[0]['LON']:.1f}
Central Pressure: {points_gds.iloc[0]['MSLP']:.2f} mb'''
props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
plt.text(0.05, 0.95, 'Testing', transform=ax.transAxes, va='top', bbox=props)
ax.coastlines()
ax.set_extent([-88,-70,25,50])
which produces this image:
To make that work I needed to change round (which is a Python built-in function) to the string 'round'. The text is formatted using f-strings ("formatted string literals"), and enclosed as a triple-quoted string to avoid needing to manually put in the newline ('\n') characters. Python's docs can tell you more about how to control the formatting of individual items.

How to concat two bar charts in Altair with space between series?

I have the following code to generate two bar charts. The first one is a "Central" scenario that needs to be always visible. The second represents multiple stress scenarios with values depending on two sliders.
My problem is to concat the two charts, letting spaces between the two series and making them visible in any cases (like a grouped bar chart).
Here is my code :
import altair as alt
from vega_datasets import data
pvfp=Res.loc[(Res.Item=="PVFP")&(Res.annee>0)]
base = alt.Chart(pvfp, width=500, height=300).mark_bar(color="Green").encode(
x=alt.X('annee:Q'),
y='valeur:Q',
tooltip="valeur:Q"
)
central = alt.Chart(pvfp.loc[(Res.TS=='Central')&(Res.TRA=='Central')], width=500, height=300).mark_bar().encode(
x=alt.X('annee:Q'),
y='valeur:Q',
tooltip="valeur:Q"
)
# A slider filter
TRA_slider = alt.binding_range(min=-40, max=20, step=10,name="Sensi TRA :")
TS_slider = alt.binding_range(min=-20, max=20, step=5,name="Sensi TS : ")
slider1 = alt.selection_single(bind=TRA_slider, fields=['TRA2'],init={'TRA2': 0})
slider2 = alt.selection_single(bind=TS_slider, fields=['TS2'],init={'TS2': 0})
filter_TRA = base.add_selection(
slider1,slider2
).transform_filter(
slider1&slider2
).properties(title="Sensi_TRA")
central + filter_TRA
And a view of the chart I obtain for now :
If you have any idea of a way to do that, I would be very grateful.
UPDATE :
Here is a reproductible example of the same problem.
import altair as alt
import pandas as pd
from vega_datasets import data
dataset = data.population.url
source=pd.read_json(dataset)
source2=df.loc[df.year==1900]
pink_blue = alt.Scale(domain=('Male', 'Female'),
range=["steelblue", "salmon"])
slider = alt.binding_range(min=1900, max=2000, step=10)
select_year = alt.selection_single(name="year", fields=['year'],
bind=slider, init={'year': 2000})
chart1 = alt.Chart(source).mark_bar().encode(
x=alt.X('age:O', title=None),
y=alt.Y('people:Q', scale=alt.Scale(domain=(0, 12000000))),
).properties(
width=300
).add_selection(
select_year
).transform_filter(
select_year
)
chart2 = alt.Chart(source2).mark_bar(color="green").encode(
x=alt.X('age:O', title=None),
y=alt.Y('people:Q', scale=alt.Scale(domain=(0, 12000000))),
)
chart1+chart2
As described, what I would like is to find a way to separate the two series and obtain an output like in the example mentioned by #joelostblom
Hope it's more clear
You can do this with a combination of bandPaddingInner and xOffset. For example:
import altair as alt
import pandas as pd
from vega_datasets import data
dataset = data.population.url
source=pd.read_json(dataset)
source2=source.loc[source.year==1900]
pink_blue = alt.Scale(domain=('Male', 'Female'),
range=["steelblue", "salmon"])
slider = alt.binding_range(min=1900, max=2000, step=10)
select_year = alt.selection_single(name="year", fields=['year'],
bind=slider, init={'year': 2000})
chart1 = alt.Chart(source).mark_bar(
xOffset=-3
).encode(
x=alt.X('age:O', title=None),
y=alt.Y('people:Q', scale=alt.Scale(domain=(0, 12000000))),
).properties(
width=300
).add_selection(
select_year
).transform_filter(
select_year
)
chart2 = alt.Chart(source2).mark_bar(
xOffset=5,
color="green",
).encode(
x=alt.X('age:O', title=None),
y=alt.Y('people:Q', scale=alt.Scale(domain=(0, 12000000))),
)
(chart1+chart2).configure_scale(bandPaddingInner=0.6)

while creating table underneath axis on a plot, is there a way to create some whitespace between the axis and the table using matplotlib?

I am trying to create table inside a plot right underneath the axis of the plot using matplotlib
I am using the plt.table function to do this
However, when i create the table, it's created right on top of the axis, hence overlaps with the axislabels
Is there a way to create the white space in between
the code looks something like this
for key, value in arrayToPlot.iteritems():
ax1 = fig.add_subplot(2, 2, 1)
if value["HorErr"]:
cdf = []
#calculate percentile stats for the value["HorErr"]
cdfArrayPointer[key]["HorErr"]["percentileStats"]=libMath.percentileForListofPercentiles( value["HorErr"], PERCENTILE, validPointsOnly = True )
# now calculate the cdf values
cdfArrayPointer[key]["HorErr"]["cdf"] = libMath.cdf( value["HorErr"], 2, 400, validPointsOnly = True)
for k, v in cdfArrayPointer[key]["HorErr"]["cdf"].iteritems():
cdf.append( v )
#plot the cdf value
ax1.plot(cdf, 'o-', label = ('HorErr for ' + str( key) ), color = getColour(key), markersize=3)
plt.title("CDF Plot of 2D-Horizontal Error", size = 8)
plt.ylabel('Percentile %', size = 7)
plt.xlabel('Horizontal Error [m]', size = 6)
plt.axis([0, 150, 0, 110])
leg = plt.legend(loc = 4)
setLegendSize( leg, 7)
# creating the table to be drawn on the axis
tableTexts["rows"].append(key)
tableTexts["rowColour"].append(getColour(key))
if (len(tableTexts["col"]) == 0):
tableTexts["col"] = tuple(cdfArrayPointer[key]["HorErr"]["percentileStats"].keys())
tableTexts["values"].append(cdfArrayPointer[key]["HorErr"]["percentileStats"].values())
the_table = plt.table(cellText=tableTexts["values"], rowLabels= tableTexts["rows"], rowColours= tableTexts["rowColour"] ,colLabels= tableTexts["col"], loc="bottom")
What about breaking your figure up using subplot?
You could have the axis in one subplot, and the table in another. (See example near bottom of page here)
I can illustrate further if you can't follow.