How to create Correlation Heat Map of All Measure in Tableau? - pandas

I have Query with 10 Measures I am able to draw correlation heat map in Python using below?
import pandas as pd
import seaborn as sn
import matplotlib as mt
df = pd.read_sql('select statement')
sn.heatmap(df.corr(), annot=True)
mt.pyplot.show()
How can I make similar correlation heat map in Tableau?

The general way to make a heatmap in Tableau is to put a discrete field on rows and a discrete field on columns. Select the square mark type. Under Format, make a square cell size, and adjust the cell size to be as large as you prefer.
Then put a continuous field on the color shelf. Click on the color button to choose the color palette you like, and possibly turn on a border. Click on the size button to adjust the mark size to match the cell size.
There are a lot of good examples on Tableau Public.
https://public.tableau.com/app/search/vizzes/correlation%20matrix

Related

How to use the parameter "annot_kws" of the function "sns.heatmap" to revise the annotaion text?

How can I draw such a heatmap using the "seaborn.heatmap" function?
The color shades are determined by matrix A and the annotation of each grid is determined by matrix B.
For example, if I get a matrix, I want its color to be displayed according to the z-score of this matrix, but the annotation remains the matrix itself.
I know I should resort to the parameter 'annot_kws', but how exactly should I write the code?
Instead of simply setting annot=True, annot= can be set to a dataframe (or 2D numpy array, or a list of lists) with the same number of rows and columns as the data. That way, the coloring will be applied using the data, and the annotation will come from annot. Seaborn will still take care to use white text for the dark cells and black text for the light ones.
annot_kws= is used to change the text properties, typically the fontsize. But you also could change the font itself, or the alignment if you'd used multiline text.
Here is an example using numbers 1 to 36 as annotation, but the numbers modulo 10 for the coloring. The annot_kws are used to enlarge and rotate the text. (Note that when the annotation are strings, you also need to set the format, e.g. fmt='').
import seaborn as sns
import numpy as np
a = pd.DataFrame({'count': [1, 2, 3]})
matrix_B = np.arange(1, 37).reshape(6, 6) # used for annotations
matrix_A = (matrix_B % 10) # used for coloring
sns.heatmap(data=matrix_A, annot=matrix_B,
annot_kws={'size': 20, 'rotation': 45},
square=True, cbar_kws={'label': 'last digit'})

is there a way to plot multiple lines using hvplot.line from an xarray array

I have multiple ytraces data in an xarray array.
data trace selection can be done by
t=s_xr_all.sel(trace_index=slice(0,2,1),xy='y')
# trace_index and xy are dimension names and above selects subset of 3 traces (lines) into t
t.name='t'
t.hvplot.line(x='point_index',y='t')
The above creates a line plot with a widget slider that allows scrolling through the lines with single line displayed at a time
I would like to be able to plot all lines without creating the slider widget.hvplot documentation is sparse as to how to do that
t.hvplot.line(x='point_index',y='t').overlay()
The .overlay() function chaining eliminates the slider creation and all the lines in the xarray are displayed

visualizing longitudinal patient data: Adding specific icons or symbols to certain cells in a time series heatmap to indicate events/outcomes

I am currently involved in a clinical study. We are trying to visualize patients blood work over time using seaborn cluster maps (for example patient CPR levels). For reference: We have some 200 Patients and up to 60 days of observed data, so cells in the plot are pretty small.
Some patients during the observations period either died or developed an outcome of interest. We would love to visualize these key events with some form a symbol or icon. I am imagining something like this:
In addition to its color coding the field at the date of death gehts big dot right in the middle, or even a symbolic cross or some other symbol.
Things that might work, but i do not know how to do:
I am using lines to seperate cells. Changing the widths and color of the cells at the date an event occured might work.
Things that dont work:
cell in my heatmap are too small for custom annotations
import pandas as pd
import seaborn as sns
df= pd.read_excel('data.xlsx')
heatmap = sns.clustermap(df,col_cluster=False, row_cluster=False, cmap='YlOrRd', mask=df=0, vmax=10, vmin=0, linewidths=1, linecolor='black', figsize=(20,16), cbar_pos=(0.1, 0.2, .02, .6))

seaborn how do i create a box plot of only particular attributes in a dataframe

I would like to create two boxplots to visualize different attributes within my data by splitting the attributes up based on their scale. I currently have this
box plots to show the distributions of attributes
sns.boxplot(data=df)
box plot with all attributes included
I would like it to be like the images below with the attributes in different box plots based on their scale but with the attribute labels below each boxplot (not the current integers).
box plots to show the distributions of attributes
sns.boxplot(data=[df['mi'],df['steps'],df['Standing time'],df['lying time']])
box plot by scale 1
You can subset a pandas DataFrame by indexing with a list of column names
sns.boxplot(data=df[['mi', 'steps', 'Standing time', 'lying time']])

hist() - how to force equal bins width?

Assuming I have the following array: [1,1,1,2,2,40,60,70,75,80,85,87,95] and I want to create a histogram out of it based on the following bins - x<=2, [3<=x<=80], [x>=81].
If I do the following: arr.hist(bins=(0,2,80,100)) I get the bins to be at different widths (based on their x range). I want them to represent different size ranges but appear in the histogram at the same width. Is it possible in an elegant way?
I can think of adding a new column for this (holding the bin id that will be calculated based on the boundaries I want) but don't really like this solution..
Thanks!
Sounds like you want a bar graph; You could use bar:
import numpy as np
import matplotlib.pyplot as plt
arr=np.array([1,1,1,2,2,40,60,70,75,80,85,87,95])
h=np.histogram(arr,bins=(0,2,80,100))
plt.bar(range(3),h[0],width=1)
xlab=['x<=2', '3<=x<=80]', 'x>=81']
plt.xticks(arange(0.5,3.5,1),xlab)