How can I plot a Chord diagram from my pandas df? - pandas

I'm trying to create a Chord Diagram that will represent the relation between two different users, exactly as you can see on the image below;
I read the Plotly tutorial https://plotly.com/python/v3/filled-chord-diagram/ but unfortunately, it's not working well;
I would like to know if someone has experience with this chart or has a good reference that I can use to understand how to implement it
Below is the sample data:
import pandas as pd
data_dict=[{'users_relation': '5ddd6939<->f3c525e3',
'user_0_idx': '5ddd6939',
'user_1_idx': 'f3c525e3',
'score': 0.8205884909272926,
'time_order': 0},
{'users_relation': '5ddd6939<->825b50dc',
'user_0_idx': '5ddd6939',
'user_1_idx': '825b50dc',
'score': 0.7253207649551456,
'time_order': 0},
{'users_relation': 'f3c525e3<->825b50dc',
'user_0_idx': 'f3c525e3',
'user_1_idx': '825b50dc',
'score': 0.7933140542847614,
'time_order': 0}]
chord_diag_data=pd.DataFrame.from_dict(data_dict)
Any help/suggestion will be really appreciated and welcome;
**Please note, you can ignore the "time" part, as my focus in building the chart;
Regards,

Related

Text gets truncated while using streamlit.write(df)

I'm using the function streamlit.write(df) to display a df, but the text is not fully displayed, here is a short example of the situation.
import pandas as pd
import streamlit as st
df = pd.DataFrame({'col1': [1, 2, 3], 'col2': ['This is some text large text that will not be
completely displayed, need to add break lines or something.', 'short text',
'another piece of text.']})
st.write(df))
This is the output, the ideal thing is to add line breaks, but did not work for me.
You can use table
https://docs.streamlit.io/library/api-reference/data/st.table
st.table(df)

matplotlib: horizontal labels as style

Is there a parameter to force horizontal labels in an mplstyle file? and/or using rcParams?
I'm currently using ax.xaxis.set_tick_params(rotation=0) at plot construction. I'd like a permanent style or setting. Thanks!
Default look (with x_compat=True in a pandas dataframes):
Desired look:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Date': {0: '1950-01-01', 1: '1960-01-02', 2: '1970-01-03', 3: '1980-01-04', 4: '1990-01-05'}, 'Value': {0 : 0, 1: 1, 2: 0, 3: 1, 4: 0}})
df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d')
df = df.set_index('Date', drop=False)
f, ax = plt.subplots()
df.plot(ax=ax, x='Date', x_compat=True)
#ax.xaxis.set_tick_params(rotation=0)
plt.show()
I looked in there, but may have missed it:
customizing-with-matplotlibrc-files
matplotlib_configuration_api.html
Use parameter rot from df.plot
df.plot(ax=ax, x='Date', x_compat=True, rot=0)
I'll answer my own question to put the matter to rest.
No, there isn't.
[as of January 2022] There is no way to control tick label rotation via a style. This is because the pandas plot wrapper resets the rotation parameter. To quote from pandas/doc/source/user_guide/visualization.rst,
pandas includes automatic tick resolution adjustment for regular
frequency time-series data. For limited cases where pandas cannot
infer the frequency information (e.g., in an externally created
twinx), you can choose to suppress this behavior for alignment
purposes.
[...]
Using the x_compat parameter, you can suppress this behavior
Despite the wording here --- namely "alignment purposes" ---, setting x_compat=True does not reset the rotation parameter back to its matplotlib default of 0, as I'd incorrectly expected.
There seem to be mainly two ways around this:
Use matplotlib directly without pandas.
Reset the rotation inside the pandas plot call. This may be done
the pandas way [See Vishnudev's answer] with df.plot(... rot=0...) or the matplotlib way [See my OP] with an axis object
setting ax.xaxis.set_tick_params(rotation=0).
Source and Thanks to: Jody Klymak in comments and Marco Gorelli at Github.

Can't visualize plotted Confusion Matrix

I am new to ML and learning the fundamentals.
I am working on Dog-vision dataset (https://www.kaggle.com/c/dog-breed-identification) and I am trying to plot a confusion matrix but can't get where I am doing wrong, need help!
My true_label looks like this
true_label[:10]
array([26, 96, 8, 15, 3, 10, 62, 82, 92, 16]
And predicted_label looks like this
predicted_l[:10]
array([26, 96, 8, 15, 3, 10, 62, 82, 92, 16]
They are almost same but not the whole elements in the array are same.
Then I had converted them into a panda dataframe, with code like this
import pandas as pd
from sklearn.metrics import confusion_matrix
classes=[]
for i in range(0, 99):
classes.append(i)
cf_matrix = confusion_matrix(true_l, predicted_l)
cf_matrix_df = pd.DataFrame(cf_matrix, index=classes,columns=classes)
cf_matrix_df
And then the output is like this-
Then I tried to plot the confusion matrix with this dataframe
but it's not being plotted in correct manner. Here is the code and the output of my confusion matrix:-
import seaborn as sns
figure = plt.figure(figsize=(8, 8))
sns.heatmap(cf_matrix_df, annot=True,cmap=plt.cm.Blues)
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.show()
Output
If you need more info then please have a look on my notebook here.
https://colab.research.google.com/drive/1SoXJJNTnGx39uZHizAut-HuMtKhQQolk?usp=sharing
You can make your plot better by removing annot=True argument, since it writes the data value in each cell. Simply remove this argument to get a better visualization:
sns.heatmap(cf_matrix_df, cmap=plt.cm.Blues)
UPDATE: Increasing the figure size figsize() will help to make visualization more clearer.

How to prevent 1e9 from being shown to exponential form in Python matplotlib figure

I've seen this. How to prevent numbers being changed to exponential form in Python matplotlib figure
However, I've got some custom annotations to put in, and I'd just like matplotlib to just not show the 1e9 marker. Example code below
import matplotlib.pyplot as plt
import seaborn as sns
sns.set() # not necessary, but just to reproduce the photo below
f, a = plt.subplots() # I use the oop interface
pd.DataFrame({'y': [1e9, 2e9, 3e9], 'x': [1, 2, 3]}).set_index('x').plot(ax=a)
Yields:
How do I just not show the 1e9? I have a custom annotation there which says 'billions' and it overlaps.
My thanks to ImportanceOfBeingErnest above, as
a.yaxis.offsetText.set_visible(False)
solves.

Adding Arbitrary points on pandas time series using Dataframe.plot function

I have been trying to plot some time series graphs using the pandas dataframe plot function. I was trying to add markers at some arbitrary points on the plot to show anomalous points. The code I used :
df1 = pd.DataFrame({'Entropy Values' : MeanValues}, index=DateRange)
df1.plot(linestyle = '-')
I have a list of Dates on which I need to add markers.Such as:
Dates = ['15:45:00', '15:50:00', '15:55:00', '16:00:00']
I had a look at this link matplotlib: Set markers for individual points on a line. Does DF.plot have a similar functionality?
I really appreciate the help. Thanks!
DataFrame.plot passes all keyword arguments it does not recognize to the matplotlib plotting method. To put markers at a few points in the plot you can use the markevery argument. Here is an example:
import pandas as pd
df = pd.DataFrame({'A': range(10), 'B': range(10)}).set_index('A')
df.plot(linestyle='-', markevery=[1, 5, 7, 8], marker='o', markerfacecolor='r')
In your case, you would have to do something like
df1.plot(linestyle='-', markevery=Dates, marker='o', markerfacecolor='r')