What is the difference between `matplotlib.rc` and `matplotlib.rcParams`? And which one to use? - matplotlib

I have been using matplotlib.rc in my scripts to preprocess my plots. But recently I have realized that using matplotlib.rcParams is much easier before doing a quick plot interactively (e.g. via IPython). This got me into thinking what difference between the two is.
I searched the matplotlib documentation wherein no clear answer was provided in this regard. Moreover, when I issue type(matplotlib.rc), the interpreter says that it is a function. On the other hand, when I issue type(matplotlib.rcParams), I am told that it is a class object. These two answers are not at all helpful and hence I would appreciate some help differentiating the two.
Additionally, I would like to know which one to prefer over the other.
Thanks in advance.
P.S. I went through this question: What's the difference between matplotlib.rc and matplotlib.pyplot.rc? but the answers are specific to the difference between the matplotlib instance and the pyplot instance of the two types I am enquiring about and, hence, is also not that helpful.

matplotlib.rc is a function that updates matplotlib.rcParams.
matplotlib.rcParams is a dict-subclass that provides a validate key-value map for Matplotlib configuration.
The docs for mpl.rc are at https://matplotlib.org/stable/api/matplotlib_configuration_api.html?highlight=rc#matplotlib.rc and the code is here.
The class definition of RcParams is here and it the instance is created here.
If we look at the guts of matplotlib.rc we see:
for g in group:
for k, v in kwargs.items():
name = aliases.get(k) or k
key = '%s.%s' % (g, name)
try:
rcParams[key] = v
except KeyError as err:
raise KeyError(('Unrecognized key "%s" for group "%s" and '
'name "%s"') % (key, g, name)) from err
where we see that matplotlib.rc does indeed update matplotlib.rcParams (after doing some string formatting).
You should use which ever one is more convenient for you. If you know exactly what key you want to update, then interacting with the dict-like is better, if you want to set a whole bunch of values in a group then mpl.rc is likely better!

Related

seaborn from distplot to displot new input parameters

as Seaborn warned to prefer 'displot' to future deprecated 'distplot', I'm trying to change old codes. Unfortunately I find a bit hard finding corresponding parameters for several inputs. Just an example: below I start with the old 'distplot' code working:
c=np.random.normal(5,2,100)
sns.distplot(c,hist=True,kde=True,color='g',kde_kws={'color':'b','lw':2,'label':'Kde'},hist_kws={'color':'purple','alpha':0.8,
'histtype':'bar','edgecolor':'k'})
Now, I want to show the same result with 'displot' but I don't know how to put 'alpha' for histogram as well as all the 'hist_kws' stuff. Below how I started:
sns.displot(data=c,kind='hist',kde=True,facecolor='purple',edgecolor='k',color='b',
alpha=1,line_kws={'lw':2})
I'm looking for a better documentation but I didn't have luck so far

difference between pandas methods, data frame methods and how to distinguish between them

It has been a while I am confused between these and I would like to see if there is a way to easily distinguish between these in a practical and fast way.
assuming df is a pandas data frame object, please see below:
while using pandas, this is what I noticed. To access/perform some methods, you have to use pd.method(df,*args) sometimes. To access some other ones, you need to use df.method(*args). Interestingly, there are some methods that work either way ...
Let's clarify this a bit more with some examples: while it totally makes sense to me to use pd.read_csv (), not df.read_csv, since there is no df created yet, I have a hard time making sense of the following examples:
1- correct: pd.getdummies(df,*args) --- incorrect: df.getdummies(*args)
2- correct: df.groupby(*args) --- incorrect: pd.groupby(df,*args)
3- correct: df.isnull() AND pd.isnull(df)
I am pretty sure you can also come up with many other examples as above. I personally find this challenging to keep in mind which one is which and found myself wasting a lot of time in total code development/analysis cycle trying to guessing if I should use pd.method (df) or df.method() for different things.
My main question is: how do you guys handle this? did you also find this issue challenging? is there any way to quickly understand which one to use ? am I missing something here?
Thanks

Unable to use pickAFile in TigerJython

In JES, I am able to use:
file=pickAFile()
In TigerJython, however, I get the following error
NameError: name 'pickAFile' is not defined
What am I doing wrong here?
You are not doing anything wrong at all. The thing is that pickAFile() is not a standard function in Python. It is actually rather a function that JES has added for convenience, but which you probably will not find it in any other environment.
Since TigerJython and JES are both based on Jython, you can easily write a pickAFile() function on your own that uses Java's Swing. Here is a possible simple implementation (the pickAFile() found in JES might be a bit more complex, but this should get you started):
def pickAFile():
from javax.swing import JFileChooser
fc = JFileChooser()
retVal = fc.showOpenDialog(None)
if retVal == JFileChooser.APPROVE_OPTION:
return fc.getSelectedFile()
else:
return None
Given that it is certainly a useful function, we might have to consider including it into our next update of TigerJython.
P.S. I would like to apologise for answering so late, I have just joined SO recently and was not aware of your question (I am one of the original authors of TigerJython).

Is there a way to assign an internal string or identifier or tag to a matplotlib artist?

Sometimes it is useful to assign a 'tag', which can be a simple string, to a matplotlib artist in order to later find it easily.
If we imagine a scenario where say plt.Line2D had a property called tag which can be retrieved using plt.Line2D.get_tag() it would be very easy to find it later in a complicated plot.
The only thing I can find that looks remotely similar is the group ID: for example line.set_gid() and line.get_gid(). I haven't found any good documentation on this. The only reference is this. Is this meant for such use as described above? Is it reserved for other operations in matplotlib?
This would be very useful for grouping different artists and then performing operations on them later, for example:
for line in ax.get_lines():
if line.get_tag() == 'group A'
line.set_color('red')
# or whatever other operation
Does such a thing exist?
You can use the gid for such purposes. The only side-effect is that those names will appear in a saved svg file as the gid tag.
Alternatively you can assign any attribute to a python object.
line, = plt.plot(...)
line.myid = "group A"
just make sure not to use any existing attribute in such case.

#NLConstraint with vectorized constraint JuMP/Julia

I am trying to solve a problem involving the equating of sums of exponentials.
This is how I would do it hardcoded:
#NLconstraint(m, exp(x[25])==exp(x[14])+exp(x[18]))
This works fine with the rest of the code. However, when I try to do it for an arbitrary set of equations like the above I get an error. Here's my code:
#NLconstraint(m,[k=1:length(LHSSum)],sum(exp.(LHSSum[k][i]) for i=1:length(LHSSum[k]))==sum(exp.(RHSSum[k][i]) for i=1:length(RHSSum[k])))
where LHSSum and RHSSum are arrays containing arrays of the elements that need to be exponentiated and then summed over. That is LHSSum[1]=[x[1],x[2],x[3],...,x[n]]. Where x[i] are variables of type JuMP.Variable. Note that length(LHSSum)=length(RHSSum).
The error returned is:
LoadError: exp is not defined for type Variable. Are you trying to build a nonlinear problem? Make sure you use #NLconstraint/#NLobjective.
So a simple solution would be to simply do all the exponentiating and summing outside of the #NLconstraint function, so the input would be a scalar. However, this too presents a problem since exp(x) is not defined since x is of type JuMP.variable, whereas exp expects something of type real. This is strange since I am able to calculate exponentials just fine when the function is called within an #NLconstraint(). I.e. when I code this line#NLconstraint(m,exp(x)==exp(z)+exp(y)) instead of the earlier line, no errors are thrown.
Another thing I thought to do would be a Taylor Series expansion, but this too presents a problem since it goes into #NLconstraint land for powers greater than 2, and then I get stuck with the same vectorization problem.
So I feel stuck, I feel like if JuMP would allow for the vectorized evaluation of #NLconstraint like it does for #constraint, this would not even be an issue. Another fix would be if JuMP implements it's own exp function to allow for the exponentiation of JuMP.Variable type. However, as it is I don't see a way to solve this problem in general using the JuMP framework. Do any of you have any solutions to this problem? Any clever workarounds that I am missing?
I'm confused why i isn't used in the expressions you wrote. Do you mean:
#NLconstraint(m, [k = 1:length(LHSSum)],
sum(exp(LHSSum[k][i]) for i in 1:length(LHSSum[k]))
==
sum(exp(RHSSum[k][i]) for i in 1:length(RHSSum[k])))