Origins of distribute_setup.py file in matplotlib - matplotlib

What are the origins of the distribute_setup.py file in Matplotlib? Is the file originally from some other source or is it from the Matplotlib project?
I am interested in using setuptools in my package's setup.py,and I want to know what the best approach to this is.

distribute was a fork of setuptools with python 3.x support and number of bugfixes due to lack of maintenance of the latter package.
ez_setup.py is a python module that fetches and installs setuptools automatically on as-needed basis. distribute_setup.py provided the same functionality to install distribute package instead.
As of Jule 2013 most of the changes done in distribute fork has been merged back and packages should switch over to using setuptools again.
Latest version of ez_setup.py can be found in setuptools repository at https://bitbucket.org/pypa/setuptools/src/tip/ez_setup.py?at=default
The usage is documented in http://setuptools.readthedocs.org/en/latest/using.html

Related

Need to use Pandas in Airflow V2: pandas vs apache-airflow[pandas]

I need to use Pandas in an airflow job. Even though I am an experienced programmer, I am relatively new to Python. I want to know in my requirements.txt, do I install pandas from PyPI or apache-airflow[pandas].
Also, I am not entirely sure what the provider apache-airflow[pandas] does? And how does pip resolve it (it seems like it is not in PyPi.
Thank you in advance for the answers.
I tried searching in PyPI for apache-airflow[pandas]
I also tried searching in SO for related questions
apache-airflow[pandas] only installs pandas>=0.17.1: https://github.com/apache/airflow/blob/0d2555b318d0eb4ed5f2d410eccf20e26ad004ad/setup.py#L308-L310. For context, this was the PR that originally added it: https://github.com/apache/airflow/pull/17575.
Since >=0.17.1 is quite broad, I suggest limiting Pandas to a more specific version in your requirements.txt. This gives you more control over the Pandas version, instead of the large number of possible Pandas versions that Airflow limits itself to.
I suggest to install Airflow with constraints as explained in the docs:
pip install "apache-airflow[pandas]==2.5.1" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.5.1/constraints-3.7.txt"
this will guarantee stable installation of Airflow without conflicts. Airflow also updates the constraints when release is cut thus when you upgrade Airflow you will get the latest possible version that "agrees" with all other Airflow dependencies.
For example:
Airflow 2.5.1 with Python 3.7 the version is:
pandas==1.3.5
Airflow 2.5.1 with Python 3.9 the version is:
pandas==1.5.2
Personally, I don't recommend overriding the versions in constraints. It carry a risk that your production environment will not be stable/consistent (unless you implement your own mechanism to generate constraints). Should you have a specific task that requires other version of a library (pandas or other) then I suggest using PythonVirtualenvOperator, DockerOperator or any other alternative that allows you to set specific libraries version for this task. This also gives DAG author the freedom to set whatever library version they need without being depended on other teams that share the same Airflow instance and need other versions for the same library, or even the same team but with another project that needs different versions (think of it the same way as you manage virtual environments in your IDE).
As for your question about apache-airflow[pandas]. Note that this is extra dependency it's not Airflow provider as you mentioned. The reason for having it is because Airflow had dependency on pandas in the past (as part of Airflow core) however pandas is heavy library and not everyone needs it thus moving it to optional dependency makes sense. That way only users who need to have pandas in their Airflow environment will install it.

Best way to write setup script for multi-language project package that includes anaconda, atom, node.js etc.?

I am designing an environment for productive research, i.e. writing, data-analysis, publication, etc.
In order to share the final results with others, I need to find a way to package this and to set up the local installation.
The project depends on Anaconda, so conda as a package manager is available.
It also includes
Pandoc and some pandoc packages, some will have to be fetched from Github directly because some versions are not available via conda-forge (doable in conda)
Atom and Atom packages; they should be installed and configured by my script (this works on the CLI via the apm package manager)
Node.js and Mermaid and a few other JS packages, which require npm calls
Some file-system-level operations, like deleting parts from packages where I only need a portion from, creating symlinks and aliases etc.
Maybe some Python code for modifying yaml/json/ini files or reading therefrom.
The main project will reside in a Github repository. It will be fine for users to clone it from there and start a build script locally.
My idea is to write a Bash shell script that
creates a conda environment based on requirements.yaml for everything that can be done this way
installs other parts using CLI commands (wget/curl etc.)
does all necessary modifications using CLI commands, maybe using a few short Python scripts (e.g. for changing or reading JSON or yaml files).
My local usage will be on OSX Big Sur, Linux should be supported, Windows compatibility would be nice-to-have.
Before I start:
Is this approach viable? I think it will be pretty transparent, but of course also a bit proprietary.
Docker is likely overkill for my purpose, and I also read that the execution will be slow on OSX.
The same environment will likely be installed multiple times on the same users' machine, so it is important that I can control e.g. the usage of existing packages and files via aliases or symlinks. It is not important that the multiple installations are decoupled for the non-python/non-conda parts (e.g. atom, node.js, mermaid could be the same binaries for all installations; just the set of Python packages might vary by installation).
Thanks for your expertise!

How to install other package version than default in buildroot?

I need to use python3.7m instead of default python3.8 in my buildroot project. Is there a solution for that or only way is get older buildroot version?
You do not have to downgrade Buildroot. You can always add a custom python3 package based on an older version, e.g.: https://git.busybox.net/buildroot/tree/package/python3?id=b3424c8fc9d1199f5836483f15af48b56373609e
Here the documentation on how to add a package:
https://buildroot.org/downloads/manual/manual.html#adding-packages
One possibility is to add a custom package with a different python version, as #Ezra Bühler explained. However, that custom package must have a different name than python3, which means that you also need to make custom versions of all packages that use python3.
Therefore, a simpler possibility is to modify the Buildroot code itself. There's normally no need to go back to an older Buildroot version - instead, you can take just copy and overwrite python3 from an older version that still has python 3.7. Of course, this is a completely untested configuration so you might encounter some breakage, but it's usually doable. If you take that route, you'll want to also update python to the latest (currently python 3.7.11).

How can I simplify my stack of package managers?

I don't know how it got this bad. I'm a web developer, and I use Ubuntu, and here are just some of the package managers I'm using.
apt-get for system-wide packages
npm for node packages
pip for python packages
pip3 for python 3 packages
cabal for haskell packages
composer for php packages
bower for front-end packages
gem for ruby packages
git for other things
When I start a new project on a new VM, I have to install seemingly a dozen package managers from a dozen different places, and use them all to create a development environment. This is just getting out of control.
I've discovered that I can basically avoid installing and using pip/pip3 just by installing python packages from apt, like sudo apt-get install python3-some-library. This saves from having to use one package manager. That's awesome. But then I'm stuck with the Ubuntu versions of those packages, which are often really old.
What I'm wondering is, is there a meta-package manager that can help me to replace a few of these parts, so my dev environment is not so tricky to replicate?
I had a thought to make a package manager to rule them all for that very reason. Never finished it though, too much effort required to stay compatible. For each package manager you have a huge community supporting it's upkeep.
Best advice I have is to try to reduce your toolchain for each type of project. Ideally you shouldn't need to work in every language you know for each project you work on. How many projects are you using that use both python 2 and python 3 simultaneously?
Keep using apt for your system packages and install git with it. From there try to stick to one language per project. AFAIK all of the package managers you listed support installing packages from git. The languages you mentioned all have comparable sets of tooling, so use the toolchain available for the target language.
I worked with a team that was using composer, npm, bower, bundler, maven, and a tar.gz file for frontend SPAs because those are the tools they knew. On top of all of that, they were using vagrant simply as a deployer. We considered our toolchain and described our need and realized that it could be expressed in a single language once we adopted appropriate tooling for the task at hand.

How to develop and package with ActivePython?

I have been developing (a somewhat complicated) app on core python (2.6) and have also managed to use pyinstaller to create an executable for deployment in measurements, or distribution to my colleagues. I work on the Ubuntu OS.
What has troubled me is in upgrading the versions of numpy or scipy. Some features I need are in 0.9 and I'm still on 0.7. The process of upgrading them, or matplotlib, for that matter are not elegant. The way I've upgraded on my local machine was to delete the folders of these libraries, and then manually install the newer versions.
However, this does not work for machines where I do not have root access. While trying to find a workaround, I found ActivePython. I gave it a quick try and it seems to use PyPM to download the newest scipy and numpy to its custom install location. Excellent! I don't need root access and can use the latest version of the libraries.
QUESTION:
If there are libraries not available on the PyPM index with ActivePython, how can I directly use the source code of those libraries (example wxpython) to include into this installation?
How can I use pyinstaller to build an executable using only the libraries in the ActivePython installation?