What are prevalent techniques for enabling user code extensions in Python? - oop

I'm looking for techniques that allow users to override modules in an application or extend an application with new modules.
Imagine an application called pydraw. It currently provides a Circle class, which inherits Shape. The package tree might look like:
/usr/lib/python/
└── pydraw
├── __init__.py
├── shape.py
└── shapes
├── circle.py
└── __init__.py
Now suppose I'd like to enable dynamic discovery and loading of user modules that implement a new shape, or perhaps even the Shape class itself. It seems most straightforward for a user's tree to have the same structure as the application tree, such as:
/home/someuser/python/
└── pydraw
├── __init__.py
├── shape.py <-- new superclass
└── shapes
├── __init__.py
└── square.py <-- new user class
In other words, I'd like to overlay and mask an application tree with same-named files from the user's tree, or at least get that apparent structure from a Python point of view.
Then, by configuring sys.path or PYTHONPATH, pydraw.shapes.square might be discoverable. However, Python's module path search doesn't find modules such as square.py. I presume this is because __method__ already contains a parent module at another path.
How would you accomplish this task with Python?

Discovery of extensions can be a bit brittle and complex, and also requires you to look through all of PYTHONPATH which can be very big.
Instead, have a configuration file that lists the plugins that should be loaded. This can be done by listing them as module names, and also requiring that they are located on the PYTHONPATH, or by simply listing the full paths.
If you want per user level configurations, I'd have both a global configuration file that lists modules, and a per user one, and just read these config files instead of trying some discovery mechanism.
Also you are trying to not only add plugins, but override components of your application. For that I would use the Zope Component Architecture. It is however not yet fully ported to Python 3, but it's designed to be used in exactly this kinds of cases, although from your description this seems to be a simple case, and the ZCA might be overkill. But look at it anyway.
http://www.muthukadan.net/docs/zca.html
http://pypi.python.org/pypi/zope.component

If you want to load python code dynamically from different locations, you can extend the search __path__ attributes by using the pkgutil module:
By placing these lines into each pydraw/__init__.py and pydraw/shapes/__init__.py:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
You will be able to write import statement as if you had a unique package:
>>> import pydraw.shapes
>>> pydraw.shapes.__path__
['/usr/lib/python/pydraw/shapes', '/home/someuser/python/pydraw/shapes']
>>> from pydraw.shapes import circle, square
>>>
You may think about auto-registration of your plugins. You can still use basic python code for that by setting a module variable (which will act as a kind of singleton pattern).
Add the last line in every pydraw/shapes/__init__.py file:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
# your shape registry
__shapes__ = []
You can now register a shape in top of its related module (circle.py or square.py here).
from pydraw.shapes import __shapes__
__shapes__.append(__name__)
Last check:
>>> from pydraw.shapes import circle,square
>>> from pydraw.shapes import circle,square,__shapes__
>>> __shapes__
['pydraw.shapes.circle', 'pydraw.shapes.square']

A method I used to handle such a problem was with a provider pattern.
In your module shape.py :
class BaseShape:
def __init__(self):
pass
provider("BaseShape", BaseShape)
and in the user's shape.py module :
class UserBaseShape:
def __init__(self):
pass
provider("BaseShape", UserBaseShape)
with provider method doing something like this :
def provider(provide_key, provider_class):
global_providers[provide_key] = provider_class
And when you need to instanciate an object, use a ProvideFactory like this
class ProvideFactory:
def get(self, provide_key, *args, **kwargs):
return global_providers[provide_key](*args, **kwargs)

You can detect file changes within a certain directory with OS-specific methods. For all operating systems, there exist file monitoring tools that produce events when a file or directory is changed. Alternatively, you can continuously search for files newer than the time of the last search. There are multiple possible solutions, but in any case:
configuring a plugins directory makes things a lot easier than monitoring your complete file system.
looking for file changes in a separate thread is probably the best solution
if a new .py file is found, you can import it using the __import__ built-in function
if a .py is changed, you can re-import it with the reload built-in function
If a class is changed, the instances of that class will still behave like instances of the old class, so make sure to recreate instances when necessary
EDIT:
If you add your plugins directory as the first directory in PYTHONPATH, that directory will have priority over the other directories in pythonpath, e.g.
import sys
sys.path.insert(0, 'PLUGIN_DIR')

Related

How does import find the file path in Kotlin?

I've read the Kotlin doc (https://kotlinlang.org/docs/packages.html), and I understood that, when importing a package, the package name does not need to match the folder's path that stores the package (unlike what happens in Java).
I don't have issues creating a package and importing it into other classes.
What I'd like to understand, is how the compiler can find the file to import?
For example:
if a file import animals.mammals.cats.* :
import animals.mammals.cats.*
...
the entities to import do not need to be stored in the file /animals/mammals/cats.kt, as long as the package name is "animals.mammals.cats":
package animals.mammals.cats
...
This Kotlin file could be stored in src/animals/kittens for example.
In other words, how import can locate the file/s to load since the package name does not help?
Thanks!
TL;DR: the compiler is given the paths to all files to compile and to all dependencies, and therefore knows about all available packages and the declarations they contain.
First, note that the import statement itself is not really the most important. It's just a convenient syntax to avoid having to specify the package everywhere throughout the file. But technically you don't need to import anything to be able to use declarations from outside the current file - you can just use their fully qualified name (a.k.a FQN) which is the package name + . + the name of the declaration.
Now, on to your question. When you run the compiler, you provide the paths to the complete set of files to be compiled at the same time: you compile a module, not a single file. Therefore, it has access to all declarations in all those files and maintains its own data structures about the available classes and top-level functions, and all symbols in general. So it can store and find declarations using just their FQN. (DISCLAIMER: I'm no expert and I don't actually know how it's done internally, but I'm just guessing that conceptually it's like storing a big mapping between FQN and the information about the corresponding declaration.)
If the declaration you use is not in the set of files being compiled, it must be in one of your dependencies. You can tell the compiler about the available dependencies by specifying the list of jars containing their already compiled classes. This list of all classes available at compile time is called the compile classpath. This is why the tool you use to build your project (for instance, Gradle or your IDE) needs to know about those dependencies, so it can put their declarations on the compile classpath when calling the compiler for you. Then, just like declarations that are being compiled, the ones from the compile classpath can be easily looked up by the compiler (the path has been given to the compiler as an argument).
Now, when you actually run your compiled program, at least on the JVM, the classes required must be placed on the runtime classpath - a set of classes given to the java program. Finding those declarations while the program runs is done by classloaders. There are multiple classloaders, organized in a hierarchy, but there is no need to go into details here. Basically, each time a class is used for the first time while your program is running, one classloader will be asked to load that class into memory. There are different implementations of classloaders, but one of the most common is the URLClassLoader which is given the URLs of some jars that contain classes, and knows how to read classes from these jars into memory on-demand.

How can I scope a CMake function so it can't be accessed from outside a file?

I'm trying to write some CMake code in a relatively complex project and I have a module that internally includes another module. The problem is, whenever I include my module, all of the functioned defined in the module it internally includes become available at a global level! This effectively is polluting my global namespace with a bunch of functions I didn't explicitly ask for.
For example:
# CMakeLists.txt
# Include my module
include(MyModule)
# Call a function from my module
my_module_function()
# HERE IS THE PROBLEM -- functions from "AnotherModule" are visible here!
# This call works
another_module_function()
Inside my module:
# MyModule.cmake
# Include another module
# - This other module is written and supported by someone else so I can't modify it
# - No functions from "AnotherModule" will be used outside of "MyModule"
include(AnotherModule)
# Define my function
function(my_module_function)
# Call a function from the other module
another_module_function()
endfunction()
Is there any way inside MyModule.cmake that I can import the functions from AnotherModule.cmake without having them be visible outside of my own module? This other module is written by someone else so I don't have control over it and it includes other functions with very generic names like one called parse_arguments that could potentially cause naming conflicts later on.
Making the functions from AnotherModule.cmake fully invisible outside of MyModule.cmake would be ideal, but even if there were a simple way to just simulate a namespace for the imported functions to be in that would be better than nothing.
In CMake macros and functions has global visibility and nothing can change that.
Often a function, "internal" to some module, is defined with underscore (_) prefix. Such prefix plays the role of a signal to outer code "not to use me". But this is only a convention, CMake doesn't enforce anything about underscore-prefixed names.
If including a module has only immediate effects, that is defines custom commands/targets but does not export functions/macros/variables for outer code, you may consider to wrap it with external project (ExternalProject_Add). An external project is a separate CMake project, and none its CMake things like variables or functions are visible outside it.

Backbone with or without RequireJS: What is better for data encapsulation?

I am in the process of transitioning my 'regular' Backbone projects into a combination of Backbone and RequireJS. While this process works pretty flawless, I still have one question.
Previously I declared a global namespace for my app to which I then bound all my models, views an collections. This is a tip I actually got from the Backbone ToDoMVC project.
So for example, the initialize method of a view could look like this:
initialize: function () {
app.employees = new app.EmployeeCollection();
app.employees.fetch();
}
This works because at the beginning of every file, I've done this:
var app = app || {};
Now when defining my files as AMD modules, the app namespace doesn't exist anymore, which means everything is much more encapsulated:
initialize: function () {
var employees = new EmployeeCollection();
employees.fetch();
}
The EmployeeCollection is loaded with RequireJS:
var EmployeeCollection = require('collections/EmployeeCollection');
Unfortunately I am still very new to Backbone and MVC in general, so I am unsure if this is a good or a bad thing.
What impact will this have on my project – is it okay to use an app namespace like I did previously or does this break any MVC/OOP 'rule'? Are there any Backbone specific consequences I need to be aware of?
Yes, loading the EmployeeCollection via requirejs is a good thing. This explicitly lists each module's dependencies and lets requirejs help you with loading modules in the proper order.
Both the app namespace approach and the requirejs approach are both valid. Backbone won't care which approach you take since with either you have the necessary View/Collection/Model constructor available to use. Personally I like the above benefits I mentioned of requirejs but it's a personal preference you'll have to decide.
However, you shouldn't use requirejs and an all-knowing app namespace together. If you're committed to requirejs then you should only use the app namespace sparingly with top-level data that most of your app will need, rather than attaching all of your requirejs modules to it.
For example, you might use it for a global UserModel that contains information about the current user. To do this you'd create an app object as a requirejs module just like you did with your EmployeeCollection, and then whatever module constructs the UserModel would require 'app' and do a simple assignment: app.user=user.
I said do this sparingly because using a global app namespaces for all your modules would sacrifice much of the benefit of requirejs and would cause you some sequencing pain. Namely:
You can no longer see the actual dependencies for each module declaratively and visualize easily how all your modules fit together. Instead of having the initialize function of your view (or whatever that is) require in 'collections/EmployeeCollection' you'd be requiring 'app'; not a lot of context there.
Requirejs will take care of loading required modules first before allowing your defining function to run. But if everything just requires 'app' then requirejs will only ensure 'app' is defined first and you're on your own for everything else. If app.Bar requires app.Foo, you have to do something to make sure app.Foo gets loaded and defined first.
On a similar note, if requirejs can't figure out all your dependencies because everything just requires 'app' then requirejs's javascript concatenator and optimizer tool (called r.js) will be either useless to you or require a lot of maintenance to add all your modules to a list that it should compile.
If you decide to use requirejs, embrace what it can do for you and just require in the modules you want instead of relying heavily on a global namespace. But there's not a right or wrong way choosing between these two approaches; each is used by lots of smart people.

Defining global variable for Browserify

I'm using SpineJS (which exports a commonjs module) and it needs to be available globally because I use it everywhere, but It seems like I have to do Spine = require('spine') on every file that uses Spine for things to work.
Is there any way to define Spine once to make it globally available?
PS: I'm using Spine as an example, but I'm in general wondering about how to do this with any other library.
Writing Spine = require('spine') in each file is the right way to do.
Yet, there are several possibilities by using the global or window object (browserify sets the global object to window, which is the global namespace):
in spine.js: global.Spine = module.exports
in any other .js file bundled by browserify: global.Spine = require('spine')
in a script tag or an .js file referenced by the .html file, after the spine.js file: window.Spine = require('spine')
First of all, for your example David is correct. Include all dependencies in every module you need it in. It's very verbose, but there is no compile time magic going on which alleviates all sorts of anti patterns and potential future problems.
The real answer.
This isn't always practical. Browserify accepts an option called insertGlobalVars. On build, each streamed file is scanned for identifiers matching the key names provided and wraps the module in an IIFE containing arguments that resolve each identifier that is not assigned within the module. This all happens before the dependency tree is finalized, which allows you to use require to resolve a dependency.
TLDR
Use insertGlobalVars option in Browserify.
browserify({
insertGlobalVars: {
spine: function(file, dir) {
return 'require("spine")';
}
}
});
For every file scanned, if an identifier spine exists that's not assigned, resolve as require("spine").

Encapsulating sub-namespaces in typescript

The project I'm working on, being rather large, consists of one master module, which I'd like to be the API interface, with a number of sub-modules defined within it. This is being done as follows:
<Library.ts>
module Library { }
<Core/Core.ts>
module Library.Core {}
Often the submodules will span a number of files. The problem I'm having is in such situations, one file cannot use non-exported properties defined within the same sub-module, but in another file.
Is there any way I can use these properties, or failing that, any way I can prevent the entirety of a sub-module's exports being exposed within its parent module?
Is there any way I can use these properties, or failing that, any way I can prevent the entirety of a sub-module's exports being exposed within its parent module?
No. You need to export from module Foo for it to be available to module Foo in another file. The same applies to submodules