How to know which module exports a certain function - module

I was going through this Flux.jl tutorial and came across something called Chain.
m = Chain(Dense(10, 5, relu), Dense(5, 2), softmax)
It was not imported from any of the used modules and no namespace was used, so I did not know which module it belonged to. Although I managed to find that I belongs to Flux package, I wonder if there is a general way of figuring this out within the script.

To figure out where a specific function comes from, you can use the parentmodule function:
julia> parentmodule(Chain)
Flux
Read more in the Julia docs: https://docs.julialang.org/en/v1/base/base/#Base.parentmodule

Note that, as the docs mention, parentmodule(<function>) only returns the module containing the first definition of the function, which may not necessarily be the one we care about. In this case it doesn't matter since Chain has definitions only in Flux, but sometimes a function has definitions in multiple packages, and which one applies depends on the type of the arguments.
parentmodule could help you there too, if you pass it the types of your arguments (for eg. parentmodule(Chain, (Dense, Dense, Function)) here), but I generally just tend to use the #which macro:
julia> #which Chain(Dense(10, 5, relu), Dense(5, 2), softmax)
Chain(xs...) in Flux at /home/Sundar/.julia/packages/Flux/BPPNj/src/layers/basic.jl:33
The output is slightly noisier, but it tells you which particular method of the function was applied, which module it is from, and exactly where you can find the method definition.

Related

Understanding how Julia modules can be extended

I'm struggling to understand how exactly modules can be extended in Julia. Specifically, I'd like to create my own LinearAlgebra matrix whose parent class is AbstractMatrix{T} and implement its functionality similar to how the Diagonal or UpperTriangular matrices are implemented in the actual LA package. If I could literally add my matrix to the original package, then I would, but for now I am content creating my own MyLinearAlgebra package that simply imports the original and extends it. Here's what I've got so far in MyLinearAlgebra.jl:
module MyLinearAlgebra
import LinearAlgebra
import Base: getindex, setindex!, size
export
# Types
LocalMatrix,
SolutionVector,
# Functions
issymmetric,
isdiag
# Operators
# Constants
include("SolutionVector.jl")
include("LocalMatrix.jl")
end
Focusing solely on LocalMatrix.jl now, I have:
"""
struct LocalMatrix{T} <: AbstractMatrix{T}
Block diagonal structure for local matrix. `A[:,:,s,iK]` is a block matrix for
state s and element iK
"""
struct LocalMatrix{T} <: AbstractMatrix{T}
data::Array{T,4}
function LocalMatrix{T}(data) where {T}
new{T}(data)
end
end
[... implement size, getindex, setindex! ... all working perfectly]
"""
issymmetric(A::LocalMatrix)
Tests whether a LocalMatrix is symmetric
"""
function issymmetric(A::LocalMatrix)
println("my issymmetric")
all(LinearAlgebra.issymmetric, [#view A.data[:,:,i,j] for i=1:size(A.data,3), j=1:size(A.data,4)])
end
"""
isdiag(A::LocalMatrix)
Tests whether a LocalMatrix is diagonal
"""
function isdiag(A::LocalMatrix)
println("my isdiag")
all(LinearAlgebra.isdiag, [#view A.data[:,:,i,j] for i=1:size(A.data,3), j=1:size(A.data,4)])
end
When I try and run this however, I get
error in method definition: function LinearAlgebra.isdiag must be explicitly imported to be extended
OK not a problem, I can change the definition to function LinearAlgebra.isdiag() instead and it works. But if I also change the definition of the other function to function LinearAlgebra.issymmetric() and run a simple test I now get the error
ERROR: MethodError: no method matching issymmetric(::MyLinearAlgebra.LocalMatrix{Float64})
So I'm stumped. Obviously I have a workaround that lets me continue working for now, but I must be simply misunderstanding how Julia modules work because I can't seem to distinguish between the two functions. Why does one needs to be explicitly extended? Why can the other not? What even is the difference between them in this situation? What is the correct way here to extend a package's module? Thanks for any help.
You need to explicitly state that you are adding new methods to existing functions so it should be:
function LinearAlgebra.issymmetric(A::LocalMatrix)
...
end
function LinearAlgebra.isdiag(A::LocalMatrix)
...
end
The reason that you are getting the error most likely is because you forgot to import LinearAlgebra in the code that is testing your package.
Note that your constructor also should be corrected:
struct LocalMatrix{T} <: AbstractMatrix{T}
data::Array{T,4}
function LocalMatrix(data::Array{T,4}) where {T}
new{T}(data)
end
end
With the current constructor you need to write LocalMatrix{Float64}(some_arr) instead of simply LocalMatrix(some_arr). Even worse, if you provide your constructor with a 3-d array you will get a type conversion error, while when you used the syntax that I am proposing one gets no method matching LocalMatrix(::Array{Int64,3}) which is much more readable for users of your library.

Serializing the current state of "Module Random" in OCaml's StdLib

I must have read the OCaml manual page on the standard library modules Random and Random.State half a dozen times (probably even more often) but I can't figure out how to serialize the current internal state of the PRNG.
Here's what I learned so far:
The modules Random and Random.State both operate on a state that is abstract / opaque from the outside.
Both modules offer two / three initializers, but functions exporting the current state ... I can't see them :(
What can I do ? Help please !
You can serialize (and de-serialize) the state using the Marshal module, e.g.,
let save_random_state out =
Marshal.to_channel out (Random.get_state ()) []
let load_random_state inp =
Random.set_state (Marshal.from_channel inp)
But if you just want the Random module to generate the same sequences of pseudo-random numbers than it is better just to initialize with the same state, i.e., use the same seed, e.g., if you will start your program with,
let () = Random.set_state (Random.State.make [|42|])
you will get the deterministic behavior of your program, as the Random module will always generate the same numbers.

values argument of the variable_scope

Could someone shed some light on the
values
argument of the variable_scope Class
in tensorflow. The official documentation is a little bit confusing.
I m quoting from the doc:
This context manager validates that the (optional) values are from the same graph, ensures that graph is the default graph, and pushes a name scope and a variable scope.
and
values: The list of Tensor arguments that are passed to the op function.
Can anyone present a use case of such an argument ?
Looking at source code in the current 1.7.0 version (python/ops/variable_scope.py), its use seems somewhat niche. The only thing that parameter appears to be used for is to call the internal function _get_graph_from_inputs (defined in python/framework/ops.py), which returns the graph on which the operations are to be constructed (the graph for which you are creating the scope, I understand). When you don't pass anything, it will be the current default graph, when you give some tensors, the graph in which these live is used. I find hard to imagine a case in which one would prefer to pass some values instead of setting a default graph context, but there it is... Maybe it is used internally or something else, but I would not be surprised if they decide to deprecate the parameter at some point, since I have not been able to find a single usage example.

Two Modules, both exporting the same name

There are two packages I want to use: CorpusLoaders.jl, and WordNet.jl
CorpusLoaders.SemCor exports sensekey(::SenseTaggedWord)
WordNet exports sensekey(::DB, ::Synset, ::Lemma)
I want to use both sensekey methods.
Eg
for some mixed list of items: mixedlist::Vector{Union{Tuple{SenseTaggedWord},Tuple{DB, Synset,Lemma}}.
Ie the items in the list are a mixture of 1-tuples of SenseTaggedWord, and3 tuples of DB, Synset, and Lemma.
for item in mixedlist
println(sensekey(item...)
end
should work.
This example is a little facetious, since why would I be mixing them like this.
But, hopefully it serves for illustrating the problem in the general case.
Trying to using CorpusLoaders.SemCor, WordNet to bring in both results in WARNING: both WordNet and Semcor export "sensekey"; uses of it in module Main must be qualified.
Manually importing both: import CorpusLoaders.SemCor.sensekey; import WordNet.sensekey results in WARNING: ignoring conflicting import of Semcor.sensekey into Main
What can be done? I want them both, and they don't really conflict, due to multiple-dispatch.
Given that CorpusLoaders.jl is a package I am writing I do have a few more options, since I could make my CorpusLoaders.jl depend on WordNet.jl.
If I did do than then I could say in CorpusLoaders.jl
import WordNet
function WordNet.sensekey(s::SenseTaggedWord)...
and that would make them both work.
But it would mean requiring WordNet as a dependency of CorpusLoaders.
And I want to know how to solve the problem for a consumer of the packages -- not as the creator of the packages.
tl;dr qualify the functions when using them in your script via their module namespace, i.e. CorpusLoader.sensekey() and WordNet.sensekey()
Explanation
My understanding of your question after the edits (thank you for clarifying) is that:
You have written a package called CorpusLoaders.jl, which exports the function sensekey(::SenseTaggedWord)
There is an external package called WordNet.jl, which exports the function sensekey(::DB, ::Synset, ::Lemma)
You have a script that makes use of both modules.
and you are worried that using the modules or "importing" the functions directly could potentially create ambiguity and / or errors in your script, asking
how can I write my CorpusLoaders package to prevent potential clashes with other packages, and
how can I write my script to clearly disambiguate between the two functions while still allowing their use?
I think this stems from a slight confusion how using and import are different from each other, and how modules create a namespace. This is very nicely explained in the docs here.
In essence, the answers are:
You should not worry about exporting things from your module that will clash with other modules. This is what modules are for: you're creating a namespace, which will "qualify" all exported variables, e.g. CorpusLoaders.sensekey(::SenseTaggedWord).
When you type using CorpusLoaders, what you're saying to julia is "import the module itself, and all the exported variables stripped from their namespace qualifier, and bring them into Main". Note that this means you now have access to sensekey as a function directly from Main without a namespace qualifier, and as CorpusLoaders.sensekey(), since you've also imported the module as a variable you can use.
If you then try using the module WordNet as well, julia very reasonably issues a warning, which essentially says:
"You've imported two functions that have the same name. I can't just strip their namespace off because that could create problems in some scenarios (even though in your case it wouldn't because they have different signatures, but I couldn't possibly know this in general). If you want to use either of these functions, please do so using their appropriate namespace qualifier".
So, the solution for 2. is:
you either do
using CorpusLoaders;
using WordNet;
, disregarding the warning, to import all other exported variables as usual in your Main namespace, and access those particular functions directly via their modules as CorpusLoaders.sensekey() and WordNet.sensekey() each time you need to use them in your script, or
you keep both modules clearly disambiguated at all times by doing
import CorpusLoaders;
import WordNet;
and qualify all variables appropriately, or
in this particular case where the function signatures don't clash, if you'd really like to be able to use the function without a namespace qualifier, relying on multiple dispatch instead, you can do something like what FengYang suggested:
import CorpusLoaders;
import WordNet;
sensekey(a::SenseTaggedWord) = CorpusLoader.sensekey(a);
sensekey(a::DB, b::Synset, c::Lemma) = WordNet.sensekey(a, b, c);
which is essentially a new function, defined on module Main, acting as a wrapper for the two namespace-qualified functions.
In the end, it all comes down to using using vs import and namespaces appropriately for your particular code. :)
As an addendum, code can get very unwieldy with long namespace qualifiers like CorpusLoader and WordNet. julia doesn't have something like python's import numpy as np, but at the same time modules become simple variables on your workspace, so it's trivial to create an alias for them. So you can do:
import CorpusLoaders; const cl = CorpusLoaders;
import Wordnet; const wn = WordNet;
# ... code using both cl.sensekey() and wn.sensekey()
In this case, the functions do not conflict, but in general that is impossible to guarantee. It could be the case that a package loaded later will add methods to one of the functions that will conflict. So to be able to use the sensekey for both packages requires some additional guarantees and restrictions.
One way to do this is to ignore both package's sensekey, and instead provide your own, dispatching to the correct package:
sensekey(x) = CorpusLoaders.sensekey(x)
sensekey(x, y, z) = WordNet.sensekey(x,y,z)
I implemented what #Fengyang Wang said,
as a function:
function importfrom(moduleinstance::Module, functionname::Symbol, argtypes::Tuple)
meths = methods(moduleinstance.(functionname), argtypes)
importfrom(moduleinstance, functionname, meths)
end
function importfrom(moduleinstance::Module, functionname::Symbol)
meths = methods(moduleinstance.(functionname))
importfrom(moduleinstance, functionname, meths)
end
function importfrom(moduleinstance::Module, functionname::Symbol, meths::Base.MethodList)
for mt in meths
paramnames = collect(mt.lambda_template.slotnames[2:end])
paramtypes = collect(mt.sig.parameters[2:end])
paramsig = ((n,t)->Expr(:(::),n,t)).(paramnames, paramtypes)
funcdec = Expr(:(=),
Expr(:call, functionname, paramsig...),
Expr(:call, :($moduleinstance.$functionname), paramnames...)
)
current_module().eval(funcdec) #Runs at global scope, from calling module
end
end
Call with:
using WordNet
using CorpusLoaders.Semcor
importfrom(CorpusLoaders.Semcor, :sensekey)
importfrom(WordNet, :sensekey)
methods(sensekey)
2 methods for generic function sensekey:
sensekey(db::WordNet.DB, ss::WordNet.Synset, lem::WordNet.Lemma)
sensekey(saword::CorpusLoaders.Semcor.SenseAnnotatedWord
If you wanted to get really flash you could reexport the DocString too.

How do you USE Fortran 90 module data

Let's say you have a Fortran 90 module containing lots of variables, functions and subroutines. In your USE statement, which convention do you follow:
explicitly declare which variables/functions/subroutines you're using with the , only : syntax, such as USE [module_name], only : variable1, variable2, ...?
Insert a blanket USE [module_name]?
On the one hand, the only clause makes the code a bit more verbose. However, it forces you to repeat yourself in the code and if your module contains lots of variables/functions/subroutines, things begin to look unruly.
Here's an example:
module constants
implicit none
real, parameter :: PI=3.14
real, parameter :: E=2.71828183
integer, parameter :: answer=42
real, parameter :: earthRadiusMeters=6.38e6
end module constants
program test
! Option #1: blanket "use constants"
! use constants
! Option #2: Specify EACH variable you wish to use.
use constants, only : PI,E,answer,earthRadiusMeters
implicit none
write(6,*) "Hello world. Here are some constants:"
write(6,*) PI, &
E, &
answer, &
earthRadiusInMeters
end program test
Update
Hopefully someone says something like "Fortran? Just recode it in C#!" so I can down vote you.
Update
I like Tim Whitcomb's answer, which compares Fortran's USE modulename with Python's from modulename import *. A topic which has been on Stack Overflow before:
‘import module’ or ‘from module import’
In an answer, Mark Roddy mentioned:
don't use 'from module import *'. For
any reasonable large set of code, if
you 'import *' your will likely be
cementing it into the module, unable
to be removed. This is because it is
difficult to determine what items used
in the code are coming from 'module',
making it east to get to the point
where you think you don't use the
import anymore but its extremely
difficult to be sure.
What are good rules of thumb for python imports?
dbr's answer contains
don't do from x import * - it makes
your code very hard to understand, as
you cannot easily see where a method
came from (from x import *; from y
import *; my_func() - where is my_func
defined?)
So, I'm leaning towards a consensus of explicitly stating all the items I'm using in a module via
USE modulename, only : var1, var2, ...
And as Stefano Borini mentions,
[if] you have a module so large that you
feel compelled to add ONLY, it means
that your module is too big. Split it.
I used to just do use modulename - then, as my application grew, I found it more and more difficult to find the source to functions (without turning to grep) - some of the other code floating around the office still uses a one-subroutine-per-file, which has its own set of problems, but it makes it much easier to use a text editor to move through the code and quickly track down what you need.
After experiencing this, I've become a convert to using use...only whenever possible. I've also started picking up Python, and view it the same way as from modulename import *. There's a lot of great things that modules give you, but I prefer to keep my global namespace tightly controlled.
It's a matter of balance.
If you use only a few stuff from the module, it makes sense if you add ONLY, to clearly specify what you are using.
If you use a lot of stuff from the module, specifying ONLY will be followed by a lot of stuff, so it makes less sense. You are basically cherry-picking what you use, but the true fact is that you are dependent on that module as a whole.
However, in the end the best philosophy is this one: if you are concerned about namespace pollution, and you have a module so large that you feel compelled to add ONLY, it means that your module is too big. Split it.
Update: Fortran? just recode it in python ;)
Not exactly answering the question here, just throwing in another solution that I have found useful in some circumstances, if for whatever reason you don't want to split your module and start to get namespace clashes. You can use derived types to store several namespaces in one module.
If there is some logical grouping of the variables, you can create your own derived type for each group, store an instance of this type in the module and then you can import just the group that you happen to need.
Small example: We have a lot of data some of which is user input and some that is the result of miscellaneous initializations.
module basicdata
implicit none
! First the data types...
type input_data
integer :: a, b
end type input_data
type init_data
integer :: b, c
end type init_data
! ... then declare the data
type(input_data) :: input
type(init_data) :: init
end module basicdata
Now if a subroutine only uses data from init, you import just that:
subroutine doesstuff
use basicdata, only : init
...
q = init%b
end subroutine doesstuff
This is definitely not a universally applicable solution, you get some extra verbosity from the derived type syntax and then it will of course barely help if your module is not the basicdata sort above, but instead more of a allthestuffivebeenmeaningtosortoutvariety. Anyway, I have had some luck in getting code that fits easier into the brain this way.
The main advantage of USE, ONLY for me is that it avoids polluting my global namespace with stuff I don't need.
Agreed with most answers previously given, use ..., only: ... is the way to go, use types when it makes sense, apply python thinking as much as possible. Another suggestion is to use appropriate naming conventions in your imported module, along with private / public statements.
For instance, the netcdf library uses nf90_<some name>, which limits the namespace pollution on the importer side.
use netcdf ! imported names are prefixed with "nf90_"
nf90_open(...)
nf90_create(...)
nf90_get_var(...)
nf90_close(...)
similarly, the ncio wrapper to this library uses nc_<some name> (nc_read, nc_write...).
Importantly, with such designs where use: ..., only: ... is made less relevant, you'd better control the namespace of the imported module by setting appropriate private / public attributes in the header, so that a quick look at it will be sufficient for readers to assess which level of "pollution" they are facing. This is basically the same as use ..., only: ..., but on the imported module side - thus to be written only once, not at each import).
One more thing: as far as object-orientation and python are concerned, a difference in my view is that fortran does not really encourage type-bound procedures, in part because it is a relatively new standard (e.g. not compatible with a number of tools, and less rationally, it is just unusual) and because it breaks handy behavior such as procedure-free derived type copy (type(mytype) :: t1, t2 and t2 = t1). That means you often have to import the type and all would-be type-bound procedures, instead of just the class. This alone makes fortran code more verbose compared to python, and practical solutions like a prefix naming convention may come in handy.
IMO, the bottom line is: choose your coding style for people who will read it (this includes your later self), as taught by python. The best is the more verbose use ..., only: ... at each import, but in some cases a simple naming convention will do it (if you are disciplined enough...).
Yes, please use use module, only: .... For large code bases with multiple programmers, it makes the code easier to follow by everyone (or just use grep).
Please do not use include, use a smaller module for that instead. Include is a text insert of source code which is not checked by the compiler at the same level as use module, see: FORTRAN: Difference between INCLUDE and modules. Include generally makes it harder for both humans and computer to use the code which means it should not be used. Ex. from mpi-forum: "The use of the mpif.h include file is strongly discouraged and may be deprecated in a future version of MPI." (http://mpi-forum.org/docs/mpi-3.1/mpi31-report/node411.htm).
I know I'm a little late to the party, but if you're only after a set of constants and not necessarily computed values, you could do like C and create an include file:
inside a file,
e.g., constants.for
real, parameter :: pi = 3.14
real, parameter :: g = 6.67384e-11
...
program main
use module1, only : func1, subroutine1, func2
implicit none
include 'constants.for'
...
end program main
Edited to remove "real(4)" as some think it is bad practice.