Hi I am an old Java/Haskell/Cobol programmer. And Julia has many fantastic features but I am struggling with what I consider the basics. Particularly how to find what is in the many packages (Flux, Plots, DifferentialEquations, etc).
I have tried both VSCode and Jupiter notebook and the Julia REPL but have found no way to extract this essential information while working on a program.
I have also failed to find the information on line without it being spread very thinly over tutorial for each package.
Is there and "cheetsheet" or "summary" or java like API?
For the basics, I would recommend two "cheatsheets" in particular
"The fast track to Julia" https://juliadocs.github.io/Julia-Cheat-Sheet/ -- covers the language itself, the stdlibs, and a couple useful packages
If you have previous experience in Python or Matlab, https://cheatsheets.quantecon.org
However, the package ecosystem is too diverse for any cheatsheet to cover all the useful packages out there, so for individual packages I would recommend a few tricks built in to the language:
Tab-complete in the REPL. Typing PackageName.<tab><tab> will print everything in the namespace of PackageName, e.g.:
julia> using Statistics
julia> Statistics.
_conj _vmean covm quantile!
_getnobs centralize_sumabs2 covzm range_varm
_mean centralize_sumabs2! eval realXcY
_mean_promote centralizedabs2fun include sqrt!
_median clampcor mean std
_quantile cor mean! stdm
_quantilesort! corm median unscaled_covzm
_std corzm median! var
_var cov middle varm
_varm cov2cor! quantile varm!
Note that since this prints everything in the package, this includes functions that may not be intended for general use. By general convention, any function beginning with an underscore _ is not intended for public use.
The names function -- will give you a list of all the functions exported by a given package (and thus is a bit more selective than just typing PackageName.<tab><tab>). This is effectively the public API of any given Julia package:
julia> using Statistics
julia> names(Statistics)
14-element Vector{Symbol}:
:Statistics
:cor
:cov
:mean
:mean!
:median
:median!
:middle
:quantile
:quantile!
:std
:stdm
:var
:varm
Once you find a function that looks promising, the built-in help, accessed by just typing ? at the REPL prompt:
help?> cov
search: cov convert StackOverflowError CUSOLVER has_cusolvermg code_llvm #code_llvm cudaconvert
cov(x::AbstractVector; corrected::Bool=true)
Compute the variance of the vector x. If corrected is true (the default) then the sum is scaled with
n-1, whereas the sum is scaled with n if corrected is false where n = length(x).
────────────────────────────────────────────────────────────────────────────────────────────────────
cov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)
Compute the covariance matrix of the matrix X along the dimension dims. If corrected is true (the
default) then the sum is scaled with n-1, whereas the sum is scaled with n if corrected is false
where n = size(X, dims).
────────────────────────────────────────────────────────────────────────────────────────────────────
...
...
...
The methodswith function -- will give you a list of all functions that can operate on objects of a given Type:
julia> using LinearAlgebra, SparseArrays
julia> methodswith(SparseMatrixCSC)
[1] sizehint!(S::SparseMatrixCSC, n::Integer) in SparseArrays
[2] cov(X::SparseMatrixCSC; dims, corrected) in Statistics
[3] \(L::SuiteSparse.CHOLMOD.Factor, B::SparseMatrixCSC) in SuiteSparse.CHOLMOD
[4] lu(A::SparseMatrixCSC; check) in SuiteSparse.UMFPACK a
[5] lu!(F::SuiteSparse.UMFPACK.UmfpackLU, A::SparseMatrixCSC; check) in SuiteSparse.UMFPACK
[6] qr(A::SparseMatrixCSC; tol) in SuiteSparse.SPQR
[7] rank(S::SparseMatrixCSC) in SuiteSparse.SPQR
...
...
...
On #OscarSmith's suggestion -- you can search for functions, types, and other symbols across multiple packages, as well as searching the full text of all Julia packages, on JuliaHub. For example: sprandn
Finally, for getting a feel for what sort of packages are out there in the package ecosystem beyond the stdlib, I can recommend joining one of the several Julia community fora, such as the discourse, the slack, or the zulip.
Related
I'm relatively new to Yosys. I've been tinkering with it with some proprietary standard cell libraries and am trying to extract some QoR/PPA metrics, similar to those you can get from DC.
Minimum slack (including worst-case negative slack/WNS)
Max logic depth [0]
Cell area [1]
For [0], I know there's the ltp command, but it only reports topological paths per module. I tried flattening the design using flatten, but there still seems to be a hierarchy in the netlist. Where should I insert the flatten command to actually flatten the netlist?
For [1], I know you can get the number of cells in the netlist using the stat command, but this doesn't tell me the equivalent of DC's CellArea metric (since each cell has a different area). I could just build a library of cell areas for each cell type based on the cell library datasheet, but that's rather laborious.
Also, is it possible to specify a target clock rate for synthesis? I think for abc there was a -D flag for delay, but this sounds to me more like input delay rather than clock period.
Thanks!
-D passed to abc is indeed clock period, not input delay. When specified this should also cause abc to print slack information.
Have you tried stat -liberty file.lib to use a liberty file for cell areas? If this isn't calculating areas as expected (I didn't quite understand your issue) then please create a feature request on GitHub with the difference.
flatten should be run after hierarchy -top top_module_name to do hierarchical elaboration and set the top module.
Coming from python3 to Julia one would love to be able to write fast iterators as a function with produce/yield syntax or something like that.
Julia's macros seem to suggest that one could build a macro which transforms such a "generator" function into an julia iterator.
[It even seems like you could easily inline iterators written in function style, which is a feature the Iterators.jl package also tries to provide for its specific iterators https://github.com/JuliaCollections/Iterators.jl#the-itr-macro-for-automatic-inlining-in-for-loops ]
Just to give an example of what I have in mind:
#asiterator function myiterator(as::Array)
b = 1
for (a1, a2) in zip(as, as[2:end])
try
#produce a1[1] + a2[2] + b
catch exc
end
end
end
for i in myiterator([(1,2), (3,1), 3, 4, (1,1)])
#show i
end
where myiterator should ideally create a fast iterator with as low overhead as possible. And of course this is only one specific example. I ideally would like to have something which works with all or almost all generator functions.
The currently recommended way to transform a generator function into an iterator is via Julia's Tasks, at least to my knowledge. However they also seem to be way slower then pure iterators. For instance if you can express your function with the simple iterators like imap, chain and so on (provided by Iterators.jl package) this seems to be highly preferable.
Is it theoretically possible in julia to build a macro converting generator-style functions into flexible fast iterators?
Extra-Point-Question: If this is possible, could there be a generic macro which inlines such iterators?
Some iterators of this form can be written like this:
myiterator(as) = (a1[1] + a2[2] + 1 for (a1, a2) in zip(as, as[2:end]))
This code can (potentially) be inlined.
To fully generalize this, it is in theory possible to write a macro that converts its argument to continuation-passing style (CPS), making it possible to suspend and restart execution, giving something like an iterator. Delimited continuations are especially appropriate for this (https://en.wikipedia.org/wiki/Delimited_continuation). The result is a big nest of anonymous functions, which might be faster than Task switching, but not necessarily, since at the end of the day it needs to heap-allocate a similar amount of state.
I happen to have an example of such a transformation here (in femtolisp though, not Julia): https://github.com/JeffBezanson/femtolisp/blob/master/examples/cps.lsp
This ends with a define-generator macro that does what you describe. But I'm not sure it's worth the effort to do this for Julia.
Python-style generators – which in Julia would be closest to yielding from tasks – involve a fair amount of inherent overhead. You have to switch tasks, which is non-trivial and cannot straightforwardly be eliminated by a compiler. That's why Julia's iterators are based on functions that transform one typically immutable, simple state value, and another. Long story short: no, I do not believe that this transformation can be done automatically.
After thinking a lot how to translate python generators to Julia without loosing much performance, I implemented and tested a library of higher level functions which implement Python-like/Task-like generators in a continuation-style. https://github.com/schlichtanders/Continuables.jl
Essentially, the idea is to regard Python's yield / Julia's produce as a function which we take from the outside as an extra parameter. I called it cont for continuation. Look for instance on this reimplementation of a range
crange(n::Integer) = cont -> begin
for i in 1:n
cont(i)
end
end
You can simply sum up all integers by the following code
function sum_continuable(continuable)
a = Ref(0)
continuable() do i
a.x += i
end
a.x
end
# which simplifies with the macro Continuables.#Ref to
#Ref function sum_continuable(continuable)
a = Ref(0)
continuable() do i
a += i
end
a
end
sum_continuable(crange(4)) # 10
As you hopefully agree, you can work with continuables almost like you would have worked with generators in python or tasks in julia. Using do notation instead of for loops is kind of the one thing you have to get used to.
This idea takes you really really far. The only standard method which is not purely implementable using this idea is zip. All the other standard higher-level tools work just like you would hope.
The performance is unbelievably faster than Tasks and even faster than Iterators in some cases (notably the naive implementation of Continuables.cmap is orders of magnitude faster than Iterators.imap). Check out the Readme.md of the github repository https://github.com/schlichtanders/Continuables.jl for more details.
EDIT: To answer my own question more directly, there is no need for a macro #asiterator, just use continuation style directly.
mycontinuable(as::Array) = cont -> begin
b = 1
for (a1, a2) in zip(as, as[2:end])
try
cont(a1[1] + a2[2] + b)
catch exc
end
end
end
mycontinuable([(1,2), (3,1), 3, 4, (1,1)]) do i
#show i
end
In[11]:= $Version
Out[11]= 9.0 for Linux x86 (32-bit) (November 20, 2012)
In[12]:= DSolve[{f[0] == d, f'[0] == v0, f''[t] == g*m2/f[t]^2}, f, t]
DSolve::bvimp: General solution contains implicit solutions. In the boundary
value problem these solutions will be ignored, so some of the solutions will
be lost.
Out[12]= {}
The code above pretty much says it all. I get the same error if I replace g*m2 with 1.
This seems like a really simple DFQ to solve. I'd like to tell DSolve to assume all variables are real and that d, g, and m2 are all greater than 0, but there's unfortunately no way to do that.
Thoughts?
You are trying for a symbolic solution. And unfortunately, symbolic integration is hard (while symbolic differentiation is easy).
The way this integration works is to obtain the energy functional by integrating once
E = 1/2*f'[t]^2 + C/f[t]
and then to isolate f'[t]. The resulting integral is not easy to solve and leads to the mentioned implicit solutions.
Did you really want to get the symbolic solution or only some function table to plot the solutions or compute other related quantities?
Since it was clarified that the requested quantity is the maximum of certain solutions: This can be computed by setting v=0 in the energy equation
C/x = E = 1/2*v0^2 + C/x0
or
x = C*x0/(C + 1/2*v0^2*x0 )
One would have to analyze the time aspect to make sure that this extremum is reached before passing again at the initial point x0.
I was profiling an application that does a lot of math operations on NMatrix matrices.
The application spends most of it's time in in the code below.
{add: :+, sub: :-, mul: :*, div: :/, pow: :**, mod: :%}.each_pair do |ewop, op|
define_method("__list_elementwise_#{ewop}__") do |rhs|
self.__list_map_merged_stored__(rhs, nil) { |l,r| l.send(op,r) }.cast(stype, NMatrix.upcast(dtype, rhs.dtype))
end
define_method("__dense_elementwise_#{ewop}__") do |rhs|
self.__dense_map_pair__(rhs) { |l,r| l.send(op,r) }.cast(stype, NMatrix.upcast(dtype, rhs.dtype))
end
define_method("__yale_elementwise_#{ewop}__") do |rhs|
self.__yale_map_merged_stored__(rhs, nil) { |l,r| l.send(op,r) }.cast(stype, NMatrix.upcast(dtype, rhs.dtype))
end
end
In the commets above the code it says:
# Define the element-wise operations for lists. Note that the __list_map_merged_stored__ iterator returns a Ruby Object
# matrix, which we then cast back to the appropriate type. If you don't want that, you can redefine these functions in
# your own code.
I am not that familiar with the internals of NMatrix but it seems as though the math operations are being executed in Ruby. Is there anyway to speed up these methods?
We had written them in C/C++ originally, but it required some really complicated macros which were basically unmaintainable and buggy, and substantially increased compile time.
If you look in History.txt, you'll be able to find at what version we started writing the math operations in Ruby. You could use the prior code to override and put the element-wise operations (where you need speed) exclusively in C/C++.
However, you may run into problems getting those to work properly (without a crash) on matrices of dtype :object.
As a side note, the sciruby-dev Google Group (or the nmatrix issue tracker) might be a more appropriate place for a question like this one.
I am in the midst of trying to make the leap from Matlab to numpy, but I desperately need speed in my fft's. Now I know of pyfftw, but I don't know that I am using it properly. My approach is going something like
import numpy as np
import pyfftw
import timeit
pyfftw.interfaces.cache.enable()
def wrapper(func, *args):
def wrapped():
return func(*args)
return wrapped
def my_fft(v):
global a
global fft_object
a[:] = v
return fft_object()
def init_cond(X):
return my_fft(2.*np.cosh(X)**(-2))
def init_cond_py(X):
return np.fft.fft(2.*np.cosh(X)**(-2))
K = 2**16
Llx = 10.
KT = 2*K
dx = Llx/np.float64(K)
X = np.arange(-Llx,Llx,dx)
global a
global b
global fft_object
a = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
b = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
fft_object = pyfftw.FFTW(a,b)
wrapped = wrapper(init_cond, X)
print min(timeit.repeat(wrapped,repeat=100,number=1))
wrapped_two = wrapper(init_cond_py, X)
print min(timeit.repeat(wrapped_two,repeat=100,number=1))
I appreciate that there are builder functions and also standard interfaces to the scipy and numpy fft calls through pyfftw. These have all behaved very slowly though. By first creating an instance of the fft_object and then using it globally, I have been able to get speeds as fast or slightly faster than numpy's fft call.
That being said, I am working under the assumption that wisdom is implicitly being stored. Is that true? Do I need to make that explicit? If so, what is the best way to do that?
Also, I think timeit is completely opaque. Am I using it properly? Is it storing wisdom as I call repeat? Thanks in advance for any help you might be able to give.
In an interactive (ipython) session, I think the following is what you want to do (timeit is very nicely handled by ipython):
In [1]: import numpy as np
In [2]: import pyfftw
In [3]: K = 2**16
In [4]: Llx = 10.
In [5]: KT = 2*K
In [6]: dx = Llx/np.float64(K)
In [7]: X = np.arange(-Llx,Llx,dx)
In [8]: a = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
In [9]: b = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
In [10]: fft_object = pyfftw.FFTW(a,b)
In [11]: a[:] = 2.*np.cosh(X)**(-2)
In [12]: timeit np.fft.fft(a)
100 loops, best of 3: 4.96 ms per loop
In [13]: timeit fft_object(a)
100 loops, best of 3: 1.56 ms per loop
In [14]: np.allclose(fft_object(a), np.fft.fft(a))
Out[14]: True
Have you read the tutorial? What don't you understand?
I would recommend using the builders interface to construct the FFTW object. Have a play with the various settings, most importantly the number of threads.
The wisdom is not stored by default. You need to extract it yourself.
All your globals are unnecessary - the objects you want to change are mutable, so you can handle them just fine. fft_object always points to the same thing, so no problem with that not being a global. Ideally, you simply don't want that loop over ii. I suggest working out how to structure your arrays in order that you can do all your operations in a single call
Edit:
[edit edit: I wrote the following paragraph with only a cursory glance at your code, and clearly with it being a recursive update, vectorising is not an obvious approach without some serious cunning. I have a few comments on your implementation at the bottom though]
I suspect your problem is a more fundamental misunderstanding of how to best use a language like Python (or indeed Matlab) for numerical processing. The core tenet is vectorise as much as possible. By this, I mean roll up your python calls to be as few as possible. I can't see how to do that with your example unfortunately (though I've only thought about it for 2 mins). If that's still failing, think about cython - though make sure you really want to go down that route (i.e. you've exhausted the other options).
Regarding the globals: Don't do it that way. If you want to create an object with state, use a class (that is what they are for) or perhaps a closure in your case. The global is almost never what you want (I think I have one at least vaguely legit use for it in all my writing of python, and that's in the cache code in pyfftw). I suggest reading this nice SO question. Matlab is a crappy language - one of the many reasons for this is its crap scoping facilities which tend to lead to bad habits.
You only need global if you want to modify a reference globally. I suggest reading a bit more about the Python scoping rules and what variables really are in python.
FFTW objects carry with them all the arrays you need so you don't need to pass them around separately. Using the call interface carries almost no overhead (particularly if you disable the normalisation) either for setting or returning the values - if you're at that level of optimisation, I strongly suspect you've hit the limit (I'd caveat this that this may not quite be true for many many very small FFTs, but at this point you want to rethink your algorithm to vectorise the calls to FFTW). If you find a substantial overhead in updating the arrays every time (using the call interface), this is a bug and you should submit it as such (and I'd be pretty surprised).
Bottom line, don't worry about updating the arrays on every call. This is almost certainly not your bottleneck, though make sure you're aware of the normalisation and disable it if you wish (it might slow things down slightly compared to raw accessing of the update_arrays() and execute() methods).
Your code makes no use of the cache. The cache is only used when you're using the interfaces code, and reduces the Python overhead in creating new FFTW objects internally. Since you're handling the FFTW object yourself, there is no reason for a cache.
The builders code is a less constrained interface to get an FFTW object. I almost always use the builders now (it's much more convenient that creating a FFTW object from scratch). The cases in which you want to create an FFTW object directly are pretty rare and I'd be interested to know what they are.
Comments on the algorithm implementation:
I'm not familiar with the algorithm you're implementing. However, I have a few comments on how you've written it at the moment.
You're computing nl_eval(wp) on every loop, but as far as I can tell that's just the same as nl_eval(w) from the previous loop, so you don't need to compute it twice (but this comes with the caveat that it's pretty hard to see what's going on when you have globals everywhere, so I might be missing something).
Don't bother with the copies in my_fft or my_ifft. Simply do fft_object(u) (2.29 ms versus 1.67 ms on my machine for the forward case). The internal array update routine makes the copy unnecessary. Also, as you've written it, you're copying twice: c[:] means "copy into the array c", and the array you're copying into c is v.copy(), i.e. a copy of v (so two copies in total).
More sensible (and probably necessary) is copying the output into holding arrays (since that avoids clobbering interim results on calls to the FFTW object), though make sure your holding arrays are properly aligned. I'm sure you've noted this is important but it's rather more understandable to copy the output.
You can move all your scalings together. The 3 in the computation of wn can be be moved inside my_fft in nl_eval. You can also combine this with the normalisation constant from the ifft (and turn it off in pyfftw).
Take a look at numexpr for the basic array operations. It can offer quite a bit of speed-up over vanilla numpy.
Anyway take what you will from all that. No doubt I've missed something or said something incorrect, so please accept it with as much humility as I can offer. It's worth spending a little time working out how Python ticks compared to Matlab (in fact, just forget the latter).