Reverse iteration in Julia

Reverse iteration in Julia - iteration

Yesterday I had occasion to want to iterate over a collection in reverse order. I found the reverse function, but this does not return an iterator, but actually creates a reversed collection.
Apparently, there used to be a Reverse iterator, which was removed several years ago. I can also find reference to something (a type?) called Order.Reverse, but that does not seem to apply to my question.
The Iterators.jl package has many interesting iteration patterns, but apparently not reverse iteration.
I can of course use the reverse function, and in some cases, for example reverse(eachindex(c)) which returnes a reversed iterator, but I would prefer a general reverse iterator.
Is there such a thing?

UPDATE FOR JULIA 1.0+
You can now use Iterators.reverse to reverse a limited subset of types. For example:
julia> Iterators.reverse(1:10)
10:-1:1
julia> Iterators.reverse(CartesianIndices(zeros(2,3))) |> collect
2×3 Array{CartesianIndex{2},2}:
CartesianIndex(2, 3) CartesianIndex(2, 2) CartesianIndex(2, 1)
CartesianIndex(1, 3) CartesianIndex(1, 2) CartesianIndex(1, 1)
julia> Iterators.Reverse([1,1,2,3,5]) |> collect
5-element Array{Int64,1}:
5
3
2
1
1
# But not all iterables support it (and new iterables don't support it by default):
julia> Iterators.reverse(Set(1:2)) |> collect
ERROR: MethodError: no method matching iterate(::Base.Iterators.Reverse{Set{Int64}})
Note that this only works for types that have specifically defined reverse iteration. That is, they have specifically defined Base.iterate(::Iterators.Reverse{T}, ...) where T is the custom type. So it's not fully general purpose (for the reasons documented below), but it does work for any type that supports it.
ORIGINAL ANSWER
Jeff's comment when he removed the reverse iterator three years ago (in the issue you linked) is just as relevant today:
I am highly in favor of deleting this since it simply does not work. Unlike everything else in iterator.jl it depends on indexing, not iteration, and doesn't even work on everything that's indexable (for example UTF8String). I hate having landmines like this in Base.
At the most basic level, iterators only know how to do three things: start iteration, get the next element, and check if the iteration is done. In order to create an iterator that doesn't allocate using these primitives, you'd need an O(n^2) algorithm: walk through the whole iterator, counting as you go, until you find the last element. Then walk the iterator again, only this time stopping at the penultimate element. Sure it doesn't allocate, but it'll be way slower than just collecting the iterator into an array and then indexing backwards. And it'll be completely broken for one-shot iterators (like eachline). So it is simply not possible to create an efficient general reverse iterator.
Note that reverse(eachindex(c)) does not work in general:
julia> reverse(eachindex(sprand(5,5,.2)))
ERROR: MethodError: no method matching reverse(::CartesianRange{CartesianIndex{2}})
One alternative that will still work with offset arrays is reverse(linearindices(c)).

Related

Fortran arrays argument [duplicate]

I'm going through a Fortran code, and one bit has me a little puzzled.
There is a subroutine, say
SUBROUTINE SSUB(X,...)
REAL*8 X(0:N1,1:N2,0:N3-1),...
...
RETURN
END
Which is called in another subroutine by:
CALL SSUB(W(0,1,0,1),...)
where W is a 'working array'. It appears that a specific value from W is passed to the X, however, X is dimensioned as an array. What's going on?

This is non-uncommon idiom for getting the subroutine to work on a (rectangular in N-dimensions) subset of the original array.
All parameters in Fortran (at least before Fortran 90) are passed by reference, so the actual array argument is resolved as a location in memory. Choose a location inside the space allocated for the whole array, and the subroutine manipulates only part of the array.
Biggest issue: you have to be aware of how the array is laid out in memory and how Fortran's array indexing scheme works. Fortran uses column major array ordering which is the opposite convention from c. Consider an array that is 5x5 in size (and index both directions from 0 to make the comparison with c easier). In both languages 0,0 is the first element in memory. In c the next element in memory is [0][1] but in Fortran it is (1,0). This affects which indexes you drop when choosing a subspace: if the original array is A(i,j,k,l), and the subroutine works on a three dimensional subspace (as in your example), in c it works on Aprime[i=constant][j][k][l], but in Fortran in works on Aprime(i,j,k,l=constant).
The other risk is wrap around. The dimensions of the (sub)array in the subroutine have to match those in the calling routine, or strange, strange things will happen (think about it). So if A is declared of size (0:4,0:5,0:6,0:7), and we call with element A(0,1,0,1), the receiving routine is free to start the index of each dimension where ever it likes, but must make the sizes (4,5,6) or else; but that means that the last element in the j direction actually wraps around! The thing to do about this is not use the last element. Making sure that that happens is the programmers job, and is a pain in the butt. Take care. Lots of care.

in fortran variables are passed by address.
So W(0,1,0,1) is value and address. so basically you pass subarray starting at W(0,1,0,1).

This is called "sequence association". In this case, what appears to be a scaler, an element of an array (actual argument in caller) is associated with an array (implicitly the first element), the dummy argument in the subroutine . Thereafter the elements of the arrays are associated by storage order, known as "sequence". This was done in Fortran 77 and earlier for various reasons, here apparently for a workspace array -- perhaps the programmer was doing their own memory management. This is retained in Fortran >=90 for backwards compatibility, but IMO, doesn't belong in new code.

How to use hyperoperators with Scalars that aren't really scalar?

I want to make a hash of sets. Well, SetHashes, since they need to be mutable.
In fact, I would like to initialize my Hash with multiple identical copies of the same SetHash.
I have an array containing the keys for the new hash: #keys
And I have my SetHash already initialized in a scalar variable: $set
I'm looking for a clean way to initialize the hash.
This works:
my %hash = ({ $_ => $set.clone } for #keys);
(The parens are needed for precedence; without them, the assignment to %hash is part of the body of the for loop. I could change it to a non-postfix for loop or make any of several other minor changes to get the same result in a slightly different way, but that's not what I'm interested in here.)
Instead, I was kind of hoping I could use one of Raku's nifty hyper-operators, maybe like this:
my %hash = #keys »=>» $set;
That expression works a treat when $set is a simple string or number, but a SetHash?
Array >>=>>> SetHash can never work reliably: order of keys in SetHash is indeterminate
Good to know, but I don't want it to hyper over the RHS, in any order. That's why I used the right-pointing version of the hyperop: so it would instead replicate the RHS as needed to match it up to the LHS. In this sort of expression, is there any way to say "Yo, Raku, treat this as a scalar. No, really."?
I tried an explicit Scalar wrapper (which would make the values harder to get at, but it was an experiment):
my %map = #keys »=>» $($set,)
And that got me this message:
Lists on either side of non-dwimmy hyperop of infix:«=>» are not of the same length while recursing
left: 1 elements, right: 4 elements
So it has apparently recursed into the list on the left and found a single key and is trying to map it to a set on the right which has 4 elements. Which is what I want - the key mapped to the set. But instead it's mapping it to the elements of the set, and the hyperoperator is pointing the wrong way for that combination of sizes.
So why is it recursing on the right at all? I thought a Scalar container would prevent that. The documentation says it prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
The error message says the version of the hyperoperator I'm using is "non-dwimmy", which may explain why it's not in fact doing what I mean, but is there maybe an even-less-dwimmy version that lets me be even more explicit? I still haven't gotten my brain aligned well enough with the way Raku works for it to be able to tell WIM reliably.

I'm looking for a clean way to initialize the hash.
One idiomatic option:
my %hash = #keys X=> $set;
See X metaoperator.
The documentation says ... a Scalar container ... prevents flattening; how is this recursion not flattening? What's the distinction being drawn?
A cat is an animal, but an animal is not necessarily a cat. Flattening may act recursively, but some operations that act recursively don't flatten. Recursive flattening stops if it sees a Scalar. But hyperoperation isn't flattening. I get where you're coming from, but this is not the real problem, or at least not a solution.
I had thought that hyperoperation had two tests controlling recursing:
Is it hyperoperating a nodal operation (eg .elems)? If so, just apply it like a parallel shallow map (so don't recurse). (The current doc quite strongly implies that nodal can only be usefully applied to a method, and only a List one (or augmentation thereof) rather than any routine that might get hyperoperated. That is much more restrictive than I was expecting, and I'm sceptical of its truth.)
Otherwise, is a value Iterable? If so, then recurse into that value. In general the value of a Scalar automatically behaves as the value it contains, and that applies here. So Scalars won't help.
A SetHash doesn't do the Iterable role. So I think this refusal to hyperoperate with it is something else.
I just searched the source and that yields two matches in the current Rakudo source, both in the Hyper module, with this one being the specific one we're dealing with:
multi method infix(List:D \left, Associative:D \right) {
die "{left.^name} $.name {right.^name} can never work reliably..."
}
For some reason hyperoperation explicitly rejects use of Associatives on either the right or left when coupled with the other side being a List value.
Having pursued the "blame" (tracking who made what changes) I arrived at the commit "Die on Associative <<op>> Iterable" which says:
This can never work due to the random order of keys in the Associative.
This used to die before, but with a very LTA error about a Pair.new()
not finding a suitable candidate.
Perhaps this behaviour could be refined so that the determining factor is, first, whether an operand does the Iterable role, and then if it does, and is Associative, it dies, but if it isn't, it's accepted as a single item?
A search for "can never work reliably" in GH/rakudo/rakudo issues yields zero matches.
Maybe file an issue? (Update I filed "RFC: Allow use of hyperoperators with an Associative that does not do Iterable role instead of dying with "can never work reliably".)
For now we need to find some other technique to stop a non-Iterable Associative being rejected. Here I use a Capture literal:
my %hash = #keys »=>» \($set);
This yields: {a => \(SetHash.new("b","a","c")), b => \(SetHash.new("b","a","c")), ....
Adding a custom op unwraps en passant:
sub infix:« my=> » ($lhs, $rhs) { $lhs => $rhs[0] }
my %hash = #keys »my=>» \($set);
This yields the desired outcome: {a => SetHash(a b c), b => SetHash(a b c), ....
my %hash = ({ $_ => $set.clone } for #keys);
(The parens seem to be needed so it can tell that the curlies are a block instead of a Hash literal...)
No. That particular code in curlies is a Block regardless of whether it's in parens or not.
More generally, Raku code of the form {...} in term position is almost always a Block.
For an explanation of when a {...} sequence is a Hash, and how to force it to be one, see my answer to the Raku SO Is that a Hash or a Block?.
Without the parens you've written this:
my %hash = { block of code } for #keys
which attempts to iterate #keys, running the code my %hash = { block of code } for each iteration. The code fails because you can't assign a block of code to a hash.
Putting parens around the ({ block of code } for #keys) part completely alters the meaning of the code.
Now it runs the block of code for each iteration. And it concatenates the result of each run into a list of results, each of which is a Pair generated by the code $_ => $set.clone. Then, when the for iteration has completed, that resulting list of pairs is assigned, once, to my %hash.

Is there a macro for creating fast Iterators from generator-like functions in julia?

Coming from python3 to Julia one would love to be able to write fast iterators as a function with produce/yield syntax or something like that.
Julia's macros seem to suggest that one could build a macro which transforms such a "generator" function into an julia iterator.
[It even seems like you could easily inline iterators written in function style, which is a feature the Iterators.jl package also tries to provide for its specific iterators https://github.com/JuliaCollections/Iterators.jl#the-itr-macro-for-automatic-inlining-in-for-loops ]
Just to give an example of what I have in mind:
#asiterator function myiterator(as::Array)
b = 1
for (a1, a2) in zip(as, as[2:end])
try
#produce a1[1] + a2[2] + b
catch exc
end
end
end
for i in myiterator([(1,2), (3,1), 3, 4, (1,1)])
#show i
end
where myiterator should ideally create a fast iterator with as low overhead as possible. And of course this is only one specific example. I ideally would like to have something which works with all or almost all generator functions.
The currently recommended way to transform a generator function into an iterator is via Julia's Tasks, at least to my knowledge. However they also seem to be way slower then pure iterators. For instance if you can express your function with the simple iterators like imap, chain and so on (provided by Iterators.jl package) this seems to be highly preferable.
Is it theoretically possible in julia to build a macro converting generator-style functions into flexible fast iterators?
Extra-Point-Question: If this is possible, could there be a generic macro which inlines such iterators?

Some iterators of this form can be written like this:
myiterator(as) = (a1[1] + a2[2] + 1 for (a1, a2) in zip(as, as[2:end]))
This code can (potentially) be inlined.
To fully generalize this, it is in theory possible to write a macro that converts its argument to continuation-passing style (CPS), making it possible to suspend and restart execution, giving something like an iterator. Delimited continuations are especially appropriate for this (https://en.wikipedia.org/wiki/Delimited_continuation). The result is a big nest of anonymous functions, which might be faster than Task switching, but not necessarily, since at the end of the day it needs to heap-allocate a similar amount of state.
I happen to have an example of such a transformation here (in femtolisp though, not Julia): https://github.com/JeffBezanson/femtolisp/blob/master/examples/cps.lsp
This ends with a define-generator macro that does what you describe. But I'm not sure it's worth the effort to do this for Julia.

Python-style generators – which in Julia would be closest to yielding from tasks – involve a fair amount of inherent overhead. You have to switch tasks, which is non-trivial and cannot straightforwardly be eliminated by a compiler. That's why Julia's iterators are based on functions that transform one typically immutable, simple state value, and another. Long story short: no, I do not believe that this transformation can be done automatically.

After thinking a lot how to translate python generators to Julia without loosing much performance, I implemented and tested a library of higher level functions which implement Python-like/Task-like generators in a continuation-style. https://github.com/schlichtanders/Continuables.jl
Essentially, the idea is to regard Python's yield / Julia's produce as a function which we take from the outside as an extra parameter. I called it cont for continuation. Look for instance on this reimplementation of a range
crange(n::Integer) = cont -> begin
for i in 1:n
cont(i)
end
end
You can simply sum up all integers by the following code
function sum_continuable(continuable)
a = Ref(0)
continuable() do i
a.x += i
end
a.x
end
# which simplifies with the macro Continuables.#Ref to
#Ref function sum_continuable(continuable)
a = Ref(0)
continuable() do i
a += i
end
a
end
sum_continuable(crange(4)) # 10
As you hopefully agree, you can work with continuables almost like you would have worked with generators in python or tasks in julia. Using do notation instead of for loops is kind of the one thing you have to get used to.
This idea takes you really really far. The only standard method which is not purely implementable using this idea is zip. All the other standard higher-level tools work just like you would hope.
The performance is unbelievably faster than Tasks and even faster than Iterators in some cases (notably the naive implementation of Continuables.cmap is orders of magnitude faster than Iterators.imap). Check out the Readme.md of the github repository https://github.com/schlichtanders/Continuables.jl for more details.
EDIT: To answer my own question more directly, there is no need for a macro #asiterator, just use continuation style directly.
mycontinuable(as::Array) = cont -> begin
b = 1
for (a1, a2) in zip(as, as[2:end])
try
cont(a1[1] + a2[2] + b)
catch exc
end
end
end
mycontinuable([(1,2), (3,1), 3, 4, (1,1)]) do i
#show i
end

Using pyfftw properly for speed up over numpy

I am in the midst of trying to make the leap from Matlab to numpy, but I desperately need speed in my fft's. Now I know of pyfftw, but I don't know that I am using it properly. My approach is going something like
import numpy as np
import pyfftw
import timeit
pyfftw.interfaces.cache.enable()
def wrapper(func, *args):
def wrapped():
return func(*args)
return wrapped
def my_fft(v):
global a
global fft_object
a[:] = v
return fft_object()
def init_cond(X):
return my_fft(2.*np.cosh(X)**(-2))
def init_cond_py(X):
return np.fft.fft(2.*np.cosh(X)**(-2))
K = 2**16
Llx = 10.
KT = 2*K
dx = Llx/np.float64(K)
X = np.arange(-Llx,Llx,dx)
global a
global b
global fft_object
a = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
b = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
fft_object = pyfftw.FFTW(a,b)
wrapped = wrapper(init_cond, X)
print min(timeit.repeat(wrapped,repeat=100,number=1))
wrapped_two = wrapper(init_cond_py, X)
print min(timeit.repeat(wrapped_two,repeat=100,number=1))
I appreciate that there are builder functions and also standard interfaces to the scipy and numpy fft calls through pyfftw. These have all behaved very slowly though. By first creating an instance of the fft_object and then using it globally, I have been able to get speeds as fast or slightly faster than numpy's fft call.
That being said, I am working under the assumption that wisdom is implicitly being stored. Is that true? Do I need to make that explicit? If so, what is the best way to do that?
Also, I think timeit is completely opaque. Am I using it properly? Is it storing wisdom as I call repeat? Thanks in advance for any help you might be able to give.

In an interactive (ipython) session, I think the following is what you want to do (timeit is very nicely handled by ipython):
In [1]: import numpy as np
In [2]: import pyfftw
In [3]: K = 2**16
In [4]: Llx = 10.
In [5]: KT = 2*K
In [6]: dx = Llx/np.float64(K)
In [7]: X = np.arange(-Llx,Llx,dx)
In [8]: a = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
In [9]: b = pyfftw.n_byte_align_empty(KT, 16, 'complex128')
In [10]: fft_object = pyfftw.FFTW(a,b)
In [11]: a[:] = 2.*np.cosh(X)**(-2)
In [12]: timeit np.fft.fft(a)
100 loops, best of 3: 4.96 ms per loop
In [13]: timeit fft_object(a)
100 loops, best of 3: 1.56 ms per loop
In [14]: np.allclose(fft_object(a), np.fft.fft(a))
Out[14]: True
Have you read the tutorial? What don't you understand?
I would recommend using the builders interface to construct the FFTW object. Have a play with the various settings, most importantly the number of threads.
The wisdom is not stored by default. You need to extract it yourself.
All your globals are unnecessary - the objects you want to change are mutable, so you can handle them just fine. fft_object always points to the same thing, so no problem with that not being a global. Ideally, you simply don't want that loop over ii. I suggest working out how to structure your arrays in order that you can do all your operations in a single call
Edit:
[edit edit: I wrote the following paragraph with only a cursory glance at your code, and clearly with it being a recursive update, vectorising is not an obvious approach without some serious cunning. I have a few comments on your implementation at the bottom though]
I suspect your problem is a more fundamental misunderstanding of how to best use a language like Python (or indeed Matlab) for numerical processing. The core tenet is vectorise as much as possible. By this, I mean roll up your python calls to be as few as possible. I can't see how to do that with your example unfortunately (though I've only thought about it for 2 mins). If that's still failing, think about cython - though make sure you really want to go down that route (i.e. you've exhausted the other options).
Regarding the globals: Don't do it that way. If you want to create an object with state, use a class (that is what they are for) or perhaps a closure in your case. The global is almost never what you want (I think I have one at least vaguely legit use for it in all my writing of python, and that's in the cache code in pyfftw). I suggest reading this nice SO question. Matlab is a crappy language - one of the many reasons for this is its crap scoping facilities which tend to lead to bad habits.
You only need global if you want to modify a reference globally. I suggest reading a bit more about the Python scoping rules and what variables really are in python.
FFTW objects carry with them all the arrays you need so you don't need to pass them around separately. Using the call interface carries almost no overhead (particularly if you disable the normalisation) either for setting or returning the values - if you're at that level of optimisation, I strongly suspect you've hit the limit (I'd caveat this that this may not quite be true for many many very small FFTs, but at this point you want to rethink your algorithm to vectorise the calls to FFTW). If you find a substantial overhead in updating the arrays every time (using the call interface), this is a bug and you should submit it as such (and I'd be pretty surprised).
Bottom line, don't worry about updating the arrays on every call. This is almost certainly not your bottleneck, though make sure you're aware of the normalisation and disable it if you wish (it might slow things down slightly compared to raw accessing of the update_arrays() and execute() methods).
Your code makes no use of the cache. The cache is only used when you're using the interfaces code, and reduces the Python overhead in creating new FFTW objects internally. Since you're handling the FFTW object yourself, there is no reason for a cache.
The builders code is a less constrained interface to get an FFTW object. I almost always use the builders now (it's much more convenient that creating a FFTW object from scratch). The cases in which you want to create an FFTW object directly are pretty rare and I'd be interested to know what they are.
Comments on the algorithm implementation:
I'm not familiar with the algorithm you're implementing. However, I have a few comments on how you've written it at the moment.
You're computing nl_eval(wp) on every loop, but as far as I can tell that's just the same as nl_eval(w) from the previous loop, so you don't need to compute it twice (but this comes with the caveat that it's pretty hard to see what's going on when you have globals everywhere, so I might be missing something).
Don't bother with the copies in my_fft or my_ifft. Simply do fft_object(u) (2.29 ms versus 1.67 ms on my machine for the forward case). The internal array update routine makes the copy unnecessary. Also, as you've written it, you're copying twice: c[:] means "copy into the array c", and the array you're copying into c is v.copy(), i.e. a copy of v (so two copies in total).
More sensible (and probably necessary) is copying the output into holding arrays (since that avoids clobbering interim results on calls to the FFTW object), though make sure your holding arrays are properly aligned. I'm sure you've noted this is important but it's rather more understandable to copy the output.
You can move all your scalings together. The 3 in the computation of wn can be be moved inside my_fft in nl_eval. You can also combine this with the normalisation constant from the ifft (and turn it off in pyfftw).
Take a look at numexpr for the basic array operations. It can offer quite a bit of speed-up over vanilla numpy.
Anyway take what you will from all that. No doubt I've missed something or said something incorrect, so please accept it with as much humility as I can offer. It's worth spending a little time working out how Python ticks compared to Matlab (in fact, just forget the latter).

If I come from an imperative programming background, how do I wrap my head around the idea of no dynamic variables to keep track of things in Haskell?

So I'm trying to teach myself Haskell. I am currently on the 11th chapter of Learn You a Haskell for Great Good and am doing the 99 Haskell Problems as well as the Project Euler Problems.
Things are going alright, but I find myself constantly doing something whenever I need to keep track of "variables". I just create another function that accepts those "variables" as parameters and recursively feed it different values depending on the situation. To illustrate with an example, here's my solution to Problem 7 of Project Euler, Find the 10001st prime:
answer :: Integer
answer = nthPrime 10001
nthPrime :: Integer -> Integer
nthPrime n
| n < 1 = -1
| otherwise = nthPrime' n 1 2 []
nthPrime' :: Integer -> Integer -> Integer -> [Integer] -> Integer
nthPrime' n currentIndex possiblePrime previousPrimes
| isFactorOfAnyInThisList possiblePrime previousPrimes = nthPrime' n currentIndex theNextPossiblePrime previousPrimes
| otherwise =
if currentIndex == n
then possiblePrime
else nthPrime' n currentIndexPlusOne theNextPossiblePrime previousPrimesPlusCurrentPrime
where currentIndexPlusOne = currentIndex + 1
theNextPossiblePrime = nextPossiblePrime possiblePrime
previousPrimesPlusCurrentPrime = possiblePrime : previousPrimes
I think you get the idea. Let's also just ignore the fact that this solution can be made to be more efficient, I'm aware of this.
So my question is kind of a two-part question. First, am I going about Haskell all wrong? Am I stuck in the imperative programming mindset and not embracing Haskell as I should? And if so, as I feel I am, how do avoid this? Is there a book or source you can point me to that might help me think more Haskell-like?
Your help is much appreciated,
-Asaf

Am I stuck in the imperative programming mindset and not embracing
Haskell as I should?
You are not stuck, at least I don't hope so. What you experience is absolutely normal. While you were working with imperative languages you learned (maybe without knowing) to see programming problems from a very specific perspective - namely in terms of the van Neumann machine.
If you have the problem of, say, making a list that contains some sequence of numbers (lets say we want the first 1000 even numbers), you immediately think of: a linked list implementation (perhaps from the standard library of your programming language), a loop and a variable that you'd set to a starting value and then you would loop for a while, updating the variable by adding 2 and putting it to the end of the list.
See how you mostly think to serve the machine? Memory locations, loops, etc.!
In imperative programming, one thinks about how to manipulate certain memory cells in a certain order to arrive at the solution all the time. (This is, btw, one reason why beginners find learning (imperative) programming hard. Non programmers are simply not used to solve problems by reducing it to a sequence of memory operations. Why should they? But once you've learned that, you have the power - in the imperative world. For functional programming you need to unlearn that.)
In functional programming, and especially in Haskell, you merely state the construction law of the list. Because a list is a recursive data structure, this law is of course also recursive. In our case, we could, for example say the following:
constructStartingWith n = n : constructStartingWith (n+2)
And almost done! To arrive at our final list we only have to say where to start and how many we want:
result = take 1000 (constructStartingWith 0)
Note that a more general version of constructStartingWith is available in the library, it is called iterate and it takes not only the starting value but also the function that makes the next list element from the current one:
iterate f n = n : iterate f (f n)
constructStartingWith = iterate (2+) -- defined in terms of iterate
Another approach is to assume that we had another list our list could be made from easily. For example, if we had the list of the first n integers we could make it easily into the list of even integers by multiplying each element with 2. Now, the list of the first 1000 (non-negative) integers in Haskell is simply
[0..999]
And there is a function map that transforms lists by applying a given function to each argument. The function we want is to double the elements:
double n = 2*n
Hence:
result = map double [0..999]
Later you'll learn more shortcuts. For example, we don't need to define double, but can use a section: (2*) or we could write our list directly as a sequence [0,2..1998]
But not knowing these tricks yet should not make you feel bad! The main challenge you are facing now is to develop a mentality where you see that the problem of constructing the list of the first 1000 even numbers is a two staged one: a) define how the list of all even numbers looks like and b) take a certain portion of that list. Once you start thinking that way you're done even if you still use hand written versions of iterate and take.
Back to the Euler problem: Here we can use the top down method (and a few basic list manipulation functions one should indeed know about: head, drop, filter, any). First, if we had the list of primes already, we can just drop the first 1000 and take the head of the rest to get the 1001th one:
result = head (drop 1000 primes)
We know that after dropping any number of elements form an infinite list, there will still remain a nonempty list to pick the head from, hence, the use of head is justified here. When you're unsure if there are more than 1000 primes, you should write something like:
result = case drop 1000 primes of
[] -> error "The ancient greeks were wrong! There are less than 1001 primes!"
(r:_) -> r
Now for the hard part. Not knowing how to proceed, we could write some pseudo code:
primes = 2 : {-an infinite list of numbers that are prime-}
We know for sure that 2 is the first prime, the base case, so to speak, thus we can write it down. The unfilled part gives us something to think about. For example, the list should start at some value that is greater 2 for obvious reason. Hence, refined:
primes = 2 : {- something like [3..] but only the ones that are prime -}
Now, this is the point where there emerges a pattern that one needs to learn to recognize. This is surely a list filtered by a predicate, namely prime-ness (it does not matter that we don't know yet how to check prime-ness, the logical structure is the important point. (And, we can be sure that a test for prime-ness is possible!)). This allows us to write more code:
primes = 2 : filter isPrime [3..]
See? We are almost done. In 3 steps, we have reduced a fairly complex problem in such a way that all that is left to write is a quite simple predicate.
Again, we can write in pseudocode:
isPrime n = {- false if any number in 2..n-1 divides n, otherwise true -}
and can refine that. Since this is almost haskell already, it is too easy:
isPrime n = not (any (divides n) [2..n-1])
divides n p = n `rem` p == 0
Note that we did not do optimization yet. For example we can construct the list to be filtered right away to contain only odd numbers, since we know that even ones are not prime. More important, we want to reduce the number of candidates we have to try in isPrime. And here, some mathematical knowledge is needed (the same would be true if you programmed this in C++ or Java, of course), that tells us that it suffices to check if the n we are testing is divisible by any prime number, and that we do not need to check divisibility by prime numbers whose square is greater than n. Fortunately, we have already defined the list of prime numbers and can pick the set of candidates from there! I leave this as exercise.
You'll learn later how to use the standard library and the syntactic sugar like sections, list comprehensions, etc. and you will gradually give up to write your own basic functions.
Even later, when you have to do something in an imperative programming language again, you'll find it very hard to live without infinte lists, higher order functions, immutable data etc.
This will be as hard as going back from C to Assembler.
Have fun!

It's ok to have an imperative mindset at first. With time you will get more used to things and start seeing the places where you can have more functional programs. Practice makes perfect.
As for working with mutable variables you can kind of keep them for now if you follow the rule of thumb of converting variables into function parameters and iteration into tail recursion.

Off the top of my head:
Typeclassopedia. The official v1 of the document is a pdf, but the author has moved his v2 efforts to the Haskell wiki.
What is a monad? This SO Q&A is the best reference I can find.
What is a Monad Transformer? Monad Transformers Step by Step.
Learn from masters: Good Haskell source to read and learn from.
More advanced topics such as GADTs. There's a video, which does a great job explaining it.
And last but not least, #haskell IRC channel. Nothing can even come close to talk to real people.

I think the big change from your code to more haskell like code is using higher order functions, pattern matching and laziness better. For example, you could write the nthPrime function like this (using a similar algorithm to what you did, again ignoring efficiency):
nthPrime n = primes !! (n - 1) where
primes = filter isPrime [2..]
isPrime p = isPrime' p [2..p - 1]
isPrime' p [] = True
isPrime' p (x:xs)
| (p `mod` x == 0) = False
| otherwise = isPrime' p xs
Eg nthPrime 4 returns 7. A few things to note:
The isPrime' function uses pattern matching to implement the function, rather than relying on if statements.
the primes value is an infinite list of all primes. Since haskell is lazy, this is perfectly acceptable.
filter is used rather than reimplemented that behaviour using recursion.
With more experience you will find you will write more idiomatic haskell code - it sortof happens automatically with experience. So don't worry about it, just keep practicing, and reading other people's code.

Another approach, just for variety! Strong use of laziness...
module Main where
nonmults :: Int -> Int -> [Int] -> [Int]
nonmults n next [] = []
nonmults n next l#(x:xs)
| x < next = x : nonmults n next xs
| x == next = nonmults n (next + n) xs
| otherwise = nonmults n (next + n) l
select_primes :: [Int] -> [Int]
select_primes [] = []
select_primes (x:xs) =
x : (select_primes $ nonmults x (x + x) xs)
main :: IO ()
main = do
let primes = select_primes [2 ..]
putStrLn $ show $ primes !! 10000 -- the first prime is index 0 ...

I want to try to answer your question without using ANY functional programming or math, not because I don't think you will understand it, but because your question is very common and maybe others will benefit from the mindset I will try to describe. I'll preface this by saying I an not a Haskell expert by any means, but I have gotten past the mental block you have described by realizing the following:
1. Haskell is simple
Haskell, and other functional languages that I'm not so familiar with, are certainly very different from your 'normal' languages, like C, Java, Python, etc. Unfortunately, the way our psyche works, humans prematurely conclude that if something is different, then A) they don't understand it, and B) it's more complicated than what they already know. If we look at Haskell very objectively, we will see that these two conjectures are totally false:
"But I don't understand it :("
Actually you do. Everything in Haskell and other functional languages is defined in terms of logic and patterns. If you can answer a question as simple as "If all Meeps are Moops, and all Moops are Moors, are all Meeps Moors?", then you could probably write the Haskell Prelude yourself. To further support this point, consider that Haskell lists are defined in Haskell terms, and are not special voodoo magic.
"But it's complicated"
It's actually the opposite. It's simplicity is so naked and bare that our brains have trouble figuring out what to do with it at first. Compared to other languages, Haskell actually has considerably fewer "features" and much less syntax. When you read through Haskell code, you'll notice that almost all the function definitions look the same stylistically. This is very different than say Java for example, which has constructs like Classes, Interfaces, for loops, try/catch blocks, anonymous functions, etc... each with their own syntax and idioms.
You mentioned $ and ., again, just remember they are defined just like any other Haskell function and don't necessarily ever need to be used. However, if you didn't have these available to you, over time, you would likely implement these functions yourself when you notice how convenient they can be.
2. There is no Haskell version of anything
This is actually a great thing, because in Haskell, we have the freedom to define things exactly how we want them. Most other languages provide building blocks that people string together into a program. Haskell leaves it up to you to first define what a building block is, before building with it.
Many beginners ask questions like "How do I do a For loop in Haskell?" and innocent people who are just trying to help will give an unfortunate answer, probably involving a helper function, and extra Int parameter, and tail recursing until you get to 0. Sure, this construct can compute something like a for loop, but in no way is it a for loop, it's not a replacement for a for loop, and in no way is it really even similar to a for loop if you consider the flow of execution. Similar is the State monad for simulating state. It can be used to accomplish similar things as static variables do in other languages, but in no way is it the same thing. Most people leave off the last tidbit about it not being the same when they answer these kinds of questions and I think that only confuses people more until they realize it on their own.
3. Haskell is a logic engine, not a programming language
This is probably least true point I'm trying to make, but hear me out. In imperative programming languages, we are concerned with making our machines do stuff, perform actions, change state, and so on. In Haskell, we try to define what things are, and how are they supposed to behave. We are usually not concerned with what something is doing at any particular time. This certainly has benefits and drawbacks, but that's just how it is. This is very different than what most people think of when you say "programming language".
So that's my take how how to leave an imperative mindset and move to a more functional mindset. Realizing how sensible Haskell is will help you not look at your own code funny anymore. Hopefully thinking about Haskell in these ways will help you become a more productive Haskeller.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas