Pascal-style arrays, built-in len() function vs .length? What are the pros/cons - language-design

What are the differences between these two? Why would you pick one over the other, is it just personal preference, or is there an actual reason behind why you would use either a built-in function or whatever .length is.

I think using *.length over *.length() or len(*) is kind of a historical artifact, which was probably done to make getting the length of an array as fast as possible. Arrays after all, are a very basic data structure in many languages, and getting the length of one is an extremely common operation. And accessing a property is much faster than calling a method.
Nowadays a compiler could probably optimize that kind of thing out, but back then I think there was a pull towards ease-of-implementation which guided many languages to simply have *.length as a property.
However, in any OOP language at least, it's more consistent to have *.length(), because while arrays have immutable lengths, and can afford to have *.length exposed as a constant value, other data structures which you can add or remove values would not be able to do this.

Related

Whats the purpose of using method overloading?

I want to know the exact reason why the method overloading is done in OOP without using different method names to every variation as it was asked at an interview. Please help me to understand this concept.
Without using any fancy terms, let's say you're building an API, and there's a method called crush which let's say crushes or destroys whatever parameter is given to it. If you follow your way, you'll have to use atleast three different methods, each for an int, float and char (I'm using the most general types as an example). Now the more types there are, the more methods you'll have to create with that so many different names. Therefore the developer using your API, is going to have to remember so many different names for something as simple as a method that destroys its parameter. As much as it's difficult, it's also much less readable because again, remembering too many names for a singular function (function as in job).
Method overloading isn't used for everything, it's intended to be used for methods or functions that might take different types of data at different points, but internally follow a constant procedure or does a singular thing no matter what type of data it's passed.
You won't be writing one version of print that takes an int as a parameter, and returns the modulus of that, and another version of print that takes a string as an argument, and prints that to stdout. You can, but that's not how it's meant to be used.
It is mainly so as to be able to follow a relatively well-known software design principle called "Syntactic Consistency" from the book "Principles of Programming Languages" by Bruce J. MacLennan, which says the following:
Similar things should look similar, whereas
different things should look different.
When you see two functions with different names, you might be tempted to believe that they do different things. If they do in fact do different things, it is okay, but what if they do the same thing? In that case, it would be nice if the functions have the exact same name, so as to indicate that they do, in fact, do the same thing.
Of course you can misuse overloading. If you go around writing functions that do different things, taking advantage of overloading to give them the same name, then you are shooting yourself in the foot.

What is the exhaustive list of guidelines/practices/rules to fully conform with functional paradigm?

I've started playing around with Kotlin, but I sense my own limitation in the way I program. My problem is that I still think Java therefore the style is still imperative, my question is to all functional programming zealots , which I believe would be very useful to all people who at the very beginning stage and also need to 'brake' their brain to start building it again; to leave comfort zone and start thinking pseudo and not in "whatever is your first language". I believe it is possible for highly experienced polyglot developers to chew the concepts down to plain advices of what makes your program being written in entirely functional way and what violates the paradigm. I don't know all the quirks but please don't hesitate to include universally accepted terms which might be unknown to me(I can always lookup). At this point I need this set of rules to make myself suffer at first and not break them but then I know I will feel it, analyze guidelines and understand how they are worse/better which of course is my own homework.
So example of these guidelines, would be something like:
Never change state, this can be avoided by using x, y, z
Operate using higher order functions only (I maybe wrong, just example)
I hope the answer will give me long term reference to put myself in extreme conditions where I stop escaping to OOP whenever I feel uncomfortable. And now when I look at Kotlin I understand how I've should've been thinking about problems, it is about intention not about the structure imposed by one language or another. Intention can always be converted to a language of your choice and backed up by design patterns applicable to the language, but to find that middle ground I need to jail myself first from the comfort zone.
Avoid mutable state like the plague.
One of the main points of using functional programming, possibly the main one, is to avoid all the little pitfalls, bugs, issues one needs to deal with when using mutable state. You should do everything you can in order to avoid mutating state. For instance, instead of using C-style for-loops where you need to keep a counter variable updated, use map and other higher-order functions in order to abstract away your iteration patterns. This also means that you should never change the value of a variable if you can avoid that. Instead, you should be defining almost all of your variables, preferrably all of them, as constants, and using functions to compute new values from them instead of mutating them.
Avoid side-effects like the plague.
Mutable state's ugly cousin, side-effects. Side effects mean anything other than taking a value and returning a value in a function. If that function prints data, mutates global variables, sends messages to threads, or anything, anything other than simply taking its parameters, computing a value from them, and returning a value, that function has side-effects. Side-effects are important (see next bullet point), but if you use them a lot, they get impossible to track. Just think of how everyone tells you to avoid global variables in imperative programming. Functional programming goes a step further and tries to avoid all side-effects. The bulk of your program should be made of pure functions. (See ahead)
When you need to use side-effects, keep them contained.
Yes, I just told you to run away from side-effects. However, no program is useful without side-effects of some kind. Graphical User Interface? Side-effect. Audio output? Side-effect. Printing to a shell? Side-effect. So you can't really get rid of side-effects if you want to build useful stuff.
What you should do instead is write your code so that all your side-effecting code lives in a thin layer which mostly calls pure functions and then does the required side-effects using the result of these pure function calls.
Use pure functions for everything you can.
This is sort of the flipside of the previous point. A pure function is a function which has no side-effects and does not mutate anything. It can only take in parameters and return a value. You should use these a lot. For instance, instead of doing your logging within functions which are computing stuff, you should be constructing your log strings using pure functions, and then letting your side-effects layer call these pure functions, call more pure functions in order to format the log strings into a full log, and then output the log itself from your side-effects layer.
Use higher-order functions to structure your code.
Higher-order functions are, in a way, the glue that makes functional programming work. A higher-order function is a function which takes one or more functions as parameters and/or returns a function. The power of higher-order functions is that they can encapsulate many of the patterns which you would use in an imperative-style program in a declarative manner. For instance, let's take a look at the three most common higher-order functions:
map is a function which takes a function and a list of values, applies its function argument to each of those values, and returns a new list with the results. map encapsulates the whole pattern of iterating over a list doing an operation on each value in a declarative manner.
filter is a function which takes a function which returns a boolean and a list of values, applies its function argument to each of those values and returns a list containing only those values for which its function argument returns true. It encapsulates the whole pattern of selecting results from a list in a declarative manner.
reduce, also known as fold, takes an initial value, a binary function and a list of values. It uses its function argument to combine the initial value with the first value of the list, then combines the result with the next value of the list and keeps on doing this until it has reduced the list to just one single value. It encapsulates the entire pattern of obtaining an aggregate value from a list of values.
This is in no way an exhaustive list of higher-order functions, but these three are the most common ones. I hope this has been enough to show how you can structure code which would require a lot of tracking variables using only functions in a declarative manner. If you use these higher-order functions well, it's likely you won't ever need a for or while loop again.
This is definitely not an exhaustive list of functional programming practices, but I think most functional programmers would agree these five guidelines form the core of what functional programming is about. If you want to really learn how to apply these, my advice would be to learn a pure functional programming language such as Haskell, so you are forced to abandon the imperative paradigm and to learn how to structure things functionally instead. I would recommend the fantastic Haskell Programming from First Principles as a starting resource if you choose to go this way. In case you don't want to/can't put down the cash, Brent Yorgey's Haskell course at UPenn is also a great free resource.

Scala immutable vs mutable. What is the way one should go?

I'm just learning to program in scala.
I have some experience in functional programming, as I have in object oriented programming.
My question is kind of simple, yet tricky:
Which structures should be used in Scala? Should we only stick to immutables, eg. modifing lists by iterating through it and stick a new one together, or go for mutables? What is your opinion on that, what are the performance aspects, memory related aspects, ...
I'm likely to program in a functional style, but it often expands to an insane amount of effort to do things which are easily done by using mutables. Is it situation dependent, what to use?
Prefer immutable to mutable state. Use mutable state only where it is absolutely necessary. Some notable reasons include:
Performance. The standard libraries make wide use of vars and while loops, even though this is not idiomatic Scala. This should not be emulated, however, except for cases where you have profiled to determine that modifying the code to be more imperative will bring a significant performance gain.
I/O. I/O, or interacting with the outside world is inherently state dependent, and thus must be dealt with in a mutable manner.
This is no different than the recommended coding style found in all major languages, imperative or functional. For example, in Java it is preferable to use data objects with only private final fields. Code written in an immutable (and functional) way is inherently easier to understand because when one sees a val, they know it will never change, reducing the possible number of states any particular object or function can be in.
In many cases, it also allows automatic parallel execution, for example, collection classes in Scala all have a par function, which will return a parallel collection that automatically run the calls to functions like map or reduce in parallel.
(I thought this must be a duplicate but couldn't easily find an earlier similar one, so I venture to answer...)
There is no general answer to this question. The rule of thumb suggested by the creators of Scala is to start with immutable vals and structures and stick to them as long as it makes sense. You can almost always create a workable solution to your problem this way. But if not, of course be pragmatic and use mutability.
Once you have a solution, you can tweak it, test it, measure its performance etc. If you find that e.g. it is too slow or overly complex, identify the critical part of it, understand what makes it problematic and - if needed - reimplement it using mutable variables, ideally keeping it isolated from the rest of the program. Note though that in many cases, a better solution can be found from within the immutable realm as well, so try looking there first. Especially for a beginner like myself, it still happens regularly that the best solution I could come up with looked contorted and complex with no apparent way to improve it - until seeing a simple and elegant solution to the same problem in a few lines of code, created by an experienced Scala developer who controls more of the power of the language and its libraries.
I usually obey the following rules:
Never use static mutable vars
Keep all user defined data types (typically case classes) immutable unless they are very expensive to copy. This will simplify a lot of the application logic.
If a data structure/collection is inherently mutable (i.e. it's designed to change over time), using a mutable data structure/collection might be appropriate. An example might be a large game world that is updated when players move. Remember to (almost) never share these data structures between threads though.
It's fine to use mutable local vars in methods
Use immutable collections for function results. These can be strictly or lazily evaluated depending on what gives best performance in the used context. Be careful if you use a lazily evaluated result which depends on a mutable collection though.

Does over-using function calls affect performance? Specifically in Fortran

I habitually write code with lots of functions, I find it makes it clearer. But now I'm writing some code in Fortran which needs to be very efficient, and I'm wondering whether over-using functions will slow it down, or whether the compiler will work out what's going on and optimise?
I know in Java/Python etc each function is an object, and so creating lots of functions would require them to be created in memory. I also know that in Haskell the functions are reduced into each other, so it makes little difference there.
Does anyone know about the case with Fortran? Is there a difference with using intent/pure functions/declaring fewer local variables/anything else?
Function calls carry a performance cost for stack based languages, like Fortran. They have to add on to the stack and such.
Because of this, most compilers will try to inline function calls aggressively, if it is possible. Most of the time the compiler will make the right choice on whether or not to inline certain functions in your program.
This automatic inlining process will mean that there is no extra cost for writing your function (at all).
This means that you should write your code as cleanly and organized as possible, and it is likely that the compiler will do these optimizations for you. It is more important that your overall strategy for solving the problem is the most efficient than worry about performance of function calls.
Just write the code in the simplest and most well-structured way you can, then when you have it written and tested you can profile it to see if there are any hotspots which require optimisation. Only at that point should you concern yourself with micro-optimisations, and this may not even be necessary if your compiler is doing its job.
I've just spent all morning tuning an app consisting of mixed C and Fortran, and of course it uses a lot of functions. What I found (and what I usually find) is not that functions are slow, but that certain function calls (and very few of them) don't really have to be done at all. For example, clearing memory blocks, just to be neat, but doing it at high frequency.
This is not a function of language, and not really a function of inlining either. Function calls could be free and you would still have the problem that the call tree tends to be more bushy than necessary. You need to find out where to prune it. This is the "profiling" method I rely on.
Whatever you do, find out what needs to be fixed. Don't Guess. Many people don't think of this kind of question as guessing, but when they find themselves asking "Will this work, will that help?", they're poking in the dark, rather than finding out where the problems are. Once they know where the problems are, the fix is obvious.
Typically subroutine / function calls in Fortran will have very little overhead. While the language standard doesn't specify argument passing mechanisms, the typical implementation is "by reference" so no copying is involved, only setting up a new procedure. On most modern architectures this has little overhead. Selecting good algorithms is generally far more important than micro-optimizations.
An exception about calling be quick could be case in which the compiler has to create temporary arrays, for example, if the actual argument is a non-contiguous array subsection and the called procedure argument is a plain contiguous array. Suppose that the dummy argument is dimension (:). Calling it with an array of dimension (:) is simple. If you request a non-unit stride in the call, e.g., array (1:12:3), then the array is non-contiguous and the compiler may need to create a temporary copy. Suppose that the actual argument is dimension (:,:). If the call has array (:,j), the sub-array is contiguous since in Fortran the first index varies fastest in memory and shouldn't need a copy. But array (i,:) is non-contiguous and might require a temporary copy.
Some compilers have options to warn you when temporary array copies are needed so that you can change your code, if you wish.

methods: multiple parameters or structure?

I noticed by looking at sample code from Apple, that they tend to design methods that receive structures instead of multiple parameters. Why is that? As far as ease of use, I personally prefer the latter, but as far as performance goes, is there one better choice than the other?
[pencil drawPoint:Point3Make(20,40,60)]
[pencil drawPointAtX:20 Y:50 Z:60]
Don't muddle this question with concerns of performance. Don't make premature optimizations (until you know you have a problem) and when thinking about performance hot spots in your code, its almost always in areas dealing with I/O (eg, database, files). So, separate your question on message passing style with performance. You want to make the best design decision first, then optimize for performance only if needed.
With that being said, Apple does not recommend or prefer passing multiple parameters vs a structure/object. Generalizing this outside of the scope of Objective-C, use individuals parameters or objects when it makes sense in the particular scenario. In other words, there isn't a black and white answer that you can follow. Instead, use the following guidelines when deciding:
Pass objects/structures when it makes sense for the method to understand many/all members of the object
Pass objects/structures when you want to validate some rules on the relationship between the various members of the object. This allows you to ensure the consumer of your method constructs a valid object prior to calling your method (thus eliminating the need of the method to validate these conditions).
Pass individual arguments when it is clear the method makes sense and only needs certain elements rather than the entire object
Using a variation on your example, a paint method that takes two coordinates (X and Y) would benefit from taking a Point object rather than two variables, X and Y.
A method retrieveOrderByIdAndName would best be designed by taking the single id and name parameter rather than some container object.
Now, if there was some method to retrieve orders by many different criterion, it would make more send to create a retrieveOrderByCriteria and pass it some criteria structure.
If you are passing the same set of parameters around it is useful to pass them in a structure because they belong together semantically.
The performance hit is probably negligible for such a simple structure as 3 points. Use the readable/reusable solution and then profile your code if you think it is slow :)