modular block design - module

I have two separate functions, bar() and foo(). The execution flow of my program supposed to be as follows:
input -> bar() -> foo() -> output
Currently, a teammate of mine on the same development team made a foo() call inside the bar() function, which destroyed the modularity design. While it's better from modular design perspective to wrap bar() and foo() calls in a wrapper function called procedure() for example, would it cost any performance in terms of adding an extra function overhead on the program stack? I plan to encapsulate the procedure functions as follows:
procedure(inputs)
{
bar();
foo();
}
Thanks in advance for the advices.

The overhead of a function call (especially with few local variables) is so small that it isn't even worth considering. Plus, a good compiler would inline calls where appropriate. I have always believed that good engineering should be the highest priority, since the overall efficiency gained from good design is often better than optimizing small things and leaving the overall design a spaghetti mess.

Related

What is the usage scenario of kotlin overloading the invoke operator?

Why not use top-level functions but overloaded invoke operators?
Is there any advantage to overloading the invoke operator?
class GetFollowableTopicsUseCase #Inject constructor(
private val topicsRepository: TopicsRepository,
private val userDataRepository: UserDataRepository
) {
operator fun invoke(sortBy: TopicSortField = NONE): Flow<List<FollowableTopic>>
...
}
There's an old pair of sayings that floats around programming communities like these.
Closures are a poor man's objects, and objects are a poor man's closures.
The fact of the matter is that, in a sufficiently modern language like Kotlin (or like most languages that we use nowadays), objects and closures are pretty similar. You could replace every class in your Kotlin program with a mess of functions and mutable variables they close over. Likewise, you could replace every function in your program with an object that has an invoke function. But the former would be a constant wrestling match with the type system, and the latter would be absurdly verbose.
So Kotlin lets us do both. Which should you use? The advantage of functions is that they're short and snappy and to-the-point. And, to a functional programmer at least, functions should generally be free of side-effects. Objects, on the other hand, are loud and verbose. That's a bad thing, in that it takes longer to read and comprehend when skimming the code. But it's also a good thing, because it stops you from hiding complexity.
So if your function is simple, use a function. If it's complicated or stateful, use a named object and document it like any public class. As a few examples, here's how I would handle some different situations.
A function to add two numbers together is simple, side-effect-free, and referentially transparent. Use a function.
A function to add a number to a local val is still very simple. It's a closure, but the val is immutable, so the function's behavior is predictable. Using an object would be overkill, so make it a function.
A function that keeps track of how many times it's been called and prints out that number each time has side effects and local state. While it could be written as a fancy closure around a var, it would be better to make this a real object, whose counter variable is a genuine instance variable, so that anyone reading the code can see at a glance what's happening.
In addition to Silvio's general answer, one specific case is for factory methods.
If you define a factory method like this:
class MyClass(…) {
…
companion object {
operator fun invoke(…): MyClass = …
}
}
Then calling the factory method looks exactly like calling a constructor: MyClass(…). This makes factory methods, with all their advantages, easier to use and hence more likely to be adopted.
(Obviously, this only makes sense when the parameter type(s) clearly distinguish the factory method from any public constructors, and also clearly indicate its purpose. In other cases, named factory methods are preferably.)

Why is CoroutineScope.launch and Coroutine.async are extension functions instead of a member function of CoroutineScope?

The title states my question.
What is exactly the reason why CoroutineScope.launch and Coroutine.async are just extension functions of CoroutineScope instead oa a member function?
What benefits does it provide?
I am asking because maybe the reason behind this design could be helpful in designing things in the future too.
Thank in advance.
Mostly because with extension functions it is easier to structure your code in multiple modules even if it is represented as one class.
CoroutineScope is actually a really good example of this design pattern. Take a look at CoroutineScope.kt where the interface is declared. There is only basic functionality there (plus operator and cancel())
The two functions you mentioned are defined in Builders.common.kt. If You take a look at the contents of this file, you can see that there are multiple classes which are private, this means they can only be used in this file. This tells you right away that you don't need this these classes for the basic functionality which is designed in CoroutineScope.kt, they are only there for launch {...} and async {...}
So if you have a large class with multiple functionality, it makes sense to break it up in multiple files (=modules).
launch and async are coroutine builders, but they aren't the only ones: look in integration modules for future (and another future), publish, the RxJava 2 builders etc. Obviously those can't be members of CoroutineScope, so why should launch and async be?
In addition, by being extension functions you know they don't rely on any CoroutineScope privates (well, they could rely on internals since they are in the same module).
The kotlinx.coroutines uses structural concurrency approach to make sure all errors are propagated to a parent coroutine. Similarly, a parent coroutine will by default wait for all it's child coroutines to complete.
There is a Job object associated with every coroutine when you do launch or async. It is just easier to use extension functions for that design to make it work implicitly, without a code-writer explicit attention
You may have a look at the more detailed explanation :
https://kotlinlang.org/docs/reference/coroutines/basics.html#structured-concurrency
https://medium.com/#elizarov/structured-concurrency-722d765aa952

frege pure functions and performance optimizations

My understanding of haskell's pure functions is that they enable performance optimizations like caching (because a pure function returns the same result for the same input every time). What performance optimizations occur for frege's pure functions?
Certainly not caching. I'm not aware of any language that would do this automatically, and for good reasons.
What we do currently is inlining, beta-reduction and elimination of certain value constructions and deconstructions. For example, when you have:
case (\a -> (Just a, Just a)) 42 of (Just b, Just c) -> [c,b]
the compiler just generates code to construct the list
[ 42, 42 ]
This looks not very useful at first sight, since certainly nobody would write such bloated code. However, consider that the lambda expression may be the result of inlining some other function. In fact, in highly abstract code like monadic code, the expansion of the (>>=) operator often leads to code that can be optimized in this way.
While inlining and beta-reduction is good in some cases, one must take care not do overdo it, lest one gets code bloat. Especially in the JVM environment, it is a disadvantage to have huge functions (that is, methods). The JIT can and will do a great job for small methods.

Alternative to polymorphism in non-OOP programming?

Assume we have a drawing program with different elements, such as circle, rectangle, triangle and so on. Different kinds of objects that will all need similar function such as draw() to display themselves.
I wonder how would a programmer approach the problem that is nowadays typically solved by polymorphism, i.e. go through a collection of non-identical elements and invoke common functionality across the different objects.
One way that comes to mind is to have a struct with a function pointer to the appropriate function (or index in a function pointer array) as well as a void pointer to the actual instance, and pass the pointer which is cast to the proper type in the function. But that is just how I - a guy who is clueless on the subject would do it.
I do realize this might be a noobish question, but since I haven't been around in the "olden" days, I really wonder how was this problem tackled. What kind of approach was used in procedural programming and did it have a performance benefit, since we all do know polymorphism has an overhead even in fast languages like C++, due to the virtual method lookup.
A really simple example.
If this interest you you can find more of this in the Linux Kernel.
#include <stdio.h>
struct shape {
void (*say_hello)(void);
};
void circle_say_hello(void)
{
printf("Hi I am circle!\n");
}
void square_say_hello(void)
{
printf("Meh I am square.\n");
}
#define ARRAY_SIZE(a) (sizeof(a)/sizeof(a[0]))
int main(int argc, char *argv[])
{
struct shape circle = { .say_hello = circle_say_hello, };
struct shape square = { .say_hello = square_say_hello, };
struct shape* shapes[] = {&circle, &square};
int i;
for (i = 0; i < ARRAY_SIZE(shapes); i++) {
if (shapes[i] && shapes[i]->say_hello)
shapes[i]->say_hello();
}
return 0;
}
In procedural languages such as C, this would be tackled by defining separate implementations of the draw() function for each custom data type (probably represented as a struct). Any common functionality would be factored out into a separate function which operated on shared elements of each struct (such as the x and y coordinate of the center of the object, which would appear in each one). From a code and functionality perspective, this isn't much different from the OOP layout utilizing polymorphism, where you still have to implement a shared draw() method in a base class and override it in the specific sub-class. In the case of a procedural language, we just wouldn't split these function definitions out into separate "objects".
There are some fancy ways to get object-like behavior out of a procedural language, such as a union type or a single monolithic type with extra booleans to determine if a particular element is in use. That would allow you to write a single draw() function that could perform logic switching based on which elements were enabled. In practice, the only place I have seen much of that is in CORBA-based systems where a program written in C had to mimic some of the behavior of an OOP language which was propagated through the IDL (i.e. translation of Java objects to constructs which could be decoded into C-style structs).
As for the overhead of virtual method lookup in languages such as C++ and Java, that is something that cannot be entirely avoided in an object-oriented language. It can be pretty well mitigated with proper use of the final keyword (which allows the compiler / JVM to optimize the method lookup tables).
This is not a direct answer to your example but an address to your comment, which shows a wrong perspective IMHO
I was just wondering about that particular problem, mostly interested
if there is a more efficient way that avoids the performance overhead
of virtual methods
There is something to understand here. Everything has a tradeoff. Design patterns and OO have all the known advantages we have come to love, but have disadvantages as well e.g. too many classes, memory overhead, performance overhead due to many method calls etc.
On the other hand the old "procedural" way had some advantages also to be objective; it was "simple" to code (no need to think how to design a system, just put everything in main) and had less overhead in many aspects (less memory overhead as less classes are needed and more compact objects -no need for virtual tables etc- and less method calls so perhaps better performance, no performance overhead for dynamic binding - whatever the overhead is nowadays anyway...-).
But it is not what the trade-offs of a particular problem instance are, it is what the experience has shown what is the proper way to build software. Reuse of code that is modular and assists in separate testing (quality assurance), readable, maintainable, flexible to extend are attributes that have been well understood that should be the main driver in software development.
So there are certain cases that a really good programmer in C/C++ could do the "old way" as you say, but is the performance benefit that it incur for this particular program worth the fact that no-one would be able to maintain or sustain it afterwards?
To give another similar example: You could ask in the same fashion?
Why multi-tier architectures in web development? Just put everything into one server and it will be A LOT FASTER since there will be no latency in querying the back-end and all the layers for the data of the UI or the network latency for a query of a remote database etc.
Sure, you have a point. But then ask your self, can this scale as load increases? The answer is no. So is scalability important to you or you want to keep the "put everything in one server" idea? If your income comes from e-sites, the fact that you can not serve more customers would not make your client happy just because you served the first 100 really fast...Anyway this is my opinion

What is high cohesion and how to use it / make it?

I'm learning computer programming and at several places I've stumbled upon the concept of cohesion and I understand that it is desirable for a software to have "high cohesion" but what does it mean? I'm a Java, C and Python programmer learning C++ from the book C++ Primer which mentions cohesion without having it in the index, could you point me to some links about this topic? I did not find the wikipedia page about computer science cohesion informative since it just says it's a qualitative measure and doesn't give real code examples.
High cohesion is when you have a class that does a well defined job. Low cohesion is when a class does a lot of jobs that don't have much in common.
Let's take this example:
You have a class that adds two numbers, but the same class creates a window displaying the result. This is a low cohesive class because the window and the adding operation don't have much in common. The window is the visual part of the program and the adding function is the logic behind it.
To create a high cohesive solution, you would have to create a class Window and a class Sum. The window will call Sum's method to get the result and display it. This way you will develop separately the logic and the GUI of your application.
An explanation of what it is from Steve McConnell's Code Complete:
Cohesion refers to how closely all the routines in a class or all the
code in a routine support a central purpose. Classes that contain
strongly related functionality are described as having strong
cohesion, and the heuristic goal is to make cohesion as strong as
possible. Cohesion is a useful tool for managing complexity because
the more code in a class supports a central purpose, the more easily
your brain can remember everything the code does.
Some way of achieving it from Uncle Bob's Clean Code:
Classes should have a small number of instance variables. Each of the
methods of a class should manipulate one or more of those variables.
In general the more variables a method manipulates the more cohesive
that method is to its class. A class in which each variable is used by
each method is maximally cohesive.
In general it is neither advisable
nor possible to create such maximally cohesive classes; on the other
hand, we would like cohesion to be high. When cohesion is high, it
means that the methods and variables of the class are co-dependent and
hang together as a logical whole.
The notion of cohesion is strongly related with the notion of coupling; also, there is a principle based on the heuristic of high cohesion, named Single Responsibility Principle (the S from SOLID).
High cohesion is a software engineering concept. Basically, it says a class should only do what it is supposed to do, and does it fully. Do not overload it with functions that it is not supposed to do, and whatever directly related to it should not appear in the code of some other class either.
Example is quite subjective, since we also have to consider the scale. A simple program should not be too modularized or it will be fragmented; while a complex program may need more level of abstractions to take care of the complexity.
e.g. Email class. It should contains data members to, from, cc, bcc, subject, body, and may contain these methods saveAsDraft(), send(), discardDraft(). But login() should not be here, since there are a number of email protocol, and should be implemented separately.
Cohesion is usually measured using one of the LCOM (Lack of cohesion) metrics, the original LCOM metric came from Chidamber and Kemerer. See for example:
http://www.computing.dcu.ie/~renaat/ca421/LCOM.html
A more concrete example:
If a class has for example one private field and three methods; when all three methods use this field to perform an operation then the class is very cohesive.
Pseudo code of a cohesive class:
class FooBar {
private SomeObject _bla = new SomeObject();
public void FirstMethod() {
_bla.FirstCall();
}
public void SecondMethod() {
_bla.SecondCall();
}
public void ThirdMethod() {
_bla.ThirdCall();
}
}
If a class has for example three private fields and three methods; when all three methods use just one of the three fields then the class is poorly cohesive.
Pseudo code of a poorly cohesive class:
class FooBar {
private SomeObject _bla = new SomeObject();
private SomeObject _foo = new SomeObject();
private SomeObject _bar = new SomeObject();
public void FirstMethod() {
_bla.Call();
}
public void SecondMethod() {
_foo.Call();
}
public void ThirdMethod() {
_bar.Call();
}
}
The class doing one thing principle is the Single Responsibility Principle which comes from Robert C. Martin and is one of the SOLID principles. The principle prescribes that a class should have only one reason to change.
Staying close to the Single Responsibility Principle could possibly result in more cohesive code, but in my opinion these are two different things.
Most of the answers don't explain what is cohesion, It is well defined in uncle bobs book clean code.
Classes should have a small number of instance variables. Each of the
methods of a class should manipulate one or more of those variables.
In general the more variables a method manipulates the more cohesive
that method is to its class. A class in which each variable is used by
each method is maximally cohesive. In general it is neither advisable
nor possible to create such maximally cohesive classes; on the other
hand, we would like cohesion to be high. When cohesion is high, it
means that the methods and variables of the class are co-dependent and
hang together as a logical whole.
Let me explain it with a class definition
class FooBar {
private _bla;
private _foo;
private _bar;
function doStuff()
if(this._bla>10){
this._foo = 10;
this._bar = 20;
}
}
function doOtherStuff(){
if(this._foo==10){
this._bar = 100;
this._bla = 200;
}
}
}
If you see the above example the class is cohesive that means the variables are shared among the class to work together more variables are shared that means the class is highly cohesive and work as a single unit.
This is an example of low cohesion:
class Calculator
{
public static void main(String args[])
{
//calculating sum here
result = a + b;
//calculating difference here
result = a - b;
//same for multiplication and division
}
}
But high cohesion implies that the functions in the classes do what they are supposed to do(like they are named). And not some function doing the job of some other function. So, the following can be an example of high cohesion:
class Calculator
{
public static void main(String args[])
{
Calculator myObj = new Calculator();
System.out.println(myObj.SumOfTwoNumbers(5,7));
}
public int SumOfTwoNumbers(int a, int b)
{
return (a+b);
}
//similarly for other operations
}
The term cohesion was originally used to describe modules of source code as a qualitative measure of how well the source code of the module was related to each other. The idea of cohesion is used in a variety of fields. For instance a group of people such as a military unit may be cohesive, meaning the people in the unit work together towards a common goal.
The essence of source code cohesion is that the source code in a module work together towards a common, well defined goal. The minimum amount of source code needed to create the module outputs is in the module and no more. The interface is well defined and the inputs flow in over through the interface and the outputs flow back out through the interface. There are no side effects and the emphasis is on minimalism.
A benefit of functionally cohesive modules is that developing and automating unit tests is straightforward. In fact a good measure of the cohesion of a module is how easy it is to create a full set of exhaustive unit tests for the module.
A module may be a class in an object oriented language or a function in a functional language or non-object oriented language such as C. Much of the original work in this area of measuring cohesion mostly involved work with COBOL programs at IBM back in the 1970s so cohesion is definitely not just an object oriented concept.
The original intent of the research from which the concept of cohesion and the associated concept of coupling came from was research into what where the characteristics of programs that were easy to understand, maintain, and extend. The goal was to be able to learn best practices of programming, codify those best practices, and then teach the practices to other programmers.
The goal of good programmers is to write source code whose cohesion is as high as possible given the environment and the problem being solved. This implies that in a large application some parts of the source code body will vary from other parts as to the level of cohesion of the source code in that module or class. Some times about the best you can get is temporal or sequential cohesion due to the problem you are trying to solve.
The best level of cohesion is functional cohesion. A module with functional cohesion is similar to a mathematical function in that you provide a set of inputs and you get a specific output. A truly functional module will not have side effects in addition to the output nor will it maintain any kind of state. It will instead have a well defined interface which encapsulates the functionality of the module without exposing any of the internals of the module and the person using the module will provide a particular set of inputs and get a particular output in return. A truly functional module should be thread safe as well.
Many programming language libraries contain a number of examples of functional modules whether classes, templates, or functions. The most functional cohesive examples would be mathematical functions such as sin, cosine, square root, etc.
Other functions may have side effects or maintain state of some kind resulting in making the use of those functions more complicated.
For instance a function which throws an exception or sets a global error variable (errno in C) or must be used in a sequence (strtok() function is an example from the Standard C library as it maintains an internal state) or which provides a pointer which must then be managed or issues a log to some log utility are all examples of a function that is no longer functional cohesion.
I have read both Yourdon and Constantine's original book, Structured Programming, where I first came across the idea of cohesion in the 1980s and Meilir Page-Jones' book Practical Guide to Structured Systems Design, and Page-Jones did a much better job of describing both coupling and cohesion. The Yourdon and Constantine book seems a bit more academic. Steve McConnell's book Code Complete is quite good and practical and the revised edition has quite a bit to say about good programming practice.
A general way to think of the principle of cohesion is that you should locate a code along with other code that either depend on it, or upon which it depends. Cohesion can and should be applied to levels of composition above the class level. For instance a package or namespace should ideally contain classes that relate to some common theme, and that are more heavily inter-dependent than dependent on other packages/namespaces. I.e. keep dependencies local.
cohesion means that a class or a method does just one defined job. the name of the method or class also should be self-explanatory. for example if you write a calculator you should name the class "calculator" and not "asdfghj". also you should consider to create a method for each task, e.g. subtract() add() etc...
the programmer who might use your program in the future knows exactly what your methods are doing. good naming can reduce commenting efforts
also a principle is DRY - don't repeat yourself
MSDN's article on it is probably more informative than Wikipedia in this case.