Global state vs global variable - variables

I'm curious as to know if these are the same thing.
I understand a global variable is a variable present outside a function which can be used my any function and a local variable is a variable which can only be used in and by a particular function.
What is global state? What is local state? Are they just terms used to describe the effects of using global/local variables?
Also what is the difference between global and local states and how can they affect a program?
Thank you.

Basically you are assuming correctly. The set of all global variables is known as the global environment or global state. It is a way of affecting the execution of code, hidden from sight - and should be avoided, except in special circumstances (see below). It becomes a nightmare really fast.
Local state is the opposite and is preferable. Use local variables to have complete control over your local state. It makes it easier to read your code, change it, and much easier to find errors in it. Also you do not affect other parts of your code.
It boils down to a matter of having much more control over your code, when you use encapsulation of your variables, functions/methods, and so on.
Globals can be useful, e.g. when your code needs to run in different environments (e.g. dev/staging/integration/production). Configurations are usually global. Other than that -> use locals.
Hope that helps

Related

Can using a constant global variable stop the issue of 'side effects' completely?

I am aware that the purpose of Functional Programming (FP) is to disallow 'side effects', that traditionally appear in object-oriented, imperative languages due to the use of global variables (for example).
However, in OOP (Non-FP) languages, can 'side effects' disappear if one uses a global variable that is constant (so it's value will never change)?
Not sure what you mean by "global variable" but seem like the answer is No.
What is more important is whether variables are mutable or immutable. That means that if you send some class to function you can be sure that is not changed.
Now it also depends on what is "side effects" - which have nothing to do with mutability. e.g you can send an imutable instance to method, you are not going to change the instant but you are going to do some other operation like adding/deleting records base on that instance or create/delete files on the FS

Why certain block closure optimization is good and valid?

In a very interesting post from 2001 Allen Wirfs-Brock explains how to implement block closures without reifying the (native) stack.
From the many ideas he exposes there is one that I don't quite understand and I thought it would be a good idea to ask it here. He says:
Any variable that can never be assigned during the lifetime of a block (e.g., arguments of enclosing methods and blocks) need not be placed in the environment if instead a copy of the variable is placed in the closure when it is created
There are two things I'm not sure I understand well enough:
Why using two copies of the read-only variable is faster than having the variable moved to the environment? Is it because it would be faster for the enclosing context to access the (original) variable in the stack?
How can we ensure that the two variables remain synchronized?
In question 1 there must be another reason. Otherwise I don't see the gain (when compared with the cost of implementing the optimization.)
For Question 2 take a non argument that is assigned in the method and not in the block. Why the oop stored in the stack would remain unchanged during the life of the block?
I think I know the answer to Q2: Because the execution of the block cannot be intertwined with the execution of the method, i.e., while the block lives, the enclosing context does not run. But isn't there any way to modify the stack temporary while the block is alive?
Thanks to the comment of #aka.nice I found the answers to the two questions in Clement Bera's post, whose reading is both pleasant and clarifying.
For Q1 let's first say that Allen's remark means that the copy of the read-only variable can be placed in the block's stack, as if it were a local temporary of the block. The advantage of doing this only materializes if all variables defined outside the block and used inside it are never written in the block. Under these circumstances there would be no need to create the environment array and to emit any prolog or epilog to take care of it.
The machine code that accesses a stack variable is equivalent to the one required to access the environment one because the first would address the location using [ebp + offset] while the second would use [edi + offest], once edi has been set to point to the environment array (tempVector in Clement's notation.) So, there is no gain if some but not all of the environment variables are read-only.
The second question is also answered in Clement's excellent blog. Yes, there is another way to break the synchrony between the original variable and its copy in the block's stack: the debugger (as aka.nice would have told us!) If the programmer modifies the variable in the enclosing context, the debugger will need to detect the action and update the copy as well. Same if the programmer modifies the copy held in the block's stack.
I'm glad I decided to post the question here. The help I received from aka.nice and Clement Bera, plus the comments some people sent me by email helped a lot in augmenting my understanding.
One final remark. Wirfs-Brock claims that avoiding the reification of method contexts is mandatory. I tend to agree. However, many important operations on these data structures can be better implemented if the reification follows the lightweight pattern. More precisely, when debugging you can model these contexts with "viewers" that point to the native stack and use two indexes to delimit the portion that corresponds to the activation under analysis. This is both efficient and clean and the combination of both techniques leads to the best of the worlds because you can have speed and expressiveness at once. Smalltalk is amazing.

Fields vs Local variables? When to use one or the other?

I have a few questions regarding the use of class fields and local variables.
When should a variable be declared as a field or a local variable? Of course, it's pretty obvious that if a variable only lives in the scope of a block or a function, a variable should only be local.
What if, after refactoring a function, the large function gets split up into
several private functions-- Would this be enough of a reason to
promote a local variable into a field? How about readability?
Would it be better to pass around the local variables among the private functions?
Instead of promoting into a field, would it be viable to extract a class among functions that use the same local variables?
Anything you could expound on related topics to this would be nice as well.
Declare a variable as a field when it represents the *state* of the instance.
A large function that's been split up isn't enough reason to promote local variables into fields. The impact on readability and maintainability is too significant:
programmers will always have to reason whether the fields are part of the state or are they just some temporary calculation helpers;
much, much harder to maintain thread-safety since the same fields are used for any concurrent method invocations.
passing the variables from one inner method to another helps:
independently understanding the exact functionality of the method;
re-use the inner methods;
unit-test the inner methods.
Yes, pass around the local variables.
In case there are just too much such variables, it's typical to group them in a convenient helper class, that functions as a struct. It increases readability and eases the usage.

What are pros & cons of Passed Arrays vs Global Arrays in Excel VBA

Ok, 2nd attempt at writing a Stack Overflow Question, so forgive me if this seems familiar.
I am rewriting an Excel Macro that was built over a 2 1/2 year period, frankenstein style (added to piecemeal). One of the things I need to do is load the data into an array once and only once for data accuracy and speed. For my skill level I am going to stick with the Array methodology.
My two approaches are:
Use Global dimmed dynamic Arrays
Dim the dynamic arrays in my Main procedure and pass them to the called procedures
So, what is Stack Overflow's take on the Pros vs Cons of these two methods?
Thanks,
Craig...
First, to answer the question you specifically didn't ask: Set up a custom class and load the data in that. Seriously, you'll thank me later.
OK, on to your question. I start by limiting the scope as much as possible. That means that I'm passing variables between procedures. When all your variables have the most restrictive scope possible, you run into the fewest problems down the line.
Once a variable passes two levels deep (calling procedure to 1st tier, 1st tier to 2nd tier), then I start taking a critical look at my structure. Usually (but not always) if all three procedures are in the same module, I'll create a module-level variable (use the Private keyword instead of Dim). If you separate your modules correctly (not arbitrarily) you can have module-level variables without much risk.
There are some variables that are always global right from the start: the variable that holds the app name and app version; the top-level class module that should never lose scope as long as the app is running; the constants (I know they're not variables) that hold things like commandbar names. I know I want these global, so they start that way.
I'm going to go out on a limb and say that module-level variables never migrate to global variables. Global variables start out that way because of their nature. If using a module-level variable seems cumbersome, it's probably because I've split a module up for no good reason or I need to rethink my whole framework.
That's not to say I've never cheated and used a global when I shouldn't have. We've all done it and you shouldn't lose any sleep if you do it too.
So to properly book-end this post: I quit using arrays unless I'm forced to. I use custom classes because
ActiveCell.Value = Invoice.LocalSalesTaxAmount
is so much nicer to debug than
ActiveCell.Value = aInvoice(35,2)
Just in case you think you need more skill to work with custom classes - so did I. I bit the bullet and so can anyone else.
You need to be careful with globals in Excel VBA, because if your application hits any kind of bug, and does some kind of soft reset (but the app still functions), then the globals will have been erased.
I had to give up on globals, since I don't write perfect apps.

When to use an object instance variable versus passing an argument to the method

How do you decide between passing arguments to a method versus simply declaring them as object instance variables that are visible to all of the object's methods?
I prefer keeping instance variables in a list at the end of the Class, but this list gets longer as my program grows. I figure if a variable is passed often enough it should just be visible to all methods that need it, but then I wonder, "if everything is public there will be no need for passing anything at all!"
Since you're referring to instance variables, I'm assuming that you're working in an object-oriented language. To some degree, when to use instance variables, how to define their scope, and when to use local variables is subjective, but there are a couple of rules of thumb you can follow whenever creating your classes.
Instance variables are typically considered to be attributes of a class. Think of these as adjectives of the object that will be created from your class. If your instance data can be used to help describe the object, then it's probably safe to bet it's a good choice for instance data.
Local variables are used within the scope of methods to help them complete their work. Usually, a method should have a purpose of getting some data, returning some data, and/or proccessing/running an algorithm on some data. Sometimes, it helps to think of local variables as ways of helping a method get from beginning to end.
Instance variable scope is not just for security, but for encapsulation, as well. Don't assume that the "goal should be to keep all variables private." In cases of inheritance, making variables as protected is usually a good alternative. Rather than marking all instance data public, you create getters/setters for those that need to be accessed to the outside world. Don't make them all available - only the ones you need. This will come throughout the development lifecycle - it's hard to guess from the get go.
When it comes to passing data around a class, it's difficult to say what you're doing is good practice without seeing some code . Sometimes, operating directly on the instance data is fine; other times, it's not. In my opinion, this is something that comes with experience - you'll develop some intuition as your object-oriented thinking skills improve.
Mainly this depends on the lifetime of the data you store in the variable. If the data is only used during a computation, pass it as a parameter.
If the data is bound to the lifetime of the object use an instance variable.
When your list of variables gets too long, maybe it's a good point to think about refactoring some parts of the class into a new class.
In my opinion, instance variables are only necessary when the data will be used across calls.
Here's an example:
myCircle = myDrawing.drawCircle(center, radius);
Now lets imaging the myDrawing class uses 15 helper functions to create the myCircle object and each of those functions will need the center and the radius. They should still not be set as instance variables of the myDrawing class. Because they will never be needed again.
On the other hand, the myCircle class will need to store both the center and radius as instance variables.
myCircle.move(newCenter);
myCircle.resize(newRadius);
In order for the myCircle object to know what it's radius and center are when these new calls are made, they need to be stored as instance variables, not just passed to the functions that need them.
So basically, instance variables are a way to save the "state" of an object. If a variable is not necessary to know the state of an object, then it shouldn't be an instance variable.
And as for making everything public. It might make your life easier in the moment. But it will come back to haunt you. Pease don't.
IMHO:
If the variable forms part of the state of the instance, then it should be an instance variable - classinstance HAS-A instancevariable.
If I found myself passing something repeatedly into an instance's methods, or I found that I had a large number of instance variables I'd probably try and look at my design in case I'd missed something or made a bad abstraction somewhere.
Hope it helps
Of course it is easy to keep one big list of public variables in the class. But even intuitively, you can tell that this is not the way to go.
Define each variable right before you are going to use it. If a variable supports the function of a specific method, use it only in the scope of the method.
Also think about security, a public class variable is susceptible to unwanted changes from "outside" code. Your main goal should be to keep all variables private, and any variable which is not, should have a very good reason to be so.
About passing parameters all they way up the stack, this can get ugly very fast. A rule of thumb is to keep your method signatures clean and elegant. If you see many methods using the same data, decide either if it's important enough to be a class member, and if it's not, refactor your code to have it make more sense.
It boils down to common sense. Think exactly where and why you are declaring each new variable, what it's function should be, and from there make a decision regarding which scope it should live in.