When coding in a strongly typed language does static checking offer any unique benefits over dynamic code analysis? - code-analysis

When working in a language which is considered strongly typed, does static code analysis offer anything that dynamic code analysis cannot?

To answer your question, yes, a strongly typed language which does static checking offers benefits.
Why?
As an example, consider a programming language that does static type checking (a functional language like OCaml), versus a language like Python that does dynamic type checking.
Static type checking allows for type-safety before the code is executed at runtime. Whereas dynamic type checking only checks for type-safety during runtime.
What this means is that if you did not use the right types in a language that does static type checking, it is caught at compile time, and will not execute at all. It will catch all those type errors before the code is ran. If it does not encounter any problems, only then would it execute.
On the other hand, in a dynamically typed language, it will compile and execute even if there are unresolved type errors, and during the execution, if it does encounter a type error it cannot resolve, it will throw an exception and just quit.
On small programs these don't look like a big difference, but if you think about it on a large scale, if your program takes a long time to compute something, and on a dynamically typed language, only catches the error nearing the end of the execution, you just wasted a lot of time and resources right? (At least this was an example that helped me understand what benefits it offers that dynamic code analysis does not)
In case you're wondering, Java is a strongly typed language. SO Q&A on whether C is strongly typed or not
There are subtle differences between a strong/weak typed language and a static/dynamic language. Being strong/weak typed refers to how strict a language is with its types. Whereas being static/dynamic is when it is required. (Compile time or runtime)
Source
Hopefully that answers your question!
Some references:
(yes wikipedia is not a reference but it gives the best examples for this case IMO)
Static type checking
Dynamic Programming Languages
Why Python is dynamic but also strongly typed

Related

Why is adding methods to a type different than adding a sub or an operator in perl6?

Making subs/procedures available for reuse is one core function of modules, and I would argue that it is the fundamental way how a language can be composable and therefore efficient with programmer time:
if you create a type in your module, I can create my own module that adds a sub that operates on your type. I do not have to extend your module to do that.
# your module
class Foo {
has $.id;
has $.name;
}
# my module
sub foo-str(Foo:D $f) is export {
return "[{$f.id}-{$f.name}]"
}
# someone else using yours and mine together for profit
my $f = Foo.new(:id(1234), :name("brclge"));
say foo-str($f);
As seen in Overloading operators for a class this composability of modules works equally well for operators, which to me makes sense since operators are just some kinda syntactic sugar for subs anyway (in my head at least). Note that the definition of such an operator does not cause any surprising change of behavior of existing code, you need to import it into your code explicitly to get access to it, just like the sub above.
Given this, I find it very odd that we do not have a similar mechanism for methods, see e.g. the discussion at How do you add a method to an existing class in Perl 6?, especially since perl6 is such a method-happy language. If I want to extend the usage of an existing type, I would want to do that in the same style as the original module was written in. If there is a .is-prime on Int, it must be possible for me to add a .is-semi-prime as well, right?
I read the discussion at the link above, but don't quite buy the "action at a distance" argument: how is that different from me exporting another multi sub from a module? for example the rust way of making this a lexical change (Trait + impl ... for) seems quite hygienic to me, and would be very much in line with the operator approach above.
More interesting (to me at least) than the technicalities is the question if language design: isn't the ability to provide new verbs (subs, operators, methods) for existing nouns (types) a core design goal for a language like perl6? If it is, why would it treat methods differently? And if it does treat them differently for a good reason, does that not mean we are using way to many non-composable methods as nouns where we should be using subs instead?
From a language design perspective, it all comes down to a simple question: which language are we speaking? In Perl 6, this is a question about which we always try to be very clear.
The notion of ones current language in Perl 6 is defined entirely in terms of lexical scope. Sub declarations are lexically scoped. When we import symbols from a module, including extra multi candidates, those are lexically scoped. When we perform language tweaks - such as introducing new operators - those are lexically scoped. Verbs in our current language - that is, subroutine calls - are those with a lexical definition. (Operators are simply sub calls with more interesting parsing.) Since lexical scopes are closed at the end of compile time, the compiler has a complete view of the current language. That's why sub calls to non-existent subs, or references to undeclared variables, are detected and reported at compile time, as well as some basic compile-time type checking; future Perl 6 versions are likely to extend the set of compile-time checks that can be expected. The current language is the static, early-bound, part of Perl 6.
By contrast, a method call is a verb to be interpreted in the target object's language. This is the dynamic, late-bound, part of Perl 6. While the most immediate result of that is the typical polymorphism found in various forms in implementations of OO, thanks to meta-programming even the manner in which a verb is interpreted is up for grabs. For example, a monitor will acquire a lock while it interprets the verb and release it afterwards. Other objects might have been constructed based on things other than Perl 6 code, and so the interpretation of a verb doesn't mean invoking code written as a Perl 6 method. Or the code might be somewhere over the network. Who knows? Well, certainly not the caller, and that's the point, and the power, and the risk, of late binding.
The Perl 6 answer to "I want to extend the range of verbs I can use with this object in my current language" is very simple: use language features that relate to extending the current language! There's even a special syntax, $obj.&foo, that allows for a verb foo to be defined in the current language - by writing a sub - and then invoked much as if it's a method on the object. However, the small syntactic distinction makes it clear to the reader - and to the compiler - what is going on, and which language is getting to define that verb.
Through the use of augment it is possible to extend the language defined by some type of objects. However, it's rarely the best way to do things, given that it will have global effect, and also scatter the definition of the language of the object.
Much of what we do in programming is about building languages. By that I don't mean new syntax; most of our new languages - even in a language as open to mutation as Perl 6 - are just nouns and verbs defined using standard language features. However, in any non-trivial program, we can't keep every detail of every language in mind at once. When I go to the restaurant and order a schnitzel, I don't know how the order will be transported to the kitchen, what the kitchen looks like, whether the schnitzel is hammered out, breaded, and cooked on demand, or just served from a (hopefully not too stale) cache of prepared schnitzels. The kitchen and I have just enough shared meaning to make the right kind of thing happen, but I don't know how they'll precisely react to my request and they need not know what I'll do in the meantime. This kind of thinking is acknowledged by OO itself - at least when we fully embrace it - and at a larger scale by concepts such as bounded contexts, as found in Domain Driven Design.
In summary, Perl 6 tries to help us keep our languages straight: to know what is in our current language, and what we express with only limited understanding. That distinction is encoded by the sub/method distinction, which also turns out to be a sensible place to hang a static/dynamic distinction too.

why are languages generally either statically typed or dynamically typed (not both)?

I don't understand this. I understand the pros and cons of each, but why don't languages like Python allow you to specify the variable type yourself at initialization and function argument types and return types when you wish so the interpreter won't waste time checking it at runtime, for programs or just parts of your code where speed is important, and not do it yourself when it isn't?
It just seems waste of time for users to switch between languages kind of needlessly in these situation and for developers of the language to lose some users or not have them use their language for all of their projects because of this.
Initializing a variable (with a specific type) in a dynamically typed language would be pointless because the variable could be reassigned with a different type later on. And the type of the variable is determined by the variable to which it is assigned anyway. So making statically typing variables optional wouldn't actually provide any extra functionality.
Second, compile-time checking of function arguments wouldn't work either because the type of the variables passed to it couldn't be determined until runtime. And functions can be coded to check the types of their own arguments in a dynamically typed language, so there's no need to implement another system for this.

Is static typing a subset of dynamic typing?

I was going to add this as a comment to my previous question about type theory, but I felt it probably deserved its own exposition:
If you have a dynamic typing system and you add a "type" member to each object and verify that this "type" is a specific value before executing a function on the object, how is this different than static typing? (Other than the fact that it is run-time instead of compile-time).
Technically, it actually is the other way round: a "dynamically typed" language is a special case of a statically typed language, namely one with only a single type (in the mathematical sense). That at least is the view point of many in the type systems community.
Edit regarding static vs dynamic checking: only local properties can be checked dynamically, whereas properties that require some kind of global knowledge cannot. Think of properties such as something being unique, something not being aliased, a computation being free of race conditions. A suitable static type system can verify such properties, because it has the ability to establish certain invariants on the context of the expression that is being checked.
static typing happens at compile-time, not at run-time! And that difference is essential!!
See B.Pierce's book Types and Programming Languages for more.

How is a dynamically-typed language implemented on top of a statically-typed language?

I've only recently come to really grasp the difference between static and dynamic typing, by starting off with C++, and moving into Python and JavaScript. What I don't understand is how a dynamically-typed language (e.g. Python) can be implemented on top of a statically-typed language (e.g. C). I seem to remember reading something about void pointers once, but I didn't really get it.
Every variable in the d-t language is represented as a struct { type, value }, where a value is union/another struct/pointer etc.
In C++ you can get similar ("similar") result if you, for example, create a base abstract class MyVariable and derived MyInt, MyString etc. You can, with some more work, use these vars like in dynamically typed language. (I don't know C++ very well, but I think you'll need to use friend operators functions to change a type of variables in runtime, or maybe not, whatever)
This result is archieved by the same thing, runtime type information, which strores info of actual type in the object
I won't recommend it, though :)
Basically, each "variable" of your dynamically typed language is represented by a structure in the statically typed language, which the data type being one of the fields. The operations on these dynamic data types (add, subtract, compare) are usually implemented by a virtual method table, which is for each data type a number of pointers to functions that implement the desired functionality in a type-specific way.
It's not. The dynamically typed language is implemented on top of a CPU architecture. As long as the CPU architecture is Turing complete, you can implement a static language on it, or a dynamic language, or something hybrid like the CLR/DLR of .NET. The important thing is that the Turing completeness of the CPU architecture is what enables or disables things, not the static nature of a programming language like C or C++.
In general, programming languages maintain Turing completeness, and therefore you can implement anything in any programming language. Of course some things are easier if the underlying tools support it, so it is not easy to implement an application that relies on a dynamic underpinning, in C or C++. That's why people put the effort into making a dynamic system that is programmable, like Python, so that you can implement the dynamic system once and suffer going through that extra effort only one time, then reuse it from the dynamic language layer.

When should weak types be discouraged?

When should weak types be discouraged? Are weak types discouraged in big projects? If the left side is strongly typed like the following would that be an exception to the rule?
int i = 5
string sz = i
sz = sz + "1"
i = sz
Does any languages support similar syntax to the above? Tell me more about pros and cons to weak types and situations related.
I think you are confusing "weak typing" with "dynamic typing".
The term "weak typing" means "not strongly typed", which means that the value of a memory location is allowed to vary from what it's type indicates it should be.
C is an example of a weakly typed language. It allows code like this to be written:
typedef struct
{
int x;
int y;
} FooBar;
FooBar foo;
char * pStr = &foo;
pStr[0] = 'H';
pStr[1] = 'i';
pStr[2] = '\0';
That is, it allows a FooBar instance to be treated as if it was an array of characters.
In a strongly typed language, that would not be allowed. Either a compiler error would be generated, or a run time exception would be thrown, but never, at any time, would a FooBar memory address contain data that was not a valid FooBar.
C#, Java, Lisp, Java Script, and Ruby are examples of languages where this type of thing would not be allowed. They are strongly typed.
Some of those languages are "statically typed", which means that variable types are assigned at compile time, and some are "dynamically typed", which means that variable types are not known until runtime. "Static vs Dynamic" and "Weak vs Strong" are orthogonal issues. For example, Lisp is a "strong dynamically typed" language, whereas "C" is a "weak statically typed language".
Also, as others have pointed out, there is a distinction between "inferred types" and types specified by the programmer. The "var" keyword in C# is an example of type inference. However, it's still a statically typed construct because the compiler infers the type of a variable at compile time, rather than at runtime.
So, what your question really is asking is:
What are the relative merits and
drawbacks of static typing, dynamic
typing, weak typing, stong typing,
inferred static types, and user
specified static types.
I provide answers to all of these below:
Static typing
Static typing has 3 primary benefits:
Better tooling support
A Reduced likely hood of certain types of bugs
Performance
The user experience and accuracy of things like intellisence, and refactoring is improved greatly in a statically typed language because of the extra information that the static types provide. If you type "a." in a code editor and "a" has a static type then the compiler knows everything that could legally come after the "." and can thus show you an accurate completion list. It's possible to support some scenarios in a dynamically typed language, but they are much more limited.
Also, in a program without compiler errors a refactoring tool can identify every place a particular method, variable, or type is used. It's not possible to do that in a dynamically typed language.
The second benefit is somewhat controversial. Proponents of statically typed languages like to make that claim. Opponents of statically typed languages, however, contend that the bugs they catch are trivial, and that they would get caught by testing anyways. But, you do get notification of things like misspelled variable or method names up front, which can be helpful.
Statically typed languages also enable better "data flow analysis", which when combined with things like Microsoft's SAL (or similar tools) can help find potential security problems.
Finally, with static typing, compilers can do a lot more optimization, and so can produce faster code.
Drawbacks:
The main drawback for static typing is that it restricts the things you can do. You can write programs in dynamically typed languages that you can't write in statically typed languages. Ruby on Rails is a good example of this.
Dynamic Typing
The big advantage of dynamic typing is that it's much more powerful than static typing. You can do a lot of really cool stuff with it.
Another one is that it requires less typing. You don't have to specify types all over the place.
Drawbacks:
Dynamic typing has 2 main draw backs:
You don't get as much "hand holding" from the compiler or IDE
It's not suitable for critical performance scenarios. For example, no one writes OS Kernels in Ruby.
Strong typing:
The biggest benefit of strong typing is security. Enforcing strong typing usually requires some type of runtime support. If a program can proove type safety then a lot of security issues, such as buffer overuns, just go away.
Weak typing:
The big drawback of strong typing, and the big benefit of weak typing, is performance.
When you can access memory any way you like, you can write faster code. For example a database can swap objects out to disk just by writing out their raw bytes, and not needing to resort to things like "ISerializable" interfaces. A video game can throw away all the data associated with one level by just running a single free on a large buffer, rather than running destructors for many small objects.
Being able to do those things requires weak typing.
Type inference
Type inference allows a lot of the benefits of static typing without requiring as much typing.
User specified types
Some people just don't like type inference because they like to be explicit. This is more of a style thing.
Weak typing is an attempt at language simplification. While this is a worthy goal, weak typing is a poor solution.
Weak typing such as is used in COM Variants was an early attempt to solve this problem, but it is fraught with peril and frankly causes more trouble than it's worth. Even Visual Basic programmers, who will put up with all sorts of rubbish, correctly pegged this as a bad idea and backronymed Microsoft's ETC (Extended Type Conversion) to Evil Type Cast.
Do not confuse inferred typing with weak typing. Inferred typing is strong typing inferred from context at compile time. A good example is the var keyword, used in C# to declare a variable suitable to receive the value of a LINQ expression.
By contrast, weak typing is inferred each and every time an expression is evaluated. This is illustrated in the question's sample code. Another example would be use of untyped pointers in C. Very handy yet begging for trouble.
Inferred typing addresses the same issue as weak typing, without introducing the problems associated with weak typing. It is therefore a preferred alternative whenever the host language makes it available.
They should almost always be discouraged. The only type of code that I can think of where it would be required is low-level code that requires some pointer voodoo.
And to answer your question, C supports code like that (except of course for not having a string type), and that sounds like something PHP or Perl would have (but I could be totally wrong on that).
"
When should weak types be discouraged? Are weak types discouraged in
big projects? If the left side is strongly typed like the following
would that be an exception to the rule?
int i = 5 string sz = i sz = sz + "1" i = sz
Does any languages support similar syntax to the above? Tell me more
about pros and cons to weak types and situations related.
"
Perhaps you could program your own library to do that.
In C++ you can use something called an "operator overload", which means that you can declare a variable of one type to be initialized as a variable of another type. That is what makes the statement:
[std::string str = "Hello World";][1]
specifically you would define a function (where the variable's type is T and B is the type you want to set it as)
work, even though any text between quotes is interpreted as an array of chars.
T& T::operator= ( const B s );
Please note that this is a class's member function
Also note that you will probably want to have some sort of function that reverses this manipulation if you want to use it liberally - something like
B& T::operator= ( const T s);
C++ is powerful enough to allow you to make an object generally weakly typed, but if you want to treat it purely weakly typed, you will want to make just a single variable type that can be used as any primitive, and use only functions that take a pointer to void.
Believe me, it is a lot easier to use strongly typed programming when it is available.
I personally prefer strongly typed, because I don't need to worry about the errors that come when I don't know what a variable is meant to do. For example, if I wanted to write a function to talk to a person - and that function used the person's height, weight, name, number of children, etc. - but you gave me a color, I would get an error because you can't really determine most of these things for a color using an algorithm that is very simple.
As far as the pros of weakly typed, you might want to get used to loosely typed programming if you are programming something to be run within a program(i.e. a web browser or a UNIX shell). JavaScript and Shell Script are weakly typed.
I would suggest that a programming language like assembly language is one of the only harware-level weakly typed languages, but the flavor of Assembly language I've seen attaches a type to each variable depending on the allocated size, i.e. word, dword, qword.
I hope I gave you a good explanation and did not put any words in your mouth.
Weak types are by their very nature less robust than strong types, because you don't tell the machine exactly what to do - instead the machine has to figure out what you meant. This often works quite adequately, but in general it is not clear what the result should be. What is, for example, a string multiplied by float?
Does any languages support similar syntax to the above?
Perl allows you to treat some numbers and strings interchangeably. For example, "5" + "1" will give you 6. The problem with this sort of thing in general is that it can be hard to avoid ambiguity: should "5" + 1 be "51" or "6"? Perl gets around this by having a separate operator for string concatenation, and reserving + for numeric addition.
Other languages would have to sort out whether you mean to do a concatenation or an addition, and (if relevant) what type or representation the result will be.
I did ASP/VBScript coding and work with legacy code without "option strict" which allows weak typing.
It was a hell in many times, especially in the hands of less experienced programmers. We got all stupid errors takes ages to diagnose.
One of the stupid examples was like this:
'Config
Dim pass
pass = "asdasd"
If NOT pass = Request("p") Then
Response.Write "login failed"
REsponse.End()
End If
So far so good but if the user changes pass to an integer password, guess what it won't work anymore because int pass != string pass (from querystring). I thought it supposed to work but it didn't I can't remember the exact piece of code.
I hate weak typing, instead of stupid debugging session I can spend extra seconds for typing exact type of a variable.
Simply put, in my experience especially in the big projects and especially with unexperienced developers it's just trouble.