determine what vars are constant in what situations

determine what vars are constant in what situations - optimization

The idea is somewhat similar to what Apple has done in the OpenGL stack. I want to have that a bit more general.
Basically, I want to have specialised and optimised variants of some code for some specific cases.
In other words: I have given an algorithm/code for a function (let B = {0,1})
f : B^n -> B^m
Now, I special a specific case by a function (which predefines part of the input of f)
preset : {1..n} -> {0,1,unset}
The amount of predefinitions (∈ {0..n}) is then given by
pn := |preset⁻¹({0,1})|
Canonically, we now get a specialised function
f_preset : B^(n-pn) -> B^m
Also canonically, we get the code/algorithm for this specialised function. Naturally, the code for f_preset will be somewhat more fast than f with pn > 0. Then, you also can optimise this code further (there might be some dead code now, some loops can be unpacked now, some calculations can be precalculated, etc). In some cases, it can have noteable improvements.
Apple does roughly this for their OpenGL stack (from what I have read / know): They try to find a good preset at runtime after everything is setup for variables which will not change anymore, then make an optimised version of the specialised function and only use that one instead of the original function.
Initially, I thought about a way to optimise the physics simulation of some own game. There I have a lot of particle objects and a set of particle types (which is unknown at compile time). A particle type is a set of attributes. The particle types are fixed and constant once they are loaded. Each particle object is of one of theye particle types. The physic simulation for a particle object is some very heavy peace of code with many many branches and very heavily depends on the particle type. My idea was now to have an optimised physics simulation function for each particle type.
After thinking a bit about this, I wanted to go a bit further:
I want to automatically calculate a set of such presets at runtime and maintain the optimised code for each. And I want to automatically add or remove presets when the circumstances change.
There are several questions now:
Is there an easy way to calculate a good preset? How do I know what variables are constant for a given situation?
Is there an easy way to check how good a preset is? 'Good' refers to the performance of the resulting optimised code.
How to compare two algorithms/codes for performance? Via some heuristic? Or by testing with random input?
How many presets (and optimised code variants) should there be for a function? A fixed limit for all functions? Or is this different for every function? Is it maybe even depending on the current computer state?
How to maintain the different optimised code variants? A wrapper function around f which chooses automatically the best optimised variant doesn't seem to be very nice as this maybe not so easy check would be needed for every single call. A solution to this problem might also be deeply related to the question about how to find the set/amount of good presets. (In the particle type case, the optimised code would be attached to / saved together with the particle type. The amount of particle types also define the amount of presets.)
For my initial case, most of these questions are kind of obsolete but am really interested now in how to do this in a more general way. Of course, most/all of these questions are also uncalculateable but I wonder to what degree you may still get good results.
This whole topic is also very important for optimisations in JIT compilers. Are they doing these kind of optimisations already? To what degree?
Are there good recent research works which answers some of my questions? Or maybe also some results which say that it is just too hard to do this in such a general way?

It seems to me you are asking about partial evaluation.
I actually have a bit of a problem with that concept, because it is usually couched in terms that are over-academic and over-difficult.
The way it is usually expressed is that you have some general function F(Islow, Ifast) having arguments that can take different values at different times. The Islow arguments change seldom, and the Ifast arguments can be different every time it is called.
Then the problem is to write some kind of partial-evaluator function G(F, Islow) -> F1(Ifast) that takes function F and the Islow arguments, and generates a new (simpler) function F1 that only takes the Ifast arguments.
The problem with this is 1) somebody has to write the general function F, and 2) somebody has to write the general partial evaluator G.
What makes more sense to me is to write from scratch a function H(Islow) -> F1(Ifast), that is, write a code-generator specifically for F1, rather than writing two functions F and G, especially where G is very difficult to write.
H is usually much easier to write than F, and G need not be written at all! The result function F1 usually is smaller and has much higher performance than F, so it's a win-win situation.
When people write code generators, that is what they are doing, and it is a very effective programming technique.

Related

Godot code optimization - calling a value from another node

Got a question regarding how it will be better to get values from another node within one function.
I have this piece of code, which multiple times getting same value from another node:
func locrestrict(transformX, transformY):
self.transform.origin.x = clamp(transformX, -cam_node.cam_bounds.x - 1, cam_node.cam_bounds.x + 1)
self.transform.origin.y = clamp(transformY, cam_node.cam_bounds.y - 1, -cam_node.cam_bounds.y * 2)
if cam_node.translation != cam_node.cam_init:
self.transform.origin.x = clamp(transformX,
-cam_node.cam_bounds.x - 1 + (cam_node.translation.x - cam_node.cam_init.x),
cam_node.cam_bounds.x + 1 + (cam_node.translation.x - cam_node.cam_init.x))
self.transform.origin.y = clamp(transformY,
cam_node.cam_bounds.y - 1 + (cam_node.translation.y - cam_node.cam_init.y),
-cam_node.cam_bounds.y * 2 + (cam_node.translation.y - cam_node.cam_init.y))
Will Godot automatically use a same record from the memory for all these operations, or it will be better to var a temporary value at the start of the function to prevent multiple node calls?

In Godot 3.x does not do any such optimizations. However, expect very little performance improvement. The reason being that in Godot 3.x GDScript is an interpreted language, and Godot still has to figure out what you are accessing (local variable or not).
I suggest you try the profiler to find out what functions are actually expensive in runtime and begin there. And while we are at it, don't take my word for it. If this function performance is important, try the optimization you have in mind, and see if it made any difference in the profiler.
In Godot 4.0 GDScript is compiled to a virtual machine. I don't know how smart the compiler is, or will be in the stable release of Godot 4.0.
I also want to say that optimization isn't always about doing things faster. Sometimes it is about doing less, or doing less frequently.
For instance, this function as is, probably isn't a performance issue, but if it has to run tens of thousands of times per frame, it might be.
It appears - I'm not sure I fully understand the code - you are implementing the functionality of drag_margin_* and limit_* properties of a Camera2D. If you can have the Camera2D do this for you, instead of writing it in GDScript, you will have better performance. This is an optimization by doing less.
Anyway, it appears the function at hand depends on a particular transform changing. If you have reason to believe it might not change every frame, then you might benefit from only calling this code when it does (e.g. using NOTIFICATION_TRANSFORM_CHANGED). See also How to invoke a function when changing position?. This is an optimization by doing less frequently.
By the way, speaking of optimizations. Typing your variables in Godot 3.x has virtually no impact, but does have it in Godot 4.0. I suggest to take it as good practice. In this case it appears the parameters should be float, and the method is void (does not return):
func locrestrict(transformX:float, transformY:float) -> void:
# …
Otherwise you are defining them to be variant. And Godot has to figure out the type of what is stored in the variant every time.
Something that will have a more noticiable impact in performance in Godot 3.x is taking advantage of some types. In particular, consider using Vector2. If you can take advantages of vectors to operate on multiple values in less instructions, it means Godot needs to interpret less instruction to archive the same result, and thus better performance. This is an optimization by doing less.
GDQuest has a few articles about GDScript that you may find useful (they did expend some time figuring what actually works):
Measuring code performances
Making the most of Godot's speed
Optimizing GDScript code
Using type hints with Godot's GDScript
Note: Even though the term "type hints" stuck, they are actual type declarations.
You may also consider using C# or even C++ as avenues of optimization for the critical parts of your code. GDScript is a great glue language, and also great prototyping language. Yet there is value is porting expensive computations to another language. This would be an optimization by doing things faster. Although I won't consider the code in the question to be an expensive computation.

Complex Boolean expression optimization, normal forms?

I'm working on a streaming rules engine, and some of my customers have a few hundred rules they'd like to evaluate on every event that arrives at the system. The rules are pure (i.e. non-side-effecting) Boolean expressions, and they can be nested arbitrarily deeply.
Customers are creating, updating and deleting rules at runtime, and I need to detect and adapt to the population of rules dynamically. At the moment, the expression evaluation uses an interpreter over the internal AST, and I haven't started thinking about codegen yet.
As always, some of the predicates in the tree are MUCH cheaper to evaluate than others, and I've been looking for an algorithm or data structure that makes it easier to find the predicates that are cheap, and that are validly interpretable as controlling the entire expression. My mental headline for this pattern is "ANDs all the way to the root", i.e. any predicate for which all ancestors are ANDs can be interpreted as controlling.
Despite several days of literature search, reading about ROBDDs, CNF, DNF, etc., I haven't been able to close the loop from what might be common practice in the industry to my particular use case. One thing I've found that seems related is Analysis and optimization for boolean expression indexing
but it's not clear how I could apply it without implementing the BE-Tree data structure myself, as there doesn't seem to be an open source implementation.
I keep half-jokingly mentioning to my team that we're going to need a SAT solver one of these days. 😅 I guess it would probably suffice to write a recursive algorithm that traverses the tree and keeps track of whether every ancestor is an AND or an OR, but I keep getting the "surely this is a solved problem" feeling. :)
Edit: After talking to a couple of friends, I think I may have a sketch of a solution!
Transform the expressions into Conjunctive Normal Form, in which, by definition, every node is in a valid short-circuit position.
Use the Tseitin algorithm to try to avoid exponential blowups in expression size as a result of the CNF transform
For each AND in the tree, sort it in ascending order of cost (i.e. cheapest to the left)
???
Profit!^Weval as usual :)

You should seriously consider compiling the rules (and the predicates). An interpreter is 10-50x slower than machine code for the same thing. This is a good idea if the rule set doesn't change very often. Its even a good idea if the rules can change dynamically because in practice they still don't change very fast, although now your rule compiler has be online. Eh, just makes for a bigger application program and memory isn't much of an issue anymore.
A Boolean expression evaluation using individual machine instructions is even better. Any complex boolean equation can be compiled in branchless sequences of individual machine instructions over the leaf values. No branches, no cache misses; stuff runs pretty damn fast. Now, if you have expensive predicates, you probably want to compile code with branches to skip subtrees that don't affect the result of the expression, if they contain expensive predicates.
Within reason, you can generate any equivalent form (I'd run screaming into the night over the idea of using CNF because it always blows up on you). What you really want is the shortest boolean equation (deepest expression tree) equivalent to what the clients provided because that will take the fewest machine instructions to execute. This may sound crazy, but you might consider exhaustive search code generation, e.g., literally try every combination that has a chance of working, especially if the number of operators in the equation is relatively small. The VLSI world has been working hard on doing various optimizations when synthesizing boolean equations into gates. You should look into the the Espresso hueristic boolean logic optimizer (https://en.wikipedia.org/wiki/Espresso_heuristic_logic_minimizer)
One thing that might drive you expression evaluation is literally the cost of the predicates. if I have formula A and B, and I know that A is expensive to evaluate and usually returns true, then clearly I want to evaluate B and A instead.
You should consider common sub expression evaluation, so that any common subterm is only computed once. This is especially important when one has expensive predicates; you never want to evaluate the same expensive predicate twice.
I implemented these tricks in a PLC emulator (these are basically machines that evaluate buckets [like hundreds of thousands] of boolean equations telling factory actuators when to move) using x86 machine instructions for AND/OR/NOT for Rockwell Automation some 20 years ago. It outran Rockwell's "premier" PLC which had custom hardware but was essentially an interpreter.
You might also consider incremental evaluation of the equations. The basic idea is not to re-evaluate all the equations over and over, but rather to re-evaluate only those equations whose input changed. Details are too long to include here, but a patent I did back then explains how to do it. See https://patents.google.com/patent/US5623401A/en?inventor=Ira+D+Baxter&oq=Ira+D+Baxter

Is it better to use a boolean variable to replace an if condition for readability or not?

I am in the second year of my bachelor study in information technology. Last year in one of my courses they taught me to write clean code so other programmers have an easier time working with your code. I learned a lot about writing clean code from a video ("clean code") on pluralsight (paid website for learning which my school uses). There was an example in there about assigning if conditions to boolean variables and using them to enhance readability. In my course today my teacher told me it's very bad code because it decreases performance (in bigger programs) due to increased tests being executed. I was wondering now whether I should continue using boolean variables for readability or not use them for performance. I will illustrate in an example (I am using python code for this example):
example boolean variable
Let's say we need to check whether somebody is legal to drink alcohol we get the persons age and we know the legal drinking age is 21.
is_old_enough = persons_age >= legal_drinking_age
if is_old_enough:
do something
My teacher told me today that this would be very bad for performance since 2 tests are performed first persons_age >= legal_drinking_age is tested and secondly in the if another test occurs whether the person is_old_enough.
My teacher told me that I should just put the condition in the if, but in the video they said that code should be read like natural language to make it clear for other programmers. I was wondering now which would be the better coding practice.
example condition in if:
if persons_age >= legal_drinking_age:
do something
In this example only 1 test is tested whether persons_age >= legal_drinking_age. According to my teacher this is better code.
Thank you in advance!
yours faithfully
Jonas

I was wondering now which would be the better coding practice.
The real safe answer is : Depends..
I hate to use this answer, but you won't be asking unless you have faithful doubt. (:
IMHO:
If the code will be used for long-term use, where maintainability is important, then a clearly readable code is preferred.
If the program speed performance crucial, then any code operation that use less resource (smaller dataSize/dataType /less loop needed to achieve the same thing/ optimized task sequencing/maximize cpu task per clock cycle/ reduced data re-loading cycle) is better. (example keyword : space-for-time code)
If the program minimizing memory usage is crucial, then any code operation that use less storage and memory resource to complete its operation (which may take more cpu cycle/loop for the same task) is better. (example: small devices that have limited data storage/RAM)
If you are in a race, then you may what to code as short as possible, (even if it may take a slightly longer cpu time later). example : Hackathon
If you are programming to teach a team of student/friend something.. Then readable code + a lot of comment is definitely preferred .
If it is me.. I'll stick to anything closest to assembly language as possible (as much control on the bit manipulation) for backend development. and anything closest to mathematica-like code (less code, max output, don't really care how much cpu/memory resource is needed) for frontend development. ( :
So.. If it is you.. you may have your own requirement/preference.. from the user/outsiders/customers point of view.. it is just a working/notWorking program. YOur definition of good program may defer from others.. but this shouldn't stop us to be flexible in the coding style/method.
Happy exploring. Hope it helps.. in any way possible.

Performance
Performance is one of the least interesting concerns for this question, and I say this as one working in very performance-critical areas like image processing and raytracing who believes in effective micro-optimizations (but my ideas of effective micro-optimization would be things like improving memory access patterns and memory layouts for cache efficiency, not eliminating temporary variables out of fear that your compiler or interpreter might allocate additional registers and/or utilize additional instructions).
The reason it's not so interesting is, because, as pointed out in the comments, any decent optimizing compiler is going to treat those two you wrote as equivalent by the time it finishes optimizing the intermediate representation and generates the final results of the instruction selection/register allocation to produce the final output (machine code). And if you aren't using a decent optimizing compiler, then this sort of microscopic efficiency is probably the last thing you should be worrying about either way.
Variable Scopes
With performance aside, the only concern I'd have with this convention, and I think it's generally a good one to apply liberally, is for languages that don't have a concept of a named constant to distinguish it from a variable.
In those cases, the more variables you introduce to a meaty function, the more intellectual overhead it can have as the number of variables with a relatively wide scope increases, and that can translate to practical burdens in maintenance and debugging in extreme cases. If you imagine a case like this:
some_variable = ...
...
some_other_variable = ...
...
yet_another_variable = ...
(300 lines more code to the function)
... in some function, and you're trying to debug it, then those variables combined with the monstrous size of the function starts to multiply the difficulty of trying to figure out what went wrong. That's a practical concern I've encountered when debugging codebases spanning millions of lines of code written by all sorts of people (including those no longer on the team) where it's not so fun to look at the locals watch window in a debugger and see two pages worth of variables in some monstrous function that appears to be doing something incorrectly (or in one of the functions it calls).
But that's only an issue when it's combined with questionable programming practices like writing functions that span hundreds or thousands of lines of code. In those cases it will often improve everything just focusing on making reasonable-sized functions that perform one clear logical operation and don't have more than one side effect (or none ideally if the function can be programmed as a pure function). If you design your functions reasonably then I wouldn't worry about this at all and favor whatever is readable and easiest to comprehend at a glance and maybe even what is most writable and "pliable" (to make changes to the function easier if you anticipate a future need).
A Pragmatic View on Variable Scopes
So I think a lot of programming concepts can be understood to some degree by just understanding the need to narrow variable scopes. People say avoid global variables like the plague. We can go into issues with how that shared state can interfere with multithreading and how it makes programs difficult to change and debug, but you can understand a lot of the problems just through the desire to narrow variable scopes. If you have a codebase which spans a hundred thousand lines of code, then a global variable is going to have the scope of a hundred thousands of lines of code for both access and modification, and crudely speaking a hundred thousand ways to go wrong.
At the same time that pragmatic sort of view will find it pointless to make a one-shot program which only spans 100 lines of code with no future need for extension avoid global variables like the plague, since a global here is only going to have 100 lines worth of scope, so to speak. Meanwhile even someone who avoids those like the plague in all contexts might still write a class with member variables (including some superfluous ones for "convenience") whose implementation spans 8,000 lines of code, at which point those variables have a much wider scope than even the global variable in the former example, and this realization could drive someone to design smaller classes and/or reduce the number of superfluous member variables to include as part of the state management for the class (which can also translate to simplified multithreading and all the similar types of benefits of avoiding global variables in some non-trivial codebase).
And finally it'll tend to tempt you to write smaller functions as well, since a variable towards the top of some function spanning 500 lines of code is going to also have a fairly wide scope. So anyway, my only concern when you do this is to not let the scope of those temporary, local variables get too wide. And if they do, then the general answer is not necessarily to avoid those variables but to narrow their scope.

What techniques are available for memory optimizing in 8051 assembly language?

I need to optimize code to get room for some new code. I do not have the space for all the changes. I can not use code bank switching (80c31 with 64k).

You haven't really given a lot to go on here, but there are two main levels of optimizations you can consider:
Micro-Optimizations:
eg. XOR A instead of MOV A,0
Adam has covered some of these nicely earlier.
Macro-Optimizations:
Look at the structure of your program, the data structures and algorithms used, the tasks performed, and think VERY hard about how these could be rearranged or even removed. Are there whole chunks of code that actually aren't used? Is your code full of debug output statements that the user never sees? Are there functions specific to a single customer that you could leave out of a general release?
To get a good handle on that, you'll need to work out WHERE your memory is being used up. The Linker map is a good place to start with this. Macro-optimizations are where the BIG wins can be made.
As an aside, you could - seriously- try rewriting parts of your code with a good optimizing C compiler. You may be amazed at how tight the code can be. A true assembler hotshot may be able to improve on it, but it can easily be better than most coders. I used the IAR one about 20 years ago, and it blew my socks off.

With assembly language, you'll have to optimize by hand. Here are a few techniques:
Note: IANA8051P (I am not an 8501 programmer but I have done lots of assembly on other 8 bit chips).
Go through the code looking for any duplicated bits, no matter how small and make them functions.
Learn some of the more unusual instructions and see if you can use them to optimize, eg. A nice trick is to use XOR A to clear the accumulator instead of MOV A,0 - it saves a byte.
Another neat trick is if you call a function before returning, just jump to it eg, instead of:
CALL otherfunc
RET
Just do:
JMP otherfunc
Always make sure you are doing relative jumps and branches wherever possible, they use less memory than absolute jumps.
That's all I can think of off the top of my head for the moment.

Sorry I am coming to this late, but I once had exactly the same problem, and it became a repeated problem that kept coming back to me. In my case the project was a telephone, on an 8051 family processor, and I had totally maxed out the ROM (code) memory. It kept coming back to me because management kept requesting new features, so each new feature became a two step process. 1) Optimize old stuff to make room 2) Implement the new feature, using up the room I just made.
There are two approaches to optimization. Tactical and Strategical. Tactical optimizations save a few bytes at a time with a micro optimization idea. I think you need strategic optimizations which involve a more radical rethinking about how you are doing things.
Something I remember worked for me and could work for you;
Look at the essence of what your code has to do and try to distill out some really strong flexible primitive operations. Then rebuild your top level code so that it does nothing low level at all except call on the primitives. Ideally use a table based approach, your table contains stuff like; Input state, event, output state, primitives.... In other words when an event happens, look up a cell in the table for that event in the current state. That cell tells you what new state to change to (optionally) and what primitive(s) (if any) to execute. You might need multiple sets of states/events/tables/primitives for different layers/subsystems.
One of the many benefits of this approach is that you can think of it as building a custom language for your particular problem, in which you can very efficiently (i.e. with minimal extra code) create new functionality simply by modifying the table.
Sorry I am months late and you probably didn't have time to do something this radical anyway. For all I know you were already using a similar approach! But my answer might help someone else someday who knows.

In the whacked-out department, you could also consider compressing part of your code and only keeping some part that is actively used decompressed at any particular point in time. I have a hard time believing that the code required for the compress/decompress system would be small enough a portion of the tiny memory of the 8051 to make this worthwhile, but has worked wonders on slightly larger systems.
Yet another approach is to turn to a byte-code format or the kind of table-driven code that some state machine tools output -- having a machine understand what your app is doing and generating a completely incomprehensible implementation can be a great way to save room :)
Finally, if the code is indeed compiled in C, I would suggest compiling with a range of different options to see what happens. Also, I wrote a piece on compact C coding for the ESC back in 2001 that is still pretty current. See that text for other tricks for small machines.

1) Where possible save your variables in Idata not in xdata
2) Look at your Jmp statements – make use of SJmp and AJmp

I assume you know it won't fit because you wrote/complied and got the "out of memory" error. :) It appears the answers address your question pretty accurately; short of getting code examples.
I would, however, recommend a few additional thoughts;
Make sure all the code is really
being used -- code coverage test? An
unused sub is a big win -- this is a
tough step -- if you're the original
author, it may be easier -- (well, maybe) :)
Ensure the level of "verification"
and initialization -- sometimes we
have a tendency to be over zealous
in insuring we have initialized
variables/memory and sure enough
rightly so, how many times have we
been bitten by it. Not saying don't
initialize (duh), but if we're doing
a memory move, the destination
doesn't need to be zero'd first --
this dovetails with
1 --
Eval the new features -- can an
existing sub be be enhanced to cover
both functions or perhaps an
existing feature replaced?
Break up big code if a piece of the
big code can save creating a new
little code.
or perhaps there's an argument for hardware version 2.0 on the table now ... :)
regards

Besides the already mentioned (more or less) obvious optimizations, here is a really weird (and almost impossible to achieve) one: Code reuse. And with Code reuse I dont mean the normal reuse, but to a) reuse your code as data or b) to reuse your code as other code. Maybe you can create a lut (or whatever static data) that it can represented by the asm hex opcodes (here you have to look harvard vs von neumann architecture).
The other would reuse code by giving code a different meaning when you address it different. Here an example to make clear what I mean. If the bytes for your code look like this: AABCCCDDEEFFGGHH at address X where each letter stands for one opcode, imagine you would now jump to X+1. Maybe you get a complete different functionality where the now by space seperated bytes form the new opcodes: ABC CCD DE EF GH.
But beware: This is not only tricky to achieve (maybe its impossible), but its a horror to maintain. So if you are not a demo code (or something similiar exotic), I would recommend to use the already other mentioned ways to save mem.

Metrics & Object-oriented programming

I would like to know if somebody often uses metrics to validate its code/design.
As example, I think I will use:
number of lines per method (< 20)
number of variables per method (< 7)
number of paremeters per method (< 8)
number of methods per class (< 20)
number of field per class (< 20)
inheritance tree depth (< 6).
Lack of Cohesion in Methods
Most of these metrics are very simple.
What is your policy about this kind of mesure ? Do you use a tool to check their (e.g. NDepend) ?

Imposing numerical limits on those values (as you seem to imply with the numbers) is, in my opinion, not very good idea. The number of lines in a method could be very large if there is a significant switch statement, and yet the method is still simple and proper. The number of fields in a class can be appropriately very large if the fields are simple. And five levels of inheritance could be way too many, sometimes.
I think it is better to analyze the class cohesion (more is better) and coupling (less is better), but even then I am doubtful of the utility of such metrics. Experience is usually a better guide (though that is, admittedly, expensive).

A metric I didn't see in your list is McCabe's Cyclomatic Complexity. It measures the complexity of a given function, and has a correlation with bugginess. E.g. high complexity scores for a function indicate: 1) It is likely to be a buggy function and 2) It is likely to be hard to fix properly (e.g. fixes will introduce their own bugs).
Ultimately, metrics are best used at a gross level -- like control charts. You look for points above and below the control limits to identify likely special cases, then you look at the details. For example a function with a high cyclomatic complexity may cause you to look at it, only to discover that it is appropriate because it a dispatcher method with a number of cases.

management by metrics does not work for people or for code; no metrics or absolute values will always work. Please don't let a fascination with metrics distract from truly evaluating the quality of the code. Metrics may appear to tell you important things about the code, but the best they can do is hint at areas to investigate.
That is not to say that metrics are not useful. Metrics are most useful when they are changing, to look for areas that may be changing in unexpected ways. For example, if you suddenly go from 3 levels of inheritance to 15, or 4 parms per method to 12, dig in and figure out why.
example: a stored procedure to update a database table may have as many parameters as the table has columns; an object interface to this procedure may have the same, or it may have one if there is an object to represent the data entity. But the constructor for the data entity may have all of those parameters. So what would the metrics for this tell you? Not much! And if you have enough situations like this in the code base, the target averages will be blown out of the water.
So don't rely on metrics as absolute indicators of anything; there is no substitute for reading/reviewing the code.

Personally I think it's very difficult to adhere to these types of requirements (i.e. sometimes you just really need a method with more than 20 lines), but in the spirit of your question I'll mention some of the guidelines used in an essay called Object Calisthenics (part of the Thoughtworks Anthology if you're interested).
Levels of indentation per method (<2)
Number of 'dots' per line (<2)
Number of lines per class (<50)
Number of classes per package (<10)
Number of instance variances per class (<3)
He also advocates not using the 'else' keyword nor any getters or setters, but I think that's a bit overboard.

Hard numbers don't work for every solution. Some solutions are more complex than others. I would start with these as your guidelines and see where your project(s) end up.
But, regarding these number specifically, these numbers seem pretty high. I usually find in my particular coding style that I usually have:
no more than 3 parameters per method
signature about 5-10 lines per method
no more than 3 levels of inheritance
That isn't to say I never go over these generalities, but I usually think more about the code when I do because most of the time I can break things down.

As others have said, keeping to a strict standard is going to be tough. I think one of the most valuable uses of these metrics is to watch how they change as the application evolves. This helps to give you an idea how good a job you're doing on getting the necessary refactoring done as functionality is added, and helps prevent making a big mess :)

OO Metrics are a bit of a pet project for me (It was the subject of my master thesis). So yes I'm using these and I use a tool of my own.
For years the book "Object Oriented Software Metrics" by Mark Lorenz was the best resource for OO metrics. But recently I have seen more resources.
Unfortunately I have other deadlines so no time to work on the tool. But eventually I will be adding new metrics (and new language constructs).
Update
We are using the tool now to detect possible problems in the source. Several metrics we added (not all pure OO):
use of assert
use of magic constants
use of comments, in relation to the compelxity of methods
statement nesting level
class dependency
number of public fields in a class
relative number of overridden methods
use of goto statements
There are still more. We keep the ones that give a good image of the pain spots in the code. So we have direct feedback if these are corrected.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas