Why does COBOL have both `SECTION` and `PARAGRAPH`?

Why does COBOL have both `SECTION` and `PARAGRAPH`? - language-design

Why does COBOL have both SECTION and PARAGRAPH?
Can anybody explain why the designers of COBOL created both SECTIONs and PARAGRAPHs? These have been around since the initial release of COBOL so I suspect the real reason for their existence has long since gone away (similar to things like NEXT SENTENCE which are still in the language specification for backward compatibility but no longer required since the introduction of explicit scope terminators).
My guess is that SECTION may have been introduced to support program overlays. SECTION has an optional PRIORITY number associated with it to identify the program overlay it is part of. However, most modern implementations of COBOL ignore or have dropped PRIORITY numbers (and overlays).
Currently, I see that SECTIONs are still required in the DECLARATIVE part of the PROCEDURE DIVISION, but can find no justification for this. I see no semantic difference between SECTION and PARAGRAPH other than PARAGRAPH is subordinate to SECTION.
Some COBOL shops ban the use of SECTION in favour of PARAGRAPH (seems common in North America). Others ban PARAGRAPH in favour of SECTION (seems common in Europe). Still others have guidelines as to when each is appropriate. All of this seems highly arbitrary to me - which begs the question: Why were they put into the language specification in the first place? And, do they have any relevance today?
If you answer this question, it would be great if you could also point to a reference to support your answer.
Thanks

No references on this, since I heard it passed on to me from one of the old timers in my shop but...
In the old COBOL compilers, at least for IBM and Unisys, sections were able to be loaded into memory one at a time. Back in the good old days when memory was scarce, a program that was too large to be loaded into memory all at once was able to be modularized for memory usage using sections. Having both sections and paragraphs allowed the programmer to decide which code parts were loaded into memory together if they couldn't all be loaded at once - you'd want two parts of the same perform loop loaded together for efficiency's sake. Nowadays it's more or less moot.
My shop uses paragraphs only, prohibits GOTO and requires exit paragraphs, so all our PERFORMS are PERFORM 100-PARAGRAPH THRU 100-EXIT or something similar - which seems to make the paragraphs more like sections to me. But I don't think that there's really much of a difference now.

I learned COBOL around 1978, on an ICL 2903. I have a vague memory that the SECTION headers could be assigned a number range, which meant that those SECTION headers could be swapped in and out of memory, when the program was too large for memory.

I know this is an old question, but the OP requested about documentation on the original justification of the use of SECTION as well as PARAGRAPH in COBOL.
You can't get much more "original" than the CODASYL Journal documentation.
in section 8 of the Journal's specification for the language,
"COBOL segmentation is a facility that provides a means by which the
user may communicate with the compiler to specify object program
overlay requirements"
( page 331, section 8.1 "Segmentation - General Description")
"Although it is not mandatory, the Procedure Division for a source
program is usually written as a consecutive group of sections, each of
which is composed of a series of closely related operations that are
designed to collectively perform a particular function. However s when
segmentation is used, the entire Procedure Division must be in
sections. In addition, each section must be classified as belonging
either to the fixed portion or to one of the independent segments of
the object program. Segmentation in no way affects the need for
qualification of procedure-names to insure uniqueness."
(p 331, section 8.1.2.1 "Program Segments")
In her book on comparative programming languages ("Programming Languages: History and Fundamentals", 1969) Jean Sammet (who sat on the CODASYL committee, representing Sylvania Electric) states:
".. The storage allocation is handled automatically by the compiler.
The prime unit for allocating executable code is a group of sections
called a segment. The programmer combines sections be specifying a
priority number with each section's name. ... The compiler is required
to see that the proper control transfers are provided so that control
among segments which are not stored simultaneously can take place.
..."
(p 369 - 371 V.3 COBOL)

Well, the simplest of the reasons is that SECTION s provide you the "modularity" -- just as functions in C -- a necessity in the "structured" programs. You would notice that code written using SECTIONs appears far more readable than the code written just in paragraphs, for every section has to have an "EXIT" -- a sole and very explicit exit point from a SECTION (exit point of a paragrpah is far more vague and implicit, i.e. until a new paragraph declaration is found). Consider this example and you may be tempted to use sections in your code:
*==================
MAINLINE SECTION.
*==================
PERFORM SEC-A
PERFORM SEC-B
PERFORM SEC-C
GOBACK.
*==================
MAINLINE-EXIT.
*==================
EXIT.
*==================
SEC-A SECTION.
*==================
.....
.....
.....
.....
IF <cond>
go to A-EXIT
end-if
.....
.....
.....
.....
.
*==================
A-EXIT.
*==================
EXIT.
Don't think you would have this sort of a privlege when writing your codes in paragraphs. You may have had to write a huge ELSE statement to cover up the statements you didn't want to execute when a certain condition is reached (consider that set of statements to be running across 2-3 pages... a further set of IF / ELSE would cramp you up for indentation). Of course, you'll have to use "GO TO" to achieve this, but you can always direct your professionals not to use GO TOs except while Exiting, which is a fair deal, I think.
So, whilst I also agree that anything that can be written using SECTIONs can also be written using paragraphs (with little or no tweaks), my personal choice would be to go for an implementation that can make the job of my developers a little easier in future!

Cobol was developed in the mid-50's. As the full name alludes, it was developed for business programming, as being a language more relevant for business purposes than the existing "scientific" or "technical" languages (there were very few "languages" anyway, and "machine code" (specific, of course, to a particular architechture (I nearly said "specific chip", before thinking of vacuum tubes)) which may have to be set through physical switches/dials on some machines) and if lucky with an "Assembler". Cobol was very advanced for its day, for its purpose.
The intention was for programs written in Cobol to be much more like English-language than just a set of "codes" which mean something to the initiated.
If you look at some of the nomenclature relating to the language - paragraph, sentence, verb, clause - it is deliberately following the patterns ascribed to the English language.
SECTION doesn't quite fit into this, until you relate things to a formal business document.
Both SECTIONs and paragraphs also appear outside the PROCEDURE DIVISION. As in written English, paragraphs can exist on their own, or can be a part of a SECTION.
SECTIONs may have a priority-number which relates to the "segmentation feature". This used to include "overlaying" of SECTIONs to afford a primitive level of memory management. This is a "computing featuer" rather than an English-language one :-) The "segmentation feature" does have something of a remaining affect, but I've never seen it actually used.
Without DECLARATIVES (which I don't use, and have just noticed the manual to be unclear upon) then it is "choice" as to whether SECTIONs or paragraphs are used for PERFORM.
If GO TO is used, rationally, "equivalence" can be achieved with PERFORM ... TRHU .... If not, and there is not gratuitous use of PERFORM ... THRU ..., then there is equivalence already.
Comparisons to "structured" code and modern languages are "reading history backwards" or just outlining a particular "practice". From the reputation attained by "spaghetti code" and ALTER ... TO PROCEED TO ... it may well be that for 20 years it was "common" to not do much with PERFORM unless you needed the "memory management", but I have no references or knowledge to back this up.
SECTIONs allow duplicate paragraph-names, otherwise paragraph-names must be unique.
I can't put a specific finger on one over the other all the time.
If using GO TO, I'd use SECTIONs. If not, paragraphs. With DECLARATIVES I'd use SECTIONs. If using SECTIONs I'd start PROCEDURE DIVISION with a SECTION to avoid a diagnostic message.
Local standards may dictate, but not necessarily on a "modern" (or even "rational") basis. Much is "known" but actually misunderstood about SECTIONs and paragraphs, in my experience.
For performance (where masses of data is being processed, and I mean masses) then a PERFORM of one SECTION rather than multiple individual paragraphs would see improvements. The effect would be the same with PERFORM ... THRU ..., but I prefer not to recommend it.
GO TO outside the range of a PERFORM is 1) bad 2) can lose out on "optimization". Shouldn't be a problem *except" when GO TO abend/exception and not expecting any logical return. If the use of this is felt to be necessarily "immediately", then it is better done with a PERFORM despite the "counter-intuitive" aspect (so document it).

For one thing, paragraph names must be unique unless they are in separate sections, so sections allow for "namespacing" of paragraphs.
If I recall correctly, the only reason you must use a SECTION is for DECLARATIVES. Aside from that they are optional and primarily useful for grouping paragraphs. I think it's common (relatively speaking, anyway) to require that PERFORM be used on paragraphs only when they are in the same section.

A section can have several paragraphs in it. When you PERFORM a section, it executes all the paragraphs in the section. Within the section you can use PERFORM or GOTO to branch to the paragraphs within the section.

I will do the best I can to answer this. If your only coding exposure is x86 or ARM then you will have significant difficulty. Yes those chips sell a lot but that doesn't mean they are good, just cheap enough people don't mind throwing them away.
Much of this information can be found in "The Minimum You Need to Know to Be an OpenVMS Application Developer." You will find it is one of the scant few titles on Dr. Dobb's recommended reading list for all developers. Yes, I wrote it. It is also the book recommended by HP OpenVMS Engineering group for developers looking to learn the platform.
My COBOL on that platform mostly happened during the 1980s when it was VAX/VMS. Then it became OpenVMS; Alpha/OpenVMS; Itanium/OpenVMS; and soon to be x86/OpenVMS. On a real computer with a real operating system, sections have meaning. Every section created a PSECT. In linker terms that was short for Program SECtion. Based on what the section was, various load attributes were set. Each PSECT would be loaded into one or more 512 Byte memory pages. Memory pages were designed to be the exact same size as a disk block. VMS stood for Virtual Memory System. IBM had several of their own operating systems which, under the hood were different, but they too were true virtual memory systems. This wasn't "overlay linking." That's an x86 term and came about due to severe architectural flaws.Read up on Compact, Small, Medium, and Large "memory models" from the 286 days on forward. Also read up on EMS and XMS memory paging. Oiy was THAT fun!
Here is one of the numerous programs found in that book.
IDENTIFICATION DIVISION.
PROGRAM-ID. COB_ZILL_DUE_REPORT_SUB.
AUTHOR. Roland Hughes.
DATE-WRITTEN. 2005-02-08.
DATE-COMPILED. TODAY.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT DRAW-STATS
ASSIGN TO 'DRAWING_STATS'
ORGANIZATION IS INDEXED
ACCESS MODE IS SEQUENTIAL
RECORD KEY IS ELM_NO IN DSTATS-REC
LOCK MODE IS AUTOMATIC
FILE STATUS IS D-STAT.
SELECT MEGA-STATS
ASSIGN TO 'MEGA_STATS'
ORGANIZATION IS INDEXED
ACCESS MODE IS SEQUENTIAL
RECORD KEY IS ELM_NO IN MSTATS-REC
LOCK MODE IS AUTOMATIC
FILE STATUS IS M-STAT.
SELECT SORT-FILE ASSIGN TO 'TMP.SRT'.
SELECT SORTED-FILE ASSIGN TO DISK.
SELECT RPT-FILE ASSIGN TO 'ZILL_DUE.RPT'.
DATA DIVISION.
FILE SECTION.
FD DRAW-STATS
IS GLOBAL
LABEL RECORDS ARE STANDARD.
COPY 'CDD_RECORDS.ZILLIONARE_STATS_RECORD' FROM DICTIONARY
REPLACING ZILLIONARE_STATS_RECORD BY DSTATS-REC.
FD MEGA-STATS
IS GLOBAL
LABEL RECORDS ARE STANDARD.
COPY 'CDD_RECORDS.ZILLIONARE_STATS_RECORD' FROM DICTIONARY
REPLACING ZILLIONARE_STATS_RECORD BY MSTATS-REC.
FD RPT-FILE
LABEL RECORDS ARE OMITTED.
01 RPT-DTL PIC X(80).
SD SORT-FILE.
COPY 'CDD_RECORDS.ZILLIONARE_STATS_RECORD' FROM DICTIONARY
REPLACING ZILLIONARE_STATS_RECORD BY SORT-REC.
FD SORTED-FILE
VALUE OF ID IS SORTED-FILE-NAME.
COPY 'CDD_RECORDS.ZILLIONARE_STATS_RECORD' FROM DICTIONARY
REPLACING ZILLIONARE_STATS_RECORD BY SORTED-REC.
Data declarations
WORKING-STORAGE SECTION.
01 CONSTANTS.
05 SORT-FILE-NAME PIC X(7) VALUE 'TMP.SRT'.
05 SORTED-FILE-NAME PIC X(8) VALUE 'STAT.SRT'.
01 STATUS-VARIABLES.
05 M-STAT PIC X(2).
05 D-STAT PIC X(2).
05 EOF-FLAG PIC X.
88 IT-IS-END-OF-FILE VALUE 'Y'.
01 STUFF.
05 TODAYS-DATE.
10 TODAY_YYYY PIC X(4).
10 TODAY_MM PIC X(2).
10 TODAY_DD PIC X(2).
05 TODAYS-DATE-FORMATTED.
10 FMT_MM PIC Z9.
10 FILLER PIC X VALUE '/'.
10 FMT_DD PIC 99.
10 FILLER PIC X VALUE '/'.
10 FMT_YYYY PIC 9(4).
05 FLT-1 COMP-2.
05 WORK-STR PIC X(65).
01 REPORT-DETAIL.
05 ELM-NO-DTL PIC Z9.
05 FILLER PIC X(3).
05 HIT-COUNT-DTL PIC ZZZ9.
05 FILLER PIC X(3).
05 SINCE-LAST-DTL PIC ZZZ9.
05 FILLER PIC X(5).
05 PCT-HITS-DTL PIC Z9.999.
05 FILLER PIC X(4).
05 AVE-BTWN-DTL PIC ZZ9.999.
01 REPORT-HDR1.
05 THE-DATE PIC X(12).
05 FILLER PIC X(20).
05 PAGE-TITLE PIC X(17).
01 REPORT-HDR2.
05 FILLER PIC X(33).
05 GROUP-TITLE PIC X(20).
01 REPORT-HDR3.
05 HDR3-TXT PIC X(40) VALUE
'No Hits Since Pct_hits Ave_btwn'.
01 REPORT-HDR4.
05 HDR4-TXT PIC X(40) VALUE
'-- ---- ----- -------- --------'.
PROCEDURE DIVISION.
A000-MAIN.
PERFORM B000-HSK.
SORT SORT-FILE
ON DESCENDING KEY SINCE_LAST IN SORT-REC
INPUT PROCEDURE IS S000-DSTAT-INPUT
GIVING SORTED-FILE.
PERFORM B010-REPORT-DRAWING-NUMBERS.
STRING SORT-FILE-NAME, ';*' DELIMITED BY SIZE INTO WORK-STR.
CALL 'LIB$DELETE_FILE' USING BY DESCRIPTOR WORK-STR.
STRING SORTED-FILE-NAME, ';*' DELIMITED BY SIZE INTO WORK-STR.
CALL 'LIB$DELETE_FILE' USING BY DESCRIPTOR WORK-STR.
*
* Set up for second part of report
*
MOVE SPACES TO RPT-DTL.
WRITE RPT-DTL BEFORE ADVANCING PAGE.
MOVE SPACES TO EOF-FLAG.
MOVE ' Mega Drawing Numbers' TO GROUP-TITLE.
SORT SORT-FILE
ON DESCENDING KEY SINCE_LAST IN SORT-REC
INPUT PROCEDURE IS S001-MSTAT-INPUT
GIVING SORTED-FILE.
PERFORM B010-REPORT-DRAWING-NUMBERS.
STRING SORT-FILE-NAME, ';*' DELIMITED BY SIZE INTO WORK-STR.
CALL 'LIB$DELETE_FILE' USING BY DESCRIPTOR WORK-STR.
STRING SORTED-FILE-NAME, ';*' DELIMITED BY SIZE INTO WORK-STR.
CALL 'LIB$DELETE_FILE' USING BY DESCRIPTOR WORK-STR.
CLOSE RPT-FILE.
CALL 'LIB$SPAWN' USING BY DESCRIPTOR 'EDIT/READ ZILL_DUE.RPT'.
EXIT PROGRAM.
Paragraph to initialize our data and files.
B000-HSK.
CALL 'COB_FILL_IN_LOGICALS'.
MOVE SPACES TO STATUS-VARIABLES.
ACCEPT TODAYS-DATE FROM DATE YYYYMMDD.
MOVE TODAY_YYYY TO FMT_YYYY.
MOVE TODAY_DD TO FMT_DD.
MOVE TODAY_MM TO FMT_MM.
OPEN OUTPUT RPT-FILE.
MOVE SPACES TO REPORT-HDR1.
MOVE TODAYS-DATE-FORMATTED TO THE-DATE.
MOVE 'Due Number Report' to PAGE-TITLE.
MOVE SPACES TO REPORT-HDR2.
MOVE 'Drawing Numbers' TO GROUP-TITLE.
Paragraph to process the sorted selection file and
create the portion of the report relating to drawing
numbers.
B010-REPORT-DRAWING-NUMBERS.
MOVE SPACES TO EOF-FLAG.
OPEN INPUT SORTED-FILE.
READ SORTED-FILE
AT END SET IT-IS-END-OF-FILE TO TRUE.
PERFORM C010-DRAWING-HEADINGS.
PERFORM UNTIL IT-IS-END-OF-FILE
MOVE SPACES TO REPORT-DETAIL
MOVE ELM_NO IN SORTED-REC TO ELM-NO-DTL
MOVE HIT_COUNT IN SORTED-REC TO HIT-COUNT-DTL
MOVE SINCE_LAST IN SORTED-REC TO SINCE-LAST-DTL
MOVE PCT_HITS IN SORTED-REC TO PCT-HITS-DTL
MOVE AVE_BTWN IN SORTED-REC TO AVE-BTWN-DTL
MOVE REPORT-DETAIL TO RPT-DTL
WRITE RPT-DTL BEFORE ADVANCING 1 LINE
READ SORTED-FILE
AT END SET IT-IS-END-OF-FILE TO TRUE
END-READ
END-PERFORM.
CLOSE SORTED-FILE.
Paragraph to print headings for the main drawing numbers
Which are due.
C010-DRAWING-HEADINGS.
MOVE SPACES TO RPT-DTL.
MOVE REPORT-HDR1 TO RPT-DTL.
WRITE RPT-DTL BEFORE ADVANCING 2 LINES.
MOVE SPACES TO RPT-DTL.
MOVE REPORT-HDR2 TO RPT-DTL.
WRITE RPT-DTL BEFORE ADVANCING 1 LINE.
MOVE SPACES TO RPT-DTL.
MOVE REPORT-HDR3 TO RPT-DTL.
WRITE RPT-DTL BEFORE ADVANCING 1 LINE.
MOVE SPACES TO RPT-DTL.
MOVE REPORT-HDR4 TO RPT-DTL.
WRITE RPT-DTL BEFORE ADVANCING 1 LINE.
Paragraph to filter due numbers into sort file.
Creates a floating point temporary to compare against
floating point value from input file. When greater
record is released to the sort file.
S000-DSTAT-INPUT.
OPEN INPUT DRAW-STATS.
READ DRAW-STATS NEXT
AT END SET IT-IS-END-OF-FILE TO TRUE.
PERFORM UNTIL IT-IS-END-OF-FILE
MOVE SINCE_LAST IN DSTATS-REC TO FLT-1
IF FLT-1 >= AVE_BTWN IN DSTATS-REC
MOVE DSTATS-REC TO SORT-REC
RELEASE SORT-REC
END-IF
READ DRAW-STATS
AT END SET IT-IS-END-OF-FILE TO TRUE
END-READ
END-PERFORM.
CLOSE DRAW-STATS.
Paragraph to filter due numbers into sort file.
Creates a floating point temporary to compare against
floating point value from input file. When greater
record is released to the sort file.
S001-MSTAT-INPUT.
OPEN INPUT MEGA-STATS.
READ MEGA-STATS NEXT
AT END SET IT-IS-END-OF-FILE TO TRUE.
PERFORM UNTIL IT-IS-END-OF-FILE
MOVE SINCE_LAST IN MSTATS-REC TO FLT-1
IF FLT-1 >= AVE_BTWN IN MSTATS-REC
MOVE MSTATS-REC TO SORT-REC
RELEASE SORT-REC
END-IF
READ MEGA-STATS
AT END SET IT-IS-END-OF-FILE TO TRUE
END-READ
END-PERFORM.
CLOSE MEGA-STATS.
END PROGRAM COB_ZILL_DUE_REPORT_SUB.
Sorry for the way the "code" feature works in this editor.
Certain sections have to exist. Your program cannot do I-O without an INPUT-OUTPUT SECTION. This is where you map names to physical storage.
If you have an INPUT-OUTPUT SECTION then you have to have a FILE SECTION. This is where you define the record layout(s) of each named file. LABEL RECORDS are always STANDARD when dealing with disk data files and OMITTED when writing report text files. There are a few other clauses I don't remember. Please note the SD included in all of those FD statements. FD is File Definition and SD is Sort Definition.
If you are going to have any local variables you have to have a WORKING-STORAGE SECTION. You cannot declare variables on the fly, they all have to be declared here. This PSECT gets a DATA segment attribute among other things. If you call some service or something and it has a bad address, attempting to execute code within this PSECT the operating system will shoot your application out of the saddle.
All PSECTs created after PROCEDURE DIVISION are flagged EXEC, write protected. If you try to overwrite anything in here during execution the operating system will shoot your program out of the saddle. Any other program attempting to write here will also be shot out of the saddle.
Scan down to the SORT SORT-FILE in A000-MAIN. The COBOL sort routine is amazing. Notice that I provided an INPUT PROCEDURE and it is a paragraph. On IBM mainframes running ROSCOE back in the day this had to be an INPUT SECTION. They needed different attributes on the PSECT so the system sort routine could read/write.
Here is a snippet from another program in that book.
*
* FMS definitions
*
COPY 'COBFDVDEF' OF 'MEGA_TEXT_LIB'.
LINKAGE SECTION.
01 FMS-STUFF.
05 FMSSTATUS PIC S9(9) COMP.
05 RMSSTATUS PIC S9(9) COMP.
05 TCA PIC X(12).
05 WORKSPACE PIC X(12).
PROCEDURE DIVISION USING FMS-STUFF.
The linkage section creates a PSECT of sharable memory. When you call external routines which return values, they need to be here.You must also grant your PROCEDURE DIVISION access to various things it needs in the linkage section.
As you can see from this snippet later in the code
B010-USER-INPUT.
PERFORM C000-FORWARD-LOAD
CALL 'FDV$PUTAL' USING BY DESCRIPTOR SCREEN-REC.
MOVE SPACES TO WORK-STR.
CALL 'FDV$GETAL' USING BY DESCRIPTOR WORK-STR
BY REFERENCE TERMINATOR.
EVALUATE TERMINATOR
WHEN FDV$K_FK_E6 SET LOAD-FORWARD TO TRUE
WHEN FDV$K_FK_E5 SET LOAD-REVERSE TO TRUE
WHEN FDV$K_FK_F10 SET WE-ARE-DONE TO TRUE
END-EVALUATE.
you can pass any local variable you wish as long as you pass it correctly. It's the writing which needs special PSECT attributes.
It's late and I'm tired but I seem to remember you could could have USING clauses on SECTION declarations in the PROCEDURE DIVISION. The on-line documentation available for COBOL, at least that indexed by GOOGLE really is quite worthless. If you want more detailed information search for a circa 1980s COBOL textbook. It won't have any of the new stuff but it will answer many questions.
Here's a kind of bad tutorial on COBOL structure.

We use COBOL SECTION coding in all of our 37K MVS batch COBOL programs. We use this technique to get much faster run times and significantly reduced CPU overhead. This COBOL coding technique is very similar to high performance batch assembler.
Call it High Performance Functionally Structured COBOL programming
Once a SECTION is defined all PERFORM xxxxx will return at the next coded SECTION not the next paragraph in the SECTION. If paragraphs are coded ahead of the first SECTION then they can be executed normally. (But we don't allow this)
Using a SECTION has higher overhead than when using and PERFORM ing only paragraphs - U N L E S S - you use GOTO logic to bypass code that should be conditionally executed. Our rule is that a GOTO can only point to a Tag-Line in the same SECTION. (a paragraph) All paragraphs in a SECTION must be a sub function of the SECTION s function. The EXIT instruction is an assembler NOP instruction. It allow for a Tag-Line to be placed before the next SECTION - a fast exit/return.
Executing a PERFORM xxxx THRU yyyy has more CPU overhead than execution a SECTION without the GOTO s.
WARNING: Executing a PERFORM xxxx Tag-Line in a SECTION will fall thru all the code in the SECTION until the next SECTION is encountered. A GOTO Tag-Line outside of the current SECTION will fall thru all the code in the new landing SECTION until the next SECTION is encountered. (But we don't allow this)

Related

STM32CubeMX I2C code writing to reserved register bits

I'm developing an I2C driver on the STM32F74 family processors. I'm using the STM32CubeMX Low Level drivers and I can't make sense of the generated defines for I2C start and stop register values (CR2).
The code is generated in stm32f7xx_ll_i2c.h and is as follows.
/** #defgroup I2C_LL_EC_GENERATE Start And Stop Generation
* #{
*/
#define LL_I2C_GENERATE_NOSTARTSTOP 0x00000000U
/*!< Don't Generate Stop and Start condition. */
#define LL_I2C_GENERATE_STOP (uint32_t)(0x80000000U | I2C_CR2_STOP)
/*!< Generate Stop condition (Size should be set to 0). */
#define LL_I2C_GENERATE_START_READ (uint32_t)(0x80000000U | I2C_CR2_START | I2C_CR2_RD_WRN)
/*!< Generate Start for read request. */
My question is why is bit 31 included in these defines? (0x80000000U). The reference manual (RM0385) states "Bits 31:27 Reserved, must be kept at reset value.". I can't decide between modifying the generated code or keeping the 31 bit. I'll happily take recommendations simply whether its more likely that this is something needed or that I'm going to break things by writing to a reserved bit.
Thanks in advance!

I am guessing here because who knows what was on the minds of the library authors? (Not a lot if you look at the source code!). But I would guess that it is a "dirty-trick" to check that when calling LL functions you are using the specified macros.
However it is severely flawed because the "trick" is only valid for Cortex-M3/4 STM32 variants (e.g. F1xx, F2xx, F4xx) where the I2C peripheral is very different and registers such as I2C_CR2 are only 15 bits wide.
The trick is that the library functions have parameter checking asserts such as:
assert_param(IS_TRANSFER_REQUEST(Request));
Where the IS_TRANSFER_REQUEST is defined thus:
#define IS_TRANSFER_REQUEST(REQUEST) (((REQUEST) == I2C_GENERATE_STOP) || \
((REQUEST) == I2C_GENERATE_START_READ) || \
((REQUEST) == I2C_GENERATE_START_WRITE) || \
((REQUEST) == I2C_NO_STARTSTOP))
This forces you to use the LL defined macros as parameters and not some self-defined or calculated mask because they all have that "unused" check bit in them.
If that truly is the the reason, it is an ill-advised practice that did not envisage the newer I2C peripheral. You might think that the bit was stripped from the parameter before being written to the register. I have checked, it is not. And if did you would be paying for that overhead on every call, which is also undesirable.
As an error detection technique if that is what it is, it is not even applied consistently; for example all the GPIO_PIN_xx macros are 16 bits wide and since they are masks not pin numbers, using bit 31 could for example guard against passing a literal pin-number 10 where the mask 1<<10 is in fact required. Passing 10 would refer to pins 3 and 1 not 10. And to be honest that mistake is far more likely than, passing an incorrect I2C transfer request type.
In the end however "Reserved" generally means "unused but may be used in future implementations", and requiring you to use the "reset value" is a way of ensuring forward binary compatibility. If you had such a device no doubt there would be a corresponding library update to support it - but it would require re-compilation of the code. The risk is low and probably only a problem if you attempt to run this binary on a newer incompatible part that used this bits.

I agree with Clifford, the ST CubeMC / HAL / LL library code is, in places, some of the worst written code imaginable. I have a particular issue with lines such as "TIMx->CCER &= ~TIM_CCER_CC1E" where TIM_CCER_CC1e is defined as 0x0001 and the CCER register contains reserved bits that should remain at 0. There are hundreds of such examples all throughout the library code. ST remain silent to my request for advice.

What is the write SPI command for MX25R device

I want to Write data to this device and read from it.using the manual shown below.
For writing At first I though I should do those two commands:
1st command {0x06};//write enable command
2nd command {0x01,0x2F,0xEF,0xD8}; //write status register based on the table below
But then I saw The PP command which from Fig. 30 shown below which starts with 0x02.
So I assume that in order to store data on this device I need to add 0x02 to my sequence as following send MSB first )
1st command {0x06};//write enable command
2nd command {0x02,0x01,0x2F,0xEF,0xD8} // PP sequence and Write STATUS register the data 0x2F,0xEF,0xD8.
Have I assembled the sequence correctly for this command?
Thanks.
https://www.macronix.com/Lists/Datasheet/Attachments/7461/MX25R8035F,%20Wide%20Range,%208Mb,%20v1.6.pdf

Page programming (PP command 0x02) is not the same as Write Status Register (WRSR command 0x01), so no clearly you don't prepend the sequence with 0x02, since it will then be a PP command and will write data to the device's flash memory, rather the the status register.
WRSR timing diagram is Fig. 15 of the data sheet you linked. PP has no relevance here if WRSR is what you want to do. Conversely if you want to program the flash memory, that is not what WRSR does.
The device has registers for controlling its operation and checking its status, and it has flash memory for storing data - and different commands for accessing these.
Your sequence: 0x02,0x01,0x2F,0xEF,0xD8 will write a single byte 0xD8 to address 0x012FEF. The data sheet says that the LSB of the address should be zero, but does explain what happens when that is not the case, so it is well defined if ill-advised and unlikly to be what you intended. But thereagain it seems likley that writing 0x2FEFD8 to the Status Register was also not what you intended.
The datasheet does have some language issues to hinder you perhaps. For example the PP section uses the word "effort" where I think it intended "effect".

How to access net displacements in pyiron

Using pyiron, I want to calculate the mean square displacement of the ions in my system. How do I see the total displacement (i.e. not folded back by periodic boundary conditions) without dumping very frequently and checking when an atom passes over the boundary and gets wrapped?

Try to compare job['output/generic/unwrapped_positions'][-1] and job.structure.positions+job.output.total_displacements[-1]. If they deliver the same values, it's definitely fine both ways. If not, you can post the relevant lines in your notebook here.

I'd like to add a few comments to Jan's answer:
While job['output/generic/unwrapped_positions'] returns the unwrapped positions parsed from the output files, job.output.total_displacements returns the displacement of atoms calculated from each pair of consecutive snapshots. So if an atom moves more than half the box length in any direction, job.output.total_displacements will give wrong coordinates. Therefore, job['output/generic/unwrapped_positions'] is generally more trustworthy, but it is not available in all the codes (since some codes simply do not provide an output for unwrapped positions).
Moreover, if an interactive job is used, it is possible that job.structure.positions does not return the initial positions, i.e. job.structure.positions+job.output.total_displacements won't be initial positions + displacements.
So, in short, my answer to your question would be rather "Use job['output/generic/unwrapped_positions'] and if it's not available, use job.structure.positions+job.output.total_displacements but be aware of potential problems you might be running into."

Assembly code for optimized bitshifting of a vector

i'm trying to write a routine that will logically bitshift by n positions to the right all elements of a vector in the most efficient way possible for the following vector types: BYTE->BYTE, WORD->WORD, DWORD->DWORD and WORD->BYTE (assuming that only 8 bits are present in the result). I would like to have three routines for each type depending on the type of processor (SSE2 supported, only MMX suppported, only standard instruction se supported). Therefore i need 12 functions in total.
I have already found by myself how to backup and restore the registers that i need, how to make a loop, how to copy data into regular registers or MMX registers and how to shift by 1 position logically.
Because i'm not familiar with assembly language that's about it.
Which registers should i use for each instruction set?
How will the availability of the large vector (an image) in L1 cache be optimized?
How do i find the next element of the vector (a pointer kind of thing), i know i can make a mov by address and i assume i have to increment the address by 1, 2 or 4 depending on my type of data?
Although i have all the ideas, writing the code is a bit difficult at this point.
Thank you.
Arnaud.
Edit:
Here is what i'm trying to do for MMX for a shift by 1 on a DWORD:
__asm("push mm"); // backup register
__asm("push cx"); // backup register
__asm("mov %cx, length"); // initialize loop
__asm("loopstart_shift1:"); // start label
__asm("movd %xmm0, r/m32"); // get 32 bits data
__asm("psrlq %xmm0, 1"); // right shift 32 bits data logically (stuffs 0 on the left) by 1
__asm("mov r/m32,%xmm0"); // set 32 bits data
__asm("dec %cx"); // decrement index
__asm("cmp %cx,0");
__asm("jnz loopstart_shift1");
__asm("pop cx"); // restore register
__asm("pop mm"); // restore register
__asm("emms"); // leave MMX state

I strongly suggest you pause and take a look at using intrinsics with C or C++ instead of trying to write raw asm - that way the C/C++ compiler will take care of all the register allocation, instruction scheduling and general housekeeping tasks and you can just focus on the important parts, e.g. instead of using psrlq see _m_psrlq in mmintrin.h. (Better yet, look at using 128 bit SSE intrinsics.)

Sounds like you'd benefit from either using or looking into BitMagic's source. its entirely intrinsics based too, which makes its far more portable (though from the looks of it your using GCC, so it might have to get an MSVC to GCC intrinics mapping).

What are the most hardcore optimisations you've seen?

I'm not talking about algorithmic stuff (eg use quicksort instead of bubblesort), and I'm not talking about simple things like loop unrolling.
I'm talking about the hardcore stuff. Like Tiny Teensy ELF, The Story of Mel; practically everything in the demoscene, and so on.

I once wrote a brute force RC5 key search that processed two keys at a time, the first key used the integer pipeline, the second key used the SSE pipelines and the two were interleaved at the instruction level. This was then coupled with a supervisor program that ran an instance of the code on each core in the system. In total, the code ran about 25 times faster than a naive C version.

In one (here unnamed) video game engine I worked with, they had rewritten the model-export tool (the thing that turns a Maya mesh into something the game loads) so that instead of just emitting data, it would actually emit the exact stream of microinstructions that would be necessary to render that particular model. It used a genetic algorithm to find the one that would run in the minimum number of cycles. That is to say, the data format for a given model was actually a perfectly-optimized subroutine for rendering just that model. So, drawing a mesh to the screen meant loading it into memory and branching into it.
(This wasn't for a PC, but for a console that had a vector unit separate and parallel to the CPU.)

In the early days of DOS when we used floppy discs for all data transport there were viruses as well. One common way for viruses to infect different computers was to copy a virus bootloader into the bootsector of an inserted floppydisc. When the user inserted the floppydisc into another computer and rebooted without remembering to remove the floppy, the virus was run and infected the harddrive bootsector, thus permanently infecting the host PC. A particulary annoying virus I was infected by was called "Form", to battle this I wrote a custom floppy bootsector that had the following features:
Validate the bootsector of the host harddrive and make sure it was not infected.
Validate the floppy bootsector and
make sure that it was not infected.
Code to remove the virus from the
harddrive if it was infected.
Code to duplicate the antivirus
bootsector to another floppy if a
special key was pressed.
Code to boot the harddrive if all was
well, and no infections was found.
This was done in the program space of a bootsector, about 440 bytes :)
The biggest problem for my mates was the very cryptic messages displayed because I needed all the space for code. It was like "FFVD RM?", which meant "FindForm Virus Detected, Remove?"
I was quite happy with that piece of code. The optimization was program size, not speed. Two quite different optimizations in assembly.

My favorite is the floating point inverse square root via integer operations. This is a cool little hack on how floating point values are stored and can execute faster (even doing a 1/result is faster than the stock-standard square root function) or produce more accurate results than the standard methods.
In c/c++ the code is: (sourced from Wikipedia)
float InvSqrt (float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f3759df - (i>>1); // Now this is what you call a real magic number
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}

A Very Biological Optimisation
Quick background: Triplets of DNA nucleotides (A, C, G and T) encode amino acids, which are joined into proteins, which are what make up most of most living things.
Ordinarily, each different protein requires a separate sequence of DNA triplets (its "gene") to encode its amino acids -- so e.g. 3 proteins of lengths 30, 40, and 50 would require 90 + 120 + 150 = 360 nucleotides in total. However, in viruses, space is at a premium -- so some viruses overlap the DNA sequences for different genes, using the fact that there are 6 possible "reading frames" to use for DNA-to-protein translation (namely starting from a position that is divisible by 3; from a position that divides 3 with remainder 1; or from a position that divides 3 with remainder 2; and the same again, but reading the sequence in reverse.)
For comparison: Try writing an x86 assembly language program where the 300-byte function doFoo() begins at offset 0x1000... and another 200-byte function doBar() starts at offset 0x1001! (I propose a name for this competition: Are you smarter than Hepatitis B?)
That's hardcore space optimisation!
UPDATE: Links to further info:
Reading Frames on Wikipedia suggests Hepatitis B and "Barley Yellow Dwarf" virus (a plant virus) both overlap reading frames.
Hepatitis B genome info on Wikipedia. Seems that different reading-frame subunits produce different variations of a surface protein.
Or you could google for "overlapping reading frames"
Seems this can even happen in mammals! Extensively overlapping reading frames in a second mammalian gene is a 2001 scientific paper by Marilyn Kozak that talks about a "second" gene in rat with "extensive overlapping reading frames". (This is quite surprising as mammals have a genome structure that provides ample room for separate genes for separate proteins.) Haven't read beyond the abstract myself.

I wrote a tile-based game engine for the Apple IIgs in 65816 assembly language a few years ago. This was a fairly slow machine and programming "on the metal" is a virtual requirement for coaxing out acceptable performance.
In order to quickly update the graphics screen one has to map the stack to the screen in order to use some special instructions that allow one to update 4 screen pixels in only 5 machine cycles. This is nothing particularly fantastic and is described in detail in IIgs Tech Note #70. The hard-core bit was how I had to organize the code to make it flexible enough to be a general-purpose library while still maintaining maximum speed.
I decomposed the graphics screen into scan lines and created a 246 byte code buffer to insert the specialized 65816 opcodes. The 246 bytes are needed because each scan line of the graphics screen is 80 words wide and 1 additional word is required on each end for smooth scrolling. The Push Effective Address (PEA) instruction takes up 3 bytes, so 3 * (80 + 1 + 1) = 246 bytes.
The graphics screen is rendered by jumping to an address within the 246 byte code buffer that corresponds to the right edge of the screen and patching in a BRanch Always (BRA) instruction into the code at the word immediately following the left-most word. The BRA instruction takes a signed 8-bit offset as its argument, so it just barely has the range to jump out of the code buffer.
Even this isn't too terribly difficult, but the real hard-core optimization comes in here. My graphics engine actually supported two independent background layers and animated tiles by using different 3-byte code sequences depending on the mode:
Background 1 uses a Push Effective Address (PEA) instruction
Background 2 uses a Load Indirect Indexed (LDA ($00),y) instruction followed by a push (PHA)
Animated tiles use a Load Direct Page Indexed (LDA $00,x) instruction followed by a push (PHA)
The critical restriction is that both of the 65816 registers (X and Y) are used to reference data and cannot be modified. Further the direct page register (D) is set based on the origin of the second background and cannot be changed; the data bank register is set to the data bank that holds pixel data for the second background and cannot be changed; the stack pointer (S) is mapped to graphics screen, so there is no possibility of jumping to a subroutine and returning.
Given these restrictions, I had the need to quickly handle cases where a word that is about to be pushed onto the stack is mixed, i.e. half comes from Background 1 and half from Background 2. My solution was to trade memory for speed. Because all of the normal registers were in use, I only had the Program Counter (PC) register to work with. My solution was the following:
Define a code fragment to do the blend in the same 64K program bank as the code buffer
Create a copy of this code for each of the 82 words
There is a 1-1 correspondence, so the return from the code fragment can be a hard-coded address
Done! We have a hard-coded subroutine that does not affect the CPU registers.
Here is the actual code fragments
code_buff: PEA $0000 ; rightmost word (16-bits = 4 pixels)
PEA $0000 ; background 1
PEA $0000 ; background 1
PEA $0000 ; background 1
LDA (72),y ; background 2
PHA
LDA (70),y ; background 2
PHA
JMP word_68 ; mix the data
word_68_rtn: PEA $0000 ; more background 1
...
PEA $0000
BRA *+40 ; patched exit code
...
word_68: LDA (68),y ; load data for background 2
AND #$00FF ; mask
ORA #$AB00 ; blend with data from background 1
PHA
JMP word_68_rtn ; jump back
word_66: LDA (66),y
...
The end result was a near-optimal blitter that has minimal overhead and cranks out more than 15 frames per second at 320x200 on a 2.5 MHz CPU with a 1 MB/s memory bus.

Michael Abrash's "Zen of Assembly Language" had some nifty stuff, though I admit I don't recall specifics off the top of my head.
Actually it seems like everything Abrash wrote had some nifty optimization stuff in it.

The Stalin Scheme compiler is pretty crazy in that aspect.

I once saw a switch statement with a lot of empty cases, a comment at the head of the switch said something along the lines of:
Added case statements that are never hit because the compiler only turns the switch into a jump-table if there are more than N cases
I forget what N was. This was in the source code for Windows that was leaked in 2004.

I've gone to the Intel (or AMD) architecture references to see what instructions there are. movsx - move with sign extension is awesome for moving little signed values into big spaces, for example, in one instruction.
Likewise, if you know you only use 16-bit values, but you can access all of EAX, EBX, ECX, EDX , etc- then you have 8 very fast locations for values - just rotate the registers by 16 bits to access the other values.

The EFF DES cracker, which used custom-built hardware to generate candidate keys (the hardware they made could prove a key isn't the solution, but could not prove a key was the solution) which were then tested with a more conventional code.

The FSG 2.0 packer made by a Polish team, specifically made for packing executables made with assembly. If packing assembly isn't impressive enough (what's supposed to be almost as low as possible) the loader it comes with is 158 bytes and fully functional. If you try packing any assembly made .exe with something like UPX, it will throw a NotCompressableException at you ;)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas