What ELF section headers typically have no associated section? - elf

For the Executable and Linkable Format, sections describe information in an object file. In the ELF specification, it is mentioned that there may be section headers that are not followed by a section:
Every section in an object file has exactly one section header describing it. Section headers may
exist that do not have a section.
Furthermore, section headers of type SHT_NULL are considered 'inactive', and thus do not have an associated section:
This value marks the section header as inactive; it does not have an associated section.
Other members of the section header have undefined values.
I'm curious as to what sort of section headers typically do not have associated sections.The way it is phrased in the specification, there are reserved indices in the section header table; dictating reserved section indices that do not have sections:
Some section header table indexes are reserved; an object file will not have sections for these special
indexes.
However, the phrasing leads me to believe that there may be other section headers that usually have no associated sections. Am I mistaken? If not, what are they?

By the statement, section headers may exist that don't have a section, it means that it may not occupy space in the file. SHT_NOBITS is a type of section header that occupies zero bytes in the file.

Related

Can we set multiple DCM_SpecificCharacterSet while importing records using DICOM?

Currently, I am using the below code to set parameters to retrieve data from PACS.
DcmDataset findParams = DcmDataset();
findParams.putAndInsertString(DCM_QueryRetrieveLevel, "SERIES");
findParams.putAndInsertString(DCM_SpecificCharacterSet, "ISO_IR 192");
However, just wanted to check can we provide support multiple characters set to import data at the same time, Code will look like something below, I am trying to check whether this is possible or not as I dont have the facility to verify the same.
findParams.putAndInsertString(DCM_SpecificCharacterSet, "ISO_IR 192" ,"ISO_IR 100");
I think that what you want to express is that "this Query SCU can accept responses in the following character sets". This is plainly not possible. See a discussion in the DICOM newsgroup for reference. It ends with a proposal to add character set negotiation to the association negotiation. But such a supplement has not been submitted yet, and I am not aware of anyone working on it currently.
The semantics of the attribute Specific Character Set (0008,0005) in the context of the Query Retrieve Service Class:
PS3.4, C.4.1.1.3.1 Request Identifier Structure
Conditionally, the Attribute Specific Character Set (0008,0005). This Attribute shall be included if expanded or replacement character sets may be used in any of the Attributes in the Request Identifier. It shall not be included otherwise
I.e. it describes nothing but the character encoding of your request dataset.
and
C.4.1.1.3.2 Response Identifier Structure
Conditionally, the Attribute Specific Character Set (0008,0005). This Attribute shall be included if expanded or replacement character sets may be used in any of the Attributes in the Response Identifier. It shall not be included otherwise. The C-FIND SCP is not required to return responses in the Specific Character Set requested by the SCU if that character set is not supported by the SCP. The SCP may return responses with a different Specific Character Set.
I.e. you cannot control the character set in which the SCP will send you the responses. Surprising but a matter of fact.
Sending multiple values for the attribute is possible but has different semantics. It means that the request contains characters from different character sets which are switched using Code Extension Techniques as defined in ISO 2022. An illustrative example how this would look like and what it would mean is found in PS3.5, H.3.2
What implementors usually do to avoid character set compatibility issues is configuring "the one and only" character set for a particular installation (=hospital) in a locale configuration that is configured upon system setup. It works pretty well, for e.g. an installation in Russia will very likely support Cyrillic (ISO_IR 144) or UNICODE (ISO_IR 192) or both. In case of "both", you can select the character set that you prefer for configuring your system.

PDFBox- is the reading order guaranteed with PDFTextStripper's processTextPosition ?

I am using PdfTextStripper (PDFBox 1.8.2) to process every TextPosition in a pdf file. I have tested with a lot of files and I noticed that it processes text in the reading order. However, this does not hold good if a pdf has footers (the docx which I exported as pdf). The pdfTextStripper processes the footer first and then the body of the file.
Is this expected behavior ? Is there a way I can specify the order ? or is there any way I can identify its a footer and I can make the adjustment in my code ?
PdfTextStripper has an attribute SortByPosition (getSortByPosition & setSortByPosition). It's false by default.
If this attribute is false, the PdfTextStripper essentially extracts the text in the order in which it appears in the PDF page content stream.
This order can be totally mangled (because in the content stream you use operators which can position the next printed text anywhere on the page) but often text sections belonging together are kept together (because the operations required for such sections often are inserted in that stream as a block).
Headers and footers, though, often are added at the same time and, therefore, appear together before or after the main body text.
If this attribute is true, though, the PdfTextStripper essentially extracts the text from top to bottom, from left to right (unless the reading order is defined to be right to left). (Ok, ok, it also respects article beads, but you hardly can count on them being used in general.)
This order is good in case of one-column text, and headers come first and footers last, but unless proper article beads are used, multi-column pages get mangled up.
BTW, you can switch off the use of article beads using the attribute ShouldSeparateByBeads (getSeparateByBeads & setShouldSeparateByBeads).

How does [0] and [3] wơrk in ASN1?

I'm decoding ASN1 (as used in X.509 for HTTPS certificates). I'm doing pretty well, but there is a thing that I just cannot find and understandable documentation for.
In this JS ASN1 parser you see a [0] and a [3] under a SEQUENCE element, the first looking like this in data: A0 03 02 01 02 .... I want to know what this means and how to decode it.
Another example is Anatomy of an X.509 v3 Certificate, there is a [0] right after the first two SEQUENCE elements.
What I don't understand is how A0 fits with the scheme where the first 2 bits of the tag byte are a class, the next a primitive/constructed bit and the remaining 5 are supposed to be the tag type. A0 is 10100000 which means that the tag type value would be zero.
It sounds like you need an introduction to ASN.1 tagging. There are two angles to approach this from. X.690 defines BER/CER/DER encoding rules. As such, it answers the question of how tags are encoded. X.680 defines ASN.1 itself. As such, it defines the syntax and rules for tagging. Both specifications can be found on the ITU-T website. I'll give you a quick overview.
Tags are used in BER/DER/CER to identify types. They are especially useful for distinguishing the components of a SEQUENCE and the alternatives of a CHOICE.
A tag combines a tag class and a tag number. The tag classes are UNIVERSAL, APPLICATION, PRIVATE, and CONTEXT-SPECIFIC. The UNIVERSAL class is basically used for the built-in types. APPLICATION is typically used for user-defined types. CONTEXT-SPECIFIC is typically used for the components inside constructed types (SEQUENCE, CHOICE, SEQUENCE OF). Syntactically, when tags are specified in an ASN.1 module, they are written inside brackets: [ tag_class tag_number ]; for CONTEXT-SPECIFIC, the tag_class is omitted. Thus, [APPLICATION 10] or [0].
While every ASN.1 type has an associated tag, syntactically, there is also the "TaggedType", which is used by an ASN.1 author to specify the tag to encode a type with. Basically, a TaggedType puts a tag prefix ahead of a type. For example:
MyType ::= SEQUENCE {
field_with_tagged_type [0] UTF8String
}
The tag in a TaggedType is either explicit or implicit. If explicit, this means that I want the original tag to be explicitly encoded. If implicit, this means I am happy to have only the tag that I specified be encoded. In the explicit case, the BER encoding results in a nested TLV (tag-length-value): the outer tag ([0] in the example above), the length, and then another TLV as the value. In the example, this inner TLV would have a tag of [UNIVERSAL 12] for the UTF8String.
Whether the tag is explicit or implicit depends upon how you write the tag and the tagging environment. For example:
MyType2 ::= SEQUENCE {
field_with_explicit_tag [0] EXPLICIT UTF8String OPTIONAL,
field_with_implicit_tag [1] IMPLICIT UTF8String OPTIONAL,
field_with_tag [2] UTF8String OPTIONAL
}
If you specify neither IMPLICIT nor EXPLICIT, there are some rules that define whether the tag is explicit or implicit (see X.680 31). These rules take into consideration the tagging environment defined for the ASN.1 module. The ASN.1 module may specify the tagging environment as IMPLICIT TAGS, EXPLICIT TAGS, or AUTOMATIC TAGS. Roughly speaking, if you don't specify IMPLICIT or EXPLICIT for a tag, the tag will be explicit if the tagging environment is EXPLICIT and implicit if the tagging environment is IMPLICIT or AUTOMATIC. An automatic tagging environment is basically the same as an IMPLICIT tagging environment, except that unique tags are automatically assigned for members of SEQUENCE and CHOICE types.
Note that in the above example, the three components of MyType2 are all optional. In BER/CER/DER, a decoder will know what component is present based on the encoded tag (which obviously better be unique).
ASN.1 BER and DER use ASN.1 TAGS to unambiguously identify certain components in an encoded stream. There are 4 classes of ASN.1 tags: UNIVERSAL, APPLICATION, PRIVATE, and context-specific. The [0] is a context-specific tag since there is no tag class keword in front of it. UNIVERSAL is reserved for built-in types in ASN.1. Most often you see context specific tags to eliminate potential ambiguity in a SEQUENCE which contains OPTIONAL elements.
If you know you are receiving two items that are not optional, one after the other, you know which is which even if their tags are the same. However, if the first one is optional, the two must have different tags, or you would not be able to tell which one you had received if only one was present in the encoding.
Most often today, ASN.1 specification use "AUTOMATIC TAGS" so that you don't have to worry about this kind of disambiguation in messages since components of SEQUENCE, SET and CHOICE will automatically get context specific tags starting with [0], [1], [2], etc. for each component.
You can find more information on ASN.1 tags at http://www.oss.com/asn1/resources/books-whitepapers-pubs/asn1-books.html where two free downloadable books are available.
Another excellent resource is http://asn1-playground.oss.com where you can try variations of ASN.1 specifications with different tags in an online compiler and encoder/decoder. There you can see the effects of tag changes on encodings.
I finally worked through this and thought that I would provide some insight for anyone still trying to understand this. In my example, as in the one above, I was using an X.509 certificate in DER format. I came across the "A0 03 02 01 02" sequence and could not figure out how that translated to a version number of 2. So if you are having the same problem, here is how that works.
The A0 tells you it is a "Context-Specific" field, a "Constructed" tag, and has the type value of 0x00. Immediately, the context-specific tells you not to use the normal type fields for DER/BER. Instead, given this is a X.509 certificate, the type value is labeled in the RFC 5280, p 116. There you will see four fields with markers on them of [0], [1], [2], and [3], standing for "version", "issuerUniqueID", "subjectUniqueID", and "extension", respectively. So in this case, a value of A0 tells you that this is one of the X.509 context-specific fields, specifically the "version" type. That takes care of the "A0" value.
The "03" value is just your length, as you might expect.
Since this was identified as "Constructed", the data should represent a normal DER/BER object. The "02 01 02" is the actual version number you are looking for, expressed as an Integer. "02" is the standard BER encoding of Integer, "01" is your length, and "02" is your value, or in this case, your version number.
So given that X.509 defines 4 context-specific types, you should expect to see "A0", "A1", "A2", and "A3" anywhere in the certificate. Hopefully the information provided above will now make more sense and help you better understand what those marker represent.
[0] is a context-specific tagged type, meaning that to figure out what meaning it gives to the fields (if the "Constructed" flag is set) or data value (if "Constructed" flag is not set) it wraps; you have to know in what context it appears in.
In addition, you also need to know what kind of object the sender and receiver are exchanging in the DER stream, ie. the "ASN.1 module".
Let's say they're exchanging a Certificate Signing Request, and [0] appears as the 4th field inside a SEQUENCE inside the root SEQUENCE:
SEQUENCE
SEQUENCE
INTEGER 0
SEQUENCE { ... }
SEQUENCE { ... }
[0] { ... }
}
}
Then according to RFC2968, which defines the DER contents for Certificate Signing Request, Appendix A, which defines the ASN.1 Module, the meaning of that particular field is sneakily defined as "Attributes" and "Should have the Constructed flag set":
attributes [0] Attributes{{ CRIAttributes }}
You can also go the other way and see that "attributes" must be the 4th field inside the first sequence inside the root sequence and tagges as [0] by looking at the root sequence definition (section 4: "the top-level type CertificationRequest"), finding the CertificationRequestInfo placement inside that, and finding where the "attributes" item is located inside the CertificationRequestInfo, and finally seeing how it is tagged.

Can you store a theorem number in a variable?

I use \newtheorem and the numbering is done automatically. Sometimes in the text I'll refer to a theorem by this number. I'd like to have a variable equal to this number, so if the theorem number changes, the references will change also.
Yes, it works through the usual \label/\ref-mechanism:
\begin{theorem}\label{thm:foo} ...
That was Theorem~\ref{thm:foo}
(You'll need two runs of LaTeX for the number to settle, you'll get a message about changed references.) Label commands "tack onto" certain things like section headers, captions, items of enumerations and, indeed, theorems and friends.
There are also extensions that can automatically distinguish sections from subsections or figures, for that, see hyperref's \autoref or the cleveref package, but don't worry about it at this point.
You need to put a \label between the \begin{yourtheorem} \end{yourtheorem} and use \ref to refer to it as usual.
You can check this link for explanations with some broader context about theorems

How to escape a line break literal in the HTTP header?

In the HTTP header, line breaks are tokens to separate fields in the header.
But, if I wan't to send a line break literal in a custom field how should I escape it?
If you are designing your own custom extension field, you may use BASE64 or quoted-printable to escape(and unescape) the value.
The actual answer to this question is that there is no standard for encoding line breaks.
You can use any Binary-to-text encoding such as URL-Encoding or Base64, but obviously that's only going to work if both sender and receiver implement the same method.
RFC 2616 did allow to 'fold' (i.e. wrap) header values over multiple lines, but the line breaks were treated as a single space character and not part of the parsed field value.
However, that specification has been obsoleted by RFC 7230 which forbids folding:
Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one space or horizontal tab (obs-fold).
This specification deprecates such line folding except within the message/http media type (Section 8.3.1).
A sender MUST NOT generate a message that includes line folding
A standard for line breaks in HTTP Header field values is not – and never was – established.
According to RFC2616 4.2 Message Headers:
Header fields can be extended over
multiple lines by preceding each extra
line with at least one SP or HT.
where SP means a space character (0x20) and HT means a horizontal tab character (0x09).
The idea is, that HTTP is ASCII-only and newlines and such are not allowed. If both sender and receiver can interpret YOUR encoding then you can encode whatever you want, however you want. That's how DNS international names are handled with the Host header (it's called PUNYCODE).
Short answer is: You don't, unless you control both sender and receiver.
If it's a custom field how you escape it depends entirely on how the targetted application is going to parse it. If this is some add on you created you could stick with URL encoding since it's pretty tried and true and lots of languages have encoding/decoding methods built in so your web app would encode it and your plug in (or whatever you're working on) would decode it.