What do the symbols mean in the second line - pdf

The teacher posed the following problem: there is some pdf; open it in wordpad (as a variant), look at the encoding:
% PDF-1.3
%·ѕ­Є
Question: What do the symbols mean in the second line and is there some vulnerability in these codes?

These four characters have no specific meaning. They were introduced in order to attempt to make sure that the file is treated as binary, and not as text.

Related

Where are named pdf characters defined like "f_f", "uni00D0" and "a204"?

I'm trying to read the official pdf specification "Document management — Portable document format — Part 1: PDF 1.7" (PDF32000_2008.pdf) as bytes and then interpret them according to that specification.
In Annex D, Character Sets and Encodings, there is a list of all named characters, like:
or
When I parse PDF32000_2008.pdf, there are also named characters like "f_f", "uni00D0" and "a204", which are missing in that specification.
My guess is that "f_f" is a symbol for two 'f' characters, which might get printed with a special glyph. There is a unicode "Latin Small Ligature Ff" for 'ff'.
For example, there is also "f_i" in that file, which I expect to mean 'fi', one glyph showing the 2 characters 'f' and 'i'. However, the pdf specification has 'fi' as named character "fi" and what is the point for having 2 named characters pointing to the same symbol ?
I can imagine that "uni00D0" means the unicode character 'Ð'. However, pdf defines it already as named character "Eth"
What could be "a204" ? Maybe Ansi 204 'Ì', which also has already a named character "Igrave" ?
Why do they use also "a62", which would be just a '<' ?
However, my main question is: Where can I find a specification for these additional named characters ?
Of course, Adobe Acrobat understands them, but also Gmail seems not to have a problem with them. So I guess, their meaning must be specified somewhere.

Does Xcode build comments code into its binary?

I comment some code in my project and don't want these to be built into my app's binary.
Does Xcode build comments code into its binary?
//Obj-C
//- (void)functionName {
//
//}
//Swift
//func functionName() {
//
//}
For Swift: From The Basics in the “The Swift Programming Language” (emphasis mine):
Use comments to include nonexecutable text in your code, as a note or reminder to yourself. Comments are ignored by the Swift compiler when your code is compiled.
For Objective-C: Objective-C is an extension of C, and the C 99 standard specifies in “5.1.1.2 Translation phases” (emphasis added):
3 The source file is decomposed into preprocessing tokens6) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment. Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined.
and in “6.4.9 Comments”:
1 Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it.
2 Except within a character constant, a string literal, or a comment, the characters // introduce a comment that includes all multibyte characters up to, but not including, the next new-line character. The contents of such a comment are examined only to identify multibyte characters and to find the terminating new-line character.
Short answer: No.
Long answer:
Every single SDK has a compiler that compiles code into machine language (aka, hexadecimal codes for each of the commands). So, all compilers will ignore comments 100%, so that it can compile codes faster.
In terms of Apple's app, it is bundled such way that in it is packed with all the assets (images, sounds, plist, that are able to be viewed by anybody with the .app file. This is the case where hackers were able to create exactly same app but with slightly different graphics/sounds and resubmit as their own.
Together with those assets, is the BINARY UNIX EXECUTABLE file, which if you open in a notepad, you will see gibberish (machine code cant be read by notepad). Example below is one of my app:

Informix 11.5 SQL Select Carriage Return and Line Feed

Informix 11.5
I am trying to search for carriage returns and line feeds that may exist in a VARCHAR field. First, I need a SELECT statement to show that they exist. Second, I need to REPLACE them with a space or other character.
I've tried all kinds of variations:
CHR(10) + CHR(13)
CHR(10) || CHR(13)
CHAR(13) + CHAR(10)
CHAR(13) || CHAR(10)
SELECT CHR(10) from systables;
Everything gives an error: Routine (chr) can not be resolved.
I've been searching all over and just can't find anything that works, and I'm sure this is crazy stupid easy.
Get the ASCII package from the IIUG
The CHR() function was added to IDS 11.70; it isn't in IDS 11.50.
The good news is you can add the function because IDS is an extensible server. The better news for you is that you can obtain the relevant code from the IIUG web site in the Software Archive under the Miscellaneous section as ascii.
That should allow you to do what you need. (Note: I wrote the code way back when — before there was support built into any of the servers.)
Windows makes things more complicated
I was uploading the ascii.unl file and I get an error that the number of columns do not match on line 13. Have you seen this before? I'm on Windows 2008. The errors are:
846: Number of values in load file is not equal to number of columns.
847: Error in load file line 13.
I hadn't seen it before, but I've not tried the file on Windows and … well, let's say life gets trickier on Windows than it is on Unix (and this bit isn't all that simple on Unix).
First of all, the data file needs to have CRLF line endings instead of the NL-only line endings that are standard on Unix. (Note that NL, newline, is another name for LF, line feed — aka '\n'.) For most lines in the unload file, that isn't a problem.
The two entries for which it might be (is) a problem are for CR and LF — entries 13 and 10 respectively. In theory, if the entry for line 10 contains (in C string notation) "10|\\\n\r\n" (that is, 10, pipe, backslash, newline, CRLF), all should be OK; the absence of an error message for line 10 suggests that it is OK.
Similarly, the entry for line 13 is "13|\r\r\n", which apparently causes grief. The simplest trial fix is to add a backslash here too: "13|\\\r\r\nn". The backslash says "the next character doesn't have a special meaning". If that doesn't work, we'll probably have to try hex-escape notation: "13|\\0d\r\n" — and use dbaccess -X to enable the hex escape notation.
With luck, one of those two (or both) will work. If neither works, come back and we'll try to think of something else.
As per my above comment:
I was uploading the ascii.unl file and I get an error that the number of columns do not match on line 13. Have you seen this before? I'm on Windows 2008. 846: Number of values in load file is not equal to number of columns. 847: Error in load file line 13.
Here is what I see in the ascii.unl file.
If I put this into MS Word and turn on Show Formatting/Paragraph marks, it shows this:

Minimal PDF size according to specs

I'm reading PDF specs and I have a few questions about the structure it has.
First of all, the file signature is %PDF-n.m (8 bytes).
After that the docs says there might be at least 4 bytes of binary data (but there also might not be any). The docs don't say how many binary bytes there could be, so that is my first question. If I was trying to parse a PDF file, how should I parse that part? How would I know how many binary bytes (if any) where placed in there? Where should I stop parsing?
After that, there should be a body, a xref table and a trailer and an %%EOF.
What could be the minimal file size of a PDF, assuming there isn't anything at all (no objects, whatsoever) in the PDF file and assuming the file doesn't contain the optional binary bytes section at the beginning?
Third and last question: If there were more than one body+xref+trailer sections, where would be offset just before the %%EOF be pointing to? The first or the last xref table?
First of all, the file signature is %PDF-n.m (8 bytes). After that the docs says there might be at least 4 bytes of binary data (but there also might not be any). The docs don't say how many binary bytes there could be, so that is my first question. If I was trying to parse a PDF file, how should I parse that part? How would I know how many binary bytes (if any) where placed in there? Where should I stop parsing?
Which docs do you have? The PDF specification ISO 32000-1 says:
If a PDF file contains binary data, as most do (see 7.2, "Lexical Conventions"), the header line shall be
immediately followed by a comment line containing at least four binary characters—that is, characters whose
codes are 128 or greater.
Thus, those at least 4 bytes of binary data are not immediately following the file signature without any structure but they are on a comment line! This implies that they are
preceded by a % (which starts a comment, i.e. data you have to ignore while parsing anyways) and
followed by an end-of-line, i.e. CR, LF, or CR LF.
So it is easy to recognize while parsing. In particular it merely is a special case of a comment line and nothing to treat specially.
(sigh, I just saw you and #Jongware cleared that in comments while I wrote this...)
What could be the minimal file size of a PDF, assuming there isn't anything at all (no objects, whatsoever) in the PDF file and assuming the file doesn't contain the optional binary bytes section at the beginning?
If there are no objects, you don't have a PDF file as certain objects are required in a PDF file, in particular the catalog. So do you mean a minimal valid PDF file?
As you commented you indeed mean a minimal valid PDF.
Please have a look at the question What is the smallest possible valid PDF? on stackoverflow, there are some attempts to create minimal PDFs adhering more or less strictly to the specification. Reading e.g. #plinth's answer you will see stuff that is not PDF anymore but still accepted by Adobe Reader.
Third and last question: If there were more than one body+xref+trailer sections, where would be offset just before the %%EOF be pointing to?
Normally it would be the last cross reference table/stream as the usual use case is
you start with a PDF which has but one cross reference section;
you append an incremental update with a cross reference section pointing to the original as previous, and the new offset before %%EOF points to that new cross reference;
you append yet another incremental update with a cross reference section pointing to the cross references from the first update as previous, and the new offset before %%EOF points to that newest cross reference;
etc...
The exception is the case of linearized documents in which the offset before the %%EOF points to the initial cross references which in turn point to the section at the end of the file as previous. For details cf. Annex F of ISO 32000-1.
And as you can of course apply incremental updates to a linearized document, you can have mixed forms.
In general it is best for a parser to be able to parse any order of partial cross references. And don't forget, there are not only cross reference sections but also alternatively cross reference streams.

reading unformatted data, intel ifort vs Ibm xlf

I'm trying to shift from intel ifort to IBM xlf, but when reading "unformatted output data"(unformatted I mean they are not the same length), there is problem. Here is an example:
program main
implicit none
real(8) a,b
open(unit=10,file='1.txt')
read (10,*) a
read (10,*) b
write(*,'(E20.14E2)'),a,b
close(10)
end program
1.txt:
0.10640229631236
8.5122792850319D-02
using ifort I get output:
0.10640229631236E+00
0.85122792850319E-01
using xlf I get output:
' in the input file. The program will recover by assuming a zero in its place.e invalid digit '
0.10640229631236E+00
0.85122792850319E-01
Since the data in the 1.txt is unformatted, I can't use a fixed format to read the data. Dose anyone know how to solve this warning?
(Question answered in the comments. See Question with no answers, but issue solved in the comments (or extended in chat) )
#M.S.B wrote:
Is there an apostrophe in the input file? Or any character besides digits, decimal point and "D"? Your reads are "list directed".
The OP Wrote:
Yes it seems to have some character after 0.10640229631236 that costs this warning. When I write those numbers to a new file by hand(change line after 0.10640229631236 by the enter key), this warning goes away. I cat -v these two files: With the warning file I get 0.10640229631236^M 8.5122792850319D-02 while the no warning files I get 0.10640229631236 8.5122792850319D-02 Do you know what that M stands for and where it comes from?
#agentp gave the link:
'^M' character at end of lines
Which explains that ^M is the windows character for carriage return