Maybe I'm blind but i can't find, in S3 documentation, the maximum file name length that can be uploaded in S3.
As follows from the Amazon documentation,
These names are the object keys. The name for a key is a sequence of
Unicode characters whose UTF-8 encoding is at most 1024 bytes long.
The max filename length is 1024 characters. If the characters in the name require more than one byte in UTF-8 representation, the number of available characters is reduced.
Related
RFC 1952 (GZIP File Format Specification) section 2.3.1.1 reads:
2.3.1.1. Extra field
If the FLG.FEXTRA bit is set, an "extra field" is present in
the header, with total length XLEN bytes. It consists of a
series of subfields, each of the form:
+---+---+---+---+==================================+
|SI1|SI2| LEN |... LEN bytes of subfield data ...|
+---+---+---+---+==================================+
SI1 and SI2 provide a subfield ID, typically two ASCII letters
with some mnemonic value. Jean-Loup Gailly
<email#hidden> is maintaining a registry of subfield
IDs; please send him any subfield ID you wish to use. Subfield
IDs with SI2 = 0 are reserved for future use. The following
IDs are currently defined:
SI1 SI2 Data
---------- ---------- ----
0x41 ('A') 0x70 ('P') Apollo file type information
LEN gives the length of the subfield data, excluding the 4
initial bytes.
Do any subfield types exist beyond the AP given in the RFC? A web search doesn't find a list; neither is there any mention on GZip's Wikipedia page, the GNU homepage, in the gzip source code, or on Stack Overflow.
As far as I know, there is no such registry being maintained. Jean-loup no longer works on gzip.
Here is one more subfield in use:
The BGZF format (which is gzip-conformant) developed for use in bioinformatics, uses the subfield type "BC", to indicate the size of the current block. This is used to make parallel decompression easy.
From the specification at http://samtools.github.io/hts-specs/SAMv1.pdf :
Each BGZF block contains a standard gzip file header with the following standard-compliant extensions:
The F.EXTRA bit in the header is set to indicate that extra fields are present.
The extra field used by BGZF uses the two subfield ID values 66 and 67 (ASCII ‘BC’).
The length of the BGZF extra field payload (field LEN in the gzip specification) is 2 (two bytes of
payload).
The payload of the BGZF extra field is a 16-bit unsigned integer in little endian format. This integer
gives the size of the containing BGZF block minus one.
I need to print an encrypted string as is in a rdlc report. My problem is if the string contain a plus sign it creates a new line in the Textbox. How to avoid this?
Encryption produces output that is binary and contains many bytes that have no displayable representation.
Because of this if encrypted data needs to be displayed it is generally either Base64 (best for computers) or hexadecimal (best for people) encoded.
It seems that you may have base64 encoded encrypted data and that is generally composed of the upper and lowercase characters, the 10 digits, "+", "/" and "=". You can not delete these and expect to recover the encrypted data.
If these characters present a problem they can be many times be escaped in some manor or another encoding can be chosen such as hexadecimal or an alternate Base64 character set, see Base64. If you choose an alternate Base64 character set interoperability will most likely be impaired.
Note: More information would produce a better answer.
I had to replace the "+" with "÷".
Users don't notice is it since the PDF is just a visual representation of the CFDI, I haven't had any issues with it.
Microsoft's documentation on the varchar(max) data type:
"Variable-length, non-Unicode string data. . . max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes"
http://technet.microsoft.com/en-us/library/ms176089.aspx
I thought 2^31 bytes = 2 GB, not that 2^31-1 bytes = 2 GB.
Am I wrong on this point?
Two of the bytes are reserved for column overhead, so the question becomes:
How many characters will the data type store?
a) 2^31-3 = 2,147,483,645 bytes = 2,147,483,645 Characters
b) 2^31-2 = 2,147,483,646 bytes = 2,147,483,646 Characters
The number 2^31-1 is 0x7fffffff in hex. It's the largest possible positive 32-bit number on a twos-compliment machine (like the x86 and just about everything else).
The documentation is telling you that this is the maximum storage size, which has to hold the length of the data plus 2 bytes. This means that the maximum data size is 2^31-1-2, or 2,147,483,645 (0x7FFFFFFD).
Encoding with hexadecimal numbers seems to be different from using hexadecimals to represent numbers. For example, then hex number 0x40 to me should be equal to 64, or BA_{64}, but when I put it through this hex to base64 converter, I get the output: QA== which to me is equal to some number times 64. Why is this?
Also when I check the integer value of the hex string deadbeef I get 3735928559, but when I check it other places I get: 222 173 190 239. Why is this?
Addendum: So I guess it is because it is easier to break the number into bit chunks than treat it as a whole number when encoding? That is pretty confusing to me but I guess I get it.
You may wish to read this:
http://en.wikipedia.org/wiki/Base64
In summary, base64 specifies a specific encoding, which involves using different values for letters than their ASCII encoding.
For the second part, one source is treating the entire string as a 32 bit integer, and the other is dividing it into bytes and giving the value of each byte.
I'm not very experienced with lower level things such as howmany bytes a character is. I tried finding out if one character equals one byte, but without success.
I need to set a delimiter used for socket connections between a server and clients. This delimiter has to be as small (in bytes) as possible, to minimize bandwidth.
The current delimiter is "#". Would getting an other delimiter decrease my bandwidth?
It depends on what character encoding you use to translate between characters and bytes (which are not at all the same thing):
In ASCII or ISO 8859, each character is represented by one byte
In UTF-32, each character is represented by 4 bytes
In UTF-8, each character uses between 1 and 4 bytes
In ISO 2022, it's much more complicated
US-ASCII characters (of whcich # is one) will take only 1 byte in UTF-8, which is the most popular encoding that allows multibyte characters.
It depends on the encoding. In Single-byte character sets such as ANSI and the various ISO8859 character sets it is one byte per character. Some encodings such as UTF8 are variable width where the number of bytes to encode a character depends on the glyph being encoded.
The answer of course is that it depends. If you are in a pure ASCII env, then yes, every char takes 1 byte, but if you are in a Unicode env (all of Windows for example), then chars can range from 1 to 4 bytes in size.
If you choose a char from the ASCII set, then yes your delimter is a small as possible.
No, all characters are 1 byte, unless you're using Unicode or wide characters (for accents and other symbols for example).
A character is 1 byte, or 8 bits, long which gives 256 possible combination to form characters with. 1 byte characters are called ASCII characters. They only use 7 bits (even though 8 are available, but you can't use this 8th bit) to form the standard alphabet and various symbols used when teletypes and typewriters were still common.
You can find an ASCII chart and what numbers correspond to what characters here.