Encoding numbers - vb.net

I am a developer using high level languages. I usually take the lower level details for granted.
I read that standards such as ASCII and Unicode are for character encodings. A character has to be stored as a number. Is this the same for numbers? For example, if I declare a variable in .NET like this:
dim test as integer=5
In this case the value of test (5) will be represented as decimal: 49 according to this table. Is that correct?

If you code Dim test As String = "5" the value will be stored using the Unicode encoding for the character "5". However, Integers (and other numeric types) are not strings and are not encoded in that way, they are represented internally using their numeric value. An Integer is stored as a 32 bit value.

what you are asking about is data representation in memory.The way integers are represented depends on the whether they are signed or unsigned. IF they are signed (usually the case, unless you specify type as unsigned int or something equvalent) they are represented in binary in two's complement form: http://en.wikipedia.org/wiki/Two%27s_complement

Related

Kotlin: Convert Hex String to signed integer via signed 2's complement?

Long story short, I am trying to convert strings of hex values to signed 2's complement integers. I was able to do this in a single line of code in Swift, but for some reason I can't find anything analogous in Kotlin. String.ToInt or String.ToUInt just give the straight base 16 to base 10 conversion. That works for some positive values, but not for any negative numbers.
How do I know I want the signed 2's complement? I've used this online converter and according to its output, what I want is the decimal from signed 2's complement, not the straight base 16 to base 10 conversion that's easy to do by hand.
So, "FFD6" should go to -42 (correct, confirmed in Swift and C#), and "002A" should convert to 42.
I would appreciate any help or even any leads on where to look. Because yes I've searched, I've googled the problem a bunch and, no I haven't found a good answer.
I actually tried writing my own code to do the signed 2's complement but so far it's not giving me the right answers and I'm pretty at a loss. I'd really hope for a built in command that does it instead; I feel like if other languages have that capability Kotlin should too.
For 2's complement, you need to know how big the type is.
Your examples of "FFD6" and "002A" both have 4 hex digits (i.e. 2 bytes).  That's the same size as a Kotlin Short.  So a simple solution in this case is to parse the hex to an Int and then convert that to a Short.  (You can't convert it directly to a Short, as that would give an out-of-range error for the negative numbers.)
"FFD6".toInt(16).toShort() // gives -42
"002A".toInt(16).toShort() // gives 42
(You can then convert back to an Int if needed.)
You could similarly handle 8-digit (4-byte) values as Ints, and 2-digit (1-byte) values as Bytes.
For other sizes, you'd need to do some bit operations.  Based on this answer for Java, if you have e.g. a 3-digit hex number, you can do:
("FD6".toInt(16) xor 0x800) - 0x800 // gives -42
(Here 0x800 is the three-digit number with the top bit (i.e. sign bit) set.  You'd use 0x80000 for a five-digit number, and so on.  Also, for 9–16 digits, you'd need to start with a Long instead of an Int.  And if you need >16 digits, it won't fit into a Long either, so you'd need an arbitrary-precision library that handled hex…)

Objective C: Parsing JSON string

I have a string data which I need to parse into a dictionary object. Here is my code:
NSString *barcode = [NSString stringWithString:#"{\"OTP\": 24923313, \"Person\": 100000000000112, \"Coupons\": [ 54900012445, 499030000003, 00000005662 ] }"];
NSLog(#"%#",[barcode objectFromJSONString]);
In this log, I get NULL result. But if I pass only one value in Coupons, I get the results. How to get all three values ?
00000005662 might not be a proper integer number as it's prefixed by zeroes (which means it's octal, IIRC). Try removing them.
Cyrille is right, here is the autoritative answer:
The application/json Media Type for JavaScript Object Notation (JSON): 2.4 Numbers
The representation of numbers is similar to that used in most programming languages. A number contains an integer component that may be prefixed with an optional minus sign, which may be followed by a fraction part and/or an exponent part.
Octal and hex forms are not allowed. Leading zeros are not allowed.

Use of byte arrays and hex values in Cryptography

When we are using cryptography always we are seeing byte arrays are being used instead of String values. But when we are looking at the techniques of most of the cryptography algorithms they uses hex values to do any operations. Eg. AES: MixColumns, SubBytes all these techniques(I suppose it uses) uses hex values to do those operations.
Can you explain how these byte arrays are used in these operations as hex values.
I have an assignment to develop a encryption algorithm , therefore any related sample codes would be much appropriate.
Every four digits of binary makes a hexadecimal digit, so, you can convert back and forth quite easily (see: http://en.wikipedia.org/wiki/Hexadecimal#Binary_conversion).
I don't think I full understand what you're asking, though.
The most important thing to understand about hexadecimal is that it is a system for representing numeric values, just like binary or decimal. It is nothing more than notation. As you may know, many computer languages allow you to specify numeric literals in a few different ways:
int a = 42;
int a = 0x2A;
These store the same value into the variable 'a', and a compiler should generate identical code for them. The difference between these two lines will be lost very early in the compilation process, because the compiler cares about the value you specified, and not so much about the representation you used to encode it in your source file.
Main takeaway: there is no such thing as "hex values" - there are just hex representations of values.
That all said, you also talk about string values. Obviously 42 != "42" != "2A" != 0x2A. If you have a string, you'll need to parse it to a numeric value before you do any computation with it.
Bytes, byte arrays and/or memory areas are normally displayed within an IDE (integrated development environment) and debugger as hexadecimals. This is because it is the most efficient and clear representation of a byte. It is pretty easy to convert them into bits (in his mind) for the experienced programmer. You can clearly see how XOR and shift works as well, for instance. Those (and addition) are the most common operations when doing symmetric encryption/hashing.
So it's unlikely that the program performs this kind of conversion, it's probably the environment you are in. That, and source code (which is converted to bytes at compile time) probably uses a lot of literals in hexadecimal notation as well.
Cryptography in general except hash functions is a method to convert data from one format to another mostly referred as cipher text using a secret key. The secret key can be applied to the cipher text to get the original data also referred as plain text. In this process data is processed in byte level though it can be bit level as well. The point here the text or strings which we referring to are in limited range of a byte. Example ASCII is defined in certain range in byte value of 0 - 255. In practical when a crypto operation is performed, the character is converted to equivalent byte and the using the key the process is performed. Now the outcome byte or bytes will most probably be out of range of human readable defined text like ASCII encoded etc. For this reason any data to which a crypto function is need to be applied is converted to byte array first. For example the text to be enciphered is "Hello how are you doing?" . The following steps shall be followed:
1. byte[] data = "Hello how are you doing?".getBytes()
2. Process encipher on data using key which is also byte[]
3. The output blob is referred as cipherTextBytes[]
4. Encryption is complete
5. Using Key[], a process is performed over cipherTextBytes[] which returns data bytes
6 A simple new String(data[]) will return string value of Hellow how are you doing.
This is a simple info which might help you to understand reference code and manuals better. In no way I am trying to explain you the core of cryptography here.

Objective-C How to get unicode character

I want to get unicode code point for a given unicode character in Objective-C. NSString said it internal use UTF-16 encoding and said,
The NSString class has two primitive methods—length and characterAtIndex:—that provide the basis for all other methods in its interface. The length method returns the total number of Unicode characters in the string. characterAtIndex: gives access to each character in the string by index, with index values starting at 0.
That seems assume characterAtIndex method is unicode aware. However it return unichar is a 16 bits unsigned int type.
- (unichar)characterAtIndex:(NSUInteger)index
The questions are:
Q1: How it present unicode code point above UFFFF?
Q2: If Q1 make sense, is there method to get unicode code point for a given unicode character in Objective-C.
Thx.
The short answer to "Q1: How it present unicode code point above UFFFF?" is: You need to be UTF16 aware and correctly handle Surrogate Code Points. The info and links below should give you pointers and example code that allow you to do this.
The NSString documentation is correct. However, while you said "NSString said it internal use UTF-16 encoding", it's more accurate to say that the public / abstract interface for NSString is UTF16 based. The difference is that this leaves the internal representation of a string a private implementation detail, but the public methods such as characterAtIndex: and length are always in UTF16.
The reason for this is it tends to strike the best balance between older ASCII-centric and Unicode aware strings, largely due to the fact that Unicode is a strict superset of ASCII (ASCII uses 7 bits, for 128 characters, which are mapped to the first 128 Unicode Code Points).
To represent Unicode Code Points that are > U+FFFF, which obviously exceeds what can be represented in a single UTF16 Code Unit, UTF16 uses special Surrogate Code Points to form a Surrogate Pair, which when combined together form a Unicode Code Point > U+FFFF. You can find details about this at:
Unicode UTF FAQ - What are surrogates?
Unicode UTF FAQ - What’s the algorithm to convert from UTF-16 to character codes?
Although the official Unicode UTF FAQ - How do I write a UTF converter? now recommends the use of International Components for Unicode, it used to recommend some code officially sanctioned and maintained by Unicode. Although no longer directly available from Unicode.org, you can still find copies of the "no longer official" example code in various open-source projects: ConvertUTF.c and ConvertUTF.h. If you need to roll your own, I'd strongly recommend examining this code first, as it is well tested.
From the documentation of length:
The number returned includes the
individual characters of composed
character sequences, so you cannot use
this method to determine if a string
will be visible when printed or how
long it will appear.
From this, I would infer that any characters above U+FFFF would be counted as two characters and would be encoded as a Surrogate Pair (see the relevant entry at http://unicode.org/glossary/).
If you have a UTF-32 encoded string with the character you wish to convert, you could create a new NSString with initWithBytesNoCopy:length:encoding:freeWhenDone: and use the result of that to determine how the character is encoded in UTF-16, but if you're going to be doing much heavy Unicode processing, your best bet is probably to get familiar with ICU (http://site.icu-project.org/).

String type versus char in abap

What are the drawbacks of the String type in abap? When to use it, when not?
An example : I have a text field that should save values ranging from 0 to 12 chars, better to use a string or a Char(12)?
Thanks!
A string is stored as a dynamic array of characters while a char is statically allocated.
Some of the downsides of strings include:
Overhead - because they are dynamic the length must be stored in addition to the actual string.
The substring and offset operators don't work with strings.
Strings cannot be turned into translatable text elements.
So to answer your question, strings should only be used for fairly long values with a wide range of lengths where the additional overhead is negligible relative to the potential wasted space of a static char(x) variable.
I think CHAR is the best because you are 100% sure that the field has to only hold 0-12 characters.
string is the variable length Data type , while in char you have to define the length ..
for type C(Text field (alphanumeric characters)) and String X or hexadecimal string have initial value (X'0 … 0') .
to avoid initial value , and to use actual length C type is used
Strings are good when:
The text length will be variable.
Spaces are part of the string (trailing spaces in CHAR fields are lost)
You pass them around a lot (when STRING variable metadata is less than char field size)
You need to get the STRING length often. It is more optimal than with CHAR fields.
CHAR fields are good:
If they are small, they are fast (less than around 32 chars on unicode systems)
CHAR field literals using (') quotes instead of (`) can be made into translatable texts.
Things to remember:
All variables have metadata, but strings also has some internal pointer to the string data, which could add up to 64 bytes to memory consumption. Something to keep in mind.
When assigning a literal text to a variable, try to match the literal type to the variable type. Use 'test' for CHAR and `test` for STRING. This is usually slightly faster.
String Variable :
A String is a variable length data type which is used to store any length of data. Variable length fields are used because they save space.
String, can store any number of characters. String will allocate the memory at runtime which is also called as dynamic memory allocation, will allocate the memory as per the size of the string. Strings cannot be declared using parameters as the memory of allocation is dynamic.
But in your case, you already know max-length of field(0 - 12 characters), So CHAR type is best for use in your case. A STRING type generally used to variable length data or a long values.
Read more