SBLineEntry is a proxy object in LLDB Python interface. SBLineEntry.GetColumn() returns point in a line, but I am not sure what it actually means.
In C++ side source, it resolves to LineEntry.column value, but it also lacks how it is measured in.
At first, I thought it as UTF-8 code unit offset. But it seems it isn't because when I measure it it looks like UTF-16 code unit offset. But I still couldn't find any definition for this value.
What is this value?
Raw byte offset in source code file?
UTF-8 code unit offset?
UTF-16 code unit offset?
Something else?
That's a good question! If the debug information is DWARF (except for Windows systems, it is), lldb is providing the DNS_LNS_set_column data from the DWARF line table as the number returned by SBLineEntry::GetColumn(). The DWARF5 specification doesn't say what this integer is counting -- it says only,
The DW_LNS_set_column opcode takes a single unsigned LEB128 operand and stores it in the column register of the state machine.
You're probably seeing that clang puts the UTF-16 code unit offset in the DWARF, but the standard doesn't require that. This would be a reasonable clarification request to file with the DWARF standards committee, http://dwarfstd.org
For the case of Rust programs, I think it's Unicode Scalar value offset.
Here's an open issue about column number. It says span_start function produces the column number.
span_start calls lookup_char_pos.
lookup_char_pos calls bytepos_to_file_charpos.
bytepos_to_file_charpos
They are repeating the word "char", and in Rust, "char" means Unicode Scalar Value.
Related
I know that by using 0:3 in this code in Pascal will put 3 decimal places to the result
var a,b:real;
begin
a:=23;
b:=7;
writeln(a/b:0:3);
readln;
end.
What I would like to know is if anyone has a source to learn what this : will do with other variables or if adding for example 0:3:4 will make a difference. Basically what : can do to a variable
For the exact definition of write parameters take a look at ISO standards 7185 and 10206, “Standard Pascal” and “Extended Pascal” respectively. These references are useless though if your compiler’s documentation does not make a statement regarding compliance with them. Other compilers have their own non-standard extensions, so the only reliable source of reference is your compiler’s documentation or even its source code if available.
[…] what this : will do with other variables […] Basically what : can do to a variable
As MartynA already noted this language is imprecise: The variables’ values are only read by write/writeLn/writeStr, thus leaving them unmodified.
[…] if adding for example 0:3:4 will make a difference.
To my knowledge a third write parameter is/was only allowed in PXSC, Pascal eXtensions for Scientific Computing. In this case the third parameter would indicate for the rounding mode (nonexistent or 0: closest printable number; greater than zero: round up; less than zero: round down).
I am a beginner to Unix programming and C and I have two questions regarding the stat struc and its field st_mode:
When accessing the st_mode field as below, what type of number is returned ( octal, decimal, etc.)?
struct stat file;
stat( someFilePath, &file);
printf("%d", file.st_mode );
I thought the number is in octal but when I ran this code, and I got the value 33188. What is the base?
I found out that the st_mode encodes a 16 bit binary number that represents the file type and file permissions. How do I get the 16-bit number from the above output (especially when it doesn't seem to be in octal). And which parts of the 16-bit digit encode which information?
Thanks for any help.
The actual type behind mode_t and how it encodes information is implementation defined. The only thing that's certain is that it's a bitmask.
To work with st_mode, use the flags and macros defined in the sys/stat.h header. For a list of those defines, consult:
man 2 stat
If you truly need to know what each bit represents, or are simply curious, read the header or use printf to inspect the flags.
Why does Microsoft tend to report "error codes" as hexadecimal values?
Error codes are 32-bit double word values (4 byte values.) This is likely the raw integer return code of whatever C-style function has reported an error.
However, why report the error to a user in hexadecimal? The "0x" prefix is worthless, and the savings in character length is minimal. These errors end up displayed to end users in Microsoft software and even on Microsoft websites.
For example:
0x80302010 is 10 characters long, and very cryptic.
2150637584 is the decimal equivalent, and much more user friendly.
Is there any description of the "standard" use of a 32-bit field as an error code mechanism (possibly dividing the field into multiple fields for developer interpretation) or of the logic behind presenting a hexadecimal code to end users?
We can only guess about the reason, so this question cannot be answered for sure. But let's guess:
One reason might be that with hex numbers, you know the number will have 8 digits. If it has more or less digits the number is "corrupt" (for example, the customer mistyped). With decimal numbers the number of digits for the same value varies.
Also, to a developer, hex numbers are more convenient and natural than decimal numbers. For example, if some info is coded as bit flags you can decipher them manually easily in hex numbers but not in decimal numbers.
It is a little bit subjective as to whether hexadecimal or decimal error codes are more user friendly. Here is a scenario where the hexadecimal error codes are significantly more convenient, which could be part of the reason that hexadecimal error codes are used in the first place.
Consider the documentation for Win32 Error Codes for Active Directory Service Interfaces, ADSI uses error codes with the format 0x8007XXXX, where the XXXX corresponds to a DWORD value that maps to a Win32 error code.
This makes it extremely easy to get the corresponding Win32 error code, because you can just strip off the last 4 digits. This would not be possible with a decimal error code representation.
The middle ground answer to this would be that formatting the number like an IPv4 address would be more luser-friendly while preserving some sort of formatting that helps the dev guys.
Although TBH I think hex is fine, the hypothetical non-technical user has no more idea what 0x1234ABCD means than 1234101112 or "Cracked gangle pin on fwip valve".
When we are using cryptography always we are seeing byte arrays are being used instead of String values. But when we are looking at the techniques of most of the cryptography algorithms they uses hex values to do any operations. Eg. AES: MixColumns, SubBytes all these techniques(I suppose it uses) uses hex values to do those operations.
Can you explain how these byte arrays are used in these operations as hex values.
I have an assignment to develop a encryption algorithm , therefore any related sample codes would be much appropriate.
Every four digits of binary makes a hexadecimal digit, so, you can convert back and forth quite easily (see: http://en.wikipedia.org/wiki/Hexadecimal#Binary_conversion).
I don't think I full understand what you're asking, though.
The most important thing to understand about hexadecimal is that it is a system for representing numeric values, just like binary or decimal. It is nothing more than notation. As you may know, many computer languages allow you to specify numeric literals in a few different ways:
int a = 42;
int a = 0x2A;
These store the same value into the variable 'a', and a compiler should generate identical code for them. The difference between these two lines will be lost very early in the compilation process, because the compiler cares about the value you specified, and not so much about the representation you used to encode it in your source file.
Main takeaway: there is no such thing as "hex values" - there are just hex representations of values.
That all said, you also talk about string values. Obviously 42 != "42" != "2A" != 0x2A. If you have a string, you'll need to parse it to a numeric value before you do any computation with it.
Bytes, byte arrays and/or memory areas are normally displayed within an IDE (integrated development environment) and debugger as hexadecimals. This is because it is the most efficient and clear representation of a byte. It is pretty easy to convert them into bits (in his mind) for the experienced programmer. You can clearly see how XOR and shift works as well, for instance. Those (and addition) are the most common operations when doing symmetric encryption/hashing.
So it's unlikely that the program performs this kind of conversion, it's probably the environment you are in. That, and source code (which is converted to bytes at compile time) probably uses a lot of literals in hexadecimal notation as well.
Cryptography in general except hash functions is a method to convert data from one format to another mostly referred as cipher text using a secret key. The secret key can be applied to the cipher text to get the original data also referred as plain text. In this process data is processed in byte level though it can be bit level as well. The point here the text or strings which we referring to are in limited range of a byte. Example ASCII is defined in certain range in byte value of 0 - 255. In practical when a crypto operation is performed, the character is converted to equivalent byte and the using the key the process is performed. Now the outcome byte or bytes will most probably be out of range of human readable defined text like ASCII encoded etc. For this reason any data to which a crypto function is need to be applied is converted to byte array first. For example the text to be enciphered is "Hello how are you doing?" . The following steps shall be followed:
1. byte[] data = "Hello how are you doing?".getBytes()
2. Process encipher on data using key which is also byte[]
3. The output blob is referred as cipherTextBytes[]
4. Encryption is complete
5. Using Key[], a process is performed over cipherTextBytes[] which returns data bytes
6 A simple new String(data[]) will return string value of Hellow how are you doing.
This is a simple info which might help you to understand reference code and manuals better. In no way I am trying to explain you the core of cryptography here.
I'm converting a WinForms app to Silverlight (VB.NET). What should I use instead of Char.ConvertFromUtf32 as it's not available to use in Silverlight?
UTF-32 is currently not part of Silverlight, so you have to find a way around the limitation. I think you should stop a moment and think exactly why you need to read UTF32-encoded text.
If you are reading such text from a database or a file on the server, I would perform the conversion server-side (if possible I would convert everything to UTF-8 and get rid of the UTF-32 data in one shot).
If you are parsing a user-provided file on the client side, I would detect the UTF-32 encoding and gently tell the user that the file encoding is not supported. UTF32 is pretty rare nowadays, so I guess it should not be a very common case (but I could be wrong not knowing your exact situation).
In order to detect the file encoding you have to look at the first few bytes (byte order mark) -more information here, if they are not present the task becomes much harder and involves some kind of heuristics based on character frequency.
From: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/how-to-convert-between-hexadecimal-strings-and-numeric-types
You can use a direct cast, like:
// Get the character corresponding to the integral value.
string stringValue = Char.ConvertFromUtf32(value);
char charValue = (char)value;
Small warning, it will only work up to 0xffff. It will not work for high range Unicode from 0x10000 to 0x10ffff.
Also, if you need to parse \uXXXX, try this other question: How do I convert Unicode escape sequences to Unicode characters in a .NET string?