varchar(255) v tinyblob v tinytext - sql

My side question is there really any difference between tinyblob & tinytext?
Buy my real question is what reason, if any, would I choose varchar(255) over tinyblob or tinytext?

Primarily storage requirements and memory handling/speed:
In the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value.
VARCHAR(M), VARBINARY(M):
L + 1
bytes if column values require 0 – 255
bytes,
L + 2 bytes if values may
require more than 255 bytes
TINYBLOB, TINYTEXT:
L + 1 bytes, where L < 28
Additionally, see this post:
For each table in use, MySQL allocates
memory for 4 rows. For each of these
rows CHAR(X)/VARCHAR(X) column takes
up the X characters.
A TEXT/BLOB on the other hand is
represented by a 8 byte pointer + a
1-4 byte length (depending on the
BLOB/TEXT type). The BLOB/TEXT is
allocated dynamicly on use. This will
use less memory, but in some cases it
may fragment your memory in the long
run.
Edit: As an aside, blobs store binary data and text stores ASCII, thats the only difference between TINYBLOB and TINYTEXT.

VARCHAR(255) is more SQL standard than tinyblob or tinytext. So your script, and application would be more portable across database vendors.

You can't apply CHARACTER SET to TINYTEXT, but you can to VARCHAR(255)

Related

Summation of 2 floating point values gives incorrect result with higher precision

The sum of 2 floating values in postgres gives the result with higher precision.
Expected:
669.05 + 1.64 = 670.69
Actual:
SELECT CAST(669.05 AS FLOAT) + CAST(1.64 AS FLOAT)
------------------
670.6899999999999
The result is having higher precision than expected.
The same operation with different set of inputs behaves differently.
SELECT CAST(669.05 AS FLOAT) + CAST(1.63 AS FLOAT)
------------------
670.68
Here I have reduced the problem statement by finding the 2 numbers for which the issue exists.
The actual problem is when I do this on a whole table the result would be very big with higher precisions (depending on the values, and I do not have an explanation for exactly what/kind of values the precision shoots up) and we had to handle the scale in the application level.
Example query
SELECT numeric_column_1/ CAST(numeric_column_2 AS FLOAT) FROM input_table;
Note: The behaviour is same for FLOAT(53) as well.
As per postgresql documentation, float uses inexact precision. Better to use DECIMAL or NUMERIC, which supports exact user specified precision.
SELECT CAST(669.05 AS numeric) + CAST(1.64 AS numeric)
Floating-Point Types in PostgreSQL
The data types real and double precision are inexact,
variable-precision numeric types. On all currently supported
platforms, these types are implementations of IEEE Standard 754 for
Binary Floating-Point Arithmetic (single and double precision,
respectively), to the extent that the underlying processor, operating
system, and compiler support it.
Numeric Types
Name
Storage Size
Description
Range
smallint
2 bytes
small-range integer
-32768 to +32767
integer
4 bytes
typical choice for integer
-2147483648 to +2147483647
bigint
8 bytes
large-range integer
-9223372036854775808 to +9223372036854775807
decimal
variable
user-specified precision, exact
up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
numeric
variable
user-specified precision, exact
up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
real
4 bytes
variable-precision, inexact
6 decimal digits precision
double precision
8 bytes
variable-precision, inexact
15 decimal digits precision
smallserial
2 bytes
small autoincrementing integer
1 to 32767
serial
4 bytes
autoincrementing integer
1 to 2147483647
bigserial
8 bytes
large autoincrementing integer
1 to 9223372036854775807
DB Fiddle: Try it here

SAS TO COBOL conversion variable declaration

Friends,
I am doing SAS to COBOL conversion.I am stuck with below declaration and conversion.So I am getting SOC7 in COBOL run.Please provide some solution.
IP in SAS - PD3.5
OP in SAS - z6.5
My COBOL declaration below.
IP s9.9(5);
OP .9(5);
Please suggest some solution..
Thanks a lot!!
Packed Decimal is stored one digit per nibble, which is two digits per byte, with the last nibble storing the sign. The sign nibbles C, A, F, and E are treated as positive; the sign nibbles B and D are treated as negative. Sign nibbles C and D are referred to as "preferred sign". A sign nibble of F is considered "unsigned," meaning it is neither positive nor negative, though pragmatically you can think of it as positive for arithmetic purposes. +123 is stored in two bytes as x'123C', -456 is stored as x'456D'.
The SAS PD informat specifies PDw.d where w is the width of the field in bytes and d is the number of decimal places to the right within the field. So PD3.5 is a 3 byte field (which would store 5 digits and a sign) with all 5 digits to the right of the decimal point.
To obtain the COBOL declaration for a SAS PDw.d declaration...
a = (w * 2) - 1
b = a - d
if b = 0
PIC SVd Packed-Decimal
else
PIC S9(b)Vd Packed-Decimal
The SAS Z format specifies Zw.d where w is the width of the field in bytes and d is the number of decimal places to the right within the field. The field will be padded with zeroes on the left to make it w bytes wide. So Z6.5 specifies a 6 byte output field with 5 bytes to the right of the decimal point. One byte is taken by the decimal point itself, and unfortunately there is no room for the sign, which may be a bug or may be intentional (perhaps all the data is known to be positive).
IP PIC Sv99999 Packed-Decimal.
OP PIC .99999.
When you MOVE IP TO OP the conversion from Packed Decimal to Zoned Decimal will be done for you by COBOL.

how many bits represent the value 2G

When we say 4K in hardware it is equal to the value 4096 which is 11 bits. What would be the value for 2G and how many bits represent this value?
Thanks
Often in CS we deal with number that are necessarily power of two (all addressable quantities for example).
In this context is it more useful to have prefixes that instead of being multiple of ten, like the decimal K = 10^3, M = 10^6, G = 10^9, are multiple of two.
Since the power of two closest to 1000, which is decimal K, is 1024 = 2^10, we can make the analogy that in CS K 1024 instead of 1000.
This is rather confusing as some quantities (like disk sizes or transmission channel parameters) are not bound to be power of two and can be given with either the decimal K or the CS K.
To avoid further confusion the CS now use appropriate binary prefixes, for example the CS K now is the Ki.
So as in decimal G is 10^9 = (10^3)^3 which you can think of as K^3 then G in binary (better called Gi) is Ki^3 = (2^10)^3 = 2^30.
To represent 4Ki quantities you need 12 bits as log2(4Ki) = log2(2^2 * 2^10) = 12.
To represent 2Gi quantities you need log2(2Gi) = log2(2 * 2^30) = 31 bits.
Note I used the phrase "To represent 4Ki quantities" rather then "To represent the 4Ki quantity", the latter is different and need one more bit. This is analogous to saying that to represent 1000 quantities we need 3 decimal digits (from 000 to 999) but to represent the number 1000 itself we need 4 digits (1, 0, 0 and 0).

What do the operators '<<' and '>>' do?

I was following 'A tour of GO` on http://tour.golang.org.
The table 15 has some code that I cannot understand. It defines two constants with the following syntax:
const (
Big = 1<<100
Small = Big>>99
)
And it's not clear at all to me what it means. I tried to modify the code and run it with different values, to record the change, but I was not able to understand what is going on there.
Then, it uses that operator again on table 24. It defines a variable with the following syntax:
MaxInt uint64 = 1<<64 - 1
And when it prints the variable, it prints:
uint64(18446744073709551615)
Where uint64 is the type. But I can't understand where 18446744073709551615 comes from.
They are Go's bitwise shift operators.
Here's a good explanation of how they work for C (they work in the same way in several languages).
Basically 1<<64 - 1 corresponds to 2^64 -1, = 18446744073709551615.
Think of it this way. In decimal if you start from 001 (which is 10^0) and then shift the 1 to the left, you end up with 010, which is 10^1. If you shift it again you end with 100, which is 10^2. So shifting to the left is equivalent to multiplying by 10 as many times as the times you shift.
In binary it's the same thing, but in base 2, so 1<<64 means multiplying by 2 64 times (i.e. 2 ^ 64).
That's the same as in all languages of the C family : a bit shift.
See http://en.wikipedia.org/wiki/Bitwise_operation#Bit_shifts
This operation is commonly used to multiply or divide an unsigned integer by powers of 2 :
b := a >> 1 // divides by 2
1<<100 is simply 2^100 (that's Big).
1<<64-1 is 2⁶⁴-1, and that's the biggest integer you can represent in 64 bits (by the way you can't represent 1<<64 as a 64 bits int and the point of table 15 is to demonstrate that you can have it in numerical constants anyway in Go).
The >> and << are logical shift operations. You can see more about those here:
http://en.wikipedia.org/wiki/Logical_shift
Also, you can check all the Go operators in their webpage
It's a logical shift:
every bit in the operand is simply moved a given number of bit
positions, and the vacant bit-positions are filled in, usually with
zeros
Go Operators:
<< left shift integer << unsigned integer
>> right shift integer >> unsigned integer

How structure padding works?

My Questing is regarding structure padding? Can any one tell me what's logic behind structure padding.
Example:
structure Node{
char c1;
short s1;
char c2;
int i1;
};
Can any one tell me how structure padding will apply on this structure?
Assumption: Integer takes 4 Byte.
Waiting for the answer.
How padding works depends entirely on the implementation.
For implementations where you have a two-byte short and four-byte int and types have to be aligned to a multiple of their size, you will have:
Offset Var Size
------ ---- ----
0 c1 1
1 ?? 1
2 s1 2
4 c2 1
5 ?? 3
8 i1 4
12 next
An implementation is free to insert padding between fields of a structure and following the last field (but not before the first field) for any reason whatsoever. The ability to pad after a structure is important for aligning subsequent elements in an array. For example:
struct { int i1; char c1; };
may give you:
Offset Var Size
------ ---- ----
0 i1 4
4 c1 1
5 ?? 3
8 next
Padding is usually done because either aligned data works faster, or misaligned data is illegal (some CPU architectures disallow misaligned access).
There is no simple answer to this, except "It depends".
It could be as little as 8 bytes, assuming two byte shorts, or it could take 12 bytes, or it could take 42 bytes on a suitably bizarre implementation. It depends on at least the underlying architecture, the compiler and the compiler flags. Check your tool's manual for information.
Inside a struct, each member's offset in memory is based on their size and alignment. Note that this is implementation specific
E.g. if char takes 1 byte, short takes 2 bytes and int takes 4 bytes:
structure Node{
char c1; // 1 byte
// 1 byte padding (next member requires 2 byte alignment)
short s1; // 2 bytes
char c2; // 1 byte
// 3 bytes padding (since next member requires 4 byte alignment)
int i1; // 4 bytes
};
This also depends on your compiler settings and architecture, and can also be modified.
If you packed this structure properly (by rearranging the order of members), you could fit it into 8 bytes, not 12 bytes (by switching c2 with s1).
The reason for alignment enforcement is that the hardware can do certain operations faster with data that have a natural alignment; otherwise it would have to perform some bitmasking, shifting and ORing to construct the data before operating on it.