How to specify min and max digits for a bank account number? - sql

Is it possible to constrain the number of digits allowed in a column of integer data type in PostgreSQL. I have the following example:
CREATE TABLE bank_accounts (
id SERIAL PRIMARY KEY
, number_account INTEGER(26) NOT NULL
);
We can enter something like:
1 -- one digit
23 -- two digits
444 -- three digits
5555 -- four digits
Etc. ... up to 26 digits.
But I want to constrain my column to store exactly 26 digits, not less and not more. How to achieve that?

A bank account number is not an integer by nature. 26 decimal digits are too much for integer or bigint anyway.
A bank account number is not a numeric value at all, really, even if we could use the type numeric for storage. It can handle 26 decimal digits easily. But it also allows fractional digits (and other decorators, like #klin commented). You can restrict to numeric(26), which is short for numeric(26,0), to remove fractional digits from storage. But that still allows fractional digits on input, which are then rounded off. And other decorators. All of these seem undesirable for a bank account number:
SELECT numeric(26) '12345678901234567890123456'
, numeric(26) '12345678901234567890123456.4' -- rounded down
, numeric(26) '12345678901234567890123456.5' -- rounded up
, numeric(26) '1e25'
, numeric(26) '1.2345e25'
, numeric(26) '+12345678901234567890123456.5'
SELECT numeric(26) '99999999999999999999999999.5' -- error after rounding up
A bank account number is more like text by nature, so data type text seems more appropriate (like #klin provided), even if that occupies a bit more space on disk (like #a_horse mentioned). 27 bytes vs. 17 bytes for numeric - or 30 vs. 20 bytes in RAM. See:
What is the overhead for varchar(n)?
However, you would not want to apply collation rules to bank account numbers. That happens with collatable types like text or varchar if your DB cluster runs with a non-C locale. Would be a void effort for only digits to begin with. But you still get slower sorting and slower indexes etc. Notably, the "abbreviated keys" feature in Postgres 9.5 or later is currently (incl. Postgres 10) disabled for non-C locales.
Putting everything together, I suggest:
CREATE TABLE bank_account (
bank_account_id serial PRIMARY KEY
-- bank_account_id integer PRIMARY KEY GENERATED ALWAYS AS IDENTITY -- in Postgres 10+
, number_account text COLLATE "C" NOT NULL -- disable collation rules
, CONSTRAINT number_account_has_26_digits CHECK (number_account ~ '^\d{26}$')
);
Asides:
Consider an IDENTITY column instead of the serial in in Postgres 10+. Details:
https://blog.2ndquadrant.com/postgresql-10-identity-columns/
INTEGER(26) is not valid syntax in Postgres, where the integer data type has no modifiers. You can chose from int2, int4 (default integer) and int8, though - the dangling number signifying occupied bytes, not the number of digits allowed.

The maximum integer value is 2147483647, maximum bigint is 9223372036854775807. You cannot use integer types for the column.
It seems that the simplest way is to define the column as text with a check constraint:
CREATE TABLE bank_accounts (
id serial primary key,
number_account text not null check (number_account ~ '^\d{26}$')
);
The regular expression used in the check constraint means a string with exactly 26 digits.

Related

I have created CHECK constraint but receiving error message

create sequence student_studentid_seq
increment by 10
start with 100
nocycle;
create table student
(studentid number(10),
name varchar2(30) not null,
ss# number(9) unique,
gpa number(2,3) not null,
constraint student_studentid_pk PRIMARY KEY (studentid),
constraint student_gpa_ck CHECK (GPA >= 0) );
insert into student (studentid, name, ss#, gpa)
values(student_studentid_seq.NEXTVAL,'Draze Katan', 323456789,1);
receiving error message:
Error starting at line 29 in command:
insert into student (studentid, name, ss#, gpa)
values(student_studentid_seq.NEXTVAL,'Draze Katan', 323456789,1)
Error report:
SQL Error: ORA-01438: value larger than specified precision allowed for this column
01438. 00000 - "value larger than specified precision allowed for this column"
*Cause: When inserting or updating records, a numeric value was entered
that exceeded the precision defined for the column.
*Action: Enter a value that complies with the numeric column's precision,
or use the MODIFY option with the ALTER TABLE command to expand
the precision.
So it appears error message is for next constraint:constraint student_gpa_ck CHECK (GPA >= 0) );
In insert statement if I enter '0' for GPA raw will be inserted but anything more I will receive error message.
This is one of my exercise questions, I can't figure out. I just need hint where mistake is not full resolution. Please if you could help me out.
The issue is in the way you create the table, in particular in the column GPA.
You are using number(2, 3), which looks like "build a number with 2 total digits and 3 decimal digits".
In oracle documentation you find a better explanation about the NUMBER data type, its attributes and what things like number(2,3) mean:
Specify a fixed-point number using the following form:
NUMBER(p,s) where:
p is the precision, or the maximum number of significant decimal
digits, where the most significant digit is the left-most nonzero
digit, and the least significant digit is the right-most known digit.
Oracle guarantees the portability of numbers with precision of up to
20 base-100 digits, which is equivalent to 39 or 40 decimal digits
depending on the position of the decimal point.
s is the scale, or the number of digits from the decimal point to the
least significant digit. The scale can range from -84 to 127.
Positive scale is the number of significant digits to the right of the
decimal point to and including the least significant digit.
Negative scale is the number of significant digits to the left of the
decimal point, to but not including the least significant digit. For
negative scale the least significant digit is on the left side of the
decimal point, because the actual data is rounded to the specified
number of places to the left of the decimal point. For example, a
specification of (10,-2) means to round to hundreds.
Scale can be greater than precision, most commonly when e notation is
used. When scale is greater than precision, the precision specifies
the maximum number of significant digits to the right of the decimal
point. For example, a column defined as NUMBER(4,5) requires a zero
for the first digit after the decimal point and rounds all values past
the fifth digit after the decimal point.
For example:
SQL> create table tabError( a number (2, 3));
Table created.
SQL> insert into tabError values (1);
insert into tabError values (1)
*
ERROR at line 1:
ORA-01438: value larger than specified precision allowed for this column
SQL> insert into tabError values (0.1);
insert into tabError values (0.1)
*
ERROR at line 1:
ORA-01438: value larger than specified precision allowed for this column
SQL> insert into tabError values (0.01);
1 row created.
If you need 2 digits for the integer part and 3 for decimals, you need number(5, 3) or, according to Mathguy's comment, if you need numbers with one integer digit and 2 decimals, you need number(3,2).

Is "NUMBER" and "NUMBER(*,0)" the same in Oracle?

In Oracle documentation it is mentioned that
NUMBER (precision, scale)
If a precision is not specified, the column stores values as given. If
no scale is specified, the scale is zero.
But NUMBER (without precision and scale) is also accepting floating point numbers (34.30) but according to documentation if scale is not specified it should be zero scale by default so it should allow only integers, am I wrong?.
And in another questions it is mentioned that
default precision is 38, default scale is zero
So NUMBER and NUMBER(*,0) should be equal but they are not.
Where am I wrong?
I think the sentence in the documentation
If a precision is not specified, the column stores values as given. If no scale is specified, the scale is zero.
is a bit confusing. The scale is zero if a precision is specified and a scale is not specified. So, for example, NUMBER(19) is equivalent to NUMBER(19,0). NUMBER, by itself, will have 38 digits of precision but no defined scale. So a column defined as a NUMBER can accept values of any scale, as long as their precision is 38 digits or less (basically, 38 numerical digits with a decimal point in any place).
You can also specify a scale without a precision: NUMBER(*, <scale>), but that just creates the column with 38 digits of precision so I'm not sure it's particularly useful.
The table How Scale Factors Affect Numeric Data Storage on this page might be helpful.
The default of scale is not zero, which has no value in it. Hence it can accept any value between -84 to 127. If you limit it to zero then it will not accept any presicion even the value contains the scale value
create table aaaaa
(
sno number(*,0),
sno1 number
);
The user_tab_columns will give you the value of your precision and scale
SQL> select column_name,data_precision,data_scale from user_tab_columns where ta
ble_name = 'AAAAA';
COLUMN_NAME DATA_PRECISION DATA_SCALE
------------------------------ -------------- ----------
SNO 0
SNO1
SQL>
Please find the below workings
SQL> select * from aaaaa;
no rows selected
SQL> insert into aaaaa values (123.123123,123123.21344);
1 row created.
SQL> commit;
Commit complete.
SQL> select * from aaaaa;
SNO SNO1
---------- ----------
123 123123.213
SQL>
This parts of Oracle documentation makes it absolutely clear:
Specify an integer using the following form:
NUMBER(p)
This represents a fixed-point number with precision p and scale 0 and is equivalent to NUMBER(p,0).
and
Specify a floating-point number using the following form:
NUMBER
The absence of precision and scale designators specifies the maximum range and precision for an Oracle number.
The meaning of the star precision is documented here and means the precision of 38
Another squirrelly case but one I faced... if the table you're inserting into contains a trigger, you should probably examine if any of its procedural flow includes attempting to convert something into a NUMBER...

which datatype to use to store a mobile number

Which datatype shall I use to store mobile numbers of 10 digits (Ex.:9932234242). Shall I go for varchar(10) or for the big one- the "bigint".
If the number is of type- '0021-23141231' , then which datatype to use?
varchar/char long enough for all expected (eg UK numbers are 11 long)
check constraint to allow only digits (expression = NOT LIKE '%[^0-9]%')
format in the client per locale (UK = 07123 456 789 , Switzerland = 071 234 56 78)
As others have answered, use varchar for data that happens to be composed of numeric digits, but for which mathematical operations make no sense.
In addition, in your example number, did you consider what would happen if you stored 002123141231 into a bigint column? Upon retrieval, it would be 2123141231, i.e. there's no way for a numeric column to store leading 0 digits...
Use varchar with check constraint to make sure that only digits are allowed.
Something like this:
create table MyTable
(
PhoneNumber varchar(10)
constraint CK_MyTable_PhoneNumber check (PhoneNumber like '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]')
)
if it is always the same length you might want to use char instead.
varchar(50) is good for mobile number data type . because it may sometimes contain country code for example +91 or spaces also. For comparison purpose we can remove all special characters from both side in the expresion.

What does the specified number mean in a VARCHAR() clause?

Just to clarify, by specifying something like VARCHAR(45) means it can take up to max 45 characters? I remember I heard from someone a few years ago that the number in the parenthesis doesn't refer to the number of characters, then the person tried to explain to me something quite complicated which I don't understand and forgot already.
And what is the difference between CHAR and VARCHAR? I did search around a bit and see that CHAR gives you the max of the size of the column and it is better to use it if your data has a fixed size and use VARCHAR if your data size varies.
But if it gives you the max of the size of the column of all the data of this column, isn't it better to use it when your data size varies? Especially if you don't know how big your data size is going to be. VARCHAR needs to specify the size (CHAR don't really need right?), isn't it more troublesome?
You also have to specify the size with CHAR. With CHAR, column values are padded with spaces to fill the size you specified, whereas with VARCHAR, only the actual value you specified is stored.
For example:
CREATE TABLE test (
char_value CHAR(10),
varchar_value VARCHAR(10)
);
INSERT INTO test VALUES ('a', 'b');
SELECT * FROM test;
The above will select "a " for char_value and "b" for varchar_value
If all your values are about the same size, the CHAR is possibly a better choice because it will often require less storage space than VARCHAR. This is because VARCHAR stores both the length of the value and the value itself, whereas CHAR can just store the (fixed-size) value.
The MySQL documentation gives a good explanation of the storage requirements of the various data types.
In particular, for a string of length L, a CHAR(M) datatype will take up (M x c) bytes (where c is the number of bytes required to store a character... this depends on the character set in use).
A VARCHAR(M) will take up (L + 1) or (L + 2) depending on whether M is <=255 or >255.
So, it really depends on how long you expect your strings to be, what the variation in length will be.
NB: The documetation doesn't discuss the impact of character sets on the storage requirements of a VARCHAR type. I've tried to quote it accurately, but my guess is that you would need to multiply the string length by the character byte-width as well to get the storage requirement.
The complicated stuff you don't remember is that the 45 refer to bytes, not chars. It's not the same if you are using a multibyte character encoding. In Oracle you can specify bytes or chars explicitly.
varchar2(45 BYTE)
or
varchar2(45 CHAR)
See Difference between BYTE and CHAR in column datatypes
char and varchar actually becomes irrelevant if you have just 1 variable length field in your table, like a varchar or text. Mysql will automatically change all char to varchar.
The fixed length/size record can give you extra performance, but you can't use any variable length field types. The reason is that it will be quicker and easier for mysql to find the next record.
For example, if you do a SELECT * FROM table LIMIT 10, mysql has to scan the table file for the tenth record. This means finding the end of each record until you find the end of the 10th record. But if your table has fixed length/size records, mysql just needs to know the record size and then skip 10 x #bytes.
If you know a column will contain a small, fixed number of chars use a CHAR, otherwise use a varchar. A CHAR column is padded to the max length.
VARCHAR has a small overhead (4-8 bytes depending on RDBMS), but only uses the overhead + the actual number of chars stored.
For the values you know they are going to be constant, for example for Phone Numbers, Zip Codes etc., It is optimal to use "char" for sure.

Which data type saves more space TINYTEXT or VARCHAR for variable data length in MySQL?

I need to store a data into MySQL. Its length is not fixed, it could be 255 or 2 characters long. Should I use TINYTEXT or VARCHAR in order to save space (speed is irrelevant)?
When using VARCHAR, you need to specify maximum number of characters that will be stored in that column. So, if you declare a column to be VARCHAR(255), it means that you can store text with up to 255 characters. The important thing here is that if you insert two characters, only those two characters will be stored, i.e. allocated space will be 2 not 255.
TINYTEXT is one of four TEXT types. They are very similar to VARCHAR, but there are few differences (this depends on MySQL version you are using though). But for version 5.5, there are some limitations when it comes to TEXT types. First one is that you have to specify an index prefix length for indexes on TEXT. And the other one is that TEXT columns can't have default values.
In general, TEXT should be used for (extremely) long values. If you will be using string that will have up to 255 characters, then you should use VARCHAR.
Hope that this helps.
As for data storage space, VARCHAR(255) and TINYTEXT are equivalent:
VARCHAR(M): L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes.
TINYTEXT: L + 1 bytes, where L < 28.
Source: MySQL Reference Manual: Data Storage Requirements.
Storage space being equal, you may want to check out the following Stack Overflow posts for further reading on when you should use one or the other:
What’s the difference between VARCHAR(255) and TINYTEXT string types in MySQL?
varchar(255) v tinyblob v tinytext