I ran 'Analyze Performance' feature in Access and it had an "idea" to improve performance; Access said I should convert items that are alphanumeric mixes that look like this 12BB1-DF740§ from text data type into long integer (the specific name from the idea). Whether Access is right that this would improve performance is secondary to whether long integer can store letters at all.
[§ About the data - the hyphen in the data provided to me is always present at that location; the letters are always A-F]
From what I can tell, w3schools is indicating that Long will only store numbers
Long - Allows whole numbers between -2,147,483,648 and 2,147,483,647
Am I conflating data types? (Further, when I pull up the design view, it only offers number as a data type; there is no long or long integer)
Can Long Integer store letters?
If my column is already populated, and I convert the data type, will I lose data?
You could store those values by splitting them into 2 Long Integer columns. Then when you need the original text form, concatenate their Hex() values with a dash between.
? Hex(76721) & "-" & Hex(915264)
12BB1-DF740
However I don't see why that would be worth doing. Occasionally a performance analyzer suggestion just doesn't make sense to me; this is such a case.
I've never run into this but it looks like it thinks your strings are hexadecimal numbers.
If you never have letters other than A-F then you could store them as longs and then convert back using the Hex() function but that seems mighty kludgy and something I'd avoid unless you're really desperate to eek out some performance.
If it is in fact hexadecimal data, and it always has the same format so that the dash could just be added at the same place, then it would be possible to store the data numeric, and convert it into the hexadecimal notation when needed.
Ten hexadecimal digits represent 40 bits of data, so the Long type described at the w3schools page wouldn't do, as it's only 32 bits. You would need a data type that is a 64 bits, like a double or bigint. (The latter one might not be available in Access.)
However, that would only be any real gain if you actually do any processing of the data in the numeric form. Otherwise you would only save a few bytes per record, and you would need extra processing to convert to and from the numeric format.
If your table is already populated, you would have to read out the values, convert them, and store them back in the numeric form.
Related
I am new to SQL and I need to create a table to accommodate a bunch of data that is in this format:
1.33E+09 -1.8E+09 58 -1.9E+09 2.35E+10 2.49E+10 2.49E+10 3.35E+08 etc.
How to deal with it? I am not surely populating the table with this and if I need to convert it in order to work with it... Any suggestions?
is that a BIGINT?
The correct data type for data like this is double precision or float.
It is obvious from the data that high precision is not necessary, so numeric would not be appropriate (it takes more storage and makes computations much slower).
float is only a good choice if you really need as few significant digits as your examples suggest and storage space is at a premium.
Your data are already in a format that PostgreSQL can accept for these data types, so there is no need for conversion.
The answer depends on whether you know the maximum range of data values your application might create.
A postgresql 'numeric' column will certainly hold any of the values you list above. Per the docs: "up to 131072 digits before the decimal point; up to 16383 digits after the decimal point".
See the data type reference here.
I have an Excel CSV file with a Barcode column that has data that looks like this: 5.06E+12 - it has a decimal number(5.06), letter(E) and symbol(+).
When I try to edit this in Excel, the number changes to 5060190000000.
When storing this type of data to my SQL Server database, what should the data type be of my Model's Barcode property?
Try to pick the most appropriate data type. If you're using actual product barcodes, a little research indicates that they're likely International Article Numbers.
Since they're really strings of 13 digits, a char(13) would probably be the most appropriate data type to store this data. Don't just default to varchar(50) because that's "big enough" - think of the length specification as free validation.
This is called E notation which is a variation of scientific notation. The number in question is an integer, but is abbreviated.
5.06 * 10^12 = 5060190000000
Thus, your value should be stored as an integer large enough to store your number.
Your value should be stored as a varchar long enough to fit the length of potential values.
I'm importing data from one system to another. The former keys off an alphanumeric field whereas the latter requires a numeric integer field. I'd like to find or write a function that I can feed the alphanumeric value to and have it return a number that would be unique to the value passed in.
My first thought was to do a hash, but of course the result of any built in hashes are going to contains letters and plus it's technically possible (however unlikely) that a hash may not be unique.
My first question is whether there is anything built in to sql that I'm overlooking, and short of that I'd like to hear suggestions on the easiest way to implement such a function.
Here is a function which will probably convert from base 10 (integer) to base 36 (alphanumeric) and back again:
https://www.simple-talk.com/sql/t-sql-programming/numeral-systems-and-numbers-conversion-in-sql/
You might find the resultant number is too big to be held in an integer though.
You could concatenate the ascii values of each character of your string and cast the result as a bigint.
If the original data is known to be integers you can use cast:
SELECT CAST(varcharcol AS INT) FROM Table
Is there any bad affect if I use TEXT data-type to store an ID number in a database?
I do something like:
CREATE TABLE GenData ( EmpName TEXT NOT NULL, ID TEXT PRIMARY KEY);
And actually, if I want to store a date value I usually use TEXT data-type. If this is a wrong way, what is its disadvantage?
I am using PostgreSQL.
Storing numbers in a text column is a very bad idea. You lose a lot of advantages when you do that:
you can't prevent storing invalid numbers (e.g. 'foo')
Sorting will not work the way you want to ('10' is "smaller" than '2')
it confuses everybody looking at your data model.
I want to store a date value I usually use TEXT
That is another very bad idea. Mainly because of the same reasons you shouldn't be storing a number in a text column. In addition to completely wrong dates ('foo') you can't prevent "invalid" dates either (e.g. February, 31st). And then there is the sorting thing, and the comparison with > and <, and the date arithmetic....
I really don't recommend using text for dates.
Look at all the functions you are missing with text
If you want to use them, you have to cast and it's only problems if by accident the dates stored are not valid cause with text there's no validation.
In addition to what the other answers already provided:
text is also subject to COLLATION and encoding, which may complicate portability and data interchange between platforms. It also slows down sorting operations.
Concerning storage size of text: an integer occupies 4 byte (and is subject to padding for data alignment). text or varchar occupy 1 byte plus the actual string, which is 1 byte for ASCII character in UTF-8 or more for special characters. Most likely, text will be much bigger.
It depends on what operations you are going to do on the data.
If you are going to be doing a lot of arithmetic on numeric data, it makes more sense to store it as some kind of numeric data type. Also, if you plan on sorting the data in numerical order, it really helps if the data is stored as a number.
When stored as text, "11" comes ahead of "9" because "1" comes ahead of "9". If this isn't what you want, don't use text.
On the other hand, it often makes sense to store strings of digits, such as zipcodes or social security number or phone numbers as text.
The title pretty much frames the question. I have not used CHAR in years. Right now, I am reverse-engineering a database that has CHAR all over it, for primary keys, codes, etc.
How about a CHAR(30) column?
Edit:
So the general opinion seems to be that CHAR if perfectly fine for certain things. I, however, think that you can design a database schema that does not have a need for "these certain things", thus not requiring fixed-length strings. With the bit, uniqueidentifier, varchar, and text types, it seems that in a well-normalized schema you get a certain elegance that you don't get when you use encoded string values. Thinking in fixed lenghts, no offense meant, seems to be a relic of the mainframe days (I learned RPG II once myself). I believe it is obsolete, and I did not hear a convincing argument from you claiming otherwise.
I use char(n) for codes, varchar(m) for descriptions. Char(n) seems to result in better performance because data doesn't need to move around when the size of contents change.
Where the nature of the data dictates the length of the field, I use CHAR. Otherwise VARCHAR.
CHARs are still faster for processing than VARCHARs in the DBMS I know well. Their fixed size allow for optimizations that aren't possible with VARCHARs. In addition, the storage requirements are slightly less for CHARS since no length has to be stored, assuming most of the rows need to fully, or near-fully, populate the CHAR column.
This is less of an impact (in terms of percentage) with a CHAR(30) than a CHAR(4).
As to usage, I tend to use CHARs when either:
the fields will generally always be close to or at their maximum length (stock codes, employee IDs, etc); or
the lengths are short (less than 10).
Anywhere else, I use VARCHARs.
I use CHAR when length of the value is fixed. For example we are generating a code or something based on some algorithm which returns the code with the specific fixed lenght lets say 13.
Otherwise, I found VARCHAR better. One more reason to use VARCHAR is that when you get the value back in your application you don't need to trim that value. In the case of CHAR you will get the full length of the column whether the value is filling it fully or not. It would get filled by spaces and you end up trimming every value, and forgetting that would lead to errors.
For PostgreSQL, the documentation states that char() has no advantage in storage space over varchar(); the only difference is that it's blank-padded to the specified length.
Having said that, I still use char(1) or char(3) for one-character or three-character codes. I think that the clarity due to the type specifying what the column should contain provides value, even if there are no storage or performance advantages. And yes, I typically use check constraints or foreign key constraints as well. Apart from those cases, I generally just stick with text rather than using varchar(). Again, this is informed by the database implementation, which automatically switches from inline to out-of-line storage if the value is large enough, which some other database implementations don't do.
Char isn't obsolete, it just should only be used if the length of the field should never vary. In the average database, this would be very few fields mostly some kind of code field like State Abbreviations which are a standard 2 character filed if you use the postal codes. Using Char where the filed length is varaible means that there will be a lot of trimming going on and that is extra, unnecessary work and the database should be refactored.