Why this SQL query returns extra space? - sql

I am trying to undestand why the below query is returning extra space after '3'. This is a screenshot from on of the SQL quizes available online. I would expect 'C' to be the correct answer. Is there anything that causes the extra space or might it be an error in the task?

The char type always stores data for all of the allocated space. Therefore a char(5), if it's not null, will always have 5 characters. If you store one character in the field the remaining four will be spaces.
Therefore the actual result will look like this:
King Kong (3 )
But in the context of HTML, multiple whitespace characters in sequence render by default as a single space, so you see this on the screen:
King Kong (3 )
To fix this to get the expected King Kong (3), you could use varchar(5) instead of char(5) or alternatively call rtrim() before the final concatenation.

This is misleading because the answers are not displayed in such a way that whitespace is preserved, probably unintentionally. (Right-click the answer and use "inspect element" to see what it is actually supposed to be.)
The answer should have been King Kong (3 ) - that is, four spaces and not one - but by rendering this in HTML (without white-space: pre-wrap; or a similar CSS rule), the whitespace was collapsed to one space.
The reason for there being four spaces is due to casting the number to char(5), i.e. a five-character-long string. Since the number 3 will need only one character to be displayed, the remaining four characters in the string will be filled with spaces, so that the total length of 3 is still five characters as it was specified.

Related

Field Name including a period gives me error (using brackets)

I put together an Access Database for a department. They've been using it frequently for the past few months with no hiccups.
However, they changed one of the field names of a linked Excel File, which forces me to go into Access and update the query a bit.
The field name has gone from "PacU" to "Mr. Cooper"
Original:
SELECT Round(BidTemplate.[PacU],6) AS PacU
New:
SELECT Round(BidTemplate.[Mr. Cooper],6) AS [Mr. Cooper]
I am receing an error as follows "Invalid bracketing of the name 'BidTeample.[Mr.Cooper]'.
I'm sure the issue is driven off of the period that is now included in the field. But shouldn't the brackets take care of this?
What am I missing?
Field names cannot contain a period.
From the MS Access Documentation:
Names of fields, controls, and objects in Microsoft Access desktop databases:
Can be up to 64 characters long.
Can include any combination of letters, numbers, spaces, and special characters except a period (.), an exclamation point (!), an
accent grave (`), and brackets ([ ]).
Can't begin with leading spaces.
Can't include control characters (ASCII values 0 through 31).
Can't include a double quotation mark (") in table, view, or stored procedure names in a Microsoft Access project.
remove extra space
SELECT Round(BidTemplate.[Mr Cooper],6) AS [Mr Cooper]

How to get rid of special character in Netezza columns

I am transferring data from one Netezza database to another using Talend, an ETL tool. When I pull data from a varchar(30) field and try to put it in the new database's varchar(30) field, it gives an error saying it's too long. Logs show the field has whitespace at the end followed by a square, representing some character I can't figure out. I attached a screenshot of the logs below. I have tried writing SQL to pull this field and replace what I thought was a CRLF, but no luck. When I do a select on the field and get the length, it has a few extra characters than what you see, so something is there and I want to get rid of it. Trimming does not do anything.
This SQL does not return a length shorter than simply doing length() on the column itself. Does anyone know what else it could be?
SELECT LENGTH(trim(translate(TRANSLATE(<column>, chr(13), ''), chr(10), ''))) as len_modified
Note that the last column in the logs, where you see a square in brackets, is supposed to show the last character examined.
Save the data to a larger target table size that works. If 30 character data put it in a 500 character table. Get it to work. Then look through character by character on the fields that are the longest to determine what character is being added. Use commands like ascii() to determine the ascii value of the individual characters and the beginning and end. Most likely you are getting some additional character in the beginning or the end. Determine what the extra character data is and then write code to remove it or to never load it so that it fits in the 30 character column. Or just leave your target column with longer and include the additional characters. For example Varchar(30) becomes Varchar(32) (waste the space but don't alter the data as it comes in to you).

Quick way to space fill column 256 chars SQL-Server 2012

So i have a file I'm creating using SQL Server 2012.
Many of the columns are optional or unused, and in place of the characters that would normally be there we are asked to zero-fill numeric columns, and space-fill alphanumeric columns.
Now I have a column called CDD and it's 256 characters long.
Is there a simpler way I can fill this column other than pressing the space bar 256 times in single quotes?
The file is Fixed Width so I have to have 256 spaces in this column for it to import correctly. I was looking at replicate and stuff, but they don't make sense being that the column doesn't have an original string to replace.
Replicate works with zeros but how can I validate it with spaces? The column doesn't expand like it would if there was an actual character in it...Does SQL-Server do any collapsing of white space in this way?
You're going to want to use the replicate function.
SELECT REPLICATE(' ',256)
This function will repeat space (or whatever string you put in the first parameter) 256 (or however many in the second parameter) times.
In addition to REPLICATE you can also use
SELECT SPACE(256);
As far as "the column expanding", the column will not appear expanded in SSMS unless you click on 'Results in Text' (instead of grid). If you use the LEN function it will return 0, but DATALENGTH will return either the actual number of spaces requested for a varchar column, or the defined length of a char column. Either way, if you copy the output into a text editor, you will see that it is indeed a string of empty spaces.

What is the major difference between Varchar2 and char

Creating Table:
CREATE TABLE test (
charcol CHAR(10),
varcharcol VARCHAR2(10));
SELECT LENGTH(charcol), LENGTH(varcharcol) FROM test;
Result:
LENGTH(CHARCOL) LENGTH(VARCHARCOL)
--------------- ------------------
10 1
Please Let me know what is the difference between Varchar2 and char?
At what times we use both?
Although there are already several answers correctly describing the behaviour of char, I think it needs to be said that you should not use it except in three specific situations:
You are building a fixed-length file or report, and assigning a non-null value to a char avoids the need to code an rpad() expression. For example, if firstname and lastname are both defined as char(20), then firstname||lastname is a shorter way of writing rpad(firstname,20)||rpad(lastname,20) to create
Chuck Norris
You need to distinguish between the explicit empty string '' and null. Normally they are the same thing in Oracle, but assigning '' to a char value will trigger its blank-padding behaviour while null will not, so if it's important to tell the difference, and I can't really think of a reason why it would be, then you have a way to do that.
Your code is ported from (or needs to be compatible with) some other system that requires blank-padding for legacy reasons. In that case you are stuck with it and you have my sympathy.
There is really no reason to use char just because some length is fixed (e.g. a Y/N flag or an ISO currency code such as 'USD'). It's not more efficient, it doesn't save space (there's no mythical length indicator for a varchar2, there's just a blank padding overhead for char), and it doesn't stop anyone entering shorter values. (If you enter 'ZZ' in your char(3) currency column, it will just get stored as 'ZZ '.) It's not even backward-compatible with some ancient version of Oracle that once relied on it, because there never was one.
And the contagion can spread, as (following best practice) you might anchor a variable declaration using something like sales.currency%type. Now your l_sale_currency variable is a stealth char which will get invisibly blank-padded for shorter values (or ''), opening the door to obscure bugs where l_sale_currency does not equal l_refund_currency even though you assigned 'ZZ' to both of them.
Some argue that char(n) (where n is some character length) indicates that values are expected to be n characters long, and this is a form of self-documentation. But surely if you are serious about a 3-character format (ISO-Alpha-3 country codes rather than ISO-Alpha-2, for example), wouldn't you define a constraint to enforce the rule, rather than letting developers glance at a char(3) datatype and draw their own conclusions?
CHAR was introduced in Oracle 6 for, I'm sure, ANSI compatibility reasons. Probably there are potential customers deciding which database product to purchase and ANSI compatibility is on their checklist (or used to be back then), and CHAR with blank-padding is defined in the ANSI standard, so Oracle needs to provide it. You are not supposed to actually use it.
Simple example to show the difference:
SELECT
'"'||CAST('abc' AS VARCHAR2(10))||'"',
'"'||CAST('abc' AS CHAR(10))||'"'
FROM dual;
'"'||CAST('ABC'ASVARCHAR2(10))||'"' '"'||CAST('ABC'ASCHAR(10))||'"'
----------------------------------- -------------------------------
"abc" "abc "
1 row selected.
The CHAR is usefull for expressions where the length of charaters is always fix, e.g. postal code for US states, for example CA, NY, FL, TX
Just to avoid confusion about much wrong information. Here are some information about difference including performance
Reference: https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2668391900346844476
Since a char is nothing more than a VARCHAR2 that is blank padded out
to the maximum length - that is, the difference between the columns X
and Y below:
create table t ( x varchar2(30), y char(30) ); insert into t (x,y)
values ( rpad('a',' ',30), 'a' );
IS ABSOLUTELY NOTHING, and given that the difference between columns X
and Y below:
insert into t (x,y) values ('a','a')
is that X consumes 3 bytes (null indicator, leading byte length, 1
byte for 'a') and Y consumes 32 bytes (null indicator, leading byte
length, 30 bytes for 'a ' )
Umm, varchar2 is going to be somewhat "at an advantage performance
wise". It helps us NOT AT ALL that char(30) is always 30 bytes - to
us, it is simply a varchar2 that is blank padded out to the maximum
length. It helps us in processing - ZERO, zilch, zippo.
Anytime you see anyone say "it is up to 50% faster", and that is it -
no example, no science, no facts, no story to back it up - just laugh
out loud at them and keep on moving along.
There are other "made up things" on that page as well, for example:
"Searching is faster in CHAR as all the strings are stored at a
specified position from the each other, the system doesnot have to
search for the end of string. Whereas in VARCHAR the system has to
first find the end of string and then go for searching."
FALSE: a char is just a varchar2 blank padded - we do not store
strings "at a specified position from each other". We do search for
the end of the string - we use a leading byte length to figure things
out.
CHAR
CHAR should be used for storing fix length character strings. String values will be space/blank padded before stored on disk. If this type is used to store varibale length strings, it will waste a lot of disk space.
VARCHAR2
VARCHAR2 is used to store variable length character strings. The string value's length will be stored on disk with the value itself.
And
At what times we use both?
Its all depend upon your requirement.
CHAR type has fixed size, so if you say it is 10 bytes, then it always stores 10 bytes in the database and it doesn't matter whether you store any text or just empty 10 bytes
VARCHAR2 size depends on how many bytes you are actually going to store in the database. The number you specify is just the maximum number of bytes that can be stored (although 1 byte is minimum)
You should use CHAR when dealing with fixed length strings (you know in advance the exact length of string you will be storing) - database can then manipulate with it better and faster since it knows the exact lenght
You should use VARCHAR2 when you don't know the exact lenght of stored strings.
Situation you would use both may be:
name VARCHAR2(255),
zip_code CHAR(5) --if your users have only 5 place zip codes
When stored in a database, varchar2 uses only the allocated space. E.g. if you have a varchar2(1999) and put 50 bytes in the table, it will use 52 bytes.
But when stored in a database, char always uses the maximum length and is blank-padded. E.g. if you have char(1999) and put 50 bytes in the table, it will consume 2000 bytes.
CHAR is used for storing fix length character strings. It will waste a lot of disk space if this type is used to store varibale length strings.
VARCHAR2 is used to store variable length character strings.
At what times we use both?
This may vary and depend on your requirement.
EDIT:-
Lets understand this with an example, If you have an student name column with size 10; sname CHAR(10) and If a column value 'RAMA' is inserted, 6 empty spaces will be inserted to the right of the value. If this was a VARCHAR column; sname VARCHAR2(10). Then Varchar will take 4 spaces out of 10 possible and free the next 6 for other usage.

How do I escape an enclosure character in a SQL Loader data file?

I have a SQL*Loader control file that has a line something like this:
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '#'
Normally, I'd use a quotation mark, but that seems to destroy emacs's python syntax highlighting if used inside a multi-line string. The problem is that we are loading an ADDRESS_LINE_2 column where only 7,000 out of a million records are loading because they have lines like this:
...(other columns),Apt #2,(other columns)...
Which is of course causing errors. Is there any way to escape the enclosing character so this doesn't happen? Or do I just need to choose a better enclosing character?
I've looked through the documentation, but don't seem to have found an answer to this.
I found it...
If two delimiter characters are encountered next to each other, a single occurrence of the delimiter character is used in the data value. For example, 'DON''T' is stored as DON'T. However, if the field consists of just two delimiter characters, its value is null.
Field List Reference
Unfortunately, SqlLoader computes both occurrences of the delimiter while checking for max length of the field. For instance, DON''T will be rejected in a CHAR(5) field, with ORA-12899: value too large for column blah.blah2 (actual: 6, maximum: 5).
At least in my 11gR2 . Haven't tried in other versions....