I am looking for a cut and dry answer. I have seen posts that say it doesn't count ending white space or some say it doesn't count leading white space.
example "if this was the example" would this be a varchar(19) or varchar(23)?
VARCHAR(23), since space also needs to be stored for the sentence to stay the same.
The simple answer is Yes it accepts white spaces. And the output is varchar(23)
varchar's can take into account spaces, so yes it does need varchar(23).
Related
I have a table in SQL that contains text data, and sometimes that text data contains emojis. See below for sample table output:
CommentID
Comment_Text
1
A walk in the park.
2
A lovely day in the park
3
A sunny day in the park š
What I'd like to do is separate out the emoji as a column. Ultimately what I'd like to end up with is a binary column indicating whether the comment text contains an emoji.
After some research, I was able to find the following solution:
REGEXP_SUBSTR(Comment_Text, '[^\x00-\x7F]+', 1, 1)
Which will separate out the first emoji the code finds. In actuality, this regex doesn't find emojis, it just finds non-ASCII characters - emojis just happen to fall in that category. While this solution does work sometimes, it does not work when there are emojis and non-ASCII characters in the same comment. So, for example, if the comment was 'A lovely walk in the ĻaĆk š', the code would output both the emojis but also the Ļ and the Ć.
What I need is a way to separate out the emojis from the other non-ASCII characters.
Good day sir.
Can you try this function on Regex101 site?
I think it will work.
[^\x00-\x7F]+[^x00-\x7F]
I am trying to undestand why the below query is returning extra space after '3'. This is a screenshot from on of the SQL quizes available online. I would expect 'C' to be the correct answer. Is there anything that causes the extra space or might it be an error in the task?
The char type always stores data for all of the allocated space. Therefore a char(5), if it's not null, will always have 5 characters. If you store one character in the field the remaining four will be spaces.
Therefore the actual result will look like this:
King Kong (3 )
But in the context of HTML, multiple whitespace characters in sequence render by default as a single space, so you see this on the screen:
King Kong (3 )
To fix this to get the expected King Kong (3), you could use varchar(5) instead of char(5) or alternatively call rtrim() before the final concatenation.
This is misleading because the answers are not displayed in such a way that whitespace is preserved, probably unintentionally. (Right-click the answer and use "inspect element" to see what it is actually supposed to be.)
The answer should have been King Kong (3 ) - that is, four spaces and not one - but by rendering this in HTML (without white-space: pre-wrap; or a similar CSS rule), the whitespace was collapsed to one space.
The reason for there being four spaces is due to casting the number to char(5), i.e. a five-character-long string. Since the number 3 will need only one character to be displayed, the remaining four characters in the string will be filled with spaces, so that the total length of 3 is still five characters as it was specified.
The SQL comment character consists of two hyphens, thusly:
-- cannot create table if one already exists
drop table if exists mytable;
When using lstlisting in package listings for source code, the comment characters are converted to en dashes. If I insert a space between the hyphens, it looks like [hyphen][space][hyphen], instead of two hyphens next to each other. So, using lstlistings in package listings for SQL source code, how do I specify the comment characters?
Actually, someone pointed out that the hyphens as comment characters displayed properly. The issue is that the space between them is so small as to be indistinguishable. With a high quality printed document and a magnifying glass, you can see the space. On my display, a rather small one, the two hyphens bleed together.
Thanks for looking at this.
At work our project is indented with 2 spaces. Somehow they don't want to use tabs for indenting.
I personally can't read code indented with 2 spaces very well and would prefer 4 spaces.
Is there anything I can do without changing the code, to improve the readability for me?
Thanks four your answers!
No.
A tab is something that can be openly and freely defined to be as many spaces as you want it to be, such that one tab could equal 3 spaces if you really wanted to (although I think both factions would be justified in their outrage over such an abomination).
A space is a hard-coded, explicitly defined value, and if two spaces are used, then it can only ever be two spaces.
Your only hope is to advocate - strongly advocate - for better code standards. You'll have to either extol the benefits of tabs since they can be used for variable space, or you'll have to demonstrate why having four spaces is preferable to two for readability's sake.
We need to create descriptive aliases for fields. Ideally we would like to create views with alias with a space. Is this possible? How can we do this?
Example:
SELECT word, word_count "Word Count" FROM [publicdata:samples.shakespeare] LIMIT 1000
No, the rules for field names (and aliases) in BigQuery are quite simple, and I quote:
Fields must contain only letters, numbers, and underscores, start with
a letter or underscore, and be at most 128 characters long.
As you see, spaces, quote characters, and other punctuation, are not allowed. Feel free to open a feature request at https://code.google.com/p/google-bigquery/ (explaining your use case, esp. why using underscores in lieu of spaces is not acceptable) -- or star an existing FR at https://code.google.com/p/google-bigquery/issues/list if it coincides with your requirements.