Databricks "extraneous input expecting EOF" error - apache-spark-sql

This took me a while to figure out so I thought I'd share to save someone else the pain. This is obviously dummy code to illustrate the issue.
This doesn't work:
%sql
Select 'A' as A -- I won't need this
, '1' as B;
Select 'Magic';
Error message:
Error in SQL statement: ParseException:
extraneous input 'Select' expecting {<EOF>, ';'}(line 4, pos 0)
== SQL ==
Select 'A' as A -- I won't need this
, '1' as B;
Select 'Magic';
^^^
This does work:
%sql
Select 'A' as A -- I wont need this
, '1' as B;
Select 'Magic';
And the difference in the single-quote in the comment on line 3.

When you use single-quote in the comment, you need to pass the comment in double quotes.
Example: -- I won't need this it should be -- "I won't need this"

Related

SQL find '%' between %s

I need to find (exclude in fact) any results that contain '%' sign, wherever in a string field. That would mean ... WHERE string LIKE '%%%'. Googling about escaping gave me the following ideas. The first throws syntax error, the second returns rows but there are records actually contain '%'.
1st:
SELECT * FROM table
WHERE string NOT LIKE '%!%%' ESCAPE '!'
///tried with different escape characters
2nd:
SELECT * FROM table
WHERE string NOT LIKE '%[%]%'
Trying on GCP BigQuery.
Try:
SELECT *
FROM table
WHERE string NOT LIKE '%!%%' {ESCAPE '!'}
With curly braces as shown in microsoft sql server docs
Or also:
WITH indata(s) AS (
SELECT 'not excluded'
UNION ALL SELECT '%excluded'
UNION ALL SELECT 'Ex%cluded'
UNION ALL SELECT 'Excluded%'
)
SELECT * FROM indata WHERE INSTR(s,'%') = 0;
-- out s
-- out --------------
-- out not excluded
find (exclude in fact) any results that contain '%'
Consider below simple approach
select *
from your_table
where not regexp_contains(string , '%')

mismatched input ';' expecting <EOF>(line 1, pos 90)

I am trying to fetch multiple rows in zeppelin using spark SQL.
Here's my SQL statement:
select id, name from target where updated_at = "val1", "val2","val3"
This is the error message I'm getting:
mismatched input ';' expecting < EOF >(line 1, pos 90)
Not sure what your exact requirement is but your match condition doesn't conform to SQL syntax standards. Below statement will work if your requirement does match this:
select id, name from target where updated_at in ('val1', 'val2','val3')
In case someone gets this error in the selectExpr function from Spark like me: the correct usage of selectExpr is not a string which contains a comma separated list of column names, but a list of column names:
spark_df.selectExpr("id, name") # ouch, wrong
> mismatched input ';' expecting <EOF>(line 1, pos xy)
spark_df.selectExpr("id", "name") # right!
> DataFrame[..]
spark_df.selectExpr(*["id", "name"]) # correct too!
> DataFrame[..]

Unable to write case statement in Spark SQL

I have written below query in Spark SQL using spark-shell and I am getting below error message
spark.sql(""" select case when Treatment == 'Yes' then 1 else 0 end AS 'All-Yes' from person """)
Error message-
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input ''All-Yes'' expecting <EOF>(line 1, pos 58).
Can someone please help me in this
The alias should be enclosed with backquotes
select case when Treatment == 'Yes' then 1 else 0 end AS `All-Yes` from person
though in general you shouldn't use non-standard, an incompatible names.

Escaping single quotes in REDSHIFT SQL

I've lots of string values containing single quotes which I need to insert to a column in REDSHIFT table.
I used both /' and '' to escape the single quote in INSERT statement.
e.g.
INSERT INTO table_Temp
VALUES ('1234', 'O\'Niel'), ('3456', 'O\'Brien')
I also used '' instead of \' but it keeps giving me error that "VALUES list must of same length" i.e. no: of arguments for each record >2.
Can you let know how to have this issue resolved?
The standard in SQL is double single quotes:
INSERT INTO table_Temp (col1, col2) -- include the column names
VALUES ('1234', 'O''Niel'), ('3456', 'O''Brien');
You should also include the column names corresponding to the values being inserted. That is probably the cause of your second error.
You could use CHR(39) and concat the strings. Your name would look like below:
('O' || CHR(39)||'Brian')
I think it may depend on your environment. I'm using Periscope Data's redshift SQL editor, and \ worked as an escape character. '' and \\ did not work.
I was facing similar problem , I was needing send a kind of JSON structure to then decode it into my query but there was a program receiving my string and this program was escaping my escapes, so the query fails, finally I found this :
Put $$ in dollar-quoted string in PostgreSQL
mentioning quote_literal(42.5)
https://www.postgresql.org/docs/current/functions-string.html#FUNCTIONS-STRING-OTHER
This resolves my issue . an example
String is
'LocalTime={US/Central}; NumDays={1}; NumRows={3}; F_ID={[Apple, Orange, Bannana]}'
Select
Param, value , replace(quote_literal(replace(replace(Value,'[',''),']','')),',',quote_literal(',')) ValueList
FROM (
select
SPLIT_PART(split,'=',1) as Param,
replace( replace(SPLIT_PART(split,'=',2),'{',''),'}','') as Value
FROM
(
select
trim(split_part(freeform.txt, ';', number.n)) as split
from
( select
'LocalTime={US/Central}; NumDays={1}; NumRows={3}; F_ID={[Apple, Orange, Bannana]}' as txt
) freeform,
( select 1 as n union all
select 2 union all
select 3 union all
select 4 union all
select 5 union all
select 6 union all
select 7 union all
select 8 union all
select 9 union all
select 10
) number
where split <> ''
) as MY_TMP
) as valuePart
use \\' to escape '
s = s.replace("'", "\\'")

PL SQL replace conditionally suggestion

I need to replace the entire word with 0 if the word has any non-digit character. For example, if digital_word='22B4' then replace with 0, else if digital_word='224' then do not replace.
SELECT replace_funtion(digital_word,'has non numeric character pattern',0,digital_word)
FROM dual;
I tried decode, regexp_instr, regexp_replace but could not come up with the right solution.
Please advise.
Thank you.
the idea is simple - you need check if the value is numeric or not
script:
with nums as
(
select '123' as num from dual union all
select '456' as num from dual union all
select '7A9' as num from dual union all
select '098' as num from dual
)
select n.*
,nvl2(LENGTH(TRIM(TRANSLATE(num, ' +-.0123456789', ' '))),'0',num)
from nums n
result
1 123 123
2 456 456
3 7A9 0
4 098 098
see more articles below to see which way is better to you
How can I determine if a string is numeric in SQL?
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:15321803936685
How to tell if a value is not numeric in Oracle?
You might try the following:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '\D') THEN '0' ELSE digital_word END
FROM dual;
The regular expression class \D matches any non-digit character. You could also use [^0-9] to the same effect:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '\D') THEN '0' ELSE digital_word END
FROM dual;
Alternately you could see if the value of digital_word is made up of nothing but digits:
SELECT CASE WHEN REGEXP_LIKE(digital_word, '^\d+$') THEN digital_word ELSE '0' END
FROM dual;
Hope this helps.
The fastest way is to replace all digits with null (to simply delete them) and see if anything is left. You don't need regular expressions (slow!) for this, you just need the standard string function TRANSLATE().
Unfortunately, Oracle has to work around their own inconsistent treatment of NULL - sometimes as empty string, sometimes not. In the case of the TRANSLATE() function, you can't simply translate every digit to nothing; you must also translate a non-digit character to itself, so that the third argument is not an empty string (which is treated as a real NULL, as in relational theory). See the Oracle documentation for the TRANSLATE() function. https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions216.htm#SQLRF06145
Then, the result can be obtained with a CASE expression (or various forms of NULL handling functions; I prefer CASE, which is SQL Standard):
with
nums ( num ) as (
select '123' from dual union all
select '-56' from dual union all
select '7A9' from dual union all
select '0.9' from dual
)
-- End of simulated inputs (for testing only, not part of the solution).
-- SQL query begins BELOW THIS LINE. Use your own table and column names.
select num,
case when translate(num, 'z0123456789', 'z') is null
then num
else '0'
end as result
from nums
;
NUM RESULT
--- ------
123 123
-56 0
7A9 0
0.9 0
Note: everything here is in varchar2 data type (or some other kind of string data type). If the results should be converted to number, wrap the entire case expression within TO_NUMBER(). Note also that the strings '-56' and '0.9' are not all-digits (they contain non-digits), so the result is '0' for both. If this is not what you needed, you must correct the problem statement in the original post.
Something like the following update query will help you:
update [table] set [col] = '0'
where REGEXP_LIKE([col], '.*\D.*', 'i')