How to bring in BQ Big Numeric comparision in SQL Queries? - google-bigquery

I am trying to query the following in google BigQuery:
SELECT role FROM `<PROJECT>.<DATABASE>.<TABLE>` WHERE googleId = 109024200300000000000
Though this isn't working as I am getting this error:
Invalid integer literal: 109024000000000000000 at [1:85]
I am not sure how to tell BigQuery that this is a Big Numeric and not Integer.
One way I found is:
SELECT role FROM `<PROJECT>.<DATASET>.<TABLE>` WHERE googleId = CAST('109024002200000000000' as BIGNUMERIC) ORDER BY timeStamp DESC LIMIT 1;
but not sure if this would be the most efficient way.

As suggested by #Mikhail Berlyant in the comment, you can use a shorter version (without CAST) as follows by including datatype ‘BIGNUMERIC’ along with the value.
SELECT role FROM `<PROJECT>.<DATABASE>.<TABLE>` WHERE googleId = BIGNUMERIC '109024002200000000000'

Related

SQL query problem in WHERE clause, this returns all that start with

I've written the following SQL query to return all sites having "id" equal to 2.
SELECT * FROM `sites` WHERE id = '2'
And it works well. The problem is that even if I add some characters after "2" like this :
SELECT * FROM `sites` WHERE id = '2etyupp-7852-trG78'
It returns the same results as above.
How to avoid this ? that's to say return none on the second query ?
Thanks
The reason is that you are mixing types:
where id = '2'
------^ number
-----------^ string
What is a SQL engine supposed to do? Well, the standard approach is to convert the string to a number. So this is run as:
where id = 2
What happens when the string is not a number? In most databases, you get a type conversion error. However, MySQL does implicit conversion, converting the leading digits to a number. Hence, your second string just comes 2.
From this, I hope you learn not to mix data types. Compare numbers to numbers. Compare strings to strings.

Cast in Google BigQuery not appropriate?

I have a #StandardSQL query
SELECT
CAST(created_utc AS STRING),
author,
FROM
`table`
WHERE
something = "Something"
which gives me the following error,
Error: Cannot read field 'created_utc' of type STRING as INT64
An example of created_utc is 1517360483
If I understand that error, which I clearly don't. created_utc is stored a string, but the query is trying unsuccessfully to convert it to a INT64. I would have hoped the CAST function would enforce it to be kept as a string.
What have I done wrong?
The problem is that you don't actually have a single table. In your question, you wrote table, but I suspect that you are querying table*, which matches multiple tables where one of them happens to have a different type for that column. Instead of using table*, your options are to:
Use UNION ALL with the individual tables, preforming casts as appropriate in the SELECT lists.
If you know which table(s) have that column as an INT64 instead of a STRING, and you are okay with excluding them, you can use a filter on _TABLE_SUFFIX to skip reading from certain tables.
As Elliott has already pointed - some of your values are actually cannot be casted to INT64 because they are not represented integers and rather have some other characters than digits
Using below SELECT you can identify such values so it will help you to locate problematic entries and make then decision on next actions
#standardSQL
SELECT created_utc, author
FROM `table`
WHERE something = "Something"
AND NOT REGEXP_CONTAINS(created_utc , r'[0-9]')

Using period "." in Standard SQL in BigQuery

BigQuery Standard SQL does not seems to allow period "." in the select statement. Even a simple query (see below) seems to fail. This is a big problem for datasets with field names that contain "." Is there an easy way to avoid this issue?
select id, time_ts as time.ts
from `bigquery-public-data.hacker_news.comments`
LIMIT 10
Returns error...
Error: Syntax error: Unexpected "." at [1:27]
This also fails...
select * except(detected_circle.center_x )
from [bigquery-public-data:eclipse_megamovie.photos_v_0_2]
LIMIT 10
It depends on what you are trying to accomplish. One interpretation is that you want to return a STRUCT named time with a single field named ts inside of it. If that's the case, you can use the STRUCT operator to build the result:
SELECT
id,
STRUCT(time_ts AS ts) AS time
FROM `bigquery-public-data.hacker_news.comments`
LIMIT 10;
In the BigQuery UI, it will display the result as id and time.ts, where the latter indicates that ts is inside a STRUCT named time.
BigQuery disallows columns in the result whose names include periods, so you'll get an error if you run the following query:
SELECT
id,
time_ts AS `time.ts`
FROM `bigquery-public-data.hacker_news.comments`
LIMIT 10;
Invalid field name "time.ts". Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long.
Elliot's answer great and addresses first part of your question, so let me address second part of it (as it is quite different)
First, wanted to mention that select modifiers like SELECT * EXCEPT are supported for BigQuery Standard SQL so, instead of
SELECT * EXCEPT(detected_circle.center_x )
FROM [bigquery-public-data:eclipse_megamovie.photos_v_0_2]
LIMIT 10
you should rather tried
#standardSQL
SELECT * EXCEPT(detected_circle.center_x )
FROM `bigquery-public-data.eclipse_megamovie.photos_v_0_2`
LIMIT 10
and of course now we are back to issue with `using period in standard sql
So, above code can only be interpreted as you try to eliminate center_x field from detected_circle STRUCT (nullable record). Technically speaking, this makes sense and can be done using below code
SELECT *
REPLACE(STRUCT(detected_circle.radius, detected_circle.center_y ) AS detected_circle)
FROM `bigquery-public-data.eclipse_megamovie.photos_v_0_2`
LIMIT 10
... still not clear to me how to use your recommendation to remove the entire detected_circle.*
SELECT * EXCEPT(detected_circle)
FROM `bigquery-public-data.eclipse_megamovie.photos_v_0_2`
LIMIT 10

Power function Sql only 3dp

select power(1.005,4) [Power]
gives 1.020
select 1.005*1.005*1.005*1.005 [Manual]
gives 1.020150500625
i need the latter result but don't want to do manually. 4th Power in this case but will be variable.
please advise. thanks
Based on your syntax, I assume you are using SQL Server. As explained in the documentation for power():
Returns the same type as submitted in float_expression. For example,
if a decimal(2,0) is submitted as float_expression, the result
returned is decimal(2,0).
SQL Server interpets numeric inputs as decimals, not floats. So, if you want the full value, convert the value before calling the function:
select power(convert(float, 1.005), 4) as [Power]
Here is a Rextester comparing the different approaches.

SQL Query - Greater Than with Text Data Type

I've searched around and couldn't find an answer anywhere.
I'm querying a database that has stored numbers as a VARCHAR2 data type. I'm trying to find numbers that are greater than 1450000 (where BI_SO_NBR > '1450000'), but this doesn't bring back the results I'm expecting.
I'm assuming it's because the value is stored as text and I don't know any way to get around it.
Is there some way to convert the field to a number in my query or some other trick that would work?Hopefully this makes sense.
I'm fairly new to SQL.
Thanks in advance.
If the number is too long to be converted correctly to a number, and it is always an integer with no left padding of zeroes, then you can also do:
where length(BI_SO_NBR) > length('1450000') or
(length(BI_SO_NBR) = length('1450000') and
BI_SO_NBR > '1450000'
)
You can try to use like this:
where to_number(BI_SO_NBR) > 1450000
Assuming you are using Oracle database. Also check To_Number function
EDIT:-
You can try this(after OP commented that it worked):
where COALESCE(TO_NUMBER(REGEXP_SUBSTR(BI_SO_NBR, '^\d+(\.\d+)?')), 0) > 1450000
If you are talking about Oracle, then:
where to_number(bi_so_nbr) > 1450000
However, there are 2 issues with this:
1. if there is any value in bi_so_nbr that cannot be converted to a number, this can result in an error
2. the query will not use an index on bi_so_nbr, if there is one. You could solve this by creating a function based index, but converting the varchar2 to number would be a better solution.