Getting an error when using CONCAT in BigQuery - sql

I'm trying to run a query where I combine two columns and separate them with an x in between.
I'm also trying to get some other columns from the same table. However, I get the following error.
Error: No matching signature for function CONCAT for argument types: FLOAT64, FLOAT64. Supported signatures: CONCAT(STRING, [STRING, ...]); CONCAT(BYTES, [BYTES, ...]).
Here is my code:
SELECT
CONCAT(right,'x',left),
position,
numbercreated,
Madefrom
FROM
table
WHERE
Date = "2018-10-07%"
I have tried also putting a cast before but that did not work.
SELECT Concast(cast(right,'x',left)), position,...
SELECT Concast(cast(right,'x',left)as STRING), position,...
Why am I getting this error?
Are there any fixes?
Thanks for the help.

You need to cast each value before the concat():
SELECT CONCAT(CAST(right as string), 'x', CAST(left as string)),
position, numbercreated, Madefrom
FROM table
WHERE Date = '2018-10-07%';
If you want a particular format, then use the FORMAT() function.
I also doubt that your WHERE will match anything. If Date is a string, then you probably want LIKE:
WHERE Date LIKE '2018-10-07%';
More likely, you should use the DATE function or direct comparison:
WHERE DATE(Date) = '2018-10-07'
or:
WHERE Date >= '2018-10-07' AND
Date < '2018-10-08'

Another option to fix your issue with CONCAT is to use FROMAT function as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1.01 AS `right`, 2.0 AS `left`
)
SELECT FORMAT('%g%s%g', t.right, 'x', t.left)
FROM `project.dataset.table` t
result will be
Row f0_
1 1.01x2
Note: in above specific example - you could use even simpler statement
FORMAT('%gx%g', t.right, t.left)
You can see more for supporting formats
Few recommendations - try not to use keywords as a column names/aliases. If for some reason you do use - wrap such with backtick or prefix it with table name/alias
Yet another comment - looks like you switched your values positions - your right one is on left side and left one is on right - might be exactly what you need but wanted to mention

Try like below by using safe_cast:
SELECT
CONCAT(SAFE_CAST( right as string ),'x',SAFE_CAST(left as string)),
position,
numbercreated,
Madefrom
FROM
table
WHERE
Date = '2018-10-07'

Related

ignoring rows where a column has non-numeric characters in its value

I have a table with a column that has some variable data. I would like to select only the rows that have values with numerical characters [0-9]
The column would look someting like this:
time
1545123
none
1565543
1903-294
I would want the rows with the first and third values only (1545123 and 1565543). None of my approaches have worked.
I've tried:
WHERE time NOT LIKE '%[^0-9]+%'
WHERE NOT regexp_like(time, '%[^0-9]+%')
WHERE regexp_like(time, '[0-9]+')
I've also tried these expressions in a CASE statement, but that was also a no go. Am I missing something here?
This is on Amazon Athena, which uses an older version of Presto
Thanks in advance
You can use regexp matching only numbers like '^[0-9]+$' or '^\d+$':
-- sample data
WITH dataset (time) AS (
VALUES
('1545123'),
('none'),
('1565543'),
('1903-294')
)
--query
select *
from dataset
WHERE regexp_like(time, '^[0-9]+$')
Output:
time
1545123
1565543
Another option which I would say should not be used in this case but can be helpful in some others is using try with cast:
--query
select *
from (
select try(cast(time as INTEGER)) time
from dataset
)
where time is not null

How to extract numeric values from a column in SQL

I am trying to extract only the numeric values from a column that contains cells that are exclusively numbers, and cells that are exclusively letter values, so that I can multiply the column with another that contains only numeric values. I have tried
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME)
and also
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME) = 1
and neither works. I get the error Function ISNUMERIC(VARCHAR) does not exist. Can someone advise? Thank you!
It highly depends on DBMS.
in SqlServer you have a limited built-in features to do it, so the next query may not work with all variants of your data:
select CAST(INTENT_VOLUME AS DECIMAL(10, 4))
from A
where INTENT_VOLUME LIKE '%[0-9.-]%'
and INTENT_VOLUME NOT LIKE '%[^0-9.-]%';
In Oracle you can use regex in a normal way:
select to_number(INTENT_VOLUME)
from A
where REGEXP_LIKE(INTENT_VOLUME,'^[-+]?[0-9]+(\.[0-9]+)?$');
MySQL DBMS has also built-in regex
Try this, which tests if that text value can be cast as numeric...
select intent_volume
from a
where (intent_volume ~ '^([0-9]+[.]?[0-9]*|[.][0-9]+)$') = 't'

Cast in Google BigQuery not appropriate?

I have a #StandardSQL query
SELECT
CAST(created_utc AS STRING),
author,
FROM
`table`
WHERE
something = "Something"
which gives me the following error,
Error: Cannot read field 'created_utc' of type STRING as INT64
An example of created_utc is 1517360483
If I understand that error, which I clearly don't. created_utc is stored a string, but the query is trying unsuccessfully to convert it to a INT64. I would have hoped the CAST function would enforce it to be kept as a string.
What have I done wrong?
The problem is that you don't actually have a single table. In your question, you wrote table, but I suspect that you are querying table*, which matches multiple tables where one of them happens to have a different type for that column. Instead of using table*, your options are to:
Use UNION ALL with the individual tables, preforming casts as appropriate in the SELECT lists.
If you know which table(s) have that column as an INT64 instead of a STRING, and you are okay with excluding them, you can use a filter on _TABLE_SUFFIX to skip reading from certain tables.
As Elliott has already pointed - some of your values are actually cannot be casted to INT64 because they are not represented integers and rather have some other characters than digits
Using below SELECT you can identify such values so it will help you to locate problematic entries and make then decision on next actions
#standardSQL
SELECT created_utc, author
FROM `table`
WHERE something = "Something"
AND NOT REGEXP_CONTAINS(created_utc , r'[0-9]')

remove first two digits of customer_number in impala sql

in Cloudera / impala SQL I need to remove the first to digits of a customer_number,
I tried the following, but this does not work. Can you please help ?
many thanks
CREATE TABLE new
STORED AS PARQUET AS
SELECT DISTINCT
CASE t1.customer_number = RIGHT(t1.customer_number, LEN(t1.customer_number) - 2)
from Old;
customer_number should become short_cust_no
33764703 764703
36764624 764624
36763795 763795
37764829 764829
39766002 766002
Impala supports substr() with two arguments. You can simply do:
SELECT DISTINCT SUBSTR(t1.customer_number, 3)
FROM Old t1;
EDIT:
I had assume customer_number was a string, because the OP uses string functions.
If it is a number, use mod();
SELECT DISTINCT MOD(t1.customer_number, 1000000)
FROM Old t1;
Note: The types for the arguments to mod() need to be compatible so this might require a cast() of some sort.
If all your customer numbers are 14 characters then I think you should be able to do that with
RIGHT(t1.customer_number, 12)
This addresses the DOUBLE, TINYINT mistake
SELECT DISTINCT
SUBSTR(cast(t1.customer_number as string), 3,10)
FROM old;

Translate function not returning relevant string in amazon redshift

I am trying to use a simple Translate function to replace "-" in a 23 digit string. The example of one such string is "1049477-1623095-2412303" The expected outcome of my query should be 104947716230952412303
The list of all "1049477-1623095-2412303" is present in a single column "table1". The name of the column is "data"
My query is
Select TRANSLATE(t.data, '-', '')
from table1 as t
However, it is returning 104947716230952000000 as the output.
At first, I thought it is an overflow error since the resulting integer is 20 digit so I also tried to use following
SELECT CAST(TRANSLATE(t.data,'-','') AS VARCHAR)
from table1 as t
but this is not working as well.
Please suggest a way so that I could have my desirable output
This is too long for a comment.
This code:
select translate('1049477-1623095-2412303', '-', '')
is going to return:
'104947716230952412303'
The return value is a string, not a number.
There is no way that it can return '104947716230952000000'. I could only imagine that happening if somehow the value is being converted to a numeric or bigint type.
Try regexp_replace()
Taking your own example, execute:
select regexp_replace('[string / column_name]','-');
It can be achieve RPAD try below code.
SELECT RPAD(TRANSLATE(CAST(t.data as VARCHAR),'-','') ,20,'00000000000000000000')