remove first two digits of customer_number in impala sql - sql

in Cloudera / impala SQL I need to remove the first to digits of a customer_number,
I tried the following, but this does not work. Can you please help ?
many thanks
CREATE TABLE new
STORED AS PARQUET AS
SELECT DISTINCT
CASE t1.customer_number = RIGHT(t1.customer_number, LEN(t1.customer_number) - 2)
from Old;
customer_number should become short_cust_no
33764703 764703
36764624 764624
36763795 763795
37764829 764829
39766002 766002

Impala supports substr() with two arguments. You can simply do:
SELECT DISTINCT SUBSTR(t1.customer_number, 3)
FROM Old t1;
EDIT:
I had assume customer_number was a string, because the OP uses string functions.
If it is a number, use mod();
SELECT DISTINCT MOD(t1.customer_number, 1000000)
FROM Old t1;
Note: The types for the arguments to mod() need to be compatible so this might require a cast() of some sort.

If all your customer numbers are 14 characters then I think you should be able to do that with
RIGHT(t1.customer_number, 12)

This addresses the DOUBLE, TINYINT mistake
SELECT DISTINCT
SUBSTR(cast(t1.customer_number as string), 3,10)
FROM old;

Related

How to select rows that have numbers as a value?

I have got a table with a column that is type of VARCHAR2(255 BYTE). I would like to select only these rows that have numbers as a value, so I discard any other values as for example "lala","1z". I just want to have pure numbers from 1 to ..... 999999999 (just digital numbers in other words) :P
Could you tell me how to make it?
if you're using Oracle 12c r2 or later then use the built-in validate_conversion() function:
select *
from your_table
where validate_conversion(cast(your_column as number)) = 0
validate_conversion() returns 0 when the proposed conversion would succeed and 1 when it wouldn't. It also supports date and timestamp conversions. Find out more.
Something like this is the usual option. You could use regexp, but it's usually a bit slower.
select column1
from tableA
where translate(column1, '1234567890', '') is null;
Here's the regexp version kfinity referred to. The regex matches a line consisting of 1 or more digits.
select column1
from tableA
where regexp_like(column1, '^\d+$');
You don't want zero to start a number. So it seems like regular expressions are the way to go:
where regexp_like(column1, '^[1-9][0-9]*$');

Getting an error when using CONCAT in BigQuery

I'm trying to run a query where I combine two columns and separate them with an x in between.
I'm also trying to get some other columns from the same table. However, I get the following error.
Error: No matching signature for function CONCAT for argument types: FLOAT64, FLOAT64. Supported signatures: CONCAT(STRING, [STRING, ...]); CONCAT(BYTES, [BYTES, ...]).
Here is my code:
SELECT
CONCAT(right,'x',left),
position,
numbercreated,
Madefrom
FROM
table
WHERE
Date = "2018-10-07%"
I have tried also putting a cast before but that did not work.
SELECT Concast(cast(right,'x',left)), position,...
SELECT Concast(cast(right,'x',left)as STRING), position,...
Why am I getting this error?
Are there any fixes?
Thanks for the help.
You need to cast each value before the concat():
SELECT CONCAT(CAST(right as string), 'x', CAST(left as string)),
position, numbercreated, Madefrom
FROM table
WHERE Date = '2018-10-07%';
If you want a particular format, then use the FORMAT() function.
I also doubt that your WHERE will match anything. If Date is a string, then you probably want LIKE:
WHERE Date LIKE '2018-10-07%';
More likely, you should use the DATE function or direct comparison:
WHERE DATE(Date) = '2018-10-07'
or:
WHERE Date >= '2018-10-07' AND
Date < '2018-10-08'
Another option to fix your issue with CONCAT is to use FROMAT function as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1.01 AS `right`, 2.0 AS `left`
)
SELECT FORMAT('%g%s%g', t.right, 'x', t.left)
FROM `project.dataset.table` t
result will be
Row f0_
1 1.01x2
Note: in above specific example - you could use even simpler statement
FORMAT('%gx%g', t.right, t.left)
You can see more for supporting formats
Few recommendations - try not to use keywords as a column names/aliases. If for some reason you do use - wrap such with backtick or prefix it with table name/alias
Yet another comment - looks like you switched your values positions - your right one is on left side and left one is on right - might be exactly what you need but wanted to mention
Try like below by using safe_cast:
SELECT
CONCAT(SAFE_CAST( right as string ),'x',SAFE_CAST(left as string)),
position,
numbercreated,
Madefrom
FROM
table
WHERE
Date = '2018-10-07'

How to convert this SAS code to SQL Server code?

SAS CODE:
data table1;
set table2;
_sep1 = findc(policynum,'/&,');
_count1 = countc(policynum,'/&,');
_sep2 = findc(policynum,'-');
_count2 = countc(policynum,'-');
_sep3 = findc(policynum,'_*');
_count3 = countc(policynum,'_*');
How can I convert this into a select statement like below:
select
*,
/*Code converted to SQL from above*/
from table2
For example I tried the below code:
select
*,
charindex('/&,',policynum) as _sep1,
LEN(policynum) - LEN(REPLACE(policynum,'/&,','')) as _count1
from table2
But I got a ERROR 42S02: Function 'CHARINDEX(UNKNOWN, VARCHAR)' does not exist. Unable to identify a function that satisfies that given argument types. You may need to add explicit typecasts.
Please note that the variable pol_no is: 'character varying(50) not null'.
I am running this on using Aginity Workbench for Netezza. I believe this is IBM.
Assuming Oracle based on CHARINDEX() this may work:
You need to apply it twice, once for each character and take the minimum to find the first occurrence.
There may be a better suited function within Oracle, but I don't know enough to suggest one.
select
*,
min(charindex('/',policynum), charindex('&', policynum)) as _sep1
from table2
EDIT: based on OP notes.
Netezza seems like IBM which means use the INSTR function, not CHARINDEX.
select
*,
min(instr(policynum, '/'), instr(policynum, '&')) as _sep1
from table2
https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.sqls.doc/ids_sqs_2336.htm
FINDC & COUNTC functions are basically used for searching a character & counting them.
You can use LIKE operator from SQL to find characters with '%' and '_' wildcards
e.g. -
SELECT * FROM <table_name> WHERE <column_name> LIKE '%-%';
and
SELECT COUNT(*) FROM <table_name> WHERE <column_name> LIKE '%-%';
You can use regular expressions in the LIKE operator as well

How to find values with certain number of decimal places using SQL?

I'm trying to figure out a way, using SQL, to query for values that go out to, say, 5 or more decimal places. In other words, I want to see only results that have 5+ decimal places (e.g. 45.324754) - the numbers before the decimal are irrelevant, however, I still need to see the full number. Is this possible? Any help if appreciated.
Assuming your DBMS supports FLOOR and your datatype conversion model supports this multiplication, you can do this:
SELECT *
FROM Table
WHERE FLOOR(Num*100000)!=Num*100000
This has the advantage of not requiring a conversion to a string datatype.
On SQL Server, you can specify:
SELECT *
FROM Table
WHERE Value <> ROUND(Value,4,1);
For an ANSI method, you can use:
SELECT *
FROM Table
WHERE Value <> CAST(Value*100000.0 AS INT) / 100000.0;
Although this method might cause an overflow if you're working with large numbers.
I imagine most DBMSs have a round function
SELECT *
FROM YourTable
WHERE YourCol <> ROUND(YourCol,4)
This worked for me in SQL Server:
SELECT *
FROM YourTable
WHERE YourValue LIKE '%._____%';
select val
from tablename
where length(substr(val,instr(val, '.')+1)) > 5
This is a way to do it in oracle using substr and instr
You can use below decode statement to identify maximum decimal present in database table
SELECT max(decode(INSTR(val,'.'), 0, 0, LENGTH(SUBSTR(val,INSTR(val,'.')+1)))) max_decimal
FROM tablename A;

SQL for extract portion of a string

I have a zipcode stored in a text field (string) and would like to select only the last 3 digits of the value in my select statement. is this possible? Is there a standard way of doing this so that the SQL is interchangeable accross databases? I will be using it in production on Oracle, but i test on Interbase (yes, yes, i know, two totally diff DBs, but thats what i am doing)
thanks for any help you can offer
Assuming the zipcodes all have the same length, you can use substr.
If they don't have the same length, you have to do similar things with the strlen function.
Interbase does not have a built-in substring function, but it does have a UDF (user defined function) called SUBSTR in lib_udf.dll that works like this:
select substr(clients.lastname, 1, 10)
from clients
You declare the UDF like this:
DECLARE EXTERNAL FUNCTION SUBSTR
CSTRING(80),
SMALLINT,
SMALLINT
RETURNS CSTRING(80) FREE_IT
ENTRY_POINT 'IB_UDF_substr' MODULE_NAME 'ib_udf';
Oracle does have a built-in substr function that you use like this:
select substr(clients.lastname, 1, 10)
from clients
--jeroen
This depends on how your storing the zip code. If you are using 5 digits only
then this should work for Oracle and may work for Interbase.
select * from table where substr(zip,3,3) = '014'
IF you store Zip + 4 and you want the last 3 digits and some are 5 digits and some are 9 digits you would have to do the following.
select * from table where substr(zip,length(zip) -2,3) = '014'
and one option that may work better in both databases is
select * from table where zip like '%014'