SQL padding 0 to the left of a number in string - sql

I am a beginner in SQL language and I am using postgre sql and doing little exercices to learn. I have a column of strings named acronym from a destination table:
DO1
ES1
ES2
FR1
FR10
FR2
FR3
FR4
FR5
FR6
FR7
FR8
FR9
GP1
GP2
IN1
IN2
MU1
RU1
TR1
UA1
I would like to add a padding zero for acronym numbers that have only one digit, output:
DO01
ES01
ES02
FR01
FR02
FR03
FR04
FR05
FR06
FR07
FR08
FR09
FR10
GP01
GP02
IN01
IN02
MU01
RU01
TR01
UA01
How can I get to the left of the first number in the string? There is some regex I think but I did not figure it out

You can use the rpad() function to add characters to the end of the value:
select rpad(col, '0', 4)
In your case, though, you want a value in-between. On simple method is -- assuming that the first two characters are strings -- is:
(case when length(col) = 3
then left(col, 2) || '0' || right(col, 1)
else col
end)
Another possibility is using regexp_replace():
regexp_replace(col, '^([^0-9]{2})([0-9])$', '\10\2')
Both of these assume that the strings to be padded are three characters, which is consistent with your data. It is unclear what you want for other lengths.

try with below:
to_char() function
select to_char(column1, 'fm000') as column2
from Test_table;
fm "fill mode"prefix avoids leading spaces in the resulting var char.
000 it defines the number of digits you want to have.

You can use string functions like lpad(), substr(), left():
select
concat(left(columnname, 2), lpad(substr(columnname, 3), 2, '0')) result
from tablename
See the demo.
Results:
| result |
| ------ |
| DO01 |
| ES01 |
| ES02 |
| FR01 |
| FR10 |
| FR02 |
| FR03 |
| FR04 |
| FR05 |
| FR06 |
| FR07 |
| FR08 |
| FR09 |
| GP01 |
| GP02 |
| IN01 |
| IN02 |
| MU01 |
| RU01 |
| TR01 |
| UA01 |

Related

Check string for substring existence

How can I check whether a certain substring (for instance 18UT) is part of a string in a column?
Redshifts' SUBSTRING function allows me to "cut" a certain substring based on a starting index + length of the subtring, but not check whether a specific substring exists is in the column's value.
Example:
+------------------+
| col |
+------------------+
| 14TH, 14KL, 18AB |
| 14LK, 18UT, 15AK |
| 14AB, 08ZT, 18ZH |
| 14GD, 52HG, 18UT |
+------------------+
Desired result:
+------------------+------+
| col | 18UT |
+------------------+------+
| 14TH, 14KL, 18AB | No |
| 14LK, 18UT, 15AK | Yes |
| 14AB, 08ZT, 18ZH | No |
| 14GD, 52HG, 18UT | Yes |
+------------------+------+
Here is one option:
select col,
case when ', ' || col || ', ' like '%, 18UT, %' then 'yes' else 'no' end has_18ut
from mytable
While this will solve your immediate, problem, it should be note that storing delimited lists in a database table is bad practice, and should be avoided. Each value should go to a separate row instead.

Redshift skip the first character of split_part()

I have a table column like below:
| cloumn_a |
| ------------------ |
| Alpha_Black_1 |
| Alpha_Black_2323 |
| Alpha_Red_100 |
| Alpha_Blue_2344 |
| Alpha_Orange_33333 |
| Alpha_White_2 |
| |
Usually, when I want to split with any symbol or character I am using the split_part(text, text, integer) so split_part(column_a, '_', 1)
I need to remove the numeric part of each variable and keep only the text part like Alpha_Black.
I cannot use the trim function because the numeric part can change
How can I skip the first underscore and split from the second one?
I would suggest using REGEXP_REPLACE here:
SELECT
column_a,
REGEXP_REPLACE(column_a, '_\\d+$', '') AS column_a_out
FROM yourTable;
Demo

How To Check Numerical Format in SQL Server 2008

I am converting some existing Oracle queries to MSSQL Server (2008) and can't figure out how to replicate the following Regex check:
SELECT SomeField
FROM SomeTable
WHERE NOT REGEXP_LIKE(TO_CHAR(SomeField), '^[0-9]{2}[.][0-9]{7}$');
That finds all results where the format of the number starts with 2 positive digits, followed by a decimal point, and 7 decimal places of data: 12.3456789
I've tried using STR, CAST, CONVERT, but they all seem to truncate the decimal to 4 decimal places for some reason. The truncating has prevented me from getting reliable results using LEN and CHARINDEX. Manually adding size parameters to STR gets slightly closer, but I still don't know how to compare the original numerical representation to the converted value.
SELECT SomeField
, STR(SomeField, 10, 7)
, CAST(SomeField AS VARCHAR)
, LEN(SomeField )
, CHARINDEX(STR(SomeField ), '.')
FROM SomeTable
+------------------+------------+---------+-----+-----------+
| Orig | STR | Cast | LEN | CHARINDEX |
+------------------+------------+---------+-----+-----------+
| 31.44650944 | 31.4465094 | 31.4465 | 7 | 0 |
| 35.85609 | 35.8560900 | 35.8561 | 7 | 0 |
| 54.589623 | 54.5896230 | 54.5896 | 7 | 0 |
| 31.92653899 | 31.9265390 | 31.9265 | 7 | 0 |
| 31.4523333333333 | 31.4523333 | 31.4523 | 7 | 0 |
| 31.40208955 | 31.4020895 | 31.4021 | 7 | 0 |
| 51.3047869443893 | 51.3047869 | 51.3048 | 7 | 0 |
| 51 | 51.0000000 | 51 | 2 | 0 |
| 32.220633 | 32.2206330 | 32.2206 | 7 | 0 |
| 35.769247 | 35.7692470 | 35.7692 | 7 | 0 |
| 35.071022 | 35.0710220 | 35.071 | 6 | 0 |
+------------------+------------+---------+-----+-----------+
What you want to do does not make sense in SQL Server.
Oracle supports a number data type that has a variable precision:
if a precision is not specified, the column stores values as given.
There is no corresponding data type in SQL Server. You have have a variable number (float/real) or a fixed number (decimal/numeric). However, both apply to ALL values in a column, not to individual values within a row.
The closest you could do is:
where somefield >= 0 and somefield < 100
Or if you wanted to insist that there is a decimal component:
where somefield >= 0 and somefield < 100 and floor(somefield) <> somefield
However, you might have valid integer values that this would filter out.
This answer gave me an option that works in conjunction with checking the decimal position first.
SELECT SomeField
FROM SomeTable
WHERE SomeField IS NOT NULL
AND CHARINDEX('.', SomeField ) = 3
AND LEN(CAST(CAST(REVERSE(CONVERT(VARCHAR(50), SomeField , 128)) AS FLOAT) AS BIGINT)) = 7
While I understand this is terrible by nearly all metrics, it satisfies the requirements.
The basis of checking formatting on this data type in inherently flawed as pointed out by several posters, however for this very isolated use case I wanted to document the workaround.

SQL - left pad with the zero after symbol '-'

I am trying to left pad with a single zero after the '-'.
I did check the other answers here but didnt help me.
Here is the table :
+---------+
| Job |
+---------+
| 3254-1 |
| 3254-25 |
| 3254-6 |
+---------+
I need to left pad with single zero after '-' if the value is between 1 and 9 in the end
I want the results to be :
+---------+
| Job |
+---------+
| 3254-01 |
| 3254-25 |
| 3254-06 |
+---------+
You can use CHARINDEX(), SUBSTRING() and REPLACE() as:
CREATE TABLE Jobs(
Job VARCHAR(45)
);
INSERT INTO Jobs VALUES
('3254-1'),
('3254-25'),
('3254-6');
SELECT CASE
WHEN CHARINDEX('-', Job, 1)+1 < LEN(Job) THEN Job
ELSE
REPLACE(Job, '-', '-0')
END AS Job
FROM Jobs;
Results:
+----+---------+
| | Job |
+----+---------+
| 1 | 3254-01 |
| 2 | 3254-25 |
| 3 | 3254-06 |
+----+---------+
If you want an update, I think this is the simplest method:
update t
set job = replace(job, '-', '-0')
where job like '%-_';
This problem is simplified greatly because you are only adding a single padding character.
If you have version 2012+, then format function may be used as :
select concat(nr1, '-', format( cast ( q2.nr2 as int ), '00')) as result
from
(
select substring(q1.str,1,charindex('-',q1.str,1)-1) as nr1,
substring(q1.str,charindex('-',q1.str,1)+1,len(q1.str)) as nr2
from
(
select '3254-1' as str union all
select '3254-25' as str union all
select '3254-6' as str
) q1
) q2;
result
------
3254-01
3254-25
3254-06
Rextester Demo

Get row with max value in Hive/SQL?

I'm new to Hive/SQL, and I'm stuck on a fairly simple problem. My data looks like:
+------------+--------------------+-----------------------+
| carrier_iD | meandelay | meancanceled |
+------------+--------------------+-----------------------+
| EV | 13.795802119653473 | 0.028584251044292006 |
| VX | 0.450591016548463 | 2.364066193853424E-4 |
| F9 | 10.898001378359766 | 0.00206753962784287 |
| AS | 0.5071547420965062 | 0.0057404326123128135 |
| HA | 1.2031093279839498 | 5.015045135406214E-4 |
| 9E | 8.147899230704216 | 0.03876067292247866 |
| B6 | 9.45383857757506 | 0.003162096314343487 |
| UA | 8.101511665305816 | 0.005467725574605967 |
| FL | 0.7265068895709532 | 0.0041141513746490044 |
| WN | 7.156119279121648 | 0.0057419058192869415 |
| DL | 4.206288692245839 | 0.005123990066804269 |
| YV | 6.316802855264404 | 0.029304029304029346 |
| US | 3.2221527095063736 | 0.007984031936127766 |
| OO | 6.954715814690328 | 0.02596499362466706 |
| MQ | 9.74568222216328 | 0.025628100708354324 |
| AA | 8.720522654298968 | 0.019242775597574157 |
+------------+--------------------+-----------------------+
I want Hive to return the row with the meanDelay max value. I have:
SELECT CAST(MAX(meandelay) as FLOAT) FROM flightinfo;
which indeed returns the max (I use cast because my values are saved as STRING). So then:
SELECT * FROM flightinfo WHERE meandelay = (SELECT CAST(MAX(meandelay) AS FLOAT) FROM flightinfo);
I get the following error:
FAILED: ParseException line 1:44 cannot recognize input near 'select' 'cast' '(' in expression specification
Use the windowing and analytics functions
SELECT carrier_id, meandelay, meancanceled
FROM
(SELECT carrier_id, meandelay, meancanceled,
rank() over (order by cast(meandelay as float) desc) as r
FROM table) S
WHERE S.r = 1;
This will also solve the problem if more than one row has the same max value, you'll get all the rows as result. If you just want a single row change rank() to row_number() or add another term to the order by.
use join instead.
SELECT a.* FROM flightinfo a left semi join
(SELECT CAST(MAX(meandelay) AS FLOAT)
maxdelay FROM flightinfo)b on (a.meandelay=b.maxdelay)
You can use the collect_max UDF from Brickhouse ( http://github.com/klout/brickhouse ) to solve this problem, passing in a value of 1, meaning that you only want the single max value.
select array_index( map_keys( collect_max( carrier_id, meandelay, 1) ), 0 ) from flightinfo;
Also, I've read somewhere that the Hive max UDF does allow you to access other fields on the row, but I think its easier just to use collect_max.
I don't think your sub-query is allowed ...
A quick look here:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries
states:
As of Hive 0.13 some types of subqueries are supported in the WHERE
clause. Those are queries where the result of the query can be treated
as a constant for IN and NOT IN statements (called uncorrelated
subqueries because the subquery does not reference columns from the
parent query):