Extract string before certain character or without that character present - sql

I am using SQL Server 2016 and I am trying to extract the first set of numbers of a certain string. Here are some examples
12345
123456
12345-ks
12345-12
123456-ks
I want:
12345
123456
12345
12345
123456
I have tried SUBSTRING(#str, 0, charindex('-', str#, 0). But that excludes any strings without '-'
I have also tried using a case statement to include those strings. But, I can't group by a case statement. Any thoughts?

I've used a case statement to return the whole value when the column does not constain -.
create table strings ( s varchar(10));
insert into strings values
('12345'),
('123456'),
('12345-ks'),
('12345-12'),
('123456-ks');
select
case when charindex('-',s,0) = 0
then s
else SUBSTRING(s,0,
charindex('-', s, 0)
)
end as first_group
from strings;
GO
ci | first_group
-: | :----------
0 | 12345
0 | 123456
6 | 12345
6 | 12345
7 | 123456
db<>fiddle here

Related

Replace values in a column for all rows

I have a column with entries like:
column:
156781
234762
780417
and would like to have the following:
column:
0000156781
0000234762
0000780417
For this I use the following query:
Select isnull(replicate('0', 10 - len(column)),'') + rtrim(column) as a from table)
However, I don't know how to replace the values in the whole column.
I already tried with:
UPDATE table
SET column= (
Select isnull(replicate('0', 10 - len(column)),'') + rtrim(column) as columnfrom table)
But I get the following error.
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
The answer to your question is going to depend on the data type of your column. If it is a text column for example VARCHAR then you can modify the value in the table. If it is a number type such as INT it is the value and not the characters which is stored.
We can also express this by saying that "0" + "1" = "01" whilst 0 + 1 = 1.
In either case we can format the value in a query.
create table numberz(
val1 int,
val2 varchar(10));
insert into numberz values
(156781,'156781'),
(234762,'234762'),
(780417,'780417');
/* required format
0000156781
0000234762
0000780417
*/
select * from numberz;
GO
val1 | val2
-----: | :-----
156781 | 156781
234762 | 234762
780417 | 780417
UPDATE numberz
SET val1 = isnull(
replicate('0',
10 - len(val1)),'')
+ rtrim(val1),
val2 = isnull(
replicate('0',
10 - len(val2)),'')
+ rtrim(val2);
GO
3 rows affected
select * from numberz;
GO
val1 | val2
-----: | :---------
156781 | 0000156781
234762 | 0000234762
780417 | 0000780417
select isnull(
replicate('0',
10 - len(val1)),'')
+ rtrim(val1)
from numberz
GO
| (No column name) |
| :--------------- |
| 0000156781 |
| 0000234762 |
| 0000780417 |
db<>fiddle here
Usually, when we need to show values in specificity format these processes are performed using the CASE command or with other functions on the selection field list, mean without updating. In such cases, we can change our format to any format and anytime with changing functions. As dynamic fields.
For example:
select id, lpad(id::text, 6, '0') as format_id from test.test_table1
order by id
Result:
id format_id
-------------
1 000001
2 000002
3 000003
4 000004
5 000005
Maybe you really need an UPDATE, so I wrote a sample query for an UPDATE command too.
update test.test_table1
set
id = lpad(id::text, 6, '0');

SQL Server - Ordering Combined Number Strings Prior To Column Insert

I have 2 string columns (thousands of rows) with ordered numbers in each string (there can be zero to ten numbers in each string). Example:
+------------------+------------+
| ColString1 | ColString2 |
+------------------+------------+
| 1;3;5;12; | 4;6' |
+------------------+------------+
| 1;5;10 | 2;26; |
+------------------+------------+
| 4;7; | 3; |
+------------------+------------+
The end result is to combine these 2 columns, sort the numbers in
ascending order and then put each number into individual columns (smallest, 2nd smallest etc).
e.g. Colstring1 is 1;3;5;12; and ColString2 is 4;6; needs to return 1;3;4;5;6;12; which I then use xml to allocated into columns.
Everthing works fine using xml apart from the step to order the numbers (i.e I'm getting 1;3;5;12;4;6; when I combine the strings i.e. not in ascending order).
I've tried put them into a JSON array first to order, thinking I could do a top[1] etc but that did not work.
Any help on how to combine the 2 columns and order them before inserting into columns:
Steps so far:
Example data:
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, ColString1 VARCHAR(50), ColString2 VARCHAR(50));
INSERT INTO #tbl (ColString1, ColString2)
VALUES
('1;3;5;12;', '4;6;'),
('1;5;10;', '2;26;'),
('14;', '3;8;');
XML Approach (Combines strings and puts into columns but not in the correct order):
;WITH Split_Numbers (xmlname)
AS
(
SELECT
CONVERT(XML,'<Names><name>'
+ REPLACE ( LEFT(ColString1+ColString2,LEN(ColString1+ColString2) - 1),';', '</name><name>') + '</name></Names>') AS xmlname
FROM #tbl
)
SELECT
xmlname.value('/Names[1]/name[1]','int') AS Number1,
xmlname.value('/Names[1]/name[2]','int') AS Number2,
xmlname.value('/Names[1]/name[3]','int') AS Number3,
xmlname.value('/Names[1]/name[4]','int') AS Number4,
xmlname.value('/Names[1]/name[5]','int') AS Number5
--etc for additional columns
FROM Split_Numbers
Current Output: numbers not in correct order,
+---------+---------+---------+---------+---------+
| Number1 | Number2 | Number3 | Number4 | Number5 |
+---------+---------+---------+---------+---------+
| 1 | 3 | 5 | 12 | 4 |
| 1 | 5 | 10 | 2 | 26 |
| 14 | 3 | 8 | NULL | NULL |
+---------+---------+---------+---------+---------+
Desired Output: numbers in ascending order.
+---------+---------+---------+---------+---------+
| Number1 | Number2 | Number3 | Number4 | Number5 |
+---------+---------+---------+---------+---------+
| 1 | 3 | 4 | 5 | 6 |
| 1 | 2 | 5 | 10 | 26 |
| 3 | 8 | 14 | NULL | NULL |
+---------+---------+---------+---------+---------+
JSON Approach: combines the columns into a JSON array but I still can't order correctly when in JSON format.
REPLACE ( CONCAT('[', LEFT(ColString1+ColString2,LEN(ColString1+ColString2) - 1), ']') ,';',',')
Any help will be greatly appreciated whether there is a way to order the xml or JSON string prior to entry. Happy to consider an alternative way if there is an easier solution.
You can use string_agg() and string_split():
select t.*, newstring
from t cross apply
(select string_agg(value, ',') order by (value) as newstring
from (select s1.value
from unnest(colstring1, ',') s1
union all
select s2.value
from unnest(colstring2, ',') s2
) s
) s;
That said, you should probably put your effort into fixing the data model. Storing numbers in strings is bad. Storing multiple values in a string is bad, bad. If the numbers are foreign references to other tables, that is bad, bad, bad, bad, bad.
While waiting for a DDL and sample data population, etc., here is a conceptual example for you. It is using XQuery and its FLWOR expression.
CTE does most of the heavy lifting:
Concatenates both columns values into one string. CONCAT() function protects against NULL values.
Converts it into XML data type.
Sorts XML elements by converting their values to int data type in the FLWOR expression.
Filters out XML elements with no legit values.
The rest is trivial.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE (ID INT IDENTITY PRIMARY KEY, col1 VARCHAR(100), col2 VARCHAR(100));
INSERT INTO #tbl (col1, col2)
VALUES
('1;3;5;12;', '4;6;'),
('1;5;10;', '2;26;');
-- DDL and sample data population, end
DECLARE #separator CHAR(1) = ';';
;WITH rs AS
(
SELECT *
, CAST('<root><r><![CDATA[' +
REPLACE(CONCAT(col1, col2), #separator, ']]></r><r><![CDATA[') +
']]></r></root>' AS XML).query('<root>
{
for $x in /root/r[text()]
order by xs:int($x)
return $x
}
</root>') AS sortedXML
FROM #tbl
)
SELECT ID
, c.value('(r[1]/text())[1]','INT') AS Number1
, c.value('(r[2]/text())[1]','INT') AS Number2
, c.value('(r[3]/text())[1]','INT') AS Number3
-- continue with the rest of the columns
FROM rs CROSS APPLY sortedXML.nodes('/root') AS t(c);
Output
+----+---------+---------+---------+
| ID | Number1 | Number2 | Number3 |
+----+---------+---------+---------+
| 1 | 1 | 3 | 4 |
| 2 | 1 | 2 | 5 |
+----+---------+---------+---------+

sql-remove dashes from string column

in stored procedure, i have this field
LTRIM(ISNULL(O.Column1, ''))
If there is a dash(-) symbol at end of the value, want to remove it. only in conditions if a dash symbol exist at start/end.
Any suggestions
EDIT:
Microsoft SQL Server 2014 12.0.5546.0
Expected output:
1)input: "abc-abc" //output: "abc-abc"
2)input: "abc-" //output: "abc"
3)input: "abc" //ouput: "abc"
I think you might be stuck with string manipulation here.
The CASE expression here takes the LTRIM/RTRIM result from your column and checks both ends for a dash, and then each end for a dash. If dashes exist, it strips them out. It's not pretty, and won't perform well on a mountain of data, but will do what you need.
Data setup:
create table trim (col1 varchar(10));
insert trim (col1)
values
('abc'),
(' abc-'),
('abc- '),
('abc-abc '),
(' -abc'),
('-abc '),
(NULL),
(''),
(' -abc- ');
The query:
select
case
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
and left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then substring(ltrim(rtrim(isnull(col1,''))),2,len(ltrim(rtrim(isnull(col1,''))))-2)
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
then left(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
when left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then right(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
else ltrim(rtrim(isnull(col1,'')))
end as trimmed
from trim;
Results:
+---------+
| trimmed |
+---------+
| abc |
| abc |
| abc |
| abc-abc |
| abc |
| abc |
| |
| |
| abc |
+---------+
SQL Fiddle Demo
Since the Database is not mentioned, here is how you do it (rather find it)
SQL Server
Remove the last character in a string in T-SQL?
Oracle
Remove last character from string in sql plus
Postgresql
Postgresql: Remove last char in text-field if the column ends with minus sign
MySQL
Strip last two characters of a column in MySQL
You can use LEFT function, along with SUBSTRING to achieve the result.
SELECT CASE WHEN RIGHT(stringVal,1)= '-' THEN SUBSTRING(stringVal,1,LEN(stringVal)-1)
ELSE stringVal END AS ModifiedString
from
( VALUES ('abc-abc'), ('abc-'),('abc')) as t(stringVal)
+----------------+
| ModifiedString |
+----------------+
| abc-abc |
| abc |
| abc |
+----------------+

Teradata SQL select string of multiple capital letters in a text field

Any help would be much appreciated on figuring out how to identify Acronyms within a text field that has mixed upper and lower case letters.
For example, we might have
"we used the BBQ sauce on the Chicken"
I need my query to SELECT "BBQ" and nothing else in the cell.
There could be multiple capitalized string per row
The output should include the uppcase string.
Any ideas are much appreciated!!
This is going to be kind of ugly. I tried to use REGEXP_SPLIT_TO_TABLE to just pull out the all caps words, but couldn't make it work.
I would do it by first using strtok_split_to_table, so each word will end up in it's own row.
First, some dummy data:
create volatile table vt
(id integer,
col1 varchar(20))
on commit preserve rows;
insert into vt
values (1,'foo BAR');
insert into vt
values (2,'fooBAR');
insert into vt
values(3,'blah FOO FOO blah');
We can use strtok_split_to_table on this:
select
t.*
from table
(strtok_split_to_table(vt.id ,vt.col1,' ')
returns
(tok_key integer
,tok_num INTEGER
,tok_value VARCHAR(30)
)) AS t
That will split each value into separate rows, using a space as a delimiter.
Finally, we can compare each of those values to that value in upper case:
select
vt.id,
vt.col1,
tok_key,
tok_num,
tok_value,
case when upper(t.tok_value) = t.tok_value (CASESPECIFIC) then tok_value else '0' end
from
(
select
t.*
from table
(strtok_split_to_table(vt.id ,vt.col1,' ')
returns
(tok_key integer
,tok_num INTEGER
,tok_value VARCHAR(30)
)) AS t
) t
inner join vt
on t.tok_key = vt.id
order by id,tok_num
Taking our lovely sample data, you'll get:
+----+-------------------+---------+---------+-----------+-------------+
| id | col1 | tok_key | tok_num | tok_value | TEST_OUTPUT |
+----+-------------------+---------+---------+-----------+-------------+
| 1 | foo BAR | 1 | 1 | foo | 0 |
| 1 | foo BAR | 1 | 2 | BAR | BAR |
| 2 | fooBAR | 2 | 1 | fooBAR | 0 |
| 3 | blah FOO FOO blah | 3 | 1 | blah | 0 |
| 3 | blah FOO FOO blah | 3 | 2 | FOO | FOO |
| 3 | blah FOO FOO blah | 3 | 3 | FOO | FOO |
| 3 | blah FOO FOO blah | 3 | 4 | blah | 0 |
+----+-------------------+---------+---------+-----------+-------------+
Defining acronyms as all uppercase words with 2 to 5 characters with a '\b[A-Z]{2,5}\b' regex:
WITH cte AS
( -- using #Andrew's Volatile Table
SELECT *
FROM vt
-- only rows containing acronyms
WHERE RegExp_Similar(col1, '.*\b[A-Z]{2,5}\b.*') = 1
)
SELECT
outkey,
tokenNum,
CAST(RegExp_Substr(Token, '[A-Z]*') AS VARCHAR(5)) AS acronym -- 1st uppercase word
--,token
FROM TABLE
( RegExp_Split_To_Table
( cte.id,
cte.col1,
-- split before an acronym, might include additional characters after
-- [^A-Z]*? = any number of non uppercase letters (removed)
-- (?= ) = negative lookahead, i.e. check, but don't remove
'[^A-Z]*?(?=\b[A-Z]{2,5}\b)',
'' -- defaults to case sensitive
) RETURNS
( outKey INT,
TokenNum INT,
Token VARCHAR(30000) -- adjust to match the size of your input column
)
) AS t
WHERE acronym <> ''
I am not 100% sure what are you trying to do but I thing you have many options. I.e.:
Option 1) check if the acronym (like BBQ) exist in the string (basic syntax)
SELECT CHARINDEX ('BBQ',#string)
In this case you would need a table of all know acronyms you want to check for and then loop through each of them to see if there is a match for your string and then return the acronym.
DECLARE #string VARCHAR(100)
SET #string = 'we used the BBQ sauce on the Chicken'
create table : [acrs]
--+--- acronym-----+
--+ BBQ +
--+ IBM +
--+ AMD +
--+ ETC +
--+----------------+
SELECT acronym FROM [acrs] WHERE CHARINDEX ([acronym], #string ) > 0)
This should return : 'BBQ'
Option 2) load up all the upper case characters into a temp table etc. for further logic and processing. I think you could use something like this...
DECLARE #string VARCHAR(100)
SET #string = 'we used the BBQ sauce on the Chicken'
-- make table of all Upper case letters and process individually
;WITH cte_loop(position, acrn)
AS (
SELECT 1, SUBSTRING(#string, 1, 1)
UNION ALL
SELECT position + 1, SUBSTRING(#string, position + 1, 1)
FROM cte_loop
WHERE position < LEN(#string)
)
SELECT position, acrn, ascii(acrn) AS [ascii]
FROM cte_loop
WHERE ascii(acrn) > 64 AND ascii(acrn) < 91 -- see the ASCII table for all codes
This would return table like this:

IF statement within a formula in a SQL query

Let's say I have a table with two numeric columns: NUM and DEN.
I need to extract the ratio NUM/DEN only if DEN isn't 0: otherwise the ratio should be 0.
Something like this:
select ID, [...] AS RATIO
from Table
where [...] is some kind of equivalent of the excel formula IF(DEN=0;0;NUM/DEN).
Is there a way to perform this kind of query?
Many thanks!
This should work:
case
when DEN = 0 then 0
else NUM/DEN
end
Yes, what you are looking for is case. It has two versions:
case [variable]
when [value1] then [output1]
when [value2] then [output2]
when [value3] then [output3]
...
else [outputdefault] end
and
case when [Boolean(True/false) expression 1] then [output1]
when [Boolean(True/false) expression 2] then [output2]
when [Boolean(True/false) expression 3] then [output3]
...
else [outputdefault] end
If SQL Server 2012 you can use : IIF
IIF ( boolean_expression, true_value, false_value )
If you are using SQL Server you could use case statement like w0lf mentioned or you could use iif statement like so:
select iif(age > 21, 'Allowed', 'Not Allowed') as status
from test;
Example:
create table test (
fullname varchar (20),
age int
);
insert into test values
('John', 10),
('Matt', 90),
('Jane', 25),
('Ruby', 80),
('Randy', null);
Result
| fullname | status |
|----------|-------------|
| John | Not Allowed |
| Matt | Allowed |
| Jane | Allowed |
| Ruby | Allowed |
| Randy | Not Allowed |
The same thing can be written as
select case when age > 21 then 'Allowed' else 'Not Allowed' end as status
from test;
case statement is used by many database engines.
If you are dealing with null and not null values, you could also use coalesce like so:
select fullname, coalesce(age, 999) as status
from test;
The result will be:
| fullname | status |
|----------|--------|
| John | 10 |
| Matt | 90 |
| Jane | 25 |
| Ruby | 80 |
| Randy | 999 |
At first you may think that coalesce does if age is null then 999 else age. It does that, sort-of, but in particular coalesce outputs the first non-null value in a list. So, coalesce(null, null, 45) will result in 45. coalesce(null, 33, 45) will result in 33.
Feel free to play around with SQL Fiddle: http://sqlfiddle.com/#!6/a41a7/6