Conditional extraction of fixed width data using SQL - sql

I have a scenario where I pull out data from multiple tables and the output is fixed width format. The fixed width output will look like:
Current output:
1001RJOHNKEITH25 20181017 NA
1002CDWANEKANE36 20181010 RR
1003CMIKAYLAGN44 20181011 RR
Desired output:
1001RJOHNKEITH25 20181017 NA
1002CDWANEKANE36 NA
1003RMIKAYLAGN44 20181010 RR
In this output, 1001 is the Person ID, R/C is the hard-coded indicator, then comes the name, age and registration date, record type. There is a condition for Registration date. If the record indicator is R, the registration date will show up. Otherwise, it should be null. I am not sure how to write a condition based on the fixed width field.
Rextester demo attached : https://rextester.com/MKESI50760
Any help?!

OK, well this is a little messy. But because your output is fixed width, you can always make the query into a view or a CTE (shown below) and then access specific positions in the string via SUBSTRING function.
There are LOT of drawbacks to doing this. If anybody changes the order or size of the fields being concatenated ... it all breaks. So, in the spirit of answering your question.. this is a way to do it. But I don't think It's a good way.
WITH BaseQuery as
(
select
t.Cid,
cast
(
concat(
LEFT(CONCAT(isnull(t.Cid,''),space(5)),5), -- PersonID
LEFT(CONCAT(isnull
((case when t.registeredonline = '1' and t.recordtype = 'NA' then 'R'
else 'C' end),''),space(10)),10),-- Record Indicator
LEFT(CONCAT(isnull(t.name,''),space(14)),14), --name
LEFT(CONCAT(isnull(t.age,''),space(5)),5), --age
LEFT(CONCAT(isnull(t.registrationdate,''),space(14)),14), -- Registration date should show up when record indicator is 'R'
LEFT(CONCAT(isnull(t.recordtype,''),space(3)),3) --Record type
) as nvarchar(max)
) result
from #temp t
)
SELECT
CONCAT(
SUBSTRING(result, 1, 34) -- portion before the 'registration date' region
, CASE WHEN SUBSTRING (RESULT, 6, 1) = 'R' THEN SUBSTRING (RESULT, 35, 10) ELSE SPACE(10) END
, SUBSTRING (RESULT, 46, 5)
)
FROM
BaseQuery
this gives the result:
1001 R JOHNKEITH 25 2018-10-17 NA
1002 C DWANEKANE 36 RR
1003 C JOHNKEITH 44 RR

The line
LEFT(CONCAT(isnull(t.registrationdate,''),space(14)),14)
become
CASE WHEN t.registeredonline = '1' and t.recordtype = 'NA' THEN LEFT(CONCAT(isnull(t.registrationdate,''),space(14)),14) ELSE SPACE(14) END, -- Registration date should show up when record indicator is 'R'
Just enclosing the original line with a condition to see if the result is 'R' or not.
The condition is showed up in the query from your link.

You just need to update one line in your query:
LEFT(CONCAT(isnull(t.registrationdate,''),space(14)),14), -- Registration date should show up when record indicator is 'R'
becomes
LEFT(CONCAT(isnull(CASE WHEN t.registeredonline = '1' and t.recordtype = 'NA' THEN CONVERT(char(10), t.registrationdate,126) ELSE NULL END,''),space(14)),14), -- Registration date should show up when record indicator is 'R'
This will check your date field and put in spaces instead of a date when the logic for record indicator evaluates to'R'
The 'convert' statement is needed otherwise the NULL date will end up showing as 1900-01-01.
Hope it helps.

Dealing with fixed width data:
Data in a fixed-width text file or string is arranged in rows and
columns, with one entry per row. Each column has a fixed width,
specified in characters, which determines the maximum amount of data
it can contain. No delimiters are used to separate the fields in the
file.
Parsing that data in T-SQL you can use SUBSTRING
https://learn.microsoft.com/en-us/sql/t-sql/functions/substring-transact-sql?view=sql-server-2017
SUBSTRING ( expression ,start , length )
Here's an example:
DECLARE #SampleData TABLE
(
[LineData] NVARCHAR(255)
);
INSERT INTO #SampleData (
[LineData]
)
VALUES ( '1001RJOHNKEITH25 20181017 NA' )
, ( '1002CDWANEKANE36 20181010 RR' )
, ( '1003CMIKAYLAGN44 20181011 RR' );
SELECT SUBSTRING([LineData], 1, 4) AS [PersonId]
, SUBSTRING([LineData], 5, 1) AS [Indicator]
, SUBSTRING([LineData], 6, 9) AS [Name]
, SUBSTRING([LineData], 15, 2) AS [Age]
, SUBSTRING([LineData], 18, 8) AS [RegDate]
, SUBSTRING([LineData], 27, 2) AS [RecordType]
, *
FROM #SampleData;
So in your example you're wanted to evaluate whether or not the "Indicator" is 'R', you can get to that value with:
SUBSTRING([LineData], 5, 1)
Not sure how that fits into what you have been tasked with. Based on other comments there's more to how this "Indicator" is determined.
Not ideal, but you could parse out all the fields and then put them back together doing the evaluation on that indicator field or use stuff in a case statement to replace the date with blanks when evaluating if indicator is R in the string.
DECLARE #SampleData TABLE
(
[LineData] NVARCHAR(255)
);
INSERT INTO #SampleData (
[LineData]
)
VALUES ( '1001RJOHNKEITH25 20181017 NA' )
, ( '1002CDWANEKANE36 20181010 RR' )
, ( '1003CMIKAYLAGN44 20181011 RR' );
--We check for R using substring
--when not equal to R we replace where Registration date in the string was with blanks.
SELECT CASE WHEN SUBSTRING([LineData], 5, 1) = 'R' THEN [LineData]
ELSE STUFF([LineData], 18, 8, ' ')
END AS [LineData]
FROM #SampleData;

Select ColA, CASE WHEN ColB (Criteria here) THEN NULL ELSE ColB END AS ColB, ColC

Related

SQL : extract next character from string where multiple separators exist

Azure MSSQL Database
I have a column that contains values stored per transaction. The string can contain up to 7 values, separated by a '-'.
I need to be able to extract the value that is stored after the 3rd '-'. The issue is that the length of this column (and the characters that come before the 3rd '-') can vary.
For example:
DIM VALUE
1. NHL--WA-S-MOSG-SER-
2. VDS----HAST-SER-
3. ---D---SER
Row 1 needs to return 'S'
Row 2 needs to return '-'
Row 3 needs to return 'D'
This is by no means an optimal solution, but it works in SQL Server. 😊
TempTable added for testing purposes. Maybe it gives you a hint as of where to start.
Edit: added reference for string_split function (works from SQL Server 2016 up).
CREATE TABLE #tempStrings (
VAL VARCHAR(30)
);
INSERT INTO #tempStrings VALUES ('NHL--WA-S-MOSG-SER-');
INSERT INTO #tempStrings VALUES ('VDS----HAST-SER-');
INSERT INTO #tempStrings VALUES ('---D---SER');
INSERT INTO #tempStrings VALUES ('A-V-D-C--SER');
SELECT
t.VAL,
CASE t.PART WHEN '' THEN '-' ELSE t.PART END AS PART
FROM
(SELECT
t.VAL,
ROW_NUMBER() OVER (PARTITION BY VAL ORDER BY (SELECT NULL)) AS IX,
value AS PART
FROM #tempStrings t
CROSS APPLY string_split(VAL, '-')) t
WHERE t.IX = 4; --DASH COUNT + 1
DROP TABLE #tempStrings;
Output is...
VAL PART
---D---SER D
A-V-D-C--SER C
NHL--WA-S-MOSG-SER- S
VDS----HAST-SER- -
If you always want the fourth element then using CHARINDEX is relatively straightforward:
DROP TABLE IF EXISTS #tmp;
CREATE TABLE #tmp (
rowId INT IDENTITY PRIMARY KEY,
xval VARCHAR(30) NOT NULL
);
INSERT INTO #tmp
VALUES
( 'NHL--WA-S-MOSG-SER-' ),
( 'VDS----HAST-SER-' ),
( '---D---SER' ),
( 'A-V-D-C--SER' );
;WITH cte AS
( -- Work out the position of the 3rd dash
SELECT
rowId,
xval,
CHARINDEX( '-', xval, CHARINDEX( '-', xval, CHARINDEX( '-', xval ) + 1 ) + 1 ) + 1 xstart
FROM #tmp t
), cte2 AS
( -- Work out the length for the substring function
SELECT rowId, xval, xstart, CHARINDEX( '-', xval, xstart) - (xstart) AS xlen
FROM cte
)
SELECT rowId, ISNULL( NULLIF( SUBSTRING( xval, xstart, xlen ), '' ), '-' ) xpart
FROM cte2
I also did a volume test at 1 million rows and this was by far the fastest method compared with STRING_SPLIT, OPENJSON, recursive CTE (the worst at high volume). As a downside this method is less extensible, say you want the second or fifth items for example.

Query SQL with similar values

I have to make a query to a base using as a comparison a string like this 12345678, but the value to compare is this way12.345.678, if I do the following query it does not return anything.
SELECT * FROM TABLA WHERE CAMPO = '12345678'
Where CAMPO would have the value of (12.345.678), if I replace = with a like, it does not return the data either
SELECT * FROM TABLA WHERE CAMPO like '12345678%'
SELECT * FROM TABLA WHERE CAMPO like '%12345678'
SELECT * FROM TABLA WHERE CAMPO like '%12345678%'
None of the 3 previous consultations works for me, how can I make this query?
The value can be of either 7, 8 or 9 numbers and the. It has to be every 3 from the end to the beginning
Use REPLACE() function to replace all the dots '.' as
SELECT *
FROM(
VALUES ('12.345.678'),
('23.456.789')
) T(CAMPO)
WHERE REPLACE(CAMPO, '.', '') = '12345678';
Your query should be
SELECT * FROM TABLA WHERE REPLACE(CAMPO, '.', '') = '12345678';
You can compare the string without the dots to a REPLACE(StringWithDots, '.','')
I recommend you to convert the number to numeric
So you can use < and > operators and all functions that require you to have a number...
the best way to achieve this is to make sure you remove any unecessary dots and convert the commas to dots. like this
CONVERT(NUMERIC(10, 2),
REPLACE(
REPLACE('7.000,45', '.', ''),
',', '.'
)
)
I hope this will help you out.
A SARGABLE solution would be to write a function that takes your target value ('12345678') and inserts the separators ('.') every third character from right to left. The result ('12.345.678') can then be used in a where clause and benefit from an index on CAMPO.
The following code demonstrates an approach without creating a user-defined function (UDF). Instead, a recursive common table expression (CTE) is used to process the input string three characters at a time to build the dotted target string. The result is used in a query against a sample table.
To see the results from the recursive CTE replace the final select statement with the commented select immediately above it.
-- Sample data.
declare #Samples as Table ( SampleId Int Identity, DottedDigits VarChar(20) );
insert into #Samples ( DottedDigits ) values
( '1' ), ( '12' ), ( '123' ), ( '1.234' ), ( '12.345' ),
( '123.456' ), ( '1.234.567' ), ( '12.345.678' ), ( '123.456.789' );
select * from #Samples;
-- Query the data.
declare #Target as VarChar(15) = '12345678';
with
Target as (
-- Get the first group of up to three characters from the tail of the string ...
select
Cast( Right( #Target, 3 ) as VarChar(20) ) as TargetString,
Cast( Left( #Target, case when Len( #Target ) > 3 then Len( #Target ) - 3 else 0 end ) as VarChar(20) ) as Remainder
union all
-- ... and concatenate the next group with a dot in between.
select
Cast( Right( Remainder, 3 ) + '.' + TargetString as VarChar(20) ),
Cast( Left( Remainder, case when Len( Remainder ) > 3 then Len( Remainder ) - 3 else 0 end ) as VarChar(20) )
from Target
where Remainder != ''
)
-- To see the intermediate results replace the final select with the line commented out below:
--select TargetString from Target;
select SampleId, DottedDigits
from #Samples
where DottedDigits = ( select TargetString from Target where Remainder = '' );
An alternative approach would be to add a indexed computed column to the table that contains Replace( CAMPO, '.', '' ).
If the table containing IDs like 12.345.678 is big (contains many records), I would add a computed field that removes the dots (and if this ID does never contain any alphanumeric characters other than dots and has no significant leading zeros then also cast it in an INT or BIGINT) and persist it and lay an index over it. That way you loose a little time when inserting the record but are querying it with maximum speed and therefore saving processor power.

Order string alpha numerically A1-1-1, A1-2-1, A1-10-1, A1-2-2, A1-2-3 etc

I have a column with different length strings which has dashes (-) that separates alphanumeric strings.
The string could look like "A1-2-3".
I need to order by first "A1" then "2" then "3"
I want to achieve the following order for the column:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
I can separate the string with the following code:
declare #string varchar(max) = 'A1-2-3'
declare #first varchar(max) = SUBSTRING(#string,1,charindex('-',#string)-1)
declare #second varchar(max) = substring(#string, charindex('-',#string) + 1, charindex('-',reverse(#string))-1)
declare #third varchar(max) = right(#string,charindex('-',reverse(#string))-1)
select #first, #second, #third
With the above logic I thought that I could use the following:
Note this only regards strings with 2 dashes
select barcode from tabelWithBarcodes
order by
case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(SUBSTRING(barcode,1,charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
SUBSTRING(barcode,1,(charindex('-',barcode)-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
substring(barcode, charindex('-',barcode) + 1, charindex('-',reverse(barcode))-1)
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
len(right(barcode,charindex('-',reverse(barcode))-1))
end
, case when len(barcode) - len(replace(barcode,'-','')) = 2 then
right(barcode,charindex('-',reverse(barcode))-1)
end
But the sorting is not working for the second and third section of the string.
(I haven't added the code for checking if the string has only 1 or no dash in it for simplicity)
Not sure if I'm on the right path here.
Is anybody able to solve this?
This is not pretty, however...
USE Sandbox;
GO
WITH VTE AS(
SELECT V.SomeString
--Randomised order
FROM (VALUES ('A1-1-1'),
('A10-1-3'),
('A10-2-2'),
('A1-1-3'),
('A10-2-1'),
('A2-2-2'),
('A1-2-1'),
('A1-2-2'),
('A2-1-1'),
('A10-1-2'),
('B2-1-2'),
('A1'),
('A2-2-1'),
('A2-10-3'),
('A10-2-3'),
('A2-1-2'),
('B1-4'),
('A2-10-2'),
('A2-2-3'),
('A10-1-1'),
('A1-A1-3'),
('A1-7'),
('A2-10-1'),
('A2-1-3'),
('A1-1-2'),
('A1-2-3')) V(SomeString)),
Splits AS(
SELECT V.SomeString,
DS.Item,
DS.ItemNumber,
CONVERT(int,STUFF((SELECT '' + NG.token
FROM dbo.NGrams8k(DS.item,1) NG
WHERE TRY_CONVERT(int, NG.Token) IS NOT NULL
ORDER BY NG.position
FOR XML PATH('')),1,0,'')) AS NumericPortion
FROM VTE V
CROSS APPLY dbo.DelimitedSplit8K(V.SomeString,'-') DS),
Pivoted AS(
SELECT S.SomeString,
MIN(CASE V.P1 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P1Alpha,
MIN(CASE V.P1 WHEN S.Itemnumber THEN S.NumericPortion END) AS P1Numeric,
MIN(CASE V.P2 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P2Alpha,
MIN(CASE V.P2 WHEN S.Itemnumber THEN S.NumericPortion END) AS P2Numeric,
MIN(CASE V.P3 WHEN S.Itemnumber THEN REPLACE(S.Item, S.NumericPortion,'') END) AS P3Alpha,
MIN(CASE V.P3 WHEN S.Itemnumber THEN S.NumericPortion END) AS P3Numeric
FROM Splits S
CROSS APPLY (VALUES(1,2,3)) AS V(P1,P2,P3)
GROUP BY S.SomeString)
SELECT P.SomeString
FROM Pivoted P
ORDER BY P.P1Alpha,
P.P1Numeric,
P.P2Alpha,
P.P2Numeric,
P.P3Alpha,
P.P3Numeric;
This outputs:
A1
A1-1-1
A1-1-2
A1-1-3
A1-2-1
A1-2-2
A1-2-3
A1-7
A1-A1-3
A2-1-1
A2-1-2
A2-1-3
A2-2-1
A2-2-2
A2-2-3
A2-10-1
A2-10-2
A2-10-3
A10-1-1
A10-1-2
A10-1-3
A10-2-1
A10-2-2
A10-2-3
B1-4
B2-1-2
This makes use of 2 user defined functions. Firstly or DelimitedSplit8k_Lead (I used DelimitedSplit8k as I don't have the other on my sandbox at the moment). Then you also have NGrams8k.
I really should explain how this works, but yuck... (edit coming).
OK... (/sigh) What it does. Firstly, we split the data into its relevant parts using delimitedsplit8k(_lead). Then, within the SELECT we use FOR XML PATH to get (only) the nuemrical part of that string (For example, for 'A10' we get '10') and we convert it to a numerical value (an int).
Then we pivot that data out into respective parts. The alphanumerical part, and the numerical part. So, for the value 'A10-A1-12' we end up with the row:
'A', 10, 'A', 1, 12
Then, now that we've pivoted the data, we sort it by each column individually. And voila.
This will fall over if you have a value like 'A1A' or '1B1', and honestly, I'm not changing it to catter for that. This was messy, and really isn't what the RDBMS should be doing.
Up to 3 dashes can be covered by fiddling with replace & parsename & patindex:
declare #TabelWithBarcodes table (id int primary key identity(1,1), barcode varchar(20) not null, unique (barcode));
insert into #TabelWithBarcodes (barcode) values
('2-2-3'),('A2-2-2'),('A2-2-1'),('A2-10-3'),('A2-10-2'),('A2-10-1'),('A2-1-3'),('A2-1-2'),('A2-1-1'),
('A10-2-3'),('A10-2-2'),('A10-2-10'),('A10-1-3'),('AA10-A111-2'),('A10-1-1'),
('A1-7'),('A1-2-3'),('A1-2-12'),('A1-2-1'),('A1-1-3'),('B1-1-2'),('A1-1-1'),('A1'),('A10-10-1'),('A12-10-1'), ('AB1-2-E1') ;
with cte as
(
select barcode,
replace(BarCode, '-', '.')
+ replicate('.0', 3 - (len(BarCode)-len(replace(BarCode, '-', '')))) as x
from #TabelWithBarcodes
)
select *
, substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
from cte
order by
substring(parsename(x,4), 1, patindex('%[0-9]%',parsename(x,4))-1)
,cast(substring(parsename(x,4), patindex('%[0-9]%',parsename(x,4)), 10) as int)
,substring(parsename(x,3), 1, patindex('%[0-9]%',parsename(x,3))-1)
,cast(substring(parsename(x,3), patindex('%[0-9]%',parsename(x,3)), 10) as int)
,substring(parsename(x,2), 1, patindex('%[0-9]%',parsename(x,2))-1)
,cast(substring(parsename(x,2), patindex('%[0-9]%',parsename(x,2)), 10) as int)
,substring(parsename(x,1), 1, patindex('%[0-9]%',parsename(x,1))-1)
,cast(substring(parsename(x,1), patindex('%[0-9]%',parsename(x,1)), 10) as int)
extend each barcode to 4 groups by adding trailing .0 if missing
split each barcode in 4 groups
split each group in leading characters and trailing digits
sort by the leading character first
then by casting the digits as numeric
See db<>fiddle
An alterative approach would be to use your technique to split the string into its 3 component parts, then left pad those strings with leading zeros (or characters of your choice). That avoids any issues where the string may contain alphanumerics rather than just numerics. However, it does mean that strings containing different length alphabetic characters may not be sorted as you may expect... Here's the code to play with (using the definitions from #dnoeth's excellent answer):
;with cte as
(
select barcode
, case
when barcode like '%-%' then
substring(barcode,1,charindex('-',barcode)-1)
else
barcode
end part1
, case
when barcode like '%-%' then
substring(barcode, charindex('-',barcode) + 1, case
when barcode like '%-%-%' then
(charindex('-',barcode,charindex('-',barcode) + 1)) - 1
else
len(barcode)
end
- charindex('-',barcode))
else
''
end part2
, case
when barcode like '%-%-%' then
right(barcode,charindex('-',reverse(barcode))-1) --note: assumes you don't have %-%-%-%
else
''
end part3
from #TabelWithBarcodes
)
select barcode
, part1, part2, part3
, right('0000000000' + coalesce(part1,''), 10) lpad1
, right('0000000000' + coalesce(part2,''), 10) lpad2
, right('0000000000' + coalesce(part3,''), 10) lpad3
from cte
order by lpad1, lpad2, lpad3
DBFiddle Example

Date format query in CASE expression

I am trying to output a date format within the below SQL
CASE
WHEN E.A_EXTTRNDTETME IS NOT NULL then LEFT(E.A_EXTTRNDTETME, 4)
+SUBSTRING(E.A_EXTTRNDTETME, 5, 2)+SUBSTRING(E.A_EXTTRNDTETME, 7, 2)
+'-'+SUBSTRING(E.A_EXTTRNDTETME, 9, 2)+':'+SUBSTRING(E.A_EXTTRNDTETME, 11, 2)
+':'+SUBSTRING(E.A_EXTTRNDTETME, 13,2)+'.'+SUBSTRING(E.A_EXTTRNDTETME, 15,3)
WHEN E.A_EXTTRNDTETME IS NULL then
(
SELECT TOP 1 A_EXTTRNDTETME FROM T_ATH_EXE
WHERE A_PAREXEID = E.A_EXEID ORDER BY A_ADDDTETME
)
ELSE ' text'
END as [TransactTime],
The second WHEN statement is returning 20180322141422883 but I would like this to be in the following format, like the values from the first branch:
20180322-14:14:22.883
But don't know how to do it inside the SELECT statement, please help.
You can put your entire query into a subquery, so you only have to apply formatting once. You can also use COALESCE, which is just a shorter way to write CASE WHEN IS NOT NULL THEN this ELSE that END. Finally, there are built in style options for formatting datetime values that avoid all that messy string manipulation (you can see all the options here).
SELECT *, TransactTime = COALESCE
(
CONVERT(char(8), dt, 112) + '-' + CONVERT(char(12), dt, 108),
' text'
)
FROM -- your larger query here
(
SELECT
dt = COALESCE
(
E.A_EXTTDNDTETME,
(
SELECT TOP (1) A_EXTTRNDTETME
FROM T_ATH_EXE
WHERE A_PAREXEID = E.A_EXEID
ORDER BY A_ADDDTETME
)
), -- other columns...
FROM -- table...
) AS sub;
I got it working by using the following:-
CASE
WHEN E.A_EXTTRNDTETME IS NULL then (SELECT top 1 LEFT(A_EXTTRNDTETME, 4)+SUBSTRING(A_EXTTRNDTETME, 5, 2)+SUBSTRING(A_EXTTRNDTETME, 7, 2)+'-'+SUBSTRING(A_EXTTRNDTETME, 9, 2)+':'+SUBSTRING(A_EXTTRNDTETME, 11, 2)+':'+SUBSTRING(A_EXTTRNDTETME, 13,2)+'.'+SUBSTRING(A_EXTTRNDTETME, 15,3) FROM T_ATH_EXE WHERE A_PAREXEID = E.A_EXEID ORDER BY E.A_ADDDTETME)
WHEN E.A_EXTTRNDTETME IS NOT NULL then LEFT(E.A_EXTTRNDTETME, 4)+SUBSTRING(E.A_EXTTRNDTETME, 5, 2)+SUBSTRING(E.A_EXTTRNDTETME, 7, 2)+'-'+SUBSTRING(E.A_EXTTRNDTETME, 9, 2)+':'+SUBSTRING(E.A_EXTTRNDTETME, 11, 2)+':'+SUBSTRING(E.A_EXTTRNDTETME, 13,2)+'.'+SUBSTRING(E.A_EXTTRNDTETME, 15,3)
ELSE ' text'
END as [TransactTime],

SQL Server REPLACE AND CHECK IF EXISTS

I have to check the string with the following scenarios in WHERE condition.
The data ProductId stored in the database can be like
7314-3337 sometimes with - symbol and not prefixed with 19
73143337 sometimes without symbol and not prefixed with 19
1973143337 correct format
197314-3337 sometimes with - symbol
I need to filter the record ProductId and the input is correct format , i.e 1973143337
WHERE P.ProductId=#ProductId
How can i filter it if the data stored in other 3 formats?
How to use the string replace(-) and prefix 19 if not exists in SQL server?
please check this 2 approach.
one is very simple and second is some trick. (I think you go with second option which cover everythings)
declare #t table (ProductId varchar(100))
insert into #t
values
('7314-3337')
,('73143337')
,('1973143337')
,('197314-3337')
,('73683337')
,('73143338')
declare #valuetosearch varchar(100) = '1973143337'
--this is very simple , but not work in each schenerio. the second approach is fine.
--select CHARINDEX ( '19','1973143337'), SUBSTRING('1973143337',3,len('1973143337'))
--select * from
--#t
--where
--replace(REPLACE(ProductId ,'-','') ,'19','') = replace(REPLACE(#valuetosearch ,'-','') ,'19','')
select * from
#t
where
REPLACE( case when CHARINDEX ( '19',ProductId) = 1
then SUBSTRING( ProductId ,3,LEN(ProductId))
else ProductId
end ,'-','')
=
REPLACE ( case when CHARINDEX ( '19',#valuetosearch) = 1
then SUBSTRING( #valuetosearch ,3,LEN(#valuetosearch))
else #valuetosearch
end ,'-','')
You should first sanitize your data, if it is not consistent then you won't be able to get the correct results.
For prefixing with 19:
UPDATE foo
SET ProductId = '19' + ProductId
WHERE Left(ProductID, 2) <> '19'
For removing the '-':
UPDATE foo
SET ProductId = REPLACE(ProductId, '-', '')
Then you should be able to get the results you want.
UPDATE:
You could construct a CTE with the results in a single format, and then, filter that CTE:
WITH cte (
FormattedPID
,ProductId
)
AS (
SELECT CASE
WHEN LEFT(ProductId, 2) = '19'
THEN REPLACE(ProductId, '-', '')
ELSE '19' + REPLACE(ProductId, '-', '')
END
,ProductId
FROM foo
)
SELECT FormattedPID
,ProductId
FROM cte
WHERE FormattedPID = #ProductID
You could make sure the column is in the correct format like this:
Remove the - by replacing it with an empty string (197314-3337 -> 1973143337, 7314-3337 -> 73143337).
Add 19 at the beginning (1973143337 -> 191973143337, 73143337 -> 1973143337).
Take 10 rightmost characters of the result and compare to the input (1973143337 -> 1973143337, 1973143337 -> 1973143337).
In Transact-SQL:
WHERE RIGHT('19' + REPLACE(P.ProductId, '-', ''), 10) = #ProductId
Of course, this means no index seek for you, because we are applying functions to the column.
An alternative to that would be to produce the three non-standard formats out of the input:
cut off the initial 19 (1973143337 -> 73143337);
insert the - (1973143337 -> 197314-3337);
insert the - and cut off the 19 (1973143337 -> 197314-3337 -> 7314-3337).
In Transact-SQL:
WHERE P.ProductId IN (
#ProductId,
SUBSTRING(#ProductId, 3, 999999999),
STUFF(#ProductId, 7, 0, '-'),
SUBSTRING(STUFF(#ProductId, 7, 0, '-'), 3, 999999999)
)
This way if there is an index on P.ProductId, it will be used efficiently.
Both approaches assume that the length of the correct format is fixed.