How to check remove numbers from a string? - sql

I'm using an SSIS package to bring data through data from one table to another. However, I have a predicament where a field in the table(GroupName) brings through data with numbers at the end. This comes in two forms, either the string will be a name and then a set of numbers less than 4 characters in length. (E.g - Group Name 22)
Or it will come as a name and four numeric characters. (E.g Group Name 2012). Now I'd like to do a check on the data in SQL to see if the length of numeric characters at the end of the string is less than 4. If so, remove the numbers.
Can anyone help

You can use patindex
SELECT
SUBSTRING('Group Name 2012'
,PATINDEX('%[0-9]%'
,'Group Name 2012')
,LEN('Group Name 2012')) as NumberOnly
,LEN( SUBSTRING('Group Name 2012'
,PATINDEX('%[0-9]%'
,'Group Name 2012')
,LEN('Group Name 2012'))) as Numberlength

Alternatively add a Derived column transformation with
NumericCheck= RIGHT(stringvariable,4)
and then in a separate Derived column transformation
(DT_I4)Numeric_check == (DT_I4)Numeric_check ? 1 : 0
Note: You will need to Configure the Error output to "Ignore Failure" for this check. Then have a conditional split which sends the zero values to be updated via an OLE DB Command

Related

How to retrieve the required string in SQL having a variable length parameter

Here is my problem statement:
I have single column table having the data like as :
ROW-1>> 7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX
ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX
Here i want to separate these values from '-' and load into a new table. There are 11 segments in this string separated by '-', therefore, 11 columns. The problem is:
A. The length of these values are changing, however, i have to keep it as the length of these values in the standard format or the length which it has
e.g 7302- (should have four values, if the value less then that then keep that value eg. 73 then it should populate 73.
Therefore, i have to separate as well as mentation the integrity. The code which i am writing is :
select
SUBSTR(PROFILE_ID,1,(case when length(instr(PROFILE_ID,'-')<>4) THEN (instr(PROFILE_ID,'-') else SUBSTR(PROFILE_ID,1,4) end)
)AS [RQUIRED_COLUMN_NAME]
from [TABLE_NAME];
getting right parenthesis error
Please help.
I used the regex_substr SQL function to solve the above issue. Here below is an example:
select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,1);
Output is: 7302 --which is the 1st segment of the string
Similarly, the send string segment which is separated by "-" in the string can be obtained by just replacing the 1 with 2 in the above query at the end.
Example : select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,2);
output: 2210177000 which is the 2nd segment of the string

Count of Character mismatch between two values in SQL Server stored procedure

How to find the count of character mismatch between two values in a SQL Server stored procedure? It's not about the length difference.
For example, there are two values,
Reference value ='Visual'
Test value ='Visvolc'
Mismatch = 3 chars (4th, 5th and 7th position)
There is 3 character mismatch based on the position. please help.
Sounds like what you are looking for is “edit distance”, a number of add/remove/replace operation to convert one string into another. Check this post: Levenshtein distance in T-SQL

Manipulating a record data

I am looking for a way to take data from one table and manipulate it and bring it to another table using an SQL query.
I have a Column called NumberStuff that has data like this in it:
INC000000315482
I need to cut off the INC portion of the number and convert it into an integer and store it into a Column in another table so that it ends up looking like this:
315482
Any help would be much appreciated!
Another approach is to use the Replace function. Either in TSQL or as a Derived Column Expression in SSIS.
TSQL
SELECT REPLACE(T.MyColumn, 'INC', '') AS ReplacedINC
SSIS
REPLACE([MyColumn], "INC", "")
This removes the character based data. It then becomes an optional exercise in converting to a numeric type before storing it to the target table or letting the implicit conversion happen.
Simplest version of what you need.
select cast(right(column,6) as int) from table
Are you doing this in a SSIS statement, or?...is it always the last 6 or?...
This is a little less dependant on your formatting...removes 0's and can be any length (will trim the first 3 chars and the leading 0's).
select cast(SUBSTRING('INC000000315482',4,LEN('INC000000315482') - 3) as int)

SQL Server comma delimiter for money datatype

I import Excel files via SSIS to SQL-Server. I have a temp table to get everything in nvarchar. For four columns I then cast the string to money type and put in my target table.
In my temp table one of those four columns let me call it X has a comma as the delimiter the rest has a dot. Don't ask me why, I have everything in my SSIS set the same.
In my Excel the delimiter is a comma as well.
So now in my target table I have everything in comma values but the X column now moves the comma two places to the right and looks like this:
537013,00 instead of 5370,13 which was the original cell value in the temp and excel column.
I was thinking this is a culture setup problem but then again it should or shouldn't work on all of these columns.
a) Why do I receive dot values in my temp table when my Excel displays comma?
b) how can I fix this? Can I replace the "," in the temp table with a dot?
UPDATE
I think I found the reason but not the solution:
In this X column in excel the first three cells are empty - the other three columns all start with 0. If I fill these three cells of X with 0s then I also get the dot in my temp table and the right value in my target table. But of course I have to use the Excel file as is.
Any ideas on that?
Try the code below. It checks whether the string value being converted to money is of numeric data type. If the string value is of numeric data type, then convert it to money data type, otherwise, return a NULL value. And it also replaces the decimal symbol and the digit grouping symbol of the string value to match the expected decimal symbol and digit grouping symbol of SQL Server.
DECLARE #MoneyString VARCHAR(20)
SET #MoneyString = '$ 1.000,00'
SET #MoneyString = REPLACE(REPLACE(#MoneyString, '.', ''), ',', '.')
SELECT CAST(CASE WHEN ISNUMERIC(#MoneyString) = 1
THEN #MoneyString
ELSE NULL END AS MONEY)
As for the reason why you get comma instead dot I have no clue. My first guess would be cultural settings but you already checked that. What about googling, did you get some results?
First the "separator" in SQL is the decimal point: its only excel that is using the comma. You can change the formatting in excel: you should format the excel column as money and specify a decimal point as the separator. Then in the SSIS import wizard split out the transformation of the column so it imports to a money data type. Its a culture thing, but delimiter tends to be used in the context of signifying the end of one column and the start of the next (as in csv)
HTH
Well thats a longstanding problem with excel. It uses the first 30 or so rows to infer data type. It can lead to endless issues. I think your solution has to be to process everything as a string in the way Yaroslav suggested, or supply an excel template to have data predefined and formatted data type columns, which then have the values inserted. Its a pita.

Problem with MySQL Select query with "IN" condition

I found a weird problem with MySQL select statement having "IN" in where clause:
I am trying this query:
SELECT ads.*
FROM advertisement_urls ads
WHERE ad_pool_id = 5
AND status = 1
AND ads.id = 23
AND 3 NOT IN (hide_from_publishers)
ORDER BY rank desc
In above SQL hide_from_publishers is a column of advertisement_urls table, with values as comma separated integers, e.g. 4,2 or 2,7,3 etc.
As a result, if hide_from_publishers contains same above two values, it should return only record for "4,2" but it returns both records
Now, if I change the value of hide_for_columns for second set to 3,2,7 and run the query again, it will return single record which is correct output.
Instead of hide_from_publishers if I use direct values there, i.e. (2,7,3) it does recognize and returns single record.
Any thoughts about this strange problem or am I doing something wrong?
There is a difference between the tuple (1, 2, 3) and the string "1, 2, 3". The former is three values, the latter is a single string value that just happens to look like three values to human eyes. As far as the DBMS is concerned, it's still a single value.
If you want more than one value associated with a record, you shouldn't be storing it as a comma-separated value within a single field, you should store it in another table and join it. That way the data remains structured and you can use it as part of a query.
You need to treat the comma-delimited hide_from_publishers column as a string. You can use the LOCATE function to determine if your value exists in the string.
Note that I've added leading and trailing commas to both strings so that a search for "3" doesn't accidentally match "13".
select ads.*
from advertisement_urls ads
where ad_pool_id = 5
and status = 1
and ads.id = 23
and locate(',3,', ','+hide_from_publishers+',') = 0
order by rank desc
You need to split the string of values into separate values. See this SO question...
Can Mysql Split a column?
As well as the supplied example...
http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/
Here is another SO question:
MySQL query finding values in a comma separated string
And the suggested solution:
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set