Related
I am inserting data from a Text file to table.
and the SPROC is as follows
ALTER PROCEDURE [dbo].[SPROC]
-- Add the parameters for the stored procedure here
#DBName VARCHAR(30),
#FirstLine VARCHAR(MAX)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #InsertQuery NVARCHAR(MAX), #Summary_ID INT
SET #InsertQuery = 'INSERT INTO ' + #DBName + '.[dbo].[Summary] VALUES ( ' + #FirstLine + '); SELECT SCOPE_IDENTITY()'
EXEC SP_EXECUTESQL #InsertQuery
END
Now the thing is, if it has 8 values it will get inserted. If 8 th Value is not there in the text file it is throwing Error column name or number of supplied values does not match table definition
How can i handle this error and If there is no value for 8th column it should insert the value as NULL.
Reading File code:
public static String[] ReadSummary_Into_Array(string filepath)
{
StreamReader sreader = null;
int counter = 0;
try
{
sreader = new StreamReader(filepath);
string line = sreader.ReadLine();
//condition to hanlde empty file
if (line == null) return null;
//condition to hanlde empty first line file
if (line == "") return new String[0];
FirstLine = line;
string cleaned_line = line.Replace("''", "'-'").Replace("','", "''");
string word = "";
List<string> data = new List<string>();
MatchCollection matches = Regex.Matches(cleaned_line, #"'([^']*)");
//String[] words = null;
foreach (Match match in matches)
{
word = match.ToString();
string word_edited = word.Replace("\'", "");
if (word_edited != string.Empty)
{
data.Add(word_edited);
counter++;
}
}
Summary = new String[counter];
Summary = data.ToArray(); //The Summary Line is reconstructed into a String array
return Summary;
}
If you dont have value then you must specify the column names.
Try like this
ALTER PROCEDURE [dbo].[SPROC]
-- Add the parameters for the stored procedure here
#DBName VARCHAR(30),
#FirstLine VARCHAR(MAX)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #InsertQuery NVARCHAR(MAX), #Summary_ID INT, #CountS INT
SET #CountS = len(#FirstLine) - len(replace(#FirstLine, ',', ''))
IF #CountS >= 7 THEN
SET #InsertQuery = 'INSERT INTO ' + #DBName + '.[dbo].[Summary] VALUES ( ' + #FirstLine + '); SELECT SCOPE_IDENTITY()'
ELSE
SET #InsertQuery = 'INSERT INTO ' + #DBName + '.[dbo].[Summary](SerialNumber,AssetNumber,SoftwareRev,TechName,StartTime,StopTime,Status) VALUES ( ' + #FirstLine + '); SELECT SCOPE_IDENTITY()'
END IF
EXEC SP_EXECUTESQL #InsertQuery
END
Similar as Vignesh, but in your C# code
if ( Regex.Matches( strLine, "," ).Count == 7)
{
strLine = strLine + ', null';
}
So, I'm looking at implementing Fuzzy logic matching in my company and having trouble getting good results. For starters, I'm trying to match up Company names with those on a list supplied by other companies.
My first attempt was to use soundex, but it looks like soundex only compares the first few sounds in the company name, so longer company names were too easily confused for one another.
I'm now working on my second attempt using the levenstein distance comparison. It looks promising, especially if I remove the punctuation first. However, I'm still having trouble finding duplicates without too many false positives.
One of the issues I have is companies such as widgetsco vs widgets inc. So, if I compare the substring of the length of the shorter name, I also pickup things like BBC University and CBC University campus. I suspect that a score using a combination of distance and longest common substring may be the solution.
Has anyone managed to build an algorithm that does such a matching with limited false positives?
We have had good results on name and address matching using a Metaphone function created by Lawrence Philips. It works in a similar way to Soundex, but creates a sound/consonant pattern for the whole value. You may find this useful in conjunction with some other techniques, especially if you can strip some of the fluff like 'co.' and 'inc.' as mentioned in other comments:
create function [dbo].[Metaphone](#str as nvarchar(70), #KeepNumeric as bit = 0)
returns nvarchar(25)
/*
Metaphone Algorithm
Created by Lawrence Philips.
Metaphone presented in article in "Computer Language" December 1990 issue.
*********** BEGIN METAPHONE RULES ***********
Lawrence Philips' RULES follow:
The 16 consonant sounds:
|--- ZERO represents "th"
|
B X S K J T F H L M N P R 0 W Y
Drop vowels
Exceptions:
Beginning of word: "ae-", "gn", "kn-", "pn-", "wr-" ----> drop first letter
Beginning of word: "wh-" ----> change to "w"
Beginning of word: "x" ----> change to "s"
Beginning of word: vowel or "H" + vowel ----> Keep it
Transformations:
B ----> B unless at the end of word after "m", as in "dumb", "McComb"
C ----> X (sh) if "-cia-" or "-ch-"
S if "-ci-", "-ce-", or "-cy-"
SILENT if "-sci-", "-sce-", or "-scy-"
K otherwise
K "-sch-"
D ----> J if in "-dge-", "-dgy-", or "-dgi-"
T otherwise
F ----> F
G ----> SILENT if "-gh-" and not at end or before a vowel
"-gn" or "-gned"
"-dge-" etc., as in above rule
J if "gi", "ge", "gy" if not double "gg"
K otherwise
H ----> SILENT if after vowel and no vowel follows
or "-ch-", "-sh-", "-ph-", "-th-", "-gh-"
H otherwise
J ----> J
K ----> SILENT if after "c"
K otherwise
L ----> L
M ----> M
N ----> N
P ----> F if before "h"
P otherwise
Q ----> K
R ----> R
S ----> X (sh) if "sh" or "-sio-" or "-sia-"
S otherwise
T ----> X (sh) if "-tia-" or "-tio-"
0 (th) if "th"
SILENT if "-tch-"
T otherwise
V ----> F
W ----> SILENT if not followed by a vowel
W if followed by a vowel
X ----> KS
Y ----> SILENT if not followed by a vowel
Y if followed by a vowel
Z ----> S
*/
as
begin
declare #Result varchar(25)
,#str3 char(3)
,#str2 char(2)
,#str1 char(1)
,#strp char(1)
,#strLen tinyint
,#cnt tinyint
set #strLen = len(#str)
set #cnt = 0
set #Result = ''
-- Preserve first 5 numeric values when required
if #KeepNumeric = 1
begin
set #Result = case when isnumeric(substring(#str,1,1)) = 1
then case when isnumeric(substring(#str,2,1)) = 1
then case when isnumeric(substring(#str,3,1)) = 1
then case when isnumeric(substring(#str,4,1)) = 1
then case when isnumeric(substring(#str,5,1)) = 1
then left(#str,5)
else left(#str,4)
end
else left(#str,3)
end
else left(#str,2)
end
else left(#str,1)
end
else ''
end
set #str = right(#str,len(#str)-len(#Result))
end
--Process beginning exceptions
set #str2 = left(#str,2)
if #str2 = 'wh'
begin
set #str = 'w' + right(#str , #strLen - 2)
set #strLen = #strLen - 1
end
else
if #str2 in('ae', 'gn', 'kn', 'pn', 'wr')
begin
set #str = right(#str , #strLen - 1)
set #strLen = #strLen - 1
end
set #str1 = left(#str,1)
if #str1 = 'x'
set #str = 's' + right(#str , #strLen - 1)
else
if #str1 in ('a','e','i','o','u')
begin
set #str = right(#str, #strLen - 1)
set #strLen = #strLen - 1
set #Result = #Result + #str1
end
while #cnt <= #strLen
begin
set #cnt = #cnt + 1
set #str1 = substring(#str,#cnt,1)
set #strp = case when #cnt <> 0
then substring(#str,(#cnt-1),1)
else ' '
end
-- Check if the current character is the same as the previous character.
-- If we are keeping numbers, only compare non-numeric characters.
if case when #KeepNumeric = 1 and #strp = #str1 and isnumeric(#str1) = 0 then 1
when #KeepNumeric = 0 and #strp = #str1 then 1
else 0
end = 1
continue -- Skip this loop
set #str2 = substring(#str,#cnt,2)
set #Result = case when #KeepNumeric = 1 and isnumeric(#str1) = 1
then #Result + #str1
when #str1 in('f','j','l','m','n','r')
then #Result + #str1
when #str1 = 'q'
then #Result + 'k'
when #str1 = 'v'
then #Result + 'f'
when #str1 = 'x'
then #Result + 'ks'
when #str1 = 'z'
then #Result + 's'
when #str1 = 'b'
then case when #cnt = #strLen
then case when substring(#str,(#cnt - 1),1) <> 'm'
then #Result + 'b'
else #Result
end
else #Result + 'b'
end
when #str1 = 'c'
then case when #str2 = 'ch' or substring(#str,#cnt,3) = 'cia'
then #Result + 'x'
else case when #str2 in('ci','ce','cy') and #strp <> 's'
then #Result + 's'
else #Result + 'k'
end
end
when #str1 = 'd'
then case when substring(#str,#cnt,3) in ('dge','dgy','dgi')
then #Result + 'j'
else #Result + 't'
end
when #str1 = 'g'
then case when substring(#str,(#cnt - 1),3) not in ('dge','dgy','dgi','dha','dhe','dhi','dho','dhu')
then case when #str2 in('gi', 'ge','gy')
then #Result + 'j'
else case when #str2 <> 'gn' or (#str2 <> 'gh' and #cnt+1 <> #strLen)
then #Result + 'k'
else #Result
end
end
else #Result
end
when #str1 = 'h'
then case when #strp not in ('a','e','i','o','u') and #str2 not in ('ha','he','hi','ho','hu')
then case when #strp not in ('c','s','p','t','g')
then #Result + 'h'
else #Result
end
else #Result
end
when #str1 = 'k'
then case when #strp <> 'c'
then #Result + 'k'
else #Result
end
when #str1 = 'p'
then case when #str2 = 'ph'
then #Result + 'f'
else #Result + 'p'
end
when #str1 = 's'
then case when substring(#str,#cnt,3) in ('sia','sio') or #str2 = 'sh'
then #Result + 'x'
else #Result + 's'
end
when #str1 = 't'
then case when substring(#str,#cnt,3) in ('tia','tio')
then #Result + 'x'
else case when #str2 = 'th'
then #Result + '0'
else case when substring(#str,#cnt,3) <> 'tch'
then #Result + 't'
else #Result
end
end
end
when #str1 = 'w'
then case when #str2 not in('wa','we','wi','wo','wu')
then #Result + 'w'
else #Result
end
when #str1 = 'y'
then case when #str2 not in('ya','ye','yi','yo','yu')
then #Result + 'y'
else #Result
end
else #Result
end
end
return #Result
end
You want to use something like Levenshtein Distance or another string comparison algorithm. You may want to take a look at this project on Codeplex.
http://fuzzystring.codeplex.com/
Are you using Access? If so, consider the '*' character, without the quotes. If you're using SQL Server, use the '%' character. However, this really isn't fuzzy logic, it's really the Like operator. If you really need fuzzy logic, export your data-set to Excel and load the AddIn from the URL below.
https://www.microsoft.com/en-us/download/details.aspx?id=15011
Read the instructions very carefully. It definitely works, and it works great, but you need to follow the instructions, and it's not completely intuitive. The first time I tried it, I didn't follow the instructions, and I wasted a lot of time trying to get it to work. Eventually I figured it out, and it worked great!!
I found success implementing a function I found here on Stack Overflow that would find the percentage of strings that match. You can then adjust tolerance till you get an appropriate amount of matches/mismatches. The function implementation will be listed below, but the gist is including something like this in your query.
DECLARE #tolerance DEC(18, 2) = 50;
WHERE dbo.GetPercentageOfTwoStringMatching(first_table.name, second_table.name) > #tolerance
Credit for the following percent matching function goes to Dragos Durlut, Dec 15 '11.
The credit for the LEVENSHTEIN function was included in the code by Dragos Durlut.
T-SQL Get percentage of character match of 2 strings
CREATE FUNCTION [dbo].[GetPercentageOfTwoStringMatching]
(
#string1 NVARCHAR(100)
,#string2 NVARCHAR(100)
)
RETURNS INT
AS
BEGIN
DECLARE #levenShteinNumber INT
DECLARE #string1Length INT = LEN(#string1)
, #string2Length INT = LEN(#string2)
DECLARE #maxLengthNumber INT = CASE WHEN #string1Length > #string2Length THEN #string1Length ELSE #string2Length END
SELECT #levenShteinNumber = [dbo].[LEVENSHTEIN] ( #string1 ,#string2)
DECLARE #percentageOfBadCharacters INT = #levenShteinNumber * 100 / #maxLengthNumber
DECLARE #percentageOfGoodCharacters INT = 100 - #percentageOfBadCharacters
-- Return the result of the function
RETURN #percentageOfGoodCharacters
END
-- =============================================
-- Create date: 2011.12.14
-- Description: http://blog.sendreallybigfiles.com/2009/06/improved-t-sql-levenshtein-distance.html
-- =============================================
CREATE FUNCTION [dbo].[LEVENSHTEIN](#left VARCHAR(100),
#right VARCHAR(100))
returns INT
AS
BEGIN
DECLARE #difference INT,
#lenRight INT,
#lenLeft INT,
#leftIndex INT,
#rightIndex INT,
#left_char CHAR(1),
#right_char CHAR(1),
#compareLength INT
SET #lenLeft = LEN(#left)
SET #lenRight = LEN(#right)
SET #difference = 0
IF #lenLeft = 0
BEGIN
SET #difference = #lenRight
GOTO done
END
IF #lenRight = 0
BEGIN
SET #difference = #lenLeft
GOTO done
END
GOTO comparison
COMPARISON:
IF ( #lenLeft >= #lenRight )
SET #compareLength = #lenLeft
ELSE
SET #compareLength = #lenRight
SET #rightIndex = 1
SET #leftIndex = 1
WHILE #leftIndex <= #compareLength
BEGIN
SET #left_char = substring(#left, #leftIndex, 1)
SET #right_char = substring(#right, #rightIndex, 1)
IF #left_char <> #right_char
BEGIN -- Would an insertion make them re-align?
IF( #left_char = substring(#right, #rightIndex + 1, 1) )
SET #rightIndex = #rightIndex + 1
-- Would an deletion make them re-align?
ELSE IF( substring(#left, #leftIndex + 1, 1) = #right_char )
SET #leftIndex = #leftIndex + 1
SET #difference = #difference + 1
END
SET #leftIndex = #leftIndex + 1
SET #rightIndex = #rightIndex + 1
END
GOTO done
DONE:
RETURN #difference
END
Note: If you need to compare two or more fields (which I don't think you do) you can add another call to the function in the WHERE clause with a minimum tolerance. I also found success averaging the percentMatching and comparing it against a tolerance.
DECLARE #tolerance DEC(18, 2) = 25;
--could have multiple different tolerances for each field (weighting some fields as more important to be matching)
DECLARE #avg_tolerance DEC(18, 2) = 50;
WHERE AND dbo.GetPercentageOfTwoStringMatching(first_table.name, second_table.name) > #tolerance
AND dbo.GetPercentageOfTwoStringMatching(first_table.address, second_table.address) > #tolerance
AND (dbo.GetPercentageOfTwoStringMatching(first_table.name, second_table.name)
+ dbo.GetPercentageOfTwoStringMatching(first_table.address, second_table.address)
) / 2 > #avg_tolerance
The benefit of this solution is the tolerance variables can be specific per field (weighting the importance of certain fields matching) and the average can insure general matching across all fields.
Firstly, I suggest, you make sure that you can't match on any other attribute and company names are all you have(because fuzzy matching is bound to give you some false positives). If you want to go ahead with fuzzy matching you could use the following steps:
Remove all stop words from the text. For example : Co, Inc etc.
If your database is very large, make use of an indexing method such as blocking or sorted neighbourhood indexing.
Finally compute the fuzzy score using the Levenshtein distance. You could use the token_set_ratio or partial_ratio functions in Fuzzywuzzy.
Also, I found the following video which aims to solve the same problem: https://www.youtube.com/watch?v=NRAqIjXaZvw
The Nanonets blog also contains several resources on the subject that could potentially be helpful.
How to split comma separated string into strings inside store procedure and insert them into a table field?
Using Firebird 2.5
I am posting modified Michael's version, maybe it will be useful for someone.
The changes are:
SPLIT_STRING is a selectable procedure.
Custom delimiter is possible.
It parses also cases when delimiter is a first character in the P_STRING.
set term ^ ;
create procedure split_string (
p_string varchar(32000),
p_splitter char(1) )
returns (
part varchar(32000)
)
as
declare variable lastpos integer;
declare variable nextpos integer;
begin
p_string = :p_string || :p_splitter;
lastpos = 1;
nextpos = position(:p_splitter, :p_string, lastpos);
if (lastpos = nextpos) then
begin
part = substring(:p_string from :lastpos for :nextpos - :lastpos);
suspend;
lastpos = :nextpos + 1;
nextpos = position(:p_splitter, :p_string, lastpos);
end
while (:nextpos > 1) do
begin
part = substring(:p_string from :lastpos for :nextpos - :lastpos);
lastpos = :nextpos + 1;
nextpos = position(:p_splitter, :p_string, lastpos);
suspend;
end
end^
set term ; ^
Here a sample how to split the string and write the sub-strings into a table:
create procedure SPLIT_STRING (
AINPUT varchar(8192))
as
declare variable LASTPOS integer;
declare variable NEXTPOS integer;
declare variable TEMPSTR varchar(8192);
begin
AINPUT = :AINPUT || ',';
LASTPOS = 1;
NEXTPOS = position(',', :AINPUT, LASTPOS);
while (:NEXTPOS > 1) do
begin
TEMPSTR = substring(:AINPUT from :LASTPOS for :NEXTPOS - :LASTPOS);
insert into new_table("VALUE") values(:TEMPSTR);
LASTPOS = :NEXTPOS + 1;
NEXTPOS = position(',', :AINPUT, LASTPOS);
end
suspend;
end
Use
POSITION
http://www.firebirdsql.org/refdocs/langrefupd21-intfunc-position.html
and
SUSTRING
http://www.firebirdsql.org/refdocs/langrefupd21-intfunc-substring.html
functions in WHILE DO statement
A similar solution I use, published a while ago by Jiri Cincura
http://blog.cincura.net/232347-tokenize-string-in-sql-firebird-syntax/
recreate procedure Tokenize(input varchar(1024), token char(1))
returns (result varchar(255))
as
declare newpos int;
declare oldpos int;
begin
oldpos = 1;
newpos = 1;
while (1 = 1) do
begin
newpos = position(token, input, oldpos);
if (newpos > 0) then
begin
result = substring(input from oldpos for newpos - oldpos);
suspend;
oldpos = newpos + 1;
end
else if (oldpos - 1 < char_length(input)) then
begin
result = substring(input from oldpos);
suspend;
break;
end
else
begin
break;
end
end
end
It looks good except one thing, in my Firebird server Varchar size declaration to 32000 cause "Implementation limit exceeded" exception so be careful. I suggest to use BLOB SUB_TYPE TEXT instead :)
This works for me on an Informix DataBase:
DROP FUNCTION rrhh:fnc_StringList_To_Table;
CREATE FUNCTION rrhh:fnc_StringList_To_Table (pStringList varchar(250))
RETURNING INT as NUMERO;
/* A esta Funcion le podes pasar una cadena CSV con una lista de numeros
* Ejem: EXECUTE FUNCTION fnc_StringList_To_Table('1,2,3,4');
* y te devolvera una Tabla con dichos numeros separados uno x fila
* Autor: Jhollman Chacon #Cutcsa - 2019 */
DEFINE _STRING VARCHAR(255);
DEFINE _LEN INT;
DEFINE _POS INT;
DEFINE _START INT;
DEFINE _CHAR VARCHAR(1);
DEFINE _VAL INT;
LET _STRING = REPLACE(pStringList, ' ', '');
LET _START = 0;
LET _POS = 0;
LET _LEN = LENGTH(_STRING);
FOR _POS = _START TO _LEN
LET _CHAR = SUBSTRING(pStringList FROM _POS FOR 1);
IF _CHAR <> ',' THEN
LET _VAL = _CHAR::INT;
ELSE
LET _VAL = NULL;
END IF;
IF _VAL IS NOT NULL THEN
RETURN _VAL WITH RESUME;
END IF;
END FOR;
END FUNCTION;
EXECUTE FUNCTION fnc_StringList_To_Table('1,2,3,4');
SELECT * FROM TABLE (fnc_StringList_To_Table('1,2,3,4'));
I need to write this query in SQL Server:
IF isFloat(#value) = 1
BEGIN
PRINT 'this is float number'
END
ELSE
BEGIN
PRINT 'this is integer number'
END
Please help me out with this, thanks.
declare #value float = 1
IF FLOOR(#value) <> CEILING(#value)
BEGIN
PRINT 'this is float number'
END
ELSE
BEGIN
PRINT 'this is integer number'
END
Martin, under certain circumstances your solution gives an incorrect result if you encounter a value of 1234.0, for example. Your code determines that 1234.0 is an integer, which is incorrect.
This is a more accurate snippet:
if cast(cast(123456.0 as integer) as varchar(255)) <> cast(123456.0 as varchar(255))
begin
print 'non integer'
end
else
begin
print 'integer'
end
Regards,
Nico
DECLARE #value FLOAT = 1.50
IF CONVERT(int, #value) - #value <> 0
BEGIN
PRINT 'this is float number'
END
ELSE
BEGIN
PRINT 'this is integer number'
END
See whether the below code will help. In the below values only 9,
2147483647, 1234567 are eligible as Integer. We can create this as
function and can use this.
CREATE TABLE MY_TABLE(MY_FIELD VARCHAR(50))
INSERT INTO MY_TABLE
VALUES('9.123'),('1234567'),('9'),('2147483647'),('2147483647.01'),('2147483648'), ('2147483648ABCD'),('214,7483,648')
SELECT *
FROM MY_TABLE
WHERE CHARINDEX('.',MY_FIELD) = 0 AND CHARINDEX(',',MY_FIELD) = 0
AND ISNUMERIC(MY_FIELD) = 1 AND CONVERT(FLOAT,MY_FIELD) / 2147483647 <= 1
DROP TABLE MY_TABLE
OR
DECLARE #num VARCHAR(100)
SET #num = '2147483648AS'
IF ISNUMERIC(#num) = 1 AND #num NOT LIKE '%.%' AND #num NOT LIKE '%,%'
BEGIN
IF CONVERT(FLOAT,#num) / 2147483647 <= 1
PRINT 'INTEGER'
ELSE
PRINT 'NOT-INTEGER'
END
ELSE
PRINT 'NOT-INTEGER'
our stored procedures have developer comments and headers and as part of our deployment process we would like to remove these from the customer copy. Is there a method of achieving this within SQL Server 2005 or with another tool?
Don't know if it would suit, but you can use the WITH ENCRYPTION option to hide the entire contents. Do your end users need to see/modify any of the procedures?
I use an SQL tool called WinSQL (very handy, highly reccommended) that has an option to "Parse Comments Locally".
I don't use it much personally, but I have had it on accidentally when running my scripts that build my stored procs and it does clean them out of the proc source in the database. :-)
Even the free version has that option.
This option wasn't available when the question was asked, but in SQL 2012, we can now use SQL Server's own parser to help us out.
Removing Comments From SQL
A little late to the party but in case someone else stumbles across...
CREATE FUNCTION [usf_StripSQLComments] ( #CommentedSQLCode VARCHAR(max) )
RETURNS Varchar(max)
/****************************************************************************************
--######################################################################################
-- Mjwheele#yahoo.com -- Some count sheep. Some code. Some write code to count sheep.
--######################################################################################
--#############################################################################
-- Sample Call Script
--#############################################################################
Declare #SqlCode Varchar(Max)
Declare #objname varchar(max) = 'sp_myproc'
select #Sqlcode = OBJECT_DEFINITION(t.OBJECT_ID)
from sys.objects t
where t.name = #objname
select dbo.ssf_StripSQLComments( #Sqlcode )
****************************************************************************************/
AS
BEGIN
DECLARE #Sqlcode VARCHAR(MAX) =#CommentedSQLCode
Declare #i integer = 0
Declare #Char1 Char(1)
Declare #Char2 Char(1)
Declare #TrailingComment Char(1) = 'N'
Declare #UncommentedSQLCode varchar(Max)=''
Declare #Whackcounter Integer = 0
Declare #max Integer = DATALENGTH(#sqlcode)
While #i < #max
Begin
Select #Char1 = Substring(#Sqlcode,#i,1)
if #Char1 not in ('-', '/','''','*')
begin
if #Char1 = CHAR(13) or #Char1 = CHAR(10)
Select #TrailingComment = 'N'
Else if not (#Char1 = CHAR(32) or #Char1 = CHAR(9)) and #TrailingComment = 'N' -- Not Space or Tab
Select #TrailingComment = 'Y'
if #Whackcounter = 0
Select #UncommentedSQLCode += #Char1
select #i+=1
end
else
begin
Select #Char2 = #Char1
, #Char1 = Substring(#Sqlcode,#i+1,1)
If #Char1 = '-' and #Char2 = '-' and #Whackcounter = 0
Begin
While #i < #Max and Substring(#Sqlcode,#i,1) not in (char(13), char(10))
Select #i+=1
if Substring(#Sqlcode,#i,1) = char(13) and #TrailingComment = 'N'
Select #i+=1
if Substring(#Sqlcode,#i,1) = char(10) and #TrailingComment = 'N'
Select #i+=1
End
else If #Char1 = '*' and #Char2 = '/'
Begin
Select #Whackcounter += 1
, #i += 2
End
else If #Char1 = '/' and #Char2 = '*'
Begin
Select #Whackcounter -= 1
, #i += 2
End
else if #char2 = '''' and #Whackcounter = 0
begin
Select #UncommentedSQLCode += #char2
while Substring(#Sqlcode,#i,1) <> ''''
Begin
Select #UncommentedSQLCode += Substring(#Sqlcode,#i,1)
, #i +=1
end
Select #i +=1
, #Char1 = Substring(#Sqlcode,#i,1)
end
else
Begin
if #Whackcounter = 0
Select #UncommentedSQLCode += #Char2
Select #i+=1
end
end
End
Return #UncommentedSQLCode
END
You may want to check this out:
Remove Comments from SQL Server Stored Procedures.
Note: this doesn't handle comments that start with --, which SQL Server allows. Otherwise I would inquire into having a developer write a short filter app that reads the text in via a stream, and then remove the comments that way. Or write it yourself.
I ended up writing my own SQL comment remover in C#
I assume you save your procedure definitions to a text or .sql file that you then version control. You could always use something like notepadd++ to find/replace the strings you want then commit them as a production/customer tag. This is not elegant, but an option. I don't know of any third party tools and my google searches returned the same result as the other posters posted.
You can remove comments using regular expressions in C# like described here. It works for line comments, block comments, even when block comments are nested, and it can correctly identify and ignore comments delimiters when they are inside literals or bracketed named identifiers.
This a VB.NET code for removing SQL Coments
It's supposed the script is well formated syntaxicly under SQL Management Studio
Module Module1
Const TagBeginMultiComent = "/*"
Const TagEndMultiComent = "*/"
Const TagMonoComent = "--"
Public Fail As Integer
Function IsQuoteOpened(ByVal Value As String) As Boolean
Dim V As String = Replace(Value, "'", "")
If V Is Nothing Then Return 0
Return ((Value.Length - V.Length) / "'".Length) Mod 2 > 0
End Function
Function RemoveComents(ByVal Value As String) As String
Dim RetVal As String = ""
Dim Block As String
Dim Tampon As String
Dim NbComentIncluded As Integer = 0
Dim QuoteOpened As Boolean
Dim CommentOpen As Boolean
While Value.Length > 0
Tampon = ""
Block = ""
Dim P1 As Integer = InStr(Value, TagBeginMultiComent)
Dim P2 As Integer = InStr(Value, TagEndMultiComent)
Dim P3 As Integer = InStr(Value, TagMonoComent)
Dim Min As Integer
If P1 = 0 Then P1 = Value.Length + 1
If P2 = 0 Then P2 = Value.Length + 1
If P3 = 0 Then P3 = Value.Length + 1
Tampon = ""
If P1 + P2 + P3 > 0 Then
Min = Math.Min(P1, Math.Min(P2, P3))
Tampon = Left(Value, Min - 1)
Block = Mid(Value, Min, 2)
Value = Mid(Value, Min)
End If
If NbComentIncluded = 0 Then QuoteOpened = IsQuoteOpened(RetVal & Tampon)
If Not QuoteOpened Then
NbComentIncluded += -(Block = TagBeginMultiComent) + (Block = TagEndMultiComent)
If Block = TagMonoComent Then
Dim Ploc As Integer = InStr(Value, vbCrLf)
If Ploc = 0 Then
Value = ""
Else
Value = Mid(Value, Ploc - 2)
End If
End If
End If
If Not CommentOpen And NbComentIncluded = 0 Then
RetVal += Tampon
If ({TagBeginMultiComent, TagEndMultiComent, TagMonoComent}.Contains(Block) And QuoteOpened) Or
(Not {TagBeginMultiComent, TagEndMultiComent, TagMonoComent}.Contains(Block) And Not QuoteOpened) Then RetVal += Block
End If
CommentOpen = (NbComentIncluded > 0)
Value = Mid(Value, 3)
End While
Fail = -1 * (IsQuoteOpened(RetVal)) - 2 * (NbComentIncluded > 0)
If Fail <> 0 Then RetVal = ""
Return RetVal
End Function
Sub Main()
Dim InputFileName = "C:\Users\godef\OneDrive - sacd.fr\DEV\DelComentsSql\test.txt" '"C:\Users\sapgy01\OneDrive - sacd.fr\DEV\DelComentsSql\test.txt"
Dim Script As String = File.ReadAllText(InputFileName)
Dim InputDataArray As String() = Split(Script, vbCrLf)
Script = RemoveComents(Script)
If Fail Then
Console.WriteLine($"Fail : {Fail}")
If Fail And 1 = 1 Then Console.WriteLine("Toutes les quotes ne sont pas refermées")
If Fail And 2 = 2 Then Console.WriteLine("Tous les commentaires multiliqnes ne sont pas refermées")
Else
Console.WriteLine(Script)
End If
Console.ReadKey()
End Sub
End Module
Addon : a check is made for unclosed multilines coment and/or unclosed apostroph.
Example :
/* Commentaire principal
Suite du commentaire principal
/* Inclusion de commentaire
Suite du commentaire inclu
*/ suite commentaire principal
continuation commentaire principal
/* mono comentaire tagué multi lignes */
*/ select * from ref
-- mono commentaire
select ref.ref_lbl_code as 'code
de
la
-- ref --
' -- from ref as 'references' -- Fin de séquence
from ref as reference -- Mono commentaire fin de ligne
go -- lance l'exécution
select dbo.ref.REF_LBL_CODE as 'commentaire
/* Mulitlignes sur une ligne dans litteral */'
from ref as 'table_ref'
select ref.ref_lbl_code as 'commentaire
/* Mulitlignes
sur plusieurs lignes
dans litteral */'
from ref as '-- ref_table --'
-- Fin de l'exécution du ' -- script -- '