how to deleting part of an entry - sql

say if I have a column containing
hello;world;how;are;you;
how do I write a sql command to just delete "hello world" or "how are you".
thanks

trojanfoe wrote what you need to do with a correct database design. If you can't change this I may have a possible way todo it.
I doubt it is the fastest way to do so, but you can fetch the data, split it into a new virtual table and then remove what you need, the result you can add again in your column.
There is no split in SQL, so here is a split function (MS SQL, maybe need to modify for mySQL):
CREATE FUNCTION dbo.fn_Split(#text nvarchar(4000), #delimiter char(1) = ',')
RETURNS #Strings TABLE (
position int IDENTITY PRIMARY KEY,
value nvarchar(4000)
)
AS
BEGIN
DECLARE #index int
SET #index = -1
SET #text = RTRIM(LTRIM(#text))
WHILE (LEN(#text) > 0)
BEGIN
SET #index = CHARINDEX(#delimiter , #text)
IF (#index = 0) AND (LEN(#text) > 0)
BEGIN
INSERT INTO #Strings VALUES (#text)
BREAK
END
IF (#index > 1)
BEGIN
INSERT INTO #Strings VALUES (LEFT(#text, #index - 1))
SET #text = RIGHT(#text, (LEN(#text) - #index))
END
ELSE
SET #text = RIGHT(#text, (LEN(#text) - #index))
END
RETURN
END
With this you can do stuff like this:
SELECT * FROM dbo.fn_Split(<yourcolumn>,';')
Which gives you then your content as single rows.
Then you can remove items by using a WHERE statement like
SELECT * FROM dbo.fn_Split(<yourcolumn>,';') WHERE [value] LIKE '%hello%'
You might need to write a PROCEDURE then or a TABLE/SCALAR FUNCTION to handle you data with this.
So isn't a good way to do but the only way I found out so far :) Hope it helps a but.

If this column contains multiple items that you want to manipulate then putting them into a single column is a schema design flaw.
You are not taking advantage of the relational nature of the database and you should store these items in a child table of the main table. You will then be able to define ordering in this child table (with a separate column) and will be able to manipulate (i.e. SELECT, DELETE, etc) individual elements far easier.

Guess if it's a legacy database structure it cannot be changed easily.
Btw the answer is updating the field. Since almost every database engine nowadays allows you to write a stored function, I guess that's a good solution.
Using Oracle's Syntax something like this would work:
UPDATE TABLE SET FIELD_NAME = STORED_FUNCTION_MANIPULATING_THE_FIELD(FIELD_NAME) WHERE [where condition]
but before executing the query, you should define the stored function.
FUNCTION STORED_FUNCTION_MANIPULATING_THE_FIELD(FIELD_NAME VARCHAR2) RETURN VARCHAR2 IS
BEGIN
-- Do whatever you want on field, then return it.
END STORED_FUNCTION_MANIPULATING_THE_FIELD;
However this is conceptually wrong, multiple data values, shall use multiple columns in order to take fully advantage of relational engines (as stated in previous answers).
Regards
M.

UPDATE tablename SET field=replace(replace(field,'hello;world;',''),';',' ') [WHERE condition]
UPDATE tablename SET field=replace(mid(field,locate('how',field)),';',' ') [WHERE condition]
UPDATE tablename SET field=replace(mid(field,1,locate(';how',field)),';',' ') [WHERE condition]
UPDATE tablename SET field=replace(substring_index(field,';',2),';',' ') [WHERE condition]
(If you use a -2 in the substring_index function, you get the other part of the text.)
One of those will give you what you want. The use of substring_index is probably the most flexible way.

Related

Constructing SQL Server stored procedure for array Input

I am struggling with this. I have looked at Table Level Variables but I am thinking this is way beyond my simple understanding at this stage of SQL.
The issue I have created is I have an array of ID values I am generating inside MS Access as a result of some other tasks in there. I am wanting to send these over to SQL Server to grab the jobs with the ID number that matches.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[get_Job]
#jobID VARCHAR,
#JobIDs id_List READONLY
AS
BEGIN
SELECT #JobID AS JobID;
SELECT *
FROM Job
END;
Is my current stored procedure, however whilst I have been able to get it to return the JobID variable any list I added generates an error. If I insert only 1 ID into JobIDs, this doesn't generate a result either.
As I said I think I am punching well above my weight and am getting a bit lost in all this. Perhaps I can be directed to a better training resource or a site that explains this in baby steps or a book I can purchase to help me understand this? I would appreciate help with fixing the errors above but a fish teaching is probably better.
Thanks in advance
The issue comes down to much is how long is the list of ID's you going to pass to t-sql is the issue?
You could take the passed list (assume it is a string), say like this from Access at a PT query
exec GetHotels '1,2,3,4,5,6,7,10,20,30'
So, the above is the PT query you can/could send to sql server from Access.
So, in above, we want to return records based on above?
The T-SQL would thus become:
CREATE PROCEDURE GetHotels
#IdList nvarchar(max)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #MySQL nvarchar(max)
set #MySQL = 'select * from tblHotels where ID in (' + #IdList + ')'
EXECUTE sp_executesql #mysql
END
GO
Now, in Access, say you have that array of "ID" ? You code will look like this:
Sub MyListQuery(MyList() As String)
' above assumes a array of id
' take array - convert to a string list
Dim strMyList As String
strMyList = "'" & Join(MyList, ",") & "'"
Dim rst As DAO.Recordset
With CurrentDb.QueryDefs("qryPassR")
.SQL = "GetHotels " & strMyList
Set rst = .OpenRecordset
End With
rst.MoveLast
Debug.Print rst.RecordCount
End Sub
Unfortunately, creating t-sql on the fly is a "less" then ideal approach. In most cases, because the table is not known at runtime, you have to specific add EXEC permissions to the user.
eg:
GRANT EXECUTE ON dbo.GetHotels TO USERTEST3
You find that such users can execute + run + use "most" store procedures, but in this case, you have to add specific rights with above grant due to the "table" not being known or resolved until runtime.
So, the above is a way to send a "given" array that you have, but from a general permissions point of view, and that of creating t-sql on the fly - I can't recommend this approach unless you are stuck, and have no other choice.
Edit
Here is a solution that works the same as above, but we don't have to create a SQL statement as a string.
CREATE PROCEDURE [dbo].[GetHotels2]
#IdList nvarchar(max)
AS
BEGIN
SET NOCOUNT ON;
-- create a table from the passed list
declare #List table (ID int)
while charindex(',',#IdList) > 0
begin
insert into #List (ID) values(left(#IDList,charindex(',',#IdList)-1))
set #Idlist = right(#IdList,len(#IdList)-charindex(',',#IdList))
end
insert into #List (ID) values(#IdList)
select * from tblHotels where ID in (select ID from #list)
END
You didn't show us what that table-valued parameter looks like - but assuming id_List contains a column called Id, then you need to join this TVP to your base table something like this:
ALTER PROCEDURE [dbo].[get_Job]
#jobID VARCHAR,
#JobIDs id_List READONLY
AS
BEGIN
SELECT (list of columns)
FROM Job j
INNER JOIN id_List l ON j.JobId = l.Id;
END;
Seems pretty easy to me - and not really all that difficult to handle! Agree?
Also, check out Bad habits to kick : declaring VARCHAR without (length) - you should always provide a length for any varchar variables and parameters that you use. Otherwise, as in your case - that #jobID VARCHAR parameter will be exactly ONE character long - and this is typically not what you expect / want ....

Generate tables with unique names

I need to create non-temporary tables in a MariaDB 10.3 database using Node. I therefore need a way of generating a table name that is guaranteed to be unique.
The Node function cannot access information regarding any unique feature about what or when the tables are made, so I cannot build the name from a timestamp or connection ID. I can only verify the name's uniqueness using the current database.
This question had a PostgreSQL answer suggesting the following:
SET #name = GetBigRandomNumber();
WHILE TableExists(#name)
BEGIN
SET #name = GetBigRandomNumber();
END
I attempted a MariaDB implementation using #name = CONCAT(MD5(RAND()),MD5(RAND())) to generate a random 64 character string, and (COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE #name) >0 to check if it was a unique name:
SET #name = CONCAT(MD5(RAND()),MD5(RAND()));
WHILE ((COUNT(*) FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME LIKE #name) >0) DO
SET #name = CONCAT(MD5(RAND()),MD5(RAND()));
END WHILE;
CREATE TABLE #name ( ... );
However I get a syntax error when I try to run the above query. My SQL knowledge isn't that great so I'm at a loss as to what the problem might be.
Furthermore, is this approach efficient? The randomly generated name is long enough that it is very unlikely to have any clashes with any current table in the database, so the WHILE loop will very rarely need to run, but is there some sort of built in function to auto increment table names, or something similar?
SET #name := UUID();
If the dashes in that cause trouble, then
SET #name := REPLACE(UUID(), '-', '');
It will be safer (toward uniqueness) than RAND(). And, in theory, there will be no need to verify its uniqueness. After all, that's the purpose of UUIDs.

IN-clause with optional parameter SQL

I have a stored procedure that returns me a set of data based on 2 input parameters. One of the parameter is optional so I am using
WHERE
(tbl_Process.ProjectID = #ProjectID)
AND
(tbl_AnalysisLookup.AnalysisCodeID = 7)
AND
(tbl_ProcessSubStep.ProcessID = ISNULL(#ProcessID,tbl_ProcessSubStep.ProcessID))
The #ProcessID is optional parameter so the user may/may not provide it.
Now I need to change my stored procedure to accommodate multiple ProcessId's i.e. the user can now select a list of multiple ProcessId's, Single ProcessID or No ProcessID and the stored proc should handle all these scenarios. What is the best way to achieve this without using Dynamic queries unless absolutely required.
In a nutshell, I wanted my stored proc to handle optional parameters with multiple values(WHERE IN Clause). The solution and relative link to the webpage I got it from has been provided below. It's a very good article and will help you to choose the right solution based on your requirements.
I have finally figured out how to achieve this. There are a couple of ways to do this, what I am using now is a function to split a string of ProcessID's based on delimiter and Then Inserting them into a table. Then using that table in my stored proc. Here is the code and the link to the webpage.
http://www.codeproject.com/Articles/58780/Techniques-for-In-Clause-and-SQL-Server
CREATE FUNCTION [dbo].[ufnDelimitedBigIntToTable]
(
#List varchar(max), #Delimiter varchar(10)
)
RETURNS #Ids TABLE
(Id bigint) AS
BEGIN
DECLARE #list1 VARCHAR(MAX), #Pos INT, #rList VARCHAR(MAX)
SET #List = LTRIM(RTRIM(#List)) + #Delimiter
SET #pos = CHARINDEX(#Delimiter, #List, 1)
WHILE #pos > 0
BEGIN
SET #list1 = LTRIM(RTRIM(LEFT(#List, #pos - 1)))
IF #list1 <> ''
INSERT INTO #Ids(Id) VALUES (CAST(#list1 AS bigint))
SET #List = SUBSTRING(#List, #pos+1, LEN(#List))
SET #pos = CHARINDEX(#Delimiter, #list, 1)
END
RETURN
END
Once made, the table-function can be used in a query:
Collapse | Copy Code
CREATE PROCEDURE [dbo].[GetUsingDelimitedFunctionTable]
#Ids varchar(max)
AS
BEGIN
SET NOCOUNT ON
SELECT s.Id,s.SomeString
FROM SomeString s (NOLOCK)
WHERE EXISTS ( SELECT *
FROM ufnDelimitedBigIntToTable(#Ids,',') Ids
WHERE s.Id = Ids.id )
END
The Link also provides more ways to achieve this.
Not the best, but one way is to convert both sides to "varchar" and use "Like" operator to compare them. It doesn't need any huge modifications, just change the datatype of your parameter to "varchar". Something like the code below:
'%[,]' + Convert(varchar(10), tbl_ProcessSubStep.ProcessID) + '[,]%' Like #ProcessIDs
Hope it helps.
You didn't specify your database product in your question, but I'm going to guess from the #Pararemter naming style that you're using SQL Server.
Except for the unusual requirement of interpreting empty input to mean 'all', this a restatement of the problem of Arrays in SQL, explored throughly by Erland Sommarskog. Read all his articles on the subject for a good analysis of all the techniques you can use.
Here I'll explain how to use a table-valued parameter to solve your problem.
Execute the following scripts all together to set up the test environment in an idempotent way.
Creating a sample solution
First create a new empty test database StackOverFlow13556628:
USE master;
GO
IF DB_ID('StackOverFlow13556628') IS NOT NULL
BEGIN
ALTER DATABASE StackOverFlow13556628 SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE StackOverFlow13556628;
END;
GO
CREATE DATABASE StackOverFlow13556628;
GO
USE StackOverFlow13556628;
GO
Next, create a user-defined table type PrinciapalList with one column principal_id. This type contains the input values with which to query the system table sys.database_principals.
CREATE TYPE PrincipalList AS TABLE (
principal_id INT NOT NULL PRIMARY KEY
);
GO
After that, create the stored procedure GetPrincipals which takes a PrincipalList table-valued parameter as input, and returns a result set from sys.database_principals.
CREATE PROCEDURE GetPrincipals (
#principal_ids PrincipalList READONLY
)
AS
BEGIN
IF EXISTS(SELECT * FROM #principal_ids)
BEGIN
SELECT *
FROM sys.database_principals
WHERE principal_id IN (
SELECT principal_id
FROM #principal_ids
);
END
ELSE
BEGIN
SELECT *
FROM sys.database_principals;
END;
END;
GO
If the table-valued parameter contains rows, then the procedure returns all the rows in sys.database_principals that have a matching principal_id value. If the table-valued parameter is empty, it returns all the rows.
Testing the solution
You can query multiple principals like this:
DECLARE #principals PrincipalList;
INSERT INTO #principals (principal_id) VALUES (1);
INSERT INTO #principals (principal_id) VALUES (2);
INSERT INTO #principals (principal_id) VALUES (3);
EXECUTE GetPrincipals
#principal_ids = #principals;
GO
Result:
principal_id name
1 dbo
2 guest
3 INFORMATION_SCHEMA
You can query a single principal like this:
DECLARE #principals PrincipalList;
INSERT INTO #principals (principal_id) VALUES (1);
EXECUTE GetPrincipals
#principal_ids = #principals;
GO
Result:
principal_id name
1 dbo
You can query all principals like this:
EXECUTE GetPrincipals;
Result:
principal_id name
0 public
1 dbo
2 guest
3 INFORMATION_SCHEMA
4 sys
16384 db_owner
16385 db_accessadmin
16386 db_securityadmin
16387 db_ddladmin
16389 db_backupoperator
16390 db_datareader
16391 db_datawriter
16392 db_denydatareader
16393 db_denydatawriter
Remarks
This solution is inefficient because you always have to read from the table-valued parameter twice. In practice, unless your table-valued parameter has millions of rows, it will probably not be the major bottleneck.
Using an empty table-valued parameter in this way feels unintuitive. A more obvious design might simply be to have two stored procedures - one that returns all the rows, and one that returns only rows with matching ids. It would be up to the calling application to choose which one to call.

Using an arbitrary number of parameters in T-SQL

Is it possible to create a parameterized SQL statement that will taken an arbitrary number of parameters? I'm trying to allow users to filter a list based on multiple keywords, each separated by a semicolon. So the input would be something like "Oakland;City;Planning" and the WHERE clause would come out something equivalent to the below:
WHERE ProjectName LIKE '%Oakland%' AND ProjectName Like '%City%' AND ProjectName Like '%Planning%'
It's really easy to create such a list with concatenation, but I don't want to take that approach because of the SQL injection vulnerabilities. What are my options? Do I create a bunch of parameters and hope that users never try to use more parameters that I've defined? Or is there a way to create parameterized SQL on the fly safely?
Performance isn't much of an issue because the table is only about 900 rows right now, and won't be growing very quickly, maybe 50 to 100 rows per year.
A basic proof-of-concept... Actual code would be less, but since I don't know your table/field names, this is the full code, so anyone can verify it works, tweak it, etc.
--Search Parameters
DECLARE #SearchString VARCHAR(MAX)
SET #SearchString='Oakland;City;Planning' --Using your example search
DECLARE #Delim CHAR(1)
SET #Delim=';' --Using your deliminator from the example
--I didn't know your table name, so I'm making it... along with a few extra rows...
DECLARE #Projects TABLE (ProjectID INT, ProjectName VARCHAR(200))
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 1, 'Oakland City Planning'
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 2, 'Oakland City Construction'
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 3, 'Skunk Works'
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 4, 'Oakland Town Hall'
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 5, 'Oakland Mall'
INSERT INTO #Projects (ProjectID, ProjectName) SELECT 6, 'StackOverflow Answer Planning'
--*** MAIN PROGRAM CODE STARTS HERE ***
DECLARE #Keywords TABLE (Keyword VARCHAR(MAX))
DECLARE #index int
SET #index = -1
--Each keyword gets inserted into the table
--Single keywords are handled, but I did not add code to remove duplicates
--since that affects performance only, not the result.
WHILE (LEN(#SearchString) > 0)
BEGIN
SET #index = CHARINDEX(#Delim , #SearchString)
IF (#index = 0) AND (LEN(#SearchString) > 0)
BEGIN
INSERT INTO #Keywords VALUES (#SearchString)
BREAK
END
IF (#index > 1)
BEGIN
INSERT INTO #Keywords VALUES (LEFT(#SearchString, #index - 1))
SET #SearchString = RIGHT(#SearchString, (LEN(#SearchString) - #index))
END
ELSE
SET #SearchString = RIGHT(#SearchString, (LEN(#SearchString) - #index))
END
--This way, only a project with all of our keywords will be shown...
SELECT *
FROM #Projects
WHERE ProjectID NOT IN (SELECT ProjectID FROM #Projects Projects INNER JOIN #Keywords Keywords ON CHARINDEX(Keywords.Keyword,Projects.ProjectName)=0)
I decided to mix a few different answers together into one :-P
This assumes you'll pass in a delimited string list of search keywords (passed in via #SearchString) as a VARCHAR(MAX), which -- realistically -- you won't run into a limit on for keyword searches.
Each keyword is broken down from the list and added into a keyword table. You'd probably want to add code to remove out duplicate keywords, but it won't hurt in my example. Just slightly less effective, since we only need to evaluate once per keyword, ideally.
From there, any keyword that isn't a part of the project name removes that project from the list...
So searching for "Oakland" gives 4 results but "Oakland;City;Planning" gives only 1 result.
You can also change the delimiter, so instead of a semi-colon, it can use a space. Or whatever floats your boat...
Also, because of the joins and what not instead of Dynamic SQL, it doesn't run the risk of SQL Injection like you were worried about.
You might also want to consider Full Text Search and using CONTAINS or CONTAINSTABLE for a more "natural" search capability.
May be overkill for 1K rows, but it is written and is not easily subverted by injection.
The trick would usually to simply pass the list as a string separated by comas (csv style), parse that string in a loop and dynamically build the query.
Above post is also right that maybe the best approach is not T-SQL but the business/application layer.
What about using an XML data type to contain the parameters?
It can be unbounded and assembled at run time...
I pass in an unknown number of PKs for a table update then pump them into a temp table.
It is easy to then update where PK in PKTempTable.
Here is the code to parse the XML data type...
INSERT INTO #ERXMLRead (ExpenseReportID)
SELECT ParamValues.ID.value('.','VARCHAR(20)')
FROM #ExpenseReportIDs.nodes('/Root/ExpenseReportID') as ParamValues(ID)
using a tool like NHibernate will allow you to dynamically construct your queries safely without the need for stored procedures.
Frans Bouma has an excellent article about stored procs vs dynamic sql and what some of the benefits of using an SQL generator are over using hand generated statements
If you use stored procs you can include a default value for the parameters, then you can elect to pass them or not pass them in client code, but you still have have to declare them individually in the stored procedure... Also only if you're using a stored proc, you can pass a single parameter as a delimited string of values, and parse out the individual values inside the sproc (There are some "standard" T-SQL functions available that will split out the records into a dynamic table variable for you)
If your using SQL server 2008 check out this artical passing a table valued parameter
Whatever way you go, watch out for SQL Server's parameter limit: ~2000 parameters.
Similar to some of the other answers, you can parse out a delimited string or an XML document. See this excellent link which demonstrates both methods with SQL Server.

Handling the data in an IN clause, with SQL parameters?

We all know that prepared statements are one of the best way of fending of SQL injection attacks. What is the best way of creating a prepared statement with an "IN" clause. Is there an easy way to do this with an unspecified number of values? Take the following query for example.
SELECT ID,Column1,Column2 FROM MyTable WHERE ID IN (1,2,3)
Currently I'm using a loop over my possible values to build up a string such as.
SELECT ID,Column1,Column2 FROM MyTable WHERE ID IN (#IDVAL_1,#IDVAL_2,#IDVAL_3)
Is it possible to use just pass an array as the value of the query paramter and use a query as follows?
SELECT ID,Column1,Column2 FROM MyTable WHERE ID IN (#IDArray)
In case it's important I'm working with SQL Server 2000, in VB.Net
Here you go - first create the following function...
Create Function [dbo].[SeparateValues]
(
#data VARCHAR(MAX),
#delimiter VARCHAR(10)
)
RETURNS #tbldata TABLE(col VARCHAR(10))
As
Begin
DECLARE #pos INT
DECLARE #prevpos INT
SET #pos = 1
SET #prevpos = 0
WHILE #pos > 0
BEGIN
SET #pos = CHARINDEX(#delimiter, #data, #prevpos+1)
if #pos > 0
INSERT INTO #tbldata(col) VALUES(LTRIM(RTRIM(SUBSTRING(#data, #prevpos+1, #pos-#prevpos-1))))
else
INSERT INTO #tbldata(col) VALUES(LTRIM(RTRIM(SUBSTRING(#data, #prevpos+1, len(#data)-#prevpos))))
SET #prevpos = #pos
End
RETURN
END
then use the following...
Declare #CommaSeparated varchar(50)
Set #CommaSeparated = '112,112,122'
SELECT ID,Column1,Column2 FROM MyTable WHERE ID IN (select col FROM [SeparateValues](#CommaSeparated, ','))
I think sql server 2008 will allow table functions.
UPDATE
You'll squeeze some extra performance using the following syntax...
SELECT ID,Column1,Column2 FROM MyTable
Cross Apply [SeparateValues](#CommaSeparated, ',') s
Where MyTable.id = s.col
Because the previous syntax causes SQL Server to run an extra "Sort" command using the "IN" clause. Plus - in my opinion it looks nicer :D!
If you would like to pass an array, you will need a function in sql that can turn that array into a sub-select.
These functions are very common, and most home grown systems take advantage of them.
Most commercial, or rather professional ORM's do ins by doing a bunch of variables, so if you have that working, I think that is the standard method.
You could create a temporary table TempTable with a single column VALUE and insert all IDs. Then you could do it with a subselect:
SELECT ID,Column1,Column2 FROM MyTable WHERE ID IN (SELECT VALUE FROM TempTable)
Go with the solution posted by digiguru. It's a great reusable solution and we use the same technique as well. New team members love it, as it saves time and keeps our stored procedures consistent. The solution also works well with SQL Reports, as the parameters passed to stored procedures to create the recordsets pass in varchar(8000). You just hook it up and go.
In SQL Server 2008, they finally got around to addressing this classic problem by adding a new "table" datatype. Apparently, that lets you pass in an array of values, which can be used in a sub-select to accomplish the same as an IN statement.
If you're using SQL Server 2008, then you might look into that.
Here's one technique I use
ALTER Procedure GetProductsBySearchString
#SearchString varchar(1000),
as
set nocount on
declare #sqlstring varchar(6000)
select #sqlstring = 'set nocount on
select a.productid, count(a.productid) as SumOf, sum(a.relevence) as CountOf
from productkeywords a
where rtrim(ltrim(a.term)) in (''' + Replace(#SearchString,' ', ''',''') + ''')
group by a.productid order by SumOf desc, CountOf desc'
exec(#sqlstring)