How does one automatically insert the results of several function calls into a table? - sql

Wasn't sure how to title the question but hopefully this makes sense :)
I have a table (OldTable) with an index and a column of comma separated lists. I'm trying to split the strings in the list column and create a new table with the indexes coupled with each of the sub strings of the string it was connected to in the old table.
Example:
OldTable
index | list
1 | 'a,b,c'
2 | 'd,e,f'
NewTable
index | letter
1 | 'a'
1 | 'b'
1 | 'c'
2 | 'd'
2 | 'e'
2 | 'f'
I have created a function that will split the string and return each sub string as a record in a 1 column table as so:
SELECT * FROM Split('a,b,c', ',', 1)
Which will result in:
Result
index | string
1 | 'a'
1 | 'b'
1 | 'c'
I was hoping that I could use this function as so:
SELECT * FROM Split((SELECT * FROM OldTable), ',')
And then use the id and string columns from OldTable in my function (by re-writing it slightly) to create NewTable. But I as far as I understand sending tables into the function doesn't work as I get: "Subquery returned more than 1 value. ... not premitted ... when the subquery is used as an expression."
One solution I was thinking of would be to run the function, as is, on all the rows of OldTable and insert the result of each call into NewTable. But I'm not sure how to iterate each row without a function. And I can't send tables into the a function to iterate so I'm back at square one.
I could do it manually but OldTable contains a few records (1000 or so) so it seems like automation would be preferable.
Is there a way to either:
Iterate over OldTable row by row, run the row through Split(), add the result to NewTable for all rows in OldTable. Either by a function or through regular sql-transactions
Re-write Split() to take a table variable after all
Get rid of the function altogether and just do it in sql transactions?
I'd prefer to not use procedures (don't know if there is a solutions with them either) mostly because I don't want the functionality inside of the DB to be exposed to the outside. If, however that is the "best"/only way to go I'll have to consider it. I'm quite (read very) new to SQL so it might be a needless worry.
Here is my Split() function if it is needed:
CREATE FUNCTION Split (
#string nvarchar(4000),
#delimitor nvarchar(10),
#indexint = 0
)
RETURNS #splitTable TABLE (id int, string nvarchar(4000) NOT NULL) AS
BEGIN
DECLARE #startOfSubString smallint;
DECLARE #endOfSubString smallint;
SET #startOfSubString = 1;
SET #endOfSubString = CHARINDEX(#delimitor, #string, #startOfSubString);
IF (#endOfSubString <> 0)
WHILE #endOfSubString > 0
BEGIN
INSERT INTO #splitTable
SELECT #index, SUBSTRING(#string, #startOfSubString, #endOfSubString - #startOfSubString);
SET #startOfSubString = #endOfSubString+1;
SET #endOfSubString = CHARINDEX(#delimitor, #string, #startOfSubString);
END;
INSERT INTO #splitTable
SELECT #index, SUBSTRING(#string, #startOfSubString, LEN(#string)-#startOfSubString+1);
RETURN;
END
Hope my problem and attempt was explained and possible to understand.

You are looking for cross apply:
SELECT t.index, s.item
FROM OldTable t CROSS APPLY
(dbo.split(t.list, ',')) s(item);
Inserting in the new table just requires an insert or select into clause.

Related

Azure Synapse Analytics SQL Database function to get match between two delimited lists

I'm using Azure Synapse Analytics SQL Database. I'm aware I can't use selects in a scalar function (hence the error The SELECT statement is not allowed in user-defined functions). I'm looking for a work-around since this function does not rely on any tables. The goal is a scalar function that takes two delimited lists parameters, a delimiter parameter and returns 1 if the lists have one or more matching items, and returns 0 if no matches are found.
--The SELECT statement is not allowed in user-defined functions
CREATE FUNCTION util.get_lsts_have_mtch
(
#p_lst_1 VARCHAR(8000),
#p_lst_2 VARCHAR(8000),
#p_dlmtr CHAR(1)
)
RETURNS BIT
/***********************************************************************************************************
Description: This function returns 1 if two delimited lists have an item that exists in both lists.
--Example run:
SELECT util.get_lsts_have_mtch('AB|CD|EF|GH|IJ','UV|WX|CD|IJ|YZ','|') -- returns 1, there's a match
SELECT util.get_lsts_have_mtch('AB|CD|EF|GH|IJ','ST|UV|WX|YZ','|') -- returns 0, there's no match
**********************************************************************************************************/
AS
BEGIN
DECLARE #v_result BIT;
-- *** CAN THIS BE ACCOMPLISHED EFFICIENTLY WITHOUT ANY SELECTS? ***
SET #v_result = (SELECT CAST(CASE WHEN EXISTS (SELECT 1
FROM STRING_SPLIT(#p_lst_1, #p_dlmtr) AS tokens_1
INNER JOIN STRING_SPLIT(#p_lst_2, #p_dlmtr) AS tokens_2
ON tokens_1.value = tokens_2.value)
THEN 1
ELSE 0
END) AS BIT);
RETURN #v_result;
END;
I ditched the function and used this CASE statement. I wanted a function to join on that would be reusable. If anyone can find a function to do this, I will make that the accepted answer.
SELECT ...
FROM tbl_1
JOIN tbl_2
ON
-- wanted: util.get_lsts_have_mtch(tbl_1.my_lst, tbl_2.my_lst, '|') = 1
-- but settled for:
CASE WHEN EXISTS
(SELECT [value]
FROM STRING_SPLIT(tbl_1.my_lst, '|')
INTERSECT
SELECT [value]
FROM STRING_SPLIT(tbl_2.my_lst, '|'))
THEN 1
ELSE 0
END = 1

Checking if field contains multiple string in sql server

I am working on a sql database which will provide with data some grid. The grid will enable filtering, sorting and paging but also there is a strict requirement that users can enter free text to a text input above the grid for example
'Engine 1001 Requi' and that the result will contain only rows which in some columns contain all the pieces of the text. So one column may contain Engine, other column may contain 1001 and some other will contain Requi.
I created a technical column (let's call it myTechnicalColumn) in the table (let's call it myTable) which will be updated each time someone inserts or updates a row and it will contain all the values of all the columns combined and separated with space.
Now to use it with entity framework I decided to use a table valued function which accepts one parameter #searchQuery and it will handle it like this:
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS #Result TABLE
( ... here come columns )
AS
BEGIN
DECLARE #searchToken TokenType
INSERT INTO #searchToken(token) SELECT value FROM STRING_SPLIT(#searchText,' ')
DECLARE #searchTextLength INT
SET #searchTextLength = (SELECT COUNT(*) FROM #searchToken)
INSERT INTO #Result
SELECT
... here come columns
FROM myTable
WHERE (SELECT COUNT(*) FROM #searchToken WHERE CHARINDEX(token, myTechnicalColumn) > 0) = #searchTextLength
RETURN;
END
Of course the solution works fine but it's kinda slow. Any hints how to improve its efficiency?
You can use an inline Table Valued Function, which should be quite a lot faster.
This would be a direct translation of your current code
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
(
WITH searchText AS (
SELECT value token
FROM STRING_SPLIT(#searchText,' ') s(token)
)
SELECT
... here come columns
FROM myTable t
WHERE (
SELECT COUNT(*)
FROM searchText
WHERE CHARINDEX(s.token, t.myTechnicalColumn) > 0
) = (SELECT COUNT(*) FROM searchText)
);
GO
You are using a form of query called Relational Division Without Remainder and there are other ways to cut this cake:
CREATE FUNCTION myFunctionName(#searchText NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
(
WITH searchText AS (
SELECT value token
FROM STRING_SPLIT(#searchText,' ') s(token)
)
SELECT
... here come columns
FROM myTable t
WHERE NOT EXISTS (
SELECT 1
FROM searchText
WHERE CHARINDEX(s.token, t.myTechnicalColumn) = 0
)
);
GO
This may be faster or slower depending on a number of factors, you need to test.
Since there is no data to test, i am not sure if the following will solve your issue:
-- Replace the last INSERT portion
INSERT INTO #Result
SELECT
... here come columns
FROM myTable T
JOIN #searchToken S ON CHARINDEX(S.token, T.myTechnicalColumn) > 0

creating a SQL table with multiple columns automatically

I must create an SQL table with 90+ fields, the majority of them are bit fields like N01, N02, N03 ... N89, N90 is there a fast way of creating multiple fileds or is it possible to have one single field to contain an array of values true/false? I need a solution that can also easily be queried.
There is no easy way to do this and it will be very challenging to do queries against such a table. Create a table with three columns - item number, bit field number and a value field. Then you will be able to write 'good' succinct Tsql queries against the table.
At least you can generate ALTER TABLE scripts for bit fields, and then run those scripts.
DECLARE #COUNTER INT = 1
WHILE #COUNTER < 10
BEGIN
PRINT 'ALTER TABLE table_name ADD N' + RIGHT('00' + CONVERT(NVARCHAR(4), #COUNTER), 2) + ' bit'
SET #COUNTER += 1
END
TLDR: Use binary arithmetic.
For a structure like this
==============
Table_Original
==============
Id | N01| N02 |...
I would recommend an alternate table structure like this
==============
Table_Alternate
==============
Id | One_Col
This One_Col is of varchar type which will have value set as
cast(n01 as nvarchar(1)) + cast(n02 as nvarchar(1))+ cast(n03 as nvarchar(1)) as One_Col
I however feel that you'd use C# or some other programming language to set value into column. You can also use bit and bit-shift operations.
Whenever you need to get a value, you can use SQL or C# syntax(treating as string)
In sql query terms you can use a query like
SELECT SUBSTRING(one_col,#pos,1)
and #pos can be set like
DECLARE #Colname nvarchar(4)
SET #colname=N'N32'
-- ....
SET #pos= CAST(REPLACE(#colname,'N','') as INT)
Also you can use binary arithmetic too with ease in any programming language.
Use three columns.
Table
ID NUMBER,
FIELD_NAME VARCHAR2(10),
VALUE NUMBER(1)
Example
ID FIELD VALUE
1 N01 1
1 N02 0
.
1 N90 1
.
2 N01 0
2 N02 1
.
2 N90 1
.
You can also OR an entire column for a fieldname (or fieldnameS):
select DECODE(SUM(VALUE), 0, 0, 1) from table where field_name = 'N01';
And even perform an AND
select EXP(SUM(LN(VALUE))) from table where field_name = 'N01';
(see http://viralpatel.net/blogs/row-data-multiplication-in-oracle/)

Firebird how to select ids that match all items in a set

I'm using Firebird 2.1.
There is a table: IDs, Labels
There can be multiple labels for the same ID:
10 Peach
10 Pear
10 Apple
11 Apple
12 Pear
13 Peach
13 Apple
Let's say I have a set of labels, ie.: (Apple, Pear, Peach).
How can I write a single select to return all IDs that have all labels associated in a given set? Preferably I'd like to specify the set in a string separated with commas, like: ('Apple', 'Pear', 'Peach') -› this should return ID = 10.
Thanks!
As asked, I'm posting my simpler version of piclrow's answer. I have tested this on my Firebird, which is version 2.5, but the OP (Steve) has tested it on 2.1 and it works as well.
SELECT id
FROM table
WHERE label IN ('Apple', 'Pear', 'Peach')
GROUP BY id
HAVING COUNT(DISTINCT label)=3
This solution has the same disadvantage as pilcrow's... you need to know how many values you are looking for, as the HAVING = condition must match the WHERE IN condition. In this respect, Ed's answer is more flexible, as it splits the concatenated value string parameter and counts the values. So you just have to change the one parameter, instead of the 2 conditions I and pilcrow use.
OTOH, if efficency is of concern, I would rather think (but I am absolutely not sure) that Ed's CTE approach might be less optimizable by the Firebird engine than the one I suggest. Firebird is very good at optimizing queries, but I don't really now if it is able to do so when you use CTE this way. But the WHERE + GROUP BY + HAVING should be optimizable by simply having an index on (id,label).
In conclusion, if execution times are of concern in your case, then you probably need some explain plans to see what is happening, whichever solution you choose ;)
It's easiest to split the string in code and then query
SQL> select ID
CON> from (select ID, count(DISTINCT LABEL) as N_LABELS
CON> from T
CON> where LABEL in ('Apple', 'Pear', 'Peach')
CON> group by 1) D
CON> where D.N_LABELS >= 3; -- We know a priori we have 3 LABELs
ID
============
10
If it is acceptable to create a helper stored procedure that will be called from the primary select then consider the following.
The Helper stored procedure takes in a delimited string along with the delimiter and returns a row for each delimited string
CREATE OR ALTER PROCEDURE SPLIT_BY_DELIMTER (
WHOLESTRING VARCHAR(10000),
SEPARATOR VARCHAR(10))
RETURNS (
ROWID INTEGER,
DATA VARCHAR(10000))
AS
DECLARE VARIABLE I INTEGER;
BEGIN
I = 1;
WHILE (POSITION(:SEPARATOR IN WHOLESTRING) > 0) DO
BEGIN
ROWID = I;
DATA = TRIM(SUBSTRING(WHOLESTRING FROM 1 FOR POSITION(TRIM(SEPARATOR) IN WHOLESTRING) - 1));
SUSPEND;
I = I + 1;
WHOLESTRING = TRIM(SUBSTRING(WHOLESTRING FROM POSITION(TRIM(SEPARATOR) IN WHOLESTRING) + 1));
END
IF (CHAR_LENGTH(WHOLESTRING) > 0) THEN
BEGIN
ROWID = I;
DATA = WHOLESTRING;
SUSPEND;
END
END
Below is the code to call, I am using Execute block to demonstrate passing in the delimited string
EXECUTE BLOCK
RETURNS (
LABEL_ID INTEGER)
AS
DECLARE VARIABLE PARAMETERS VARCHAR(50);
BEGIN
PARAMETERS = 'Apple,Peach,Pear';
FOR WITH CTE
AS (SELECT ROWID,
DATA
FROM SPLIT_BY_DELIMITER(:PARAMETERS, ','))
SELECT ID
FROM TABLE1
WHERE LABELS IN (SELECT DATA
FROM CTE)
GROUP BY ID
HAVING COUNT(*) = (SELECT COUNT(*)
FROM CTE)
INTO :LABEL_ID
DO
SUSPEND;
END

SQL select multiple rows of data then compare

What would be the best approach in SQL Server 2008 to select something that can contain 10 list of data, then compare that data with a specific value in one of it's columns
So something like this below
SELECT bType FROM WORK_STATION WHERE nFileId = 123456789
Which could return either 1 - 10 values MAX (will return at least one value). Then to compare the data from that SQL statement above that we just selected to a specific value to something like
if bType = 1
--DO something
What is the best approach of doing something like this?
declare #table as table(btype int)
declare #btype int
insert into #table
SELECT bType FROM WORK_STATION WHERE nFileId = 123456789
while(exists(select top 1 'x' from #table)) --as long as #table contains records continue
begin
select top 1 #btype = btype from #table
if(#btype = 10)
print 'something'
delete top (1) from #table --remove the previously processed row. also ensures no infinite loop
end
I think you can use SP to declare variables and then compare it with the resultset, if you know that you have only 10 values you can use temp table and insert 10 values.
I hope this is helpful.