SQL query--String Permutations

SQL query--String Permutations - sql

I am trying to create a query using a db on OpenOffice where a string is entered in the query, and all permutations of the string are searched in the database and the matches are displayed. My database has fields for a word and its definition, so if I am looking for GOOD I will get its definition as well as the definition for DOG.

You'll need a third column as well. In this column you'll have the word - but with the letters sorted in alphabetical order. For example, you'll have the word APPLE and in the next column the word AELPP.
You would sort the word your looking for - and run a some SQL code like
WHERE sorted_words = 'my_sorted_word'
for the word apple, you would get something like this:
unsorted sorted
AELPP APPLE
AELPP PEPLA
AELPP APPEL
Now, you also wanted - correct me if I'm wrong, but you want all the words that can be made with **any combination ** of the letters, meaning APPLE also returns words like LEAP and PEA.
To do this, you would have to use some programming language - you would have to write a function that preformed the above recursively, for example - for the word AELLP you have
ELLP
ALLP
AELP
and so forth.. (each time subtracting one letter in every combination, and then two letters in every combination possible ect..)

Basically, you can't easily do permutations in single SQL statement. You can easily do them in another language though, for example here's how to do it in C#: http://msdn.microsoft.com/en-us/magazine/cc163513.aspx

Ok, corrected version that I think handles all situations. This will work in MS SQL Server, so you may need to adjust it for your RDBMS as far as using the local table and the REPLICATE function. It assumes a passed parameter called #search_string. Also, since it's using VARCHAR instead of NVARCHAR, if you're using extended characters be sure to change that.
One last point that I'm just thinking of now... it will allow duplication of letters. For example, "GOOD" would find "DODO" even though there is only one "D" in "GOOD". It will NOT find words of greater length than your original word though. In other words, while it would find "DODO", it wouldn't find "DODODO". Maybe this will give you a starting point to work from though depending on your exact requirements.
DECLARE #search_table TABLE (search_string VARCHAR(4000))
DECLARE #i INT
SET #i = 1
WHILE (#i <= LEN(#search_string))
BEGIN
INSERT INTO #search_table (search_string)
VALUES (REPLICATE('[' + #search_string + ']', #i)
SET #i = #i + 1
END
SELECT
word,
definition
FROM
My_Words
INNER JOIN #search_table ST ON W.word LIKE ST.search_string
The original query before my edit, just to have it here:
SELECT
word,
definition
FROM
My_Words
WHERE
word LIKE REPLICATE('[' + #search_string + ']', LEN(#search_string))

maybe this can help:
Suppose you have a auxiliary Numbers table with integer numbers.
DECLARE #s VARCHAR(5);
SET #s = 'ABCDE';
WITH Subsets AS (
SELECT CAST(SUBSTRING(#s, Number, 1) AS VARCHAR(5)) AS Token,
CAST('.'+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM dbo.Numbers WHERE Number BETWEEN 1 AND 5
UNION ALL
SELECT CAST(Token+SUBSTRING(#s, Number, 1) AS VARCHAR(5)) AS Token,
CAST(Permutation+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM Subsets s JOIN dbo.Numbers n ON s.Permutation NOT LIKE
'%.'+CAST(Number AS CHAR(1))+'.%' AND s.Iteration < 5 AND Number
BETWEEN 1 AND 5
--AND s.Iteration = (SELECT MAX(Iteration) FROM Subsets)
)
SELECT * FROM Subsets
WHERE Iteration = 5
ORDER BY Permutation
Token Permutation Iteration
----- ----------- -----------
ABCDE .1.2.3.4.5. 5
ABCED .1.2.3.5.4. 5
ABDCE .1.2.4.3.5. 5
(snip)
EDBCA .5.4.2.3.1. 5
EDCAB .5.4.3.1.2. 5
EDCBA .5.4.3.2.1. 5
(120 row(s) affected)

Related

How to generate 5 digit increment number based on Zone?

My Out put like This : If it is North "N00001
N00002
If it is South "S00001
S00002
Here is my Tried Code
Declare #Zone varchar(20),#ZoneID int,#id varchar(10)
Set #Zone = 'S'
Select #ZoneID = cast(isnull(max(cast(replace(Idno,#Zone,'') as numeric))+1,'00000') as varchar) from memberprofiles Where left(idno,1) = #Zone
Set #id = #Zone+cast(#ZoneID as varchar)
select #id
But Every Time I am Getting "S1" But I need "S00001"
How can i generate Zone wise Number generation

This question is a "closed as duplicate" candidate. But, as there are several flaws, I think it's worth an answer:
In your database you seem to have a column "Idno" with a leading character marking the zone. If this is true, you should - if ever possible - change the design. The number and the zone mark should reside in two columns. Any combination of them is a presentation issue
Your LEFT(Idno,1) will perform badly (read about "sargability"). If there's an index on Idno, you should be better off with Idno LIKE ' + #Zone + '%'
Are you sure, that in "memberprofiles" there's only one row where your WHERE clause is true? If not, which number would you expect in "#ZoneID" after your SELECT?
Your cast(isnull(max(cast(replace(Idno,#Zone,'') as numeric))+1,'00000') ... replaces the leading "S" with nothing, hoping there is a number left. You'll get the highest number (OK, this answers point 3, but is still very - uhm - hacky), still you expect a "NULL" where you'd return a "00000". This cries for a better design loudly :-)
You should try to get into "set based" thinking, rather then "procedural" thinking...
Try this
CREATE TABLE #memberprofile(Idno VARCHAR(100),OtherColumn VARCHAR(100));
INSERT INTO #memberprofile VALUES('N3','Test North 3'),('S24','Test South 24'),('N14','Test North 14')
DECLARE #Zone VARCHAR(20)='N';
SELECT *
,ZoneCode + REPLACE(STR(Number,5),' ','0') AS YourNewPaddedCode
FROM #memberprofile
CROSS APPLY
(
SELECT LEFT(Idno,1) AS ZoneCode
,CAST(SUBSTRING(Idno,2,1000) AS INT) AS Number
) AS Idno_in_parts
WHERE ZoneCode=#Zone;
GO
--Clean up
--DROP TABLE #memberprofile
The result
Idno OtherColumn ZoneCode Number YourNewPaddedCode
N3 Test North 3 N 3 N00003
N14 Test North 14 N 14 N00014

SQL Real number convert to Ft & In

enter code hereI asked a question back in May about how to convert a number from a table that inches, such as 300.9 to a Ft' In" display. I got two very good answers...
CONVERT(VARCHAR(20),finlength /12) + '''' + CONVERT(VARCHAR(20),finlength %12)+'"' as FinishLen
replace(replace('<feet>'' <inches>', '<feet>', FinLength / 12), '<inches>', FinLength % 12) as FinishLen
Both worked well until I ran into a table that the inches are declared as "REAL" numbers. Now I ran into this error...
"The data types real and int are incompatible in the modulo operator."
How can I display that? I can't change the table declarations. Other users need that data as well.
Thanks and Kuddos for the great site.
Guess the full query might help, sorry.
SELECT TOP 1000 ProdWkYr
,Product
,Grade
,CONVERT(VARCHAR(20),finlength /12) + '''' + CONVERT(VARCHAR(20),finlength %12)+'"' as FinishLen
,BlmWeight
,BlmsNeeded
,BlmFootWgt
FROM NYS2MiscOrderInfo
where ProdWkYr = 3215
order by product, Grade

Just include a floor() in your expression like
-- -------------------------------------------------------------------
-- set-up some test data using a CTE:
WITH tst as ( SELECT 13.7 finlength UNION ALL SELECT 123 )
-- alternatively: generate a table [tst] with a single column [finlength]
-- -------------------------------------------------------------------
SELECT CONVERT(VARCHAR(20),FLOOR(finlength / 12)) + ''''
+ CONVERT(VARCHAR(20),finlength % 12)+'"' as FinishLen
FROM tst
-- results:
FinishLen
1'1.70"
10'3."
This will turn the first (ft) value into an integer while the second one (in) will still show all the digits after the decimal point.
UPDATE
When I ran the select from a #tmp table I got the same error as OP. I then modified and ended up with this:
It is as ugly as hell now, but at least it works now, see here SQL Demo:
create table #tst (finlength float);
INSERT INTO #tst VALUES (13.7),(123.),(300.9);
SELECT CONVERT(VARCHAR(20),FLOOR(finlength / 12)) + '''' -- ft
+CONVERT(VARCHAR(20),finlength-FLOOR(finlength) -- in: fractional part
+CAST(FLOOR(finlength) as int) %12)+'"' -- in: integer part
as FinishLen
FROM #tst
Please note: The formula will return reasonable results for positive values. For "negative distances" further changes are necessary. If similar output is required in different places then a UDF makes sense here. Something like:
CREATE FUNCTION ftinstr(#v float) RETURNS varchar(32) BEGIN
DECLARE #l int;
SELECT #l=FLOOR(ABS(#v));
RETURN CAST(SIGN(#v)*(#l/12) AS varchar(6))+''''
+CAST( ABS(#v)-#l+#l%12 AS varchar(20))+'"'
END
would do the trick, To be called like dbo.ftinstr( floatval ).
Maybe I can beautify it a little still ...

SQL Server : find percentage match of LIKE string

I'm trying to write a query to find the percentage match of a search string in a notes or TEXT column.
This is what I'm starting with:
SELECT *
FROM NOTES
WHERE UPPER(NARRATIVE) LIKE 'PAID CALLED RECEIVED'
Ultimately, what I want to do is:
Split the search string by spaces and search individually for all words in the string
Order the results descending based on percentage match
For example, in the above scenario, each word in the search string would constitute 33.333% of the total. A NARRATIVE with 3 matches (100%) should be at the top of the results, while a match containing 2 of the keywords (66.666%) would be lower, and a match containing 1 of the keywords (33.333%) would be even lower.
I then want to display the resulting percentage match for that row in a column, along with all the other columns from that table (*).
Hopefully, this makes sense and can be done. Any thoughts on how to proceed? This MUST all be done in SQL Server, and I would prefer not to write any CTEs.
Thank you in advance for any guidance.

Here is what I came up with:
DECLARE #VISIT VARCHAR(25) = '999232'
DECLARE #KEYWORD VARCHAR(100) = 'PAID,CALLED,RECEIVED'
DECLARE SPLIT_CURSOR CURSOR FOR
SELECT RTRIM(LTRIM(VALUE)) FROM Rpt_Split(#KEYWORD, ',')
IF OBJECT_ID('tempdb..#NOTES_FF_SEARCH') IS NOT NULL DROP TABLE #NOTES_FF_SEARCH
SELECT N.VISIT_NO
,N.CREATE_DATE
,N.CREATE_BY
,N.NARRATIVE
,0E8 AS PERCENTAGE
INTO #NOTES_FF_SEARCH
FROM NOTES_FF AS N
WHERE N.VISIT_NO = #VISIT
DECLARE #KEYWORD_VALUE AS VARCHAR(255)
OPEN SPLIT_CURSOR
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE #NOTES_FF_SEARCH
SET PERCENTAGE = PERCENTAGE + ( 100 / ##CURSOR_ROWS )
WHERE UPPER(NARRATIVE) LIKE '%' + UPPER(#KEYWORD_VALUE) + '%'
FETCH NEXT FROM SPLIT_CURSOR INTO #KEYWORD_VALUE
END
CLOSE SPLIT_CURSOR
DEALLOCATE SPLIT_CURSOR
SELECT * FROM #NOTES_FF_SEARCH
WHERE PERCENTAGE > 0
ORDER BY PERCENTAGE, CREATE_DATE DESC
There may be a more efficient way to do this but every other road I started down ended in a dead-end. Thanks for your help

If you want to do a "percentage" match, you need to do two things: calculate the number of words in the string and calculate the number of words you care about. Before giving some guidance, I will say that full text search probably does everything you want and much more efficiently.
Assuming the search string has space delimited words, you can count the words with the expression:
(len(narrative) - len(replace(narrative, ' ', '') + 1) as NumWords
You can count the matching words with success replaces. So, for keywords, it would be something like removing each key word, fixing the spaces, and counting the words.
The overall code is best represented with subqueries. The resulting query is something like:
select n.*
from (select n.*,
(len(narrative) - len(replace(narrative, ' ', '') + 1.0) as NumWords,
ltrim(rtrim(replace(replace(replace(narrative + ' ', #keyword1 + ' ', ''),
#keyword2 + ' ', ''),
#keyword3 + ' ', ''))) as NoKeywords
from notes n
) n
order by 1 - (len(NoKeywords) - len(replace(NoKeywords, ' ', '') + 1.0) / NumWords desc;
SQL Server -- as with many databases -- is not particularly good at parsing strings. You can do that outside the query and assign the #keyword variables accordingly.

Selecting Strings With Alphabetized Characters - In SQL Server 2008 R2

This is a recreational pursuit, and is not homework. If you value academic challenges, please read on.
A radio quiz show had a segment requesting listeners to call in with words that have their characters in alphabetical order, e.g. "aim", "abbot", "celt", "deft", etc. I got these few examples by a quick Notepad++ (NPP) inspection of a Scrabble dictionary word list.
I'm looking for an elegant way in T-SQL to determine if a word qulifies for the list, i.e. all its letters are in alpha order, case insensitive.
It seemed to me that there should be some kind of T-SQL algorithm possible that will do a SELECT on a table of English words and return the complete list of all words in the Srcabble dictionary that meets the spec. I've spent considerable time looking at regex strings, but haven't hit on anything that comes even remotely close. I've thought about the obvious looping scenario, but abandoned it for now as "inelegant". I'm looking for your ideas that will obtain the qualifying word list,
preferably using
- a REGEX expression
- a tally-table-based approach
- a scalar UDF that returns 1 if the input word meets the requirement, else 0.
- Other, only limited by your creativity.
But preferably NOT using
- a looping structure
- a recursive solution
- a CLR solution
Assumptions/observations:
1. A "word" is defined here as two or more characters. My dictionary shows 55 2-character words, of which only 28 qualify.
2. No word will have more than two concecutive characters that are identical. (If you find one, please point it out.)
3. At 21 characters, "electroencephalograms" is the longest word in my Scrabble dictionary
(though why that word is in the Scrabble dictionary escapes me--the board is only a 15-by-15 grid.)
Consider 21 as the upper limit on word length.
4. All words LIKE 'Z%' can be dismissed because all you can create is {'Z','ZZ', ... , 'ZZZ...Z'}.
5. As the dictionary's words' initial character proceedes through the alphabet, fewer words will qualify.
6. As the word lengths get longer, fewer words will qualify.
7. I suspect that there will be less than 0.2% of my dictionary's 60,387 words that will qualify.
For example, I've tried NPP regex searches like "^a[a-z][b-z][b-z][c-z][c-z][d-z][d-z][e-z]" for 9-letter words starting with "a", but the character-by-character alphabetic enforcement is not handled properly. This search will return "abilities" which fails the test with the "i" that follows the "l".
There's several free Scrabble word lists available on the web, but Phil Factor gives a really interesting treatment of T-SQL/Scrabble considerations at https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-scrabble-and-rapping/ which is where I got my word list.
Care to give it a shot?

Split the word into individual characters using a numbers table. Use the numbers as one set of indices. Use ROW_NUMBER to create another set. Compare the two sets of indices to see if they match for every character to see if they match. If they do, the letters in the word are in the alphabetical order.
DECLARE #Word varchar(100) = 'abbot';
WITH indexed AS (
SELECT
Index1 = n.Number,
Index2 = ROW_NUMBER() OVER (ORDER BY x.Letter, n.Number),
x.Letter
FROM
dbo.Numbers AS n
CROSS APPLY
(SELECT SUBSTRING(#Word, n.Number, 1)) AS x (Letter)
WHERE
n.Number BETWEEN 1 AND LEN(#Word)
)
SELECT
Conclusion = CASE COUNT(NULLIF(Index1, Index2))
WHEN 0 THEN 'Alphabetical'
ELSE 'Not alphabetical'
END
FROM
indexed
;
The NULLIF(Index, Index2) expression does the comparison: it returns a NULL if the the arguments are equal, otherwise it returns the value of Index1. If all indices match, all the results will be NULL and COUNT will return 0, which means the order of letters in the word was alphabetical.

I did something similar to Andriy. I created a numbers table with value 1-21. I use it to create one set of data with the individual letters order by the index and the a second set ordered alphabetically. Joined the sets to each other on the letter and numbers. I then count nulls. Anything over 0 means it is not in order.
DECLARE #word VARCHAR(21)
SET #word = 'abbot'
SELECT Count(1)
FROM (SELECT Substring(#word, number, 1) AS Letter,
Row_number() OVER ( ORDER BY number) AS letterNum
FROM numbers
WHERE number <= CONVERT(INT, Len(#word))) a
LEFT OUTER JOIN (SELECT Substring(#word, number, 1) AS letter,
Row_number() OVER ( ORDER BY Substring(#word, number, 1)) AS letterNum
FROM numbers
WHERE number <= CONVERT(INT, Len(#word))) b
ON a.letternum = b.letternum
AND a.letter = b.letter
WHERE b.letter IS NULL

Interesting idea...
Here's my take on it. This returns a list of words that are in order, but you could easily return 1 instead.
DECLARE #WORDS TABLE (VAL VARCHAR(MAX))
INSERT INTO #WORDS (VAL)
VALUES ('AIM'), ('ABBOT'), ('CELT'), ('DAVID')
;WITH CHARS
AS
(
SELECT VAL AS SOURCEWORD, UPPER(VAL) AS EVALWORD, ASCII(LEFT(UPPER(VAL),1)) AS ASCIICODE, RIGHT(VAL,LEN(UPPER(VAL))-1) AS REMAINS, 1 AS ROWID, 1 AS INORDER, LEN(VAL) AS WORDLENGTH
FROM #WORDS
UNION ALL
SELECT SOURCEWORD, REMAINS, ASCII(LEFT(REMAINS,1)), RIGHT(REMAINS,LEN(REMAINS)-1), ROWID+1, INORDER+CASE WHEN ASCII(LEFT(REMAINS,1)) >= ASCIICODE THEN 1 ELSE 0 END AS INORDER, WORDLENGTH
FROM CHARS
WHERE LEN(REMAINS)>=1
),
ONLYINORDER
AS
(
SELECT *
FROM CHARS
WHERE ROWID=WORDLENGTH AND INORDER=WORDLENGTH
)
SELECT SOURCEWORD
FROM ONLYINORDER
Here it is as a UDF:
CREATE FUNCTION dbo.AlphabetSoup (#Word VARCHAR(MAX))
RETURNS BIT
AS
BEGIN
SET #WORD = UPPER(#WORD)
DECLARE #RESULT INT
;WITH CHARS
AS
(
SELECT #WORD AS SOURCEWORD,
#WORD AS EVALWORD,
ASCII(LEFT(#WORD,1)) AS ASCIICODE,
RIGHT(#WORD,LEN(#WORD)-1) AS REMAINS,
1 AS ROWID,
1 AS INORDER,
LEN(#WORD) AS WORDLENGTH
UNION ALL
SELECT SOURCEWORD,
REMAINS,
ASCII(LEFT(REMAINS,1)),
RIGHT(REMAINS,LEN(REMAINS)-1),
ROWID+1,
INORDER+CASE WHEN ASCII(LEFT(REMAINS,1)) >= ASCIICODE THEN 1 ELSE 0 END AS INORDER,
WORDLENGTH
FROM CHARS
WHERE LEN(REMAINS)>=1
),
ONLYINORDER
AS
(
SELECT 1 AS RESULT
FROM CHARS
WHERE ROWID=WORDLENGTH AND INORDER=WORDLENGTH
UNION
SELECT 0
FROM CHARS
WHERE NOT (ROWID=WORDLENGTH AND INORDER=WORDLENGTH)
)
SELECT #RESULT = RESULT FROM ONLYINORDER
RETURN #RESULT
END

Is it possible to search for multiple terms in a column by using a LIKE statement?

I'm trying to understand if the above question is possible. I've been conceptually thinking about it, and basically what I'm looking to do is:
Specify keywords that may appear in a title. Lets use the two terms "Portfolio" and "Mike"
I'm hoping to generate a query that will allow for me to search for when Portfolio is contained within a title, or Mike. These two titles need not to be together.
For instance, if I have a title dubbed: "Portfolio A" and another title "Mike's favorite" I'd like both of these titles to be returned.
The issue I've encountered with using a LIKE statement is the following:
WHERE 1=1
and rpt_title LIKE ''%'+#report_title+'%'''
If I were to input: 'Portfolio,Mike' it would search for the occurrence of just that within a title.
EDIT: I should have been a bit more clear. I believe it's necessary for me to input my variable as 'Portfolio, Mike' in order for it to find the multiple values. Is this possible?
I'm assuming you could maybe use a charindex with a substring and a replace?

Yep, multiple Like statements with OR will work just fine -- just make sure you use the correct parentheses:
SELECT ...
FROM ...
WHERE 1=1
and (rpt_title LIKE '%Portfolio%'
or rpt_title LIKE '%Mike%')
However, I might suggest you look into using a full-text search.
http://msdn.microsoft.com/en-us/library/ms142571.aspx

I can propose a solution where you could specify any number of masks, without using multiple LIKE -
DECLARE #temp TABLE (st VARCHAR(100))
INSERT INTO #temp (st)
VALUES ('Portfolio photo'),('- Mike'),('blank'),('else'),('est')
DECLARE #delims VARCHAR(30)
SELECT #delims = '|Portfolio|Mike|' -- %Portfolio% OR %Mike% OR etc.
SELECT t.st
FROM #temp t
CROSS JOIN (
SELECT substr =
SUBSTRING(
#delims,
number + 1,
CHARINDEX('|', #delims, number + 1) - number - 1)
FROM [master].dbo.spt_values n
WHERE [type] = N'P'
AND number <= LEN(#delims) - 1
AND SUBSTRING(#delims, number, 1) = '|'
) s
WHERE t.st LIKE '%' + s.substr + '%'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query--String Permutations - sql

Basically, you can't easily do permutations in single SQL statement. You can easily do them in another language though, for example here's how to do it in C#: http://msdn.microsoft.com/en-us/magazine/cc163513.aspx

Related

How to generate 5 digit increment number based on Zone?

SQL Real number convert to Ft & In

SQL Server : find percentage match of LIKE string

Selecting Strings With Alphabetized Characters - In SQL Server 2008 R2

Is it possible to search for multiple terms in a column by using a LIKE statement?

Categories

Resources