SQL Server Best match query with update (T-SQL) - sql

I am trying to find out what's the most optimized SQL Query to achieve the following.
I have a table containing ZipCodes/PostalCodes, let's assume the following structure:
table_codes:
ID | ZipCode
---------------
1 1234
2 1235
3 456
and so on.
The users of my application fill up a profile where they are required to enter their ZipCode (PostalCode).
Assuming that sometimes, the user will enter a ZipCode not defined in my table, I am trying to suggest a Best Match based on the zip entered by the user.
I am using the following query:
Declare #entered_zipcode varchar(10)
set #entered_zipcode = '23456'
SELECT TOP 1 table_codes.ZipCode
FROM table_codes
where #entered_zipcode LIKE table_codes.ZipCode + '%'
or table_codes.ZipCode + '%' like #entered_zipcode + '%'
ORDER BY table_codes.ZipCode, LEN(table_codes.ZipCode) DESC
Basically, I am trying the following:
if the #entered_zipcode is longer than any zip code in the table, I am trying to get to get the best prefix in the zip table matching the #entered_zipcode
if the #entered_zipcode is shorter than any existing code in the table, I am trying to use it as a prefix and get the best match in the table
Moreover, I am building a temp table with the following structure:
#tmpTable
------------------------------------------------------------------------------------
ID | user1_enteredzip | user1_bestmatchzip | user2_enteredzip | user2_bestmatchzip |
------------------------------------------------------------------------------------
1 | 12 | *1234* | 4567 | **456** |
2 |
3 |
4 |
Entered zip is the one the user enters and the code between * .. * is the best matching code from my lookup table, that I am trying to get using the query below.
The query seems to take a little bit to long and this is why I am asking for help in optimizing it:
update #tmpTable
set user1_bestmatchzip = ( SELECT TOP 1
zipcode
FROM table_codes
where #tmpTable.user1_enteredzip LIKE table_codes.zipcode + '%'
or table_codes.zipcode + '%' like #tmpTable.user1_enteredzip + '%'
ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC
),
user2_bestmatchzip = ( SELECT TOP 1
zipcode
FROM table_codes
where #tmpTable.user2_enteredzip LIKE table_codes.zipcode + '%'
or table_codes.zipcode + '%' like #tmpTable.user2_enteredzip + '%'
ORDER BY table_codes.zipcode, LEN(table_codes.zipcode) DESC
)
from #tmpTable

What if you change your temp table to be like:
id | user | enteredzip | bestmatchzip
10 | 1 | 12345 | 12345
20 | 2 | 12 | 12345
That is: use a column to save the user number (1 or 2). This way you will update one row at a time.
Also, the ORDER BY takes time, did you set indices on the zipcode? Couldn't you create a field "length" in the zipcodes table to pre-compute the zipcodes lenghts?
EDIT:
I was thinking that ordering by LEN makes no sense, you could remove that! If the zipcodes cannot have duplicates, then ordering by the zipcode is just enought. If they can though, the LEN will always be equal!

You are comparing first characters of both strings - what if you compare substrings of minimal length?
select top 1 zipcode
from table_zipcodes
where substring(zipcode, 1, case when len(zipcode) > len (#entered_zipcode) then len(#entered_zipcode) else len (zipcode) end)
= substring (#entered_zipcode, 1, case when len(zipcode) > len (#entered_zipcode) then len(#entered_zipcode) else len (zipcode) end)
order by len (zipcode) desc
This will remove OR and allow for usage of index *in_#entered_zipcode LIKE table_codes.ZipCode + '%'*. Also, it seems to me that the ordering of results is wrong - shorter zipcodes go first.

Related

Case statement logic and substring

Say I have the following data:
Passes
ID | Pass_code
-----------------
100 | 2xBronze
101 | 1xGold
102 | 1xSilver
103 | 2xSteel
Passengers
ID | Passengers
-----------------
100 | 2
101 | 5
102 | 1
103 | 3
I want to count then create a ticket in the output of:
ID 100 | 2 pass (bronze)
ID 101 | 5 pass (because it is gold, we count all passengers)
ID 102 | 1 pass (silver)
ID 103 | 2 pass (steel)
I was thinking something like the code below however, I am unsure how to finish my case statement. I want to substring pass_code so that we get show pass numbers e.g '2xBronze' should give me 2. Then for ID 103, we have 2 passes and 3 customers so we should output 2.
Also, is there a way to firstly find '2xbronze' if the pass_code contained lots of other things such as '101001, 1xbronze, FirstClass' - this may change so i don't want to substring, could we search for '2xbronze' and then pull out the 2??
SELECT
CASE
WHEN Passes.pass_code like '%gold%' THEN Passengers.passengers
WHEN Passes.pass_code like '%steel%' THEN SUBSTRING(passes.pass_code, 1,1)
WHEN Passes.pass_code like '%bronze%' THEN SUBSTRING(passes.pass_code, 1,1)
WHEN Passes.pass_code like '%silver%' THEN SUBSTRING(passes.pass_code, 1,1)
else 0 end as no,
Passes.ID,
Passes.Pass_code,
Passengers.Passengers
FROM Passes
JOIN Passengers ON Passes.ID = Passengers.ID
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=db698e8562546ae7658270e0ec26ca54
So assuming you are indeed using Oracle (as your DB fiddle implies).
You can do some string magic with finding position of a splitter character (in your case the x), then substringing based on that. Obviously this has it's problems, and x is a bad character seperator as well.. but based on your current set.
WITH PASSCODESPLIT AS
(
SELECT PASSES.ID,
TO_Number(SUBSTR(PASSES.PASS_CODE, 0, (INSTR(PASSES.PASS_CODE, 'x')) - 1)) AS NrOfPasses,
SUBSTR(PASSES.PASS_CODE, (INSTR(PASSES.PASS_CODE, 'x')) + 1) AS PassType
FROM Passes
)
SELECT
PASSCODESPLIT.ID,
CASE
WHEN PASSCODESPLIT.PassType = 'gold' THEN Passengers.Passengers
ELSE PASSCODESPLIT.NrOfPasses
END AS NrOfPasses,
PASSCODESPLIT.PassType,
Passengers.Passengers
FROM PASSCODESPLIT
INNER JOIN Passengers ON PASSCODESPLIT.ID = Passengers.ID
ORDER BY PASSCODESPLIT.ID ASC
Gives the result of:
ID NROFPASSES PASSTYPE PASSENGERS
100 2 bronze 2
101 5 gold 5
102 1 silver 1
103 2 steel 3
As can also be seen in this fiddle
But I would strongly advise you to fix your table design. Having multiple attributes in the same column leads to troubles like these. And the more variables/variations you start storing, the more 'magic' you need to keep doing.
In this particular example i see no reason why you don't simply have the 3 columns in Passes, also giving you the opportunity to add new columns going forward. I.e. to keep track of First class.
You can extract the numbers using regexp_substr(). So I think this does what you want:
SELECT (CASE WHEN p.pass_code LIKE '%gold%'
THEN TO_NUMBER(REGEXP_SUBSTR(p.pass_code, '^[0-9]+'))
ELSE pp.passengers
END) as num,
p.ID, p.Pass_code, pp.Passengers
FROM Passes p JOIN
Passengers pp
ON p.ID = pp.ID;
Here is a db<>fiddle.
This converts the leading digits in the code to a number. Also note the use of table aliases to simplify the query.

SQL LIKE using the same row value

I'm wondering how can I use a row value as a variable for my like statement? For example
ID | PID | DESCRIPTION
1 | 4124 | Hi4124
2 | 2451 | Test
3 | 1467 | Hello
4 | 9642 | Me9642
I have a table above, I want to return IDs 1 and 4 since DESCRIPTION contains PID.
I'm thinking it would be SELECT * from TABLE WHERE DESCRIPTION LIKE '%PID%' but I can't get it.
You can use CONCAT() to assemble the matching pattern, as in:
select *
from t
where description like concat('%', PID, '%')
We could also try using CHARINDEX here:
SELECT ID, PID, DESCRIPTION
FROM yourTable
WHERE CHARINDEX(PID, DESCRIPTION) > 0;
Demo
Note that I assume in the demo that the PID column is actually text, and not a numeric column. If PID be numeric, we might have to first use a cast in order to use CHARINDEX (or any of the methods given in the other answers).
Use the CONCAT SQL function
SELECT *
FROM TABLE
WHERE DESCRIPTION LIKE CONCAT('%', PID, '%')

find value in comma separated list for each record of a different table

We have a table that holds our customers, the product they have and the cities they subscribe to. The cities field is a comma separated list of numbers.
They want a stored procedure that will find all the customers that have the specific city ids they pass in, but they want to be able to pass in more than one city id.
Example
They want to find all customers that are subscribed to 22 and/or 900 (both are separate cities).
I need all customers that have one or the other or both of these cities in their comma separated list.
So I need a way to search that list for the first value and then search the list for the second value. I was thinking of using a recursive CTE but I need to join the City table (temp table I created to separate the list of city ids they pass) and I cant.
When they put the city_ids in, I separate that list into a table that has each city id as it's own record in the temporary table. Help?
And please don't say stop storing comma separated lists... I can't change that part of how our system functions and that is not helpful. (I've seen it many times while searching for an answer to this question).
Customer |Product_ID |Cities
6 | 49 |ALL
9 | 2760 |ALL
9 | 3618 |ALL
9 | 3981 |ALL
10 | 2760 |ALL
10 | 3618 |ALL
10 | 3981 |ALL
11 | 3981 |ALL
12 | 3981 |ALL
20 | 2894 |10,12,14,16,18,20,22,26,32,085,615,34,38,39,46,620,50,60,65,365,70,73,680,375,405,77,80,90,435,705
91 | 501 |510,515,520,521,522,523,525,526,527,530,535,540,542,545,550,553,555,560,563,565,566,567,569,570,571,572,573,574,575,20,22,576,580,581,582,585,587,590,591,593,595,598,600,610,612,614,615,617,618,619,620,621,623,625
I would be searching through something like above, looking for 20 or 900 and this is a very small sample. The ALL is easy, I know what to do there. It's searching through the lists that is the issue when trying to look for more than one city id. I've been able to do it while looking for only one, it's doing more than one that's the killer.
I am doing this in the beginning:
CREATE TABLE #City
( Order_ID INT
, City_ID VARCHAR(100)
, FirstCitySearchText VARCHAR(100) NULL
, LastCitySearchText VARCHAR(100) NULL
, OnlyCitySearchText VARCHAR(100) NULL
PRIMARY KEY (Order_ID, City_ID)
)
INSERT INTO #City
SELECT i.listpos
, i.quotedtext
, NULL
, NULL
, NULL
FROM opiscommon.dbo.SplitCommaSeparatedList(#City_IDs) i
UPDATE c
SET c.FirstCitySearchText = '%' + CAST(c.City_ID AS VARCHAR)+ ',%'
, c.LastCitySearchText = '%,' + CAST(c.City_ID AS VARCHAR) + '%'
, c.OnlyCitySearchText = '%,' + CAST(c.City_ID AS VARCHAR) + ',%'
FROM #City c
Sometimes you are stuck with other people's bad design decisions. You should understand how bad it is to store ids in lists, numeric values as strings, and foreign keys with no declared foreign key relationship.
But, when you are stuck, there are methods. Here is a method using like:
where ',' + list + ',' like '%,22,%' or
',' + list + ',' like '%,900,%'
Here is the logic for 22. Repeat for all numbers
where city = '22' -- exact match
or city like '22,%' -- first item in list
or city like '%,22,%' -- middle of list
or city like '%,22' -- last item in list
So what I ended up doing instead that solved my issue was joining the table in like this:
JOIN #CityIDs c
ON cpcpp.CUST_PROD_PARM_VAL LIKE '' + c.FirstCitySearchText + ''
OR cpcpp.CUST_PROD_PARM_VAL LIKE '' + c.LastCitySearchText + ''
OR cpcpp.CUST_PROD_PARM_VAL LIKE '' + c.OnlyCitySearchText + ''
OR cpcpp.CUST_PROD_PARM_VAL LIKE '' + c.City_ID + ''
OR cpcpp.CUST_PROD_PARM_VAL LIKE 'ALL'
And this what those search text fields are:
UPDATE c
SET c.FirstCitySearchText = CAST(c.City_ID AS VARCHAR(100))+ ',%'
, c.LastCitySearchText = '%,' + CAST(c.City_ID AS VARCHAR(100))
, c.OnlyCitySearchText = '%,' + CAST(c.City_ID AS VARCHAR(100)) + ',%'
FROM #CityIDs c
I couldn't use the where clauses that were posted because I wasn't sure of how to join the table in to get the city ids, so instead I joined them using the where clause as the join.
This isn't a good solution but it's the only one that has worked so far. I'm getting back the results that I expect and the stored procedure is only taking about 20 seconds to run, which is considered a win.

Counting occurrences in a table

Lets say I want to count the total number of occurrences of a name contained within a string in a column and display that total next to all occurrences of that name in a new column beside it. For example, if I have:
Name | Home Address | Special ID
==================================
Frank | 152414 | aTRF342
Jane | 4342342 | rRFC432
Mary | 423432 | xTRF353
James | 32111111 | tLZQ399
May | 4302443 | 3TRF322
How would I count the occurrences of special tags like 'TRF', 'RFC', or 'LZQ' so the table looks like this:
Name | Home Address | Special ID | Occurrences
================================================
Frank | 152414 | aTRF342 | 3
Jane | 4342342 | rRFC432 | 1
Mary | 423432 | xTRF353 | 3
James | 32111111 | tLZQ399 | 1
May | 4302443 | 3TRF322 | 3
Currently using Access 2007. Is this even possible using a SQL query?
Using Access 2007, I stored your sample data in a table named tblUser1384831. The query below returns this result set.
Name Home Address Special ID special_tag Occurrences
---- ------------ ---------- ----------- -----------
Frank 152414 aTRF342 TRF 3
Jane 4342342 rRFC432 RFC 1
Mary 423432 xTRF353 TRF 3
James 32111111 tLZQ399 LZQ 1
May 4302443 3TRF322 TRF 3
Although your question has a vba tag, you don't need to use a VBA procedure for this. You can do it with SQL and the Mid() function.
SELECT
base.[Name],
base.[Home Address],
base.[Special ID],
base.special_tag,
tag_count.Occurrences
FROM
(
SELECT
[Name],
[Home Address],
[Special ID],
Mid([Special ID],2,3) AS special_tag
FROM tblUser1384831
) AS base
INNER JOIN
(
SELECT
Mid([Special ID],2,3) AS special_tag,
Count(*) AS Occurrences
FROM tblUser1384831
GROUP BY Mid([Special ID],2,3)
) AS tag_count
ON base.special_tag = tag_count.special_tag;
You would have to GROUP BY the substring of Special ID. In MS Access, you can read about how to compute substrings here.
The problem in your case is that your data in Special ID column does not follow a standard pattern, one which easy to extract via the substring function. You might need to use regular expressions to extract such values, and later apply the GROUP BY to them.
With MSSQL, Oracle, PostgreSQL you would be able to declare a stored procedure (example CLR function in MS SQL Server) that would do this for you. Not sure with MS Access.
you can do something like this:
select Name, [Home Address], [Special ID],
(select count(*) from [your table] where [Special ID] = RemoveNonAlphaCharacters([Special ID]) ) as Occurrences
from [your table]
auxiliar function (got from this link):
Create Function [dbo].[RemoveNonAlphaCharacters](#Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
While PatIndex('%[^a-z]%', #Temp) > 0
Set #Temp = Stuff(#Temp, PatIndex('%[^a-z]%', #Temp), 1, '')
Return #Temp
End
lets say your first table is called 'table_with_string'
the following code will show the occurance based on the first 3 charecters of string in Special ID column. since it is not clear how exactly you are passing the string to match
select tws.Name,tws.HomeAddress,tws.SpecialID,str_count.Occurrences from
table_with_string tws
left join
(select SpecialID,count(*) from table_with_string where specialID like(substring
(specialid,0,3))
group by specialId) as str_count(id,Occurrences)
on str_count.id=tws.SpecialID
I would suggest doing this explicitly as a join, so you are clear on how it works:
select tws.Name, tws.HomeAddress, tws.SpecialID, str_count.Occurrences
from table_with_string tws
join
(
select substring(spcecialid, 2, 3) as code, count(*) as Occurrences
from table_with_string tws
group by substring(spcecialid, 2, 3)
) s
on s.code = substring(tws.spcecialid, 2, 3)

How do you concat multiple rows into one column in SQL Server?

I've searched high and low for the answer to this, but I can't figure it out. I'm relatively new to SQL Server and don't quite have the syntax down yet. I have this datastructure (simplified):
Table "Users" | Table "Tags":
UserID UserName | TagID UserID PhotoID
1 Bob | 1 1 1
2 Bill | 2 2 1
3 Jane | 3 3 1
4 Sam | 4 2 2
-----------------------------------------------------
Table "Photos": | Table "Albums":
PhotoID UserID AlbumID | AlbumID UserID
1 1 1 | 1 1
2 1 1 | 2 3
3 1 1 | 3 2
4 3 2 |
5 3 2 |
I'm looking for a way to get the all the photo info (easy) plus all the tags for that photo concatenated like CONCAT(username, ', ') AS Tags of course with the last comma removed. I'm having a bear of a time trying to do this. I've tried the method in this article but I get an error when I try to run the query saying that I can't use DECLARE statements... do you guys have any idea how this can be done? I'm using VS08 and whatever DB is installed in it (I normally use MySQL so I don't know what flavor of DB this really is... it's an .mdf file?)
Ok, I feel like I need to jump in to comment about How do you concat multiple rows into one column in SQL Server? and provide a more preferred answer.
I'm really sorry, but using scalar-valued functions like this will kill performance. Just open SQL Profiler and have a look at what's going on when you use a scalar-function that calls a table.
Also, the "update a variable" technique for concatenation is not encouraged, as that functionality might not continue in future versions.
The preferred way of doing string concatenation to use FOR XML PATH instead.
select
stuff((select ', ' + t.tag from tags t where t.photoid = p.photoid order by tag for xml path('')),1,2,'') as taglist
,*
from photos
order by photoid;
For examples of how FOR XML PATH works, consider the following, imagining that you have a table with two fields called 'id' and 'name'
SELECT id, name
FROM table
order by name
FOR XML PATH('item'),root('itemlist')
;
Gives:
<itemlist><item><id>2</id><name>Aardvark</a></item><item><id>1</id><name>Zebra</name></item></itemlist>
But if you leave out the ROOT, you get something slightly different:
SELECT id, name
FROM table
order by name
FOR XML PATH('item')
;
<item><id>2</id><name>Aardvark</a></item><item><id>1</id><name>Zebra</name></item>
And if you put an empty PATH string, you get even closer to ordinary string concatenation:
SELECT id, name
FROM table
order by name
FOR XML PATH('')
;
<id>2</id><name>Aardvark</a><id>1</id><name>Zebra</name>
Now comes the really tricky bit... If you name a column starting with an # sign, it becomes an attribute, and if a column doesn't have a name (or you call it [*]), then it leaves out that tag too:
SELECT ',' + name
FROM table
order by name
FOR XML PATH('')
;
,Aardvark,Zebra
Now finally, to strip the leading comma, the STUFF command comes in. STUFF(s,x,n,s2) pulls out n characters of s, starting at position x. In their place, it puts s2. So:
SELECT STUFF('abcde',2,3,'123456');
gives:
a123456e
So now have a look at my query above for your taglist.
select
stuff((select ', ' + t.tag from tags t where t.photoid = p.photoid order by tag for xml path('')),1,2,'') as taglist
,*
from photos
order by photoid;
For each photo, I have a subquery which grabs the tags and concatenates them (in order) with a commma and a space. Then I surround that subquery in a stuff command to strip the leading comma and space.
I apologise for any typos - I haven't actually created the tables on my own machine to test this.
Rob
I'd create a UDF:
create function GetTags(PhotoID int) returns #tags varchar(max)
as
begin
declare #mytags varchar(max)
set #mytags = ''
select #mytags = #mytags + ', ' + tag from tags where photoid = #photoid
return substring(#mytags, 3, 8000)
end
Then, all you have to do is:
select GetTags(photoID) as tagList from photos
Street_Name ; Street_Code
west | 14
east | 7
west+east | 714
If want to show two different row concat itself , how can do it?
(I mean last row i want to show from select result. My table had first and secord record)