I'm having an issue in selecting specific post codes.
These are all valid UK post code formats:
WV11JX
WV1 1JX
WV102QK
WV10 2QK
WV113KQ
WV11 3KQ
Now, say I had a mix of the above formats in the data table; I'm trying to select only post codes that conform to the WV1 prefix (in this example).
In the above 6 item list, I'd want to return:
WV11JX
WV1 1JX
I would want to exclude:
WV102QK
WV10 2QK
WV113KQ
WV11 3KQ
If I execute the following query this will bring back both the WV11's and the WV1's:
SELECT ad.PostCode,*
FROM Staff st
INNER JOIN Address ad on ad.AddressID = st.Address
WHERE
ad.PostCode like 'WV1%'
Changing the condition in the WHERE to cater for length like this doesn't really work either:
SELECT ad.PostCode,*
FROM Staff st
INNER JOIN Address ad on ad.AddressID = st.Address
WHERE
(
ad.PostCode like 'WV1%'
OR
(ad.PostCode like 'WV1%' and LEN(ad.PostCode) = 6)
The above will just filter out any of the formats with a space so if we cater for those by doing the below:
SELECT ad.PostCode,*
FROM Staff st
INNER JOIN Address ad on ad.AddressID = st.Address
WHERE
(ad.PostCode like 'WV1%' and LEN(ad.PostCode) = 6)
or
(ad.PostCode like 'WV1 %' and LEN(ad.PostCode) = 7)
That fixes the issue but the problem is that we want to check more than just the 'WV1' prefix in this manner so having a growing list of 'OR' comparisons isn't viable.
How do we isolate the above post codes in a scalable way?
I'm using Microsoft SQL Server 2016.
I think the logic you want is:
WHERE ad.PostCode like 'WV1 [0-9][A-Z][A-Z]' OR
ad.PostCode like 'WV1[0-9][A-Z][A-Z]'
I'm not sure if numbers are allowed for the last three characters.
Related
I am trying to make a filter to find all the stuffs made of various substances.
In the database, there is:
a stuffs table
a substances table
a stuffs_substances join table.
Now, I want to find only all the stuffs that are made of gold AND silver (not all the stuffs that contain gold and all stuffs that contain silver).
One last thing: the end user can type only a part of the substance name in the filter form field. For example he will type silv and it will show up all the stuffs made of silver.
So I made this query (not working):
select "stuffs".*
from "stuffs"
inner join "stuffs_substances" as "substances_join"
on "substances_join"."stuff_id" = "stuffs"."id"
inner join "substances"
on "substances_join"."substance_id" = "substances"."id"
where ("substances"."name" like '%silv%')
and ("substances"."name" like '%gold%')
It returns an empty array. What am I doing wrong here?
Basically, you just want aggregation:
select st.*
from "stuffs" st join
"stuffs_substances" ss join
on ss."stuff_id" = st."id" join
"substances" s
on ss."substance_id" = s."id"
where s."name" like '%silv%' or
s."name" like '%gold%'
group by st.id
having count(*) filter (where s."name" like '%silv%') > 0 and
count(*) filter (where s."name" like '%gold%') > 0;
Note that this works, assuming that stuff.id is the primary key in stuffs.
I don't understand your fascination with double quotes and long table aliases. To me, those things just make the query harder to write and to read.
if you want to do search by part of word then do action to re-run query each time user write a letter of word , and the filter part in query in case of oracle sql way
in case to search with start part only
where name like :what_user_write || '%'
or in case any part of word
where name like '%' || :what_user_write || '%'
you can also use CAB, to be sure user can search by capital or small whatever
ok, you ask about join, I test this in mysql , it work find to get stuff made from gold and silver or silver and so on, hope this help
select sf.id, ss.code, sf.stuff_name from stuffs sf, stuffs_substances ss , substances s
where sf.id = ss.id
and s.code = ss.code
and s.sub_name like 'gol%'
I have the following sql data:
ID Company Name Customer Address 1 City State Zip Date
0108500 AAA Test Mish~Sara Newa Claims Chtiana CO 123 06FE0046
0108500 AAA.Test Mish~Sara Newa Claims Chtiana CO 123 06FE0046
1802600 AAA Test Company Ban, Adj.~Gorge PO Box 83 MouLaurel CA 153 09JS0025
1210600 AAA Test Company Biwel~Brce 97kehst ve Jacn CA 153 04JS0190
AAA Test, AAA.Test and AAA Test Company are considered as one company.
Since their data is messy I'm thinking either to do this:
Is there a way to search all the records in the DB wherein it will search the company name with almost the same name then re-name it to the longest name?
In this case, the AAA Test and AAA.Test will be AAA Test Company.
OR Is there a way to filter only record with company name that are almost the same then they can have option to change it?
If there's no way to do it via sql query, what are your suggestions so that we can clean-up the records? There are almost 1 million records in the database and it's hard to clean it up manually.
Thank you in advance.
You could use String matching algorithm like Jaro-Winkler. I've written an SQL version that is used daily to deduplicate People's names that have been typed in differently. It can take awhile but it does work well for the fuzzy match you're looking for.
Something like a self join? || is ANSI SQL concat, some products have a concat function instead.
select *
from tablename t1
join tablename t2 on t1.companyname like '%' || t2.companyname || '%'
Depending on datatype you may have to remove blanks from the t2.companyname, use TRIM(t2.companyname) in that case.
And, as Miguel suggests, use REPLACE to remove commas and dots etc.
Use case-insensitive collation. SOUNDEX can be used etc etc.
I think most Database Servers support Full-Text search ability, and if so there are some functions related to Full-Text search that support Proximity.
for example there is a Near function in SqlServer and here is its documentation https://msdn.microsoft.com/en-us/library/ms142568.aspx
You can do the clean-up in several stages.
Create new columns
Convert everything to upper case, remove punctuation & whitespace, then match on the first 6 to 10 characters (using self join). Assuming your table is called "vendor": add two columns, "status", "dupstr", then update as follows
/** Populate dupstr column for fuzzy match **/
update vendor v
set v.dupstr = left(upper(regex_replace(regex_replace(v.companyname,'.',''),' ','')),6)
;
Identify duplicate records
Add an index on the dupstr column, then do an update like this to identify "good" records:
/** Mark the good duplicates **/
update vendor v
set v.status = 'keep' --indicate keeper record
where
--dupes to clean up
exists ( select 1 from vendor v1 where v.dupstr = v1.dupstr
and v.id != v1.id )
and
( --keeper has longest name
length(v.companyname) =
( select max(length(v2.companyname)) from vendor v2
where v.dupstr = v2.dupstr
)
or
--keeper has latest record (assuming ID is sequential)
v.id =
( select max(v3.id) from vendor v3
where v.dupstr = v3.dupstr
)
)
group by v.dupstr
;
The above SQL can be refined to add "dupe" status to other records , or you can do a separate update.
Clean Up Stragglers
Report any remaining partial matches to be reviewed by a human (i.e. dupe records without a keeper record)
You can use SQL query with SOUDEX of DIFFRENCE
For example:
SELECT DIFFERENCE ('AAA Test','AAA Test Company')
DIFFERENCE returns 0 - 4 ( 4 = almost the same, 0 - totally diffrent)
See also: https://learn.microsoft.com/en-us/sql/t-sql/functions/difference-transact-sql?view=sql-server-2017
I have a situation where I need to pull existing products out of a MSSQL database to CSV for ingestion into another database, each row will need to contain as much information as possible from the existing database. I think i've gotten a majority of what I need with my query so far, but I am stuck on figuring out how to merge the multiple categories per item down into one row.
Whats happening is i'll have a duplicate row for each category listed. So if its assigned to the category Glass and Glass and Glass Connectors, i'll have a row for each.
I'd like there to be a single field named Categories that's just comma separated like this: "Glass,Glass and Glass Connectors"
I read that STUFF() can do this, but I can't seem to get the syntax right. Other examples on Stack didn't seem to work for my situation or I just don't know exactly how to apply it to my query, the mass amount of JOINs needed hasn't helped either.
Here's my query:
SELECT
tblCatalog_SKUs.InternalSKU,
tblCatalog_Products.Name AS ParentProd,
tblCatalog_Categories.Name AS Category,
tblCatalog_SKUs_Images.Image1,
tblCatalog_SKUs_Images.Image2,
tblCatalog_SKUs_Images.Image3,
tblCatalog_Products.Summary,
tblCatalog_SKUs.Name AS optName,
tblCatalog_SKUs.Description AS optDesc,
tblCatalog_SKUs.Price,
tblCatalog_SKUs.Inventory,
tblCatalog_SKUs.Sale
FROM tblCatalog_Products_Categories
INNER JOIN tblCatalog_Categories
ON tblCatalog_Products_Categories.CategoryID = tblCatalog_Categories.CategoryID
INNER JOIN tblCatalog_SKUs
ON tblCatalog_Products_Categories.ProductID = tblCatalog_SKUs.ProductID
INNER JOIN tblCatalog_SKUs_Images
ON tblCatalog_SKUs.SKUID = tblCatalog_SKUs_Images.SKUID
INNER JOIN tblCatalog_Products
ON tblCatalog_SKUs.ProductID = tblCatalog_Products.ProductID
Sample results:
http://i.stack.imgur.com/Y7AIt.png
I was hoping there might be something like group_concat in MySQL.
Thanks for your help!
You can do this with STUFF and FOR XML PATH like this:
SELECT
tblCatalog_SKUs.InternalSKU,
tblCatalog_Products.Name AS ParentProd,
STUFF((
SELECT ',' + Name
FROM tblCatalog_Categories
INNER JOIN tblCatalog_Products_Categories ON tblCatalog_Products_Categories.CategoryID = tblCatalog_Categories.CategoryID
WHERE tblCatalog_Products_Categories.ProductID = tblCatalog_SKUs.ProductID
FOR XML PATH('')
), 1, 1, '') AS Category,
tblCatalog_SKUs_Images.Image1,
tblCatalog_SKUs_Images.Image2,
tblCatalog_SKUs_Images.Image3,
tblCatalog_Products.Summary,
tblCatalog_SKUs.Name AS optName,
tblCatalog_SKUs.Description AS optDesc,
tblCatalog_SKUs.Price,
tblCatalog_SKUs.Inventory,
tblCatalog_SKUs.Sale
FROM tblCatalog_SKUs
INNER JOIN tblCatalog_SKUs_Images
ON tblCatalog_SKUs.SKUID = tblCatalog_SKUs_Images.SKUID
INNER JOIN tblCatalog_Products
ON tblCatalog_SKUs.ProductID = tblCatalog_Products.ProductID
STUFF is just used to remove the leading ',', key part is FOR XML PATH concatenating the strings.
It's been a while since I've done SQL, but I have a rather pressing issue.
My db-layout is as following:
Now, starting from a Users.ID, I want to get all the Rounds the user has played. The user could be Hosts.HostID, Host.GuestID, or even both. Where he is both, it should not show op in the results.
The results I need from the Query are the Hosts.Name and all the fields of Rounds. In general what I want to do is display a list of all the Hosts (actually these are the Games) in which the user has participated, as a Host or as a Guest, along with perhaps a total score. When clicking on this, some dropdown will appear showing the individual round scores, words, ...
Now I was wondering whether this was possible in a single query. Of course I could do a query getting all the Hosts and then per Host a query for each Round, but that doesn't seem that performant. This is what I've come up with so far:
SELECT Rounds.ID, Rounds.GameID, Rounds.Round, Rounds.Score, Rounds.Word
, Hosts.ID, Hosts.HostID, Hosts.GuestID
FROM Rounds INNER JOIN Hosts
ON Rounds.GameID = Hosts.ID
INNER JOIN Users
ON Hosts.hostID = Users.ID
WHERE Users.ID = 5
The issue is however that it doesn't filter out where the user is both host AND guest, and I can't seem to Group it by Hosts.ID either.
Add Hosts.hostID <> Hosts.guestID to the where clause.
If you are using SQL Server 2005 or later version, you could modify your present query like this:
SELECT Rounds.ID, Rounds.GameID, Rounds.Round, Rounds.Score, Rounds.Word
, Hosts.ID, Hosts.HostID, Hosts.GuestID
FROM Rounds INNER JOIN Hosts
ON Rounds.GameID = Hosts.ID
CROSS APPLY
(
SELECT Hosts.HostID
UNION
SELECT Hosts.GuestID
) AS u (UserID)
WHERE u.UserID = 5
;
The CROSS APPLY clause would produce either one or two rows, depending on whether HostID and GuestID are equal. In either event, the WHERE condition would ultimately leave at most one. Thus, the above query would give you only the games (with all their rounds) where the specified user participated.
And from that query you could easily get to something like this:
SELECT Hosts.ID,
TotalScore = SUM(Rounds.Score)
FROM
...
GROUP BY
Hosts.ID
;
I have a many to many relationship between people and some electronic codes. The table with the codes has the code itself, and a text description of the code. A typical result set from a query might be (there are many codes that contain "broken" so I feel like it's better to search the text description rather than add a bunch of ORs for every code.)
id# text of code
1234 broken laptop
1234 broken mouse
Currently the best way for me to get a result set like this is to use the LIKE%broken% filter. Without changing the text description, is there any way I can return only one instance of a code with broken? So in the example above the query would only return 1234 and broken mouse OR broken laptop. In this scenario it doesn't matter which is returned, all I'm looking for is the presence of "broken" in any of the text descriptions of that person's codes.
My solution at the moment is to create a view that would return
`id# text of code
1234 broken laptop
1234 broken mouse`
and using SELECT DISTINCT ID# while querying the view to get only one instance of each.
EDIT ACTUALLY QUERY
SELECT tblVisits.kha_id, tblICD.descrip, min(tblICD.Descrip) as expr1
FROM tblVisits inner join
icd_jxn on tblVisits.kha_id = icd_jxn.kha)id inner join tblICD.icd_fk=tblICD.ICD_ID
group by tblVisits.kha_id, tblicd.descrip
having (tblICD.descrip like n'%broken%')
You could use the below query to SELECT the MIN code. This will ensure only text per id.
SELECT t.id, MIN(t.textofcode) as textofcode
FROM table t
WHERE t.textofcode LIKE '%broken%'
GROUP BY t.id
Updated Actual Query:
SELECT tblVisits.kha_id,
MIN(tblICD.Descrip)
FROM tblVisits
INNER JOIN icd_jxn ON tblVisits.kha_id = icd_jxn.kha)id
INNER JOIN tblicd ON icd_jxn.icd_fk = tbl.icd_id
WHERE tblICD.descrip like n'%broken%'
GROUP BY tblVisits.kha_id