I'm looking for answers for the following 2 questions (SQL Server 2000). I have an order info table that is indexed so that I may search data on particular columns. So, an example query I might run is:
SELECT top 50 ft_tbl.*, key_tbl.Rank
from OrderInfo as ft_tbl
INNER JOIN FREETEXTTABLE(OrderInfo, Address1, 'W Main St') as key_tbl
ON ft_tbl.OrderInfoID = key_tbl.[KEY]
order by key_tbl.Rank desc
What I'd expect is that SQL would pull everything that matches "W Main St" first since that would have the highest rank, then variations following. However, my results aren't exactly what I'm expecting. Here are the Top 8 results ordered by Rank:
258 W Main St
4322 N Marshall St
221 Main St
320 Broad St
7 S 3rd St
510 Bauerlein St
175 Main Street
108 Maywood St
(I know why this happens now, and am assuming I can fix it with the answer below)
Question: Is there any way to pass in variations where St could be:
St
St.
Street
And W could be
W
W.
West
Thanks in advance! (bump)
Not sure if you have come across an answer for this question, but I think it would be possible to use the Contains clause to take account for these variations. Both of these offer pretty good resources.
http://www.eggheadcafe.com/articles/20010422.asp
http://en.wikipedia.org/wiki/SQL_Server_Full_Text_Search#Inflectional_Searches
Related
I'm trying to find the match no in which Germany played against Poland. This is from https://www.w3resource.com/sql-exercises/soccer-database-exercise/sql-subqueries-exercise-soccer-database-4.php. There are two tables : match_details and soccer_country. I don't understand how the count(distinct) works in this case. Can someone please clarify? Thanks!
SELECT match_no
FROM match_details
WHERE team_id = (
SELECT country_id
FROM soccer_country
WHERE country_name = 'Germany')
OR team_id = (
SELECT country_id
FROM soccer_country
WHERE country_name = 'Poland')
GROUP BY match_no
HAVING COUNT(DISTINCT team_id) = 2;
As Lamak mentioned, what an ugly consideration for a query, but many ways to approach a query.
As mentioned, counting for (Distinct team_id) makes sure that there are only 2 unique teams. If there is ever a Cartesian result, you could get repetition of multiple rows showing more than one instance of both teams. So the count of distinct on the TEAM_ID eliminates that.
Now, that said, Other "team" query data structures I have seen have a single record for the match and a column for EACH TEAM playing the match. That is easier by a long-shot, but still a relatively easy query.
Break the query down a little, and consider a large scale set of data (not that this, or any sort of even professional league would have such large record counts to give delay with a sql engine).
Your first criteria is games with Germany. So lets start with that.
SELECT
md1.match_no
FROM
match_details md1
JOIN soccer_country sc1
on md1.team_id = sc1.country_id
AND sc1.country_name = 'Germany'
So, why even look at any other record/match if Germany is not even part of the match on either side. Of which this in itself would return 6 matches from the sample data of 51 matches. So now, all you need to do is join AGAIN to the match details table a second time for only those matches, but ALSO the second team is Poland
SELECT
md1.match_no
FROM
match_details md1
JOIN soccer_country sc1
on md1.team_id = sc1.country_id
AND sc1.country_name = 'Germany'
-- joining again for the same match Germany was already qualified
JOIN match_details md2
on md1.match_no = md2.match_no
-- but we want the OTHER team record since Germany was first team
and md1.team_id != md2.team_id
-- and on to the second country table based on the SECOND team ID
JOIN soccer_country sc2
on md2.team_id = sc2.country_id
-- and the second team was Poland
AND sc2.country_name = 'Poland'
Yes, may be a longer query, but by eliminating 45 other matches (again, thinking a LARGE database), you have already saved blowing through tons of data to a very finite set. And now finishing only those Germany / Poland. No aggregates, counts, distincts, just direct joins.
FEEDBACK
Lets take a look at some BAD sample data... which as all programmers know, there is no such thing (NOT). Anyhow, lets take a look at these few matches.
Match Team ID blah
52 Poland Just put the names here for simplistic purposes
52 Poland
53 Germany
53 Germany
If you were to run the query without DISTINCT Teams, both match 52 and 53 would show up... As Poland is one team and appears 2 times for match 52, and similarly Germany 2 times for match 53. By doing DISTINCT Team, you can see that for each match, there is only 1 team being returned and thus excluded. Does that help? Again, no such thing as bad data :)
And yet another sample match where more than 2 teams created
Match Team ID
54 France
54 Poland
54 England
55 Hungary
56 Austria
In each of these matches, NONE would be returned. Match 54 has 3 distinct teams, and Match 55 and 56 only have single entry, thus no opponent to compete against.
2nd FEEDBACK
To clarify the query. If you look at the short query for just Germany, that aliased instance of "md1" is already sitting on any given record for a Germany match. So the second join to the "md2", I only care about the same match, so I can join on the same match_no. However, in the "md2" alias, the "!=" means NOT EQUAL. ! = logical NOT. So the join is saying from the MD1, join to the MD2 alias on the same match id. However, only give me where the teams are NOT the same. So the first instance holds Germany's team ID (already qualified) and thus give me the secondary team id. So now I can use the secondary (md2) instance team ID to join to the country to confirm only for Poland.
Does this now clarify things for you?
I have 3 tables in access - tblUsers, tblAssignnent and tblJob. The tblJob is where using a series of calculations in VBA, I arrive at the data dump which is to be worked by my Quality (QC) team. The tblUsers has a list of all our staff and where they are located globally. The tblAssignnent defines which QC team analyst will work on which case processed by our staff globally. For eg. QCID 123 needs to work on all Level 3 cases worked by our staff in China. Accordingly, VBA must allocate QCID 123 to all those rows where individuals in China have worked at Level 3. We have about 20 such QC IDs and an average of 1000 cases worked daily.
Again, the catch here is that tblUsers defines the name and location of each staff - tblAssignnent defines location and level of each staff along with QC ID expected to work. The tblJob has the Name of staff and Level. Check out he shapshot below,
tblUsers
NAME LOCATION
Mathew Shanghai
John New York
Peter Dubai
tblAssignnent
QCID LEVEL LOCATION
123 L3 Shanghai
135 L1 New York
tblJob
QCID LEVEL NAME CASEID
L3 Mathew 001283526
L1 John 827271729
So basically I need QCID to be updated with 123 and 135 in tblJob using VBA. I attempted INNER JOIN within a recordset but I kept getting errors. On search, apparently a VBA recordset may not be able to hold complex statements. Forgive me for my poor formatting as I only have access to my phone now. All my attempts at this code has failed and I will be much obliged for any help extended.
I remain at your disposal for further clarification.
I think I maybe misread this earlier. If you only need the QCID added to the table Job its really simple :
UPDATE (tblJob
INNER JOIN tblAssignment ON tblJob.LEVEL = tblAssignment.LEVEL)
INNER JOIN tblUsers ON (tblUsers.Location = tblAssignment.Location)
AND (tblJob.EName = tblUsers.EName)
SET tblJob.QCID = tblAssignment.QCID
I'm trying to format a select statement. The assignment specifies that it has to be formatted this way.
I have a database regarding a taxi service. I have to put together a view with the company name, passenger name, and taxi number. Easy. However, the output specifies that the company name should only appear once in the output, at the top of it's own group. So I have:
CREATE VIEW TAXITRIPS(COMPANYNAME, PASSENGERNAME, TAXI#) AS
(SELECT COMPANY.NAME, BOOKING.NAME, VEHICLES.TAXI#
FROM BOOKING JOIN VEHICLES ON BOOKING.TAXI# = VEHICLES.TAXI#
RIGHT OUTER JOIN COMPANY ON VEHICLES.NAME = COMPANY.NAME);
The right outer join is so that companies with no booking recorded are still displayed. If I now run:
SELECT * FROM TAXITRIPS ORDER BY COMPANYNAME ASC;
It will give me something like
COMPANYNAME PASSENGERNAME TAXI#
---------------------------------------------
ABC TAXIS DAVE 192
LEGION CABS
PREMIER CABS SHANE 2154
PREMIER CABS TIM 2169
SILVER SERVICE DAVE 18579
SILVER SERVICE TIM 18124
SILVER SERVICE AARON 18917
No result for legion cabs, all field displayed, et cetera. Assignment specification says it has to look like this.
COMPANYNAME PASSENGERNAME TAXI#
---------------------------------------------
ABC TAXIS DAVE 192
LEGION CABS
PREMIER CABS SHANE 2154
TIM 2169
SILVER SERVICE DAVE 18579
TIM 18124
AARON 18917
The company name should only be displayed on its first row. DISTINCT is not helping. Any advice?
Normally, you would do this at the application layer, because the result set relies on the ordering of the rows -- a bad thing in SQL.
But you can do it as:
SELECT (CASE WHEN ROW_NUMBER() OVER (PARTITION BY c.NAME ORDER BY v.TAXI#) = 1
THEN c.NAME
END) as CompanyName, b.NAME, v.TAXI#
FROM COMPANY c LEFT JOIN
VEHICLES v
ON v.NAME = c.NAME LEFT JOIN
BOOKING b
ON b.TAXI# = v.FLIGHT#
ORDER BY c.name, v.taxi#;
Note: I rearranged the joins to be LEFT JOINs. Most people find that easier to follow than RIGHT JOINs.
I have 2 tables joined with political results and I need to have the votes SUM per county, and then the MAX of the vote counts per county, with the Party that relates to the MAX in another column. I'm having trouble getting the Party into the Query results without messing up the SUM and MAX columns.
This Table I can get with the Following SQL
County Name SumOfVoteCount MaxOfVoteCount OfficeID
Baker 7253 4008 S
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID
HAVING (((NY_Race.OfficeID)="S"));
What I need is for the Party that has that 4008 vote total to be included in the query results, but when I try to select Party to be added, it shows all of them and messes up the SUM of the vote count, and I end of with this:
County Name SumOfVoteCount MaxOfVoteCount1 Party OfficeID
Baker 2927 2927 Dem S
Baker 4008 4008 GOP S
Baker 101 101 Lib S
Baker 53 53 Prg S
Baker 164 164 WF S
This is the SQL code I am using that gets the above Table:
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount, NY_Results.Party
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID, NY_Results.Party
HAVING (((OR_Race.OfficeID)="S"));
How can I get this table in the query results?
County Name SumOfVoteCount MaxOfVoteCount Party OfficeID
Baker 7253 4008 GOP S
I can't help but think I'm missing a WHERE claus somewhere that compares Party to MAXofVoteCount
One way to approach these is to have a nested subquery that gets the MAX() for the field of interest. Then, only select the record with that MAX(). Here's the structure:
select COUNTY_NAME, R1.*
, (select sum(votecount) from results R2 where R1.COUNTY_ID=R2.COUNTY_ID and R1.OFFICE_ID=R2.OFFICE_ID)
from RESULTS R1
join RACE on R1.COUNTY_ID=RACE.COUNTY_ID and R1.OFFICE_ID=RACE.OFFICE_ID
where R1.office_id = 'S'
and voteCount =
(select max(votecount) from results R3 where R1.COUNTY_ID=R3.COUNTY_ID and R1.OFFICE_ID=R3.OFFICE_ID)
I created a demo on SQLFiddle.
One issue: what if two get exactly the same number of votes. That's a functional issue you will have to resolve.
I have a tricky problem that I wouldn't mind a bit of help on, I've made some progress using queries that I've here and elsewhere, but am getting seriously stumped now.
I have a mailing list that has numerous near duplications that I'm trying to combine into one meaningful row, taking data such as this.
Title Forename Surname Address1 Postcode Phone Age Income Ownership Gas
Mrs D Andrews 122 Somewhere BH10 123456 66-70 Homeowner
Ms Diane Andrews 122 Somewhere BH10 123456 £25-40 EDF
and making one row along the lines of
Title Forename Surname Address1 Postcode Phone Age Income Ownership Gas
Mrs Diane Andrews 122 Somewhere BH10 123456 66-70 £25-40 Homeowner EDF
I have over 127 million records, most duplicated with a similar pattern, but no clear logic as was proven when I added an identity field. I also have over 90 columns to consider, so it's a bit of work!
There isn't a clear pattern to the data, so I'm thinking I may have a huge case statement to try to climb over.
Using the following code I can get a decent start on only returning the full name, but with the pattern of data - trying to compare the fields across rows is as follows.
SELECT c1.*
FROM
Mailing c1
JOIN
Mailingc2 ON c1.Telephone1 = c2.Telephone1 AND c1.surname = c2.surname
WHERE
len(c1.Forename) > len(c2.Forename)
AND c2.over_18 <> ''
AND c1.Telephone1 = '123456'
Has anyone got any pointers as to how I should progress please? I'm open to discussion and ideas...
I'm using SQL 2005 and apologies in advance if the tagging is all over the place!
Cheers,
Jon
Would it work by assuming that all persons with the same surname and phone number (Do all persons have a phone?) were the same person?
INSERT INTO newtable <fieldnames>
SELECT lastname,phone,max(field3),max(field4)....
FROM oldtable
GROUP BY lastname,phone
But that would collapse John Smith and Jack Smith living together into one person.
Perhaps you should consider outsourcing it to a data-entry sweatshop somewhere, adter you have preprocessed the data. :-)
And/or be prepared to take the flack for mistaken bundling.
Perhaps adding something like "To improve our green footprint, we have merged x listings on your adress together. If you would like separate mailings, please contact us"