How to match entires in SQL based on their ending letter? - sql

So I'm trying to match entries in two databases so in the new table the row is comprised of two words that end in the same ending letter. I'm working with two tables that have one column in each of them, each named word. table 1 contains the following data in order: Dog, High, It, Weeks, while table two contains the data: Bat, Is, Laugh, Sing. I need to select from both of these tables and match the words so that each row is as follows: Dog | Sing, High | Laugh, It | Bat, Weeks | Is
The screenshot is what I have so far for my SQL statement. I'm still early on in learning SQL so any info to help on this would be appreciated.

Recommend reading up on SUBSTR() for more information on why the below code works: https://docs.oracle.com/cd/B28359_01/olap.111/b28126/dml_functions_2101.htm#OLADM679
SELECT
a.word
, b.word
FROM sec1313_words1 a
JOIN sec1313_words2 b
ON SUBSTR(b.word, -1) = SUBSTR(a.word, -1)
ORDER BY a.word

Related

How do you 'join' multiple SQL data sets side by side (that don't link to each other)?

How would I go about joining results from multiple SQL queries so that they are side by side (but unrelated)?
The reason I am thinking of this is so that I can run 1 query in Google Big Query and it will return 1 single table which I can import into Excel and do some charts.
e.g. Query 1 looks at dataset TableA and returns:
**Metric:** Sales
**Value:** 3,402
And then Query 2 looks at dataset TableB and returns:
**Name:** John
**DOB:** 13 March
They would both use different tables and different filters, etc.
What would I do to make it look like:
---Sales----------John----
---3,402-------13 March----
Or alternatively:
-----Sales--------3,402-----
-----John-------13 March----
Or is there a totally different way to do this?
I can see the use case for the above, I've used something similar to create a single table from multiple tables with different metrics to query in Data Studio so that filters apply to all data in the dataset for example. However in that case, the data did share some dimensions that made it worthwhile doing.
If you are going to put those together with no relationship between the tables, I'd have 4 columns with TYPE describing the data in that row to make for easier filtering.
Type | Sales | Name | DOB
Use UNION ALL to put the rows together so you have something like
"Sales" | 3402 | null | null
"Customer Details" | null | John | 13 March
However, like the others said, make sure you have a good reason to do that otherwise you're just creating a bigger table to query for no reason.

Partitioning join to limit records in SQL

I have 2 tables:
- first one containing spatial data - geometry of circles
- second contains geometries of lines.
I want to find all lines which are inside each circle. I have a query which can do that, however there are millions of records so it is unusably slow.
There is a column in both tables which is area_id and essentially all circles are assigned to particular area and all lines as well, so if I can do the intersect of the circles only with the lines in the matching area this will reduce the load a lot. The problem is I can't think of solution e.g. using windowing function. The query I am using is:
Select ct.AREA_ID, ct.Circle_descr, lt.Line_descr from circles_table as ct
JOIN lines_table as lt
ON
circles_table.Circle_location.STIntersects(points_table.Point_location)=1
*using a where clause at the end makes no difference as it is essentially part of the slow join...
+---------------+----------------------+--------------------------+
| AREA_ID (int) | Circle_descr(varchar) | Circle_location(geometry)|
+---------------+----------------------+--------------------------+
+---------------+---------------------+-------------------------+
| AREA_ID (int) | Line_descr(varchar) | Line_location(geometry) |
+---------------+---------------------+-------------------------+
Add an additional join criterion to partition the rows by area_id before comparing them. Something like
Select ct.AREA_ID, ct.Circle_descr, lt.Line_descr
from circles_table as ct
JOIN lines_table as lt
ON ct.Circle_location.STIntersects(lt.Point_location)=1
AND ct.area_id = lt.area_id

SQL Server Multiple Likes

I have an unusual question that seems simple but has me stumped in a SQL Server stored procedure.
I have 2 tables as described below.
tblMaster
ID, CommitDate, SubUser, OrigFileName
Sample data
ID CommitDate SubUser OrigFileName
----------------------------------------
1 2014-10-07 Test1 Test1.pdf
2 2014-10-08 Test2 Test2.pdf
3 2014-10-09 Test3 Test3.pdf
The above table is basically the header table that tracks the committed files. In addition to this, we have a details table with the following structure.
tblIndex
ID, FileID (Linking column to the header row described above), Word
Sample data:
1. 1, 1, Oil
2. 2, 1, oil
3. 3, 2, oil
4. 4, 2, tank
5. 5, 3, tank
The above rows represent the words that we want to search on and if a certain criteria matches return the corresponding filename/header row ID. What I would love to figure out to do is if I do a search for
One word (i.e. "oil"), then the system should respond with all the files that meet the criteria (easiest case and figured out)
If more than one word is searched for (i.e. "oil" and "tank"), then we should only see the second file since it is the only one that has both oil and tank as its key words.
Tried using a LIKE "%oil%" AND LIKE "%tank%" and that resulted in no rows being created since one value can't be both oil and tank.
Tried doing a LIKE "%oil%" OR LIKE "%tank%" but I get files 1, 2, and 3 since the OR is inclusive of all the other rows.
One last thing, I recognize I could just do a search for the first term and then save the results into a temp table and then search for the second term in that second table and I will get what I am looking for. The problem with that is that I don't exactly know how many items will be searched for. I don't want to have to create a structure where I am constantly having to store data into another temp table if someone does a search for 6 "keywords".
Any help/ideas will be much appreciated.
try this ! slightly differing from the previous answer
SELECT distinct FileID,COUNT(distinct t.word) FROM tblIndex t
WHERE t.Word LIKE '%oil%' OR t.Word LIKE '%tank%'
GROUP BY FileID
HAVING COUNT(distinct t.word) > 1
One simple option would be to do something like this :
SELECT FileID
FROM tblIndex t
WHERE t.Word LIKE '%oil%' OR t.Word LIKE '%tank%'
GROUP BY FileID
HAVING COUNT(*) > 1
This assume you do not have duplicate in your tblIndex.
I'm also unsure whether you really need the like or not. According to your sample data you don't, a basic comparison would be way more efficient and would avoid possible collisions.

Microsoft Access 2010 - Updating Multiple Rows with Different values in ONE query

I have a question about updating multiple rows with different values in MS Access 2010.
Table 1: Food
ID | Favourite Food
1 | Apple
2 | Orange
3 | Pear
Table 2: New
ID | Favourite Food
1 | Watermelon
3 | Cherries
Right now, it looks deceptively simple to execute them separately (because this is just an example). But how would I execute a whole lot of them at the same time if I had, say, 500 rows to update out of 1000 records.
So what I want to do is to update the "Food" table based on the new values from the "New" table.
Would appreciate if anyone could give me some direction / syntax so that I can test it out on MS Access 2010. If this requires VBA, do provide some samples of how I should carry this out programmatically, not manually statement-by-statement.
Thank you!
ADDENDUM (REAL DATA)
Table: Competitors
Columns: CompetitorNo (PK), FirstName, LastName, Score, Ranking
query: FinalScore
Columns: CompetitorNo, Score, Ranking
Note - this query is a query of another query, which in turn, is a query of another query (could there be a potential problem here? There are at least 4 queries before this FinalScore query is derived. Should I post them?)
In the competitors table, all the columns except "Score" and "Ranking" are filled. We would need to take the values from the FinalScore query and insert them into the relevant competitor columns.
Addendum (Brief Explanation of Query)
Table: Competitors
Columns: CompetitorNo (PK), FirstName, LastName, Score, Ranking
Sample Data: AX1234, Simpson, Danny, <blank initially>, <blank initially>
Table: CompetitionRecord
Columns: EventNo (PK composite), CompetitorNo (PK composite), Timing, Bonus
Sample Data1: E01, AX1234, 14.4, 1
Sample Data2: E01, AB1938, 12.5, 0
Sample Data3: E01, BB1919, 13.0, 2
Event No specifies unique event ID
Timing measures the time taken to run 200 metres. The lesser, the better.
Bonus can be given in 3 values (0 - Disqualified, 1 - Normal, 2 - Exceptional). Competitors with Exceptional are given bonus points (5% off their timing).
Query: FinalScore
Columns: CompetitorNo (PK), Score, Ranking
Score is calculated by wins. For example, in the above event (E01), there are three competitors. The winner of the event is BB1919. Winners get 1 point. Losers don't get any points. Those that are disqualified do not receive any points as well.
This query lists the competitors and their cumulative scores (from a list of many events - E01, E02, E03 etc.) and calculates their ranking in the ranking column everytime the query is executed. (For example, a person who wins the most 200m events would be at the top of this list).
Now, I am required to update the Competitors table with this information. The query is rather complex - with all the grouping, summations, rankings and whatnots. Thus, I had to create multiple queries to achieve the end result.
How about:
UPDATE Food
INNER JOIN [New]
ON Food.ID=New.ID
SET Food.[Favourite Food] = New.[Favourite Food]

SQL Query with multiple values in one column

I've been beating my head on the desk trying to figure this one out. I have a table that stores job information, and reasons for a job not being completed. The reasons are numeric,01,02,03,etc. You can have two reasons for a pending job. If you select two reasons, they are stored in the same column, separated by a comma. This is an example from the JOBID table:
Job_Number User_Assigned PendingInfo
1 user1 01,02
There is another table named Pending, that stores what those values actually represent. 01=Not enough info, 02=Not enough time, 03=Waiting Review. Example:
Pending_Num PendingWord
01 Not Enough Info
02 Not Enough Time
What I'm trying to do is query the database to give me all the job numbers, users, pendinginfo, and pending reason. I can break out the first value, but can't figure out how to do the second. What my limited skills have so far:
select Job_number,user_assigned,SUBSTRING(pendinginfo,0,3),pendingword
from jobid,pending
where
SUBSTRING(pendinginfo,0,3)=pending.pending_num and
pendinginfo!='00,00' and
pendinginfo!='NULL'
What I would like to see for this example would be:
Job_Number User_Assigned PendingInfo PendingWord PendingInfo PendingWord
1 User1 01 Not Enough Info 02 Not Enough Time
Thanks in advance
You really shouldn't store multiple items in one column if your SQL is ever going to want to process them individually. The "SQL gymnastics" you have to perform in those cases are both ugly hacks and performance degraders.
The ideal solution is to split the individual items into separate columns and, for 3NF, move those columns to a separate table as rows if you really want to do it properly (but baby steps are probably okay if you're sure there will never be more than two reasons in the short-medium term).
Then your queries will be both simpler and faster.
However, if that's not an option, you can use the afore-mentioned SQL gymnastics to do something like:
where find ( ',' |fld| ',', ',02,' ) > 0
assuming your SQL dialect has a string search function (find in this case, but I think charindex for SQLServer).
This will ensure all sub-columns begin and start with a comma (comma plus field plus comma) and look for a specific desired value (with the commas on either side to ensure it's a full sub-column match).
If you can't control what the application puts in that column, I would opt for the DBA solution - DBA solutions are defined as those a DBA has to do to work around the inadequacies of their users :-).
Create two new columns in that table and make an insert/update trigger which will populate them with the two reasons that a user puts into the original column.
Then query those two new columns for specific values rather than trying to split apart the old column.
This means that the cost of splitting is only on row insert/update, not on _every single select`, amortising that cost efficiently.
Still, my answer is to re-do the schema. That will be the best way in the long term in terms of speed, readable queries and maintainability.
I hope you are just maintaining the code and it's not a brand new implementation.
Please consider to use a different approach using a support table like this:
JOBS TABLE
jobID | userID
--------------
1 | user13
2 | user32
3 | user44
--------------
PENDING TABLE
pendingID | pendingText
---------------------------
01 | Not Enough Info
02 | Not Enough Time
---------------------------
JOB_PENDING TABLE
jobID | pendingID
-----------------
1 | 01
1 | 02
2 | 01
3 | 03
3 | 01
-----------------
You can easily query this tables using JOIN or subqueries.
If you need retro-compatibility on your software you can add a view to reach this goal.
I have a tables like:
Events
---------
eventId int
eventTypeIds nvarchar(50)
...
EventTypes
--------------
eventTypeId
Description
...
Each Event can have multiple eventtypes specified.
All I do is write 2 procedures in my site code, not SQL code
One procedure converts the table field (eventTypeIds) value like "3,4,15,6" into a ViewState array, so I can use it any where in code.
This procedure does the opposite it collects any options your checked and converts it in
If changing the schema is an option (which it probably should be) shouldn't you implement a many-to-many relationship here so that you have a bridging table between the two items? That way, you would store the number and its wording in one table, jobs in another, and "failure reasons for jobs" in the bridging table...
Have a look at a similar question I answered here
;WITH Numbers AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS N
FROM JobId
),
Split AS
(
SELECT JOB_NUMBER, USER_ASSIGNED, SUBSTRING(PENDING_INFO, Numbers.N, CHARINDEX(',', PENDING_INFO + ',', Numbers.N) - Numbers.N) AS PENDING_NUM
FROM JobId
JOIN Numbers ON Numbers.N <= DATALENGTH(PENDING_INFO) + 1
AND SUBSTRING(',' + PENDING_INFO, Numbers.N, 1) = ','
)
SELECT *
FROM Split JOIN Pending ON Split.PENDING_NUM = Pending.PENDING_NUM
The basic idea is that you have to multiply each row as many times as there are PENDING_NUMs. Then, extract the appropriate part of the string
While I agree with DBA perspective not to store multiple values in a single field it is doable, as bellow, practical for application logic and some performance issues. Let say you have 10000 user groups, each having average 1000 members. You may want to have a table user_groups with columns such as groupID and membersID. Your membersID column could be populated like this:
(',10,2001,20003,333,4520,') each number being a memberID, all separated with a comma. Add also a comma at the start and end of the data. Then your select would use like '%,someID,%'.
If you can not change your data ('01,02,03') or similar, let say you want rows containing 01 you still can use " select ... LIKE '01,%' OR '%,01' OR '%,01,%' " which will insure it match if at start, end or inside, while avoiding similar number (ie:101).