difficulty with an sql query using Oracle database - sql

the question:
Find the title of the books whose keyword contains the last 3 characters of the bookgroup of the book which was booked by "Mr. Karim".
I am trying to do it like this:
SELECT Title
FROM Lib_Book
WHERE BookKeywords LIKE '%(SELECT BookGroup FROM Lib_Book WHERE BookId=(SELECT BookId from Lib_Booking, Lib_Borrower WHERE Lib_Booking.BId=Lib_Borrower.BId AND Lib_Borrower.BName = 'Mr. Karim'))%';
from the part after % upto the end returns me an answer which is 'programming'. so i need to indicate the BookKeyword as '%ing%'. How can i do that?
**the tables are huge so i hvnt written those here..if anyone need to those plz lemme know...thnx

You have the basic concept down, although it's not possible to process a SELECT statement inside a LIKE clause (and I'd probably shoot the RDBMS developer who allowed that - it's a GAPING HUGE HOLE for SQL Injection). Also, you're likely to have problems with multiple results, as your 'query' would fail the moment Mr. Karim had borrowed more than one book.
I'd probably start by attempting to make things run off of joins (oh, and never use implicit join syntax):
SELECT d.title
FROM Lib_Borrower as a
JOIN Lib_Booking as b
ON b.bId = a.bId
JOIN Lib_Book as c
ON c.bookId = b.bookId
JOIN Lib_Book as d
ON d.bookKeywords LIKE '%' + SUBSTRING(c.bookGroup, LENGTH(c.bookGroup) - 3) + '%'
WHERE a.bName = 'Mr. Karim'
Please note the following caveats:
This will get you all titles for all books with similar keywords to all borrowed books (from all 'Mr. Karim's). You may need to include some sort of restrictive criteria while joining to Lib_Booking.
The column bookKeywords seems potentially like a multi-value column (would need example data). If so, your table structure needs to be revised.
The use of SUBSTRING() or RIGHT() will invalidate the use of inidicies in joining to the bookGroup column. There isn't much you can necessarily do about that, given your requirements...
This table is not internationalization safe (because the bookGroup column is language-dependant parsed text). You may find yourself better served by creating a Book_Group table, a Keywords table, and a cross-reference Book_Keywords table, and joining on numerical ids. You may also want language-keyed Book_Group_Description and Keyword_Description tables. This will take more space, and probably take more processing time (increased number of joins, although potentially less textual processing), but give you increased flexibility and 'safety'.

Related

cannot implement this inner join in oracle

I have recently started to learn oracle, and I am having difficulty understanding this inner join on the tables.
INSERT INTO temp_bill_pay_ft
SELECT DISTINCT
ft.ft_id,
ft.ft_credit_acct_no,
ft.ft_debit_acct_no,
ft.ft_stmt_nos,
ft.ft_debit_their_ref,
ft.ft_date_time
FROM
funds_transfer_his ft
INNER JOIN temp_bill_pay_lwday_pl dt
ON ft.ft_id = dt.ac_ste_trans_reference || ';1'
AND ft.ft_credit_acct_no = dt.ac_id;
It is this line specifically which I dont understand, why do we use || here, I suppose it is for concatenation.
ON ft.ft_id = dt.ac_ste_trans_reference||';1'
Can somebody please explain to me this sql query. I would really appreciate it. Thank you.
This is string concatenation. The need is because there is a design error in the database and join keys are not the same in the two tables. So the data might look something like this:
ft_id ac_ste_trans_reference
123;1 123
abc;1 abc
In order for the join to work, the keys need to match. One possibility is to remove the last two characters from ft_id, but I'm guessing those are meaningful.
I can speculate on why this is so. One possibility is that ft_id is really a compound key combined into a single column -- and the 1 is used to indicate the "type" of key. If so, then there are possibly other values after this:
ft_id
123;1
garbled;2
special;3
The "2" and "3" would refer to other reference tables.
If this is the situation, then it would be cleaner to have a separate column with the correct ac_ste_trans_reference. However that occupies additional space, and can require multiple additional columns for each type. So hacks like the one you see are sometimes implemented.
Yes it is used for concatenation.
But only somebody having worked on this database model can explain what table data represent and why this concatenation is needed for this joining condition.

SQL Server More Efficient Substring

I am running a query in SQL Server where I need to join two tables into one where the full name field matches the partial name field in another after apostrophes have been removed. For a code example the join is happening like this:
from [Data1]
right join [Data2]
on replace([Data2].[PartialName], '''','')=Substring([Data1].[FullName],1,1+LEN(replace([Data2].[PartialName], '''','')))
And it works. But it takes what would be a 10 second execution if we just used where name=name and makes it take around 20 minutes. This is rather unacceptable in terms of run time so I was wondering if anyone had any more efficient alternatives to consider.
Btw Data 1 has about 800 lines and Data2 has about 1.6 million if it's relevant.
Edit: I've been told I need to give a bit more descriptive information. Basically in this example Data1 is a table from an outside source that contains a name field [FullName] which contains people's full names in the form of 'Last-Name , First-Name Middle-Name(s)' with any apostrophes removed (for example in the name O'Neil it would just be ONeil).
So an example would be 'ONeil , Sarah Conner'
Data2 contains a name field that has names in the form 'Last-Name , First-Name' Middle names are omitted and apostrophes are intact. So for example 'O'Neil , Sarah'
These tables need to be merged together on their name fields, hence the logic above.
DavidG is right, a PERSISTED column is the way to go here. After drinking a little more coffee, I think you need a computed column and then LIKE in your JOIN. The PERSISTED column's SQL would be something like:
ALTER TABLE [Data2] ADD PartialName_na AS REPLACE(PartialName,'''','') PERSISTED;
You may want to add that to an index. Then your new (pseudo) SQL query would be:
SELECT ...
FROM Data2 D2
LEFT JOIN Data1 D1 ON D1.FullName = D2.PartialName_na + '%';
There's no need to use SUBSTRING. LIKE will maintain SARGability here, it doesn't use a leading wildcard.
Edit: Couple of notes. I used the _na suffix to stand for "No Apostrophe"; you can call the column whatever you want. I also changed the query from a RIGHT JOIN to a LEFT JOIN. Personally I feel that LEFT JOINs are much easier to read, however, if you want to swap it back round, feel free.

Difference between SQL Server codes?

For a school assignment, I have to create a database and run reports. I've created code and a classmate has also created code and it runs the same thing, but his is in a format I've not seen and don't quite understand.
Here is mine:
SELECT
Course.Name AS 'Course Name',
Program.Name AS 'Program Name'
FROM
Course, Program, ProgramCourse
WHERE
ProgramCourse.CourseID = Course.ID
AND
ProgramCourse.ProgramID = Program.ID
GO
And here's his:
CREATE VIEW NumberOfCoursePerProgram AS
SELECT
p.name AS ProgramName,
c.name AS CourseName
FROM
Program p
JOIN
ProgramCourse pc ON pc.ProgramID = p.ID
JOIN
Course c ON c.ID = pc.CourseID
GO
I ran both queries using the data in the tables I've created. They return practically the same results, just in a slightly different order but it fulfills the assignment question. Anyway, if I delete the p from Program p from his code, it returns an error
The multi-part identifier "p.name" could not be bound.
So how is SQL Server able to accept p.name and p.ID, etc. when I haven't ever established these variables? I don't quite understand how the code is working on his. Mine seems simple and straightforward, and I definitely understand what's going on there. So can someone explain his?
Thanks
There's a few differences. First off, he's creating a VIEW rather than just a select statement:
CREATE VIEW NumberOfCoursePerProgram AS
Once the view is created, you can query the view just as you would a table:
SELECT * FROM NumberOfCoursePerProgram;
Second, he's using an ANSI JOIN rather than an implicit JOIN. His method is more modern and most likely considered more correct by today's standards:
JOIN ProgramCourse pc ON pc.ProgramID = p.ID
JOIN Course c ON c.ID= pc.CourseID
Rather than:
FROM Course, Program, ProgramCourse
Also, note he's assigning table aliases when he refers to a table:
FROM Program p
The p at the end allows you to substitute p rather than specify the entire table name of Program elsewhere in the query. For example, you can now say WHERE p.Foo > 5 rather than WHERE Program.Foo > 5. In this case, it's just a shortcut and saves a few characters. However, suppose you were referring to the same table twice (for example, JOINing in two different rows on the same table). In that case, you might have to provide aliases for each table to disambiguate which one is which.
These are called alias in SQL. Alias is basically created to give more readability and for better ease of writing code.
The readability of a SELECT statement can be improved by giving a
table an alias, also known as a correlation name or range variable. A
table alias can be assigned either with or without the AS keyword:
table_name AS table alias
table_name table_alias
So in your query p is an alias to Program so that means now you can refer your table Program by the name of p instead of writing the whole name Program everywhere.
Similarly you can access the names of the columns of your table Program by simply writing p with a dot and then the column name. Something like p.column. This technique is very useful when you using JOINS and some your tables have similar names of the columns.
EDIT:-
Although most of the points are covered in other's answer. I am just adding a point that you should avoid the habit of JOINING table the way you are doing it right now.
You may check Bad habits to kick : using old-style JOINs by Aaron Bertrand for reference.
CREATE VIEW NumberOfCoursePerProgram AS
BEGIN
SELECT
p.name AS ProgramName,
c.name AS CourseName
FROM
Program p
JOIN
ProgramCourse pc ON pc.ProgramID= p.ID
JOIN Course c ON c.ID= pc.CourseID
END
GO
Observe that both tables Program and Course have a table alias defined.
The select part must specify the table from which the column name comes from. Which is exactly what you did. Your partner just added aliases to the tables names. These aliases are shorter, and makes the query look a bit less like a big wall of text.
The other difference is the use of joints. The joins are usually used to link results from two tables that has a corresponding column.
The columns are usually the primary key, and the foreign key in the second table.
Your query is fine, but the join syntax is preferred.
Edit : Once the view is created, it is now compiled and can be used in a select just like a table.
SELECT * FROM NumberOfCoursePerProgram
If you need to modify the view, you can use
ALTER VIEW NumberOfCoursePerProgram AS
......
......
There are a few things going on here:
if I delete the "p" from "Program p" from his code, it returns "The multi-part identifier "p.name" could not be bound."
The P is an alias for program. By doing it that way. If you ever need to use Program you can just call it P.
As far as what he is doing: it's better to do a JOIN in this case as opposed to doing a multi-table SELECT that you are doing. In simple cases it doesn't really matter but what if course and program both had millions of rows. In your way you are basically selecting everything. By doing the join on ProgramID you are only getting those items in ProgramCourse that correspond to an entry in Program as well (tied together by ID and CourseID).
One more important note. You are doing a simple SELECT statement. In SQL there are objects called VIEWS that act as a virtual table. He can now do any time a SELECT * FROM NumberOfCoursePerProgram and he will never have to do any of the joins, and selects again.
Hope that helps...
In SQL Server you can give a table an alias without AS, its optional.
If you were to add the AS it would will work just like you have it.
It may also be worthy to note that he is using a join as opposed to a where clause join, which is my preference because it's easier to read and more update code ethic.
Sometimes you need to join on a WHERE clause because of multiple conditions on join, but that's pretty rare in your case.

is there a way to search within multiple tables in SQL

Right now i have 100 tables in SQL and i am looking for a specific string value in all tables, and i do not know which column it is in.
select * from table1, table2 where column1 = 'MyLostString' will not work because i do not know which column it has to be in.
Is there a SQL query for that, must i brute force search every table for every column for that 'MyLostString'
If I were to brute-force search across all tables, is there an efficient query for that?
For instance:
select * from table3 where allcolumns = MyLostString
It is the defining feature of a RDBMS (or at least one of them), that the meaning of a value depends on the column it is in. E.g.: The value 17 will have quite different meanings, if it stands in a customer_id column, than in the product_id of a fictional orders table.
This leads to the fact, that RDBMS are not well equipped to search for a value, no matter in which column of which tables it might be used.
My recommendation is to first study the data model to try and find out, which column of which table should be holding the value. If this really fails, you have a problem much worse than a "lost string".
The last ressort is to transform the DB into something better suited for fulltext search ... such as a flat file. You might want to try mydbexportcommand --options | grep -C10 'My lost string' or friends.

Use a LIKE clause in part of an INNER JOIN

Can/Should I use a LIKE criteria as part of an INNER JOIN when building a stored procedure/query? I'm not sure I'm asking the right thing, so let me explain.
I'm creating a procedure that is going to take a list of keywords to be searched for in a column that contains text. If I was sitting at the console, I'd execute it as such:
SELECT Id, Name, Description
FROM dbo.Card
WHERE Description LIKE '%warrior%'
OR
Description LIKE '%fiend%'
OR
Description LIKE '%damage%'
But a trick I picked up a little while go to do "strongly typed" list parsing in a stored procedure is to parse the list into a table variable/temporary table, converting it to the proper type and then doing an INNER JOIN against that table in my final result set. This works great when sending say a list of integer IDs to the procedure. I wind up having a final query that looks like this:
SELECT Id, Name, Description
FROM dbo.Card
INNER JOIN #tblExclusiveCard ON dbo.Card.Id = #tblExclusiveCard.CardId
I want to use this trick with a list of strings. But since I'm looking for a particular keyword, I am going to use the LIKE clause. So ideally I'm thinking I'd have my final query look like this:
SELECT Id, Name, Description
FROM dbo.Card
INNER JOIN #tblKeyword ON dbo.Card.Description LIKE '%' + #tblKeyword.Value + '%'
Is this possible/recommended?
Is there a better way to do something like this?
The reason I'm putting wildcards on both ends of the clause is because there are "archfiend", "beast-warrior", "direct-damage" and "battle-damage" terms that are used in the card texts.
I'm getting the impression that depending on the performance, I can either use the query I specified or use a full-text keyword search to accomplish the same task?
Other than having the server do a text index on the fields I want to text search, is there anything else I need to do?
Try this
select * from Table_1 a
left join Table_2 b on b.type LIKE '%' + a.type + '%'
This practice is not ideal. Use with caution.
Your first query will work but will require a full table scan because any index on that column will be ignored. You will also have to do some dynamic SQL to generate all your LIKE clauses.
Try a full text search if your using SQL Server or check out one of the Lucene implementations. Joel talked about his success with it recently.
try it...
select * from table11 a inner join table2 b on b.id like (select '%'+a.id+'%') where a.city='abc'.
Its works for me.:-)
It seems like you are looking for full-text search. Because you want to query a set of keywords against the card description and find any hits? Correct?
Personally, I have done it before, and it has worked out well for me. The only issues i could see is possibly issues with an unindexed column, but i think you would have the same issue with a where clause.
My advice to you is just look at the execution plans between the two. I'm sure that it will differ which one is better depending on the situation, just like all good programming problems.
#Dillie-O
How big is this table?
What is the data type of Description field?
If either are small a full text search will be overkill.
#Dillie-O
Maybe not the answer you where looking for but I would advocate a schema change...
proposed schema:
create table name(
nameID identity / int
,name varchar(50))
create table description(
descID identity / int
,desc varchar(50)) --something reasonable and to make the most of it alwase lower case your values
create table nameDescJunc(
nameID int
,descID int)
This will let you use index's without have to implement a bolt on solution, and keeps your data atomic.
related: Recommended SQL database design for tags or tagging
a trick I picked up a little while go
to do "strongly typed" list parsing in
a stored procedure is to parse the
list into a table variable/temporary
table
I think what you might be alluding to here is to put the keywords to include into a table then use relational division to find matches (could also use another table for words to exclude). For a worked example in SQL see Keyword Searches by Joe Celko.
Performance will be depend on the actual server than you use, and on the schema of the data, and the amount of data. With current versions of MS SQL Server, that query should run just fine (MS SQL Server 7.0 had issues with that syntax, but it was addressed in SP2).
Have you run that code through a profiler? If the performance is fast enough and the data has the appropriate indexes in place, you should be all set.
LIKE '%fiend%' will never use an seek, LIKE 'fiend%' will. Simply a wildcard search is not sargable
Try this;
SELECT Id, Name, Description
FROM dbo.Card
INNER JOIN #tblKeyword ON dbo.Card.Description LIKE '%' +
CONCAT(CONCAT('%',#tblKeyword.Value),'%') + '%'