Related
I have two tables, Author_Dim and Author_Fact as below:
CREATE TABLE Author_Dim
(
TitleAuthor_ID INT IDENTITY(1,1),
Title_ID CHAR(20),
Title VARCHAR(80),
Type_Title CHAR(12),
Author_ID CHAR(20),
Last_Name VARCHAR(40),
First_Name VARCHAR(20),
Contract_Author BIT,
Author_Order INT,
PRIMARY KEY (TitleAuthor_ID),
);
GO
CREATE TABLE Author_Fact
(
Fact_ID INT IDENTITY(1,1),
TitleAuthor_ID INT,
Author_ID CHAR (20),
Price DEC(10,2),
YTD_Sales INT,
Advance DEC(10,2),
Royalty INT,
Royalty_Perc INT,
Total_Sales DEC(10,2),
Total_Advance DEC(10,2),
Total_Royalty DEC(10,2)
PRIMARY KEY(Fact_ID),
FOREIGN KEY (TitleAuthor_ID) REFERENCES Author_Dim(TitleAuthor_ID),
);
go
I wish to create a view that gives the total royalties paid per author and then sorts it with the highest paid author shown first, i.e. it sums the Total_Royalty column, groups it by the Author_ID and then sorts the Total_Royalty in descending order.
I have the below but I'm not sure how to add the sum/group/sort functions to the view:
Create view [Total_Royalty_View] As (
Select Author_Dim.Author_ID, Author_Dim.Last_Name, Author_Dim.First_Name, Author_Fact.Total_Royalty
From Author_Dim
Join Author_Fact
On Author_Fact.TitleAuthor_ID = Author_Dim.TitleAuthor_ID
);
Go
In SQL it is all tables. You select from tables and the result is again a table (consisting of columns and rows). You can store a select statement for re-use and this is called a view. You could just as well write an ad-hoc view (a subquery in a from clause). Their results are again tables.
And tables are considered unordered sets of data.
So, you cannot write a view that produces an ordered set of rows.
Here is the view (unordered):
create view total_royalty_view as
select
a.author_id,
a.last_name,
a.first_name,
coalesce(r.sum_total_royalty, 0) as total_royalty
from author_dim a
left join
(
select titleauthor_id, sum(total_royalty) as sum_total_royalty
from author_fact
group by titleauthor_id
) r on r.titleauthor_id = a.titleauthor_id;
And here is how to select from it:
select *
from total_royalty_view
order by total_royalty desc;
I can do this question with three tables. However, the extra table confuses me a lot. Here is the schema:
COURSE(CourseID, CourseName)
COURSEMODULES(CourseID, ModuleID)
MODULE (ModuleID, ModuleName, LecturerID)
LECTURER(LecturerID, FirstName, Surname, Email)
How would you ensure that more than one lecturer can teach on a module?
I've made some assumptions about the data types contained in your schema. The following query will allow you to select only lecturers that are teaching more than 1 module:
SELECT
COUNT(ModuleName) AS number_of_modules_taught
, FirstName
, Surname
, Email
, CourseName
FROM
(SELECT
l.FirstName
, l.Surname
, l.Email
, c.CourseName
, m.ModuleName
FROM course c
JOIN coursemodules cm ON c.CourseID = cm.CourseID
JOIN module m ON cm.ModuleID = m.ModuleID
JOIN lecturer l ON m.LecturerID = l.lecturerID
) AS modules_by_lecturer
GROUP BY 2, 3, 4, 5
HAVING COUNT(ModuleName) > 1
I've also made this available on SQL Fiddle for you to play around with: http://sqlfiddle.com/#!9/9f25d5/4
Edit: After the fact it occurred to me that you might actually care about loading the data into tables such that one lecturer can be mapped to more than one module. This is accomplished by allowing the ModuleID field to be unique, and mapping it to the same lecturer:
CREATE TABLE course (CourseID VARCHAR(20), CourseName VARCHAR(100));
CREATE TABLE coursemodules (CourseID VARCHAR(20), ModuleID VARCHAR(20));
CREATE TABLE module (ModuleID VARCHAR(20), ModuleName VARCHAR(100), LecturerID VARCHAR(20));
CREATE TABLE lecturer (LecturerID VARCHAR(20), FirstName VARCHAR(50), Surname VARCHAR(50), Email VARCHAR(100));
INSERT INTO course (CourseID, CourseName) VALUES(12345, 'Pysics 101'),(23456, 'English 102'),(34567, 'Computer Science 306');
INSERT INTO coursemodules (CourseID, ModuleID) VALUES(12345, 13579),(12345, 79135),(23456, 35791),(34567, 57913);
INSERT INTO module (ModuleID, ModuleName, LecturerID) VALUES(13579, 'Newton\'s Laws', 24680),(79135, 'Thermodynamics', 24680),(35791, 'Chaucer', 46802),(57913, 'Lambda Functions in Java', 68024);
INSERT INTO lecturer (LecturerID, FirstName, Surname, Email) VALUES(24680, 'Stephen', 'Hawking', 'shawking#cambridge.com'), (80246, 'Neil', 'Tyson', 'ndtyson#amnh.org'),(46802, 'George', 'Martin', 'grrmartin#westeros.com'),(68024, 'Linus', 'Torvalds', 'lt#linux.org');
If I have two full text indexes on tables such as Contacts and Companies, how can I write a query that ensures ALL the words of the search phrase exist within either of the two indexes?
For example, if I'm searching for contacts where all the keywords exist in either the contact record or the company, how would I write the query?
I've tried doing CONTAINSTABLE on both the contact and company tables and then joining the tables together, but if I pass the search phrase in to each as '"searchTerm1*' AND '"searchTerm2*"' then it only matches when all the search words are on both indexes and returns too few records. If I pass it in like '"searchTerm1*' OR '"searchTerm2*"' then it matches where any (instead of all) of the search words are in either of the indexes and returns too many records.
I also tried creating an indexed view that joins contacts to companies so I could search across all the columns in one shot, but unfortunately a contact can belong to more than one company and so the ContactKey that I was going to use as the key for the view is no longer unique and so it fails to be created.
It seems like maybe I need to break the phrase apart and query for each word separately and then join the results back together to be able to ensure all the words were matched on, but I can't think of how I'd write that query.
Here's an example of what the model could look like:
Contact CompanyContact Company
-------------- -------------- ------------
ContactKey ContactKey CompanyKey
FirstName CompanyKey CompanyName
LastName
I have a Full Text index on FirstName,LastName and another on CompanyName.
This answer is rebuilt to address your issue such that multiple strings must exist ACROSS the fields. Note the single key in the CompanyContactLink linking table:
CREATE FULLTEXT CATALOG CompanyContact WITH ACCENT_SENSITIVITY = OFF
GO
CREATE TABLE Contact ( ContactKey INT IDENTITY, FirstName VARCHAR(20) NOT NULL, LastName VARCHAR(20) NOT NULL )
ALTER TABLE Contact ADD CONSTRAINT PK_Contact PRIMARY KEY NONCLUSTERED ( ContactKey )
CREATE TABLE Company ( CompanyKey INT IDENTITY, CompanyName VARCHAR(50) NOT NULL )
ALTER TABLE Company ADD CONSTRAINT PK_Company PRIMARY KEY NONCLUSTERED ( CompanyKey )
GO
CREATE TABLE CompanyContactLink ( CompanyContactKey INT IDENTITY NOT NULL, CompanyKey INT NOT NULL, ContactKey INT NOT NULL )
GO
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Dipper', 'Pines' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Mabel', 'Pines' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Stanley', 'Pines' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Soos', 'Ramirez' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Wendy', 'Corduroy' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Sheriff', 'Blubs' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Bill', 'Cipher' )
INSERT INTO Contact ( FirstName, LastName ) VALUES ( 'Pine Dip', 'Nobody' )
INSERT INTO Contact ( FirstNAme, LastName ) VALUES ( 'Nobody', 'Pine Dip' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Mystery Shack' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Greesy Diner' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Watertower' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Manotaur Cave' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Big Dipper Watering Hole' )
INSERT INTO Company ( CompanyName ) VALUES ( 'Lost Pines Dipping Pool' )
GO
INSERT INTO CompanyContactLink Values (3, 5), (1, 1), (1, 2), (1, 3), (1, 4), (1,5), (5,1), (3,1), (4,1)
GO
CREATE FULLTEXT INDEX ON Contact (LastName, FirstName)
KEY INDEX PK_Contact
ON CompanyContact
WITH STOPLIST = SYSTEM
CREATE FULLTEXT INDEX ON Company (CompanyName)
KEY INDEX PK_Company
ON CompanyContact
WITH STOPLIST = SYSTEM
GO
CREATE VIEW CompanyContactView
WITH SCHEMABINDING
AS
SELECT
CompanyContactKey,
CompanyName,
FirstName,
LastName
FROM
dbo.CompanyContactLink
INNER JOIN dbo.Company ON Company.CompanyKey = CompanyContactLink.CompanyKey
INNER JOIN dbo.Contact ON Contact.ContactKey = CompanyContactLink.ContactKey
GO
CREATE UNIQUE CLUSTERED INDEX idx_CompanyContactView ON CompanyContactView (CompanyContactKey);
GO
CREATE FULLTEXT INDEX ON CompanyContactView (CompanyName, LastName, FirstName)
KEY INDEX idx_CompanyContactView
ON CompanyContact
WITH STOPLIST = SYSTEM
GO
-- Wait a few moments for the FULLTEXT INDEXing to take place.
-- Check to see how the index is doing ... repeat the following line until you get a zero back.
DECLARE #ReadyStatus INT
SET #ReadyStatus = 1
WHILE (#ReadyStatus != 0)
BEGIN
SELECT #ReadyStatus = FULLTEXTCATALOGPROPERTY('CompanyContact', 'PopulateStatus')
END
SELECT
CompanyContactView.*
FROM
CompanyContactView
WHERE
FREETEXT((FirstName,LastName,CompanyName), 'Dipper') AND
FREETEXT((FirstName,LastName,CompanyName), 'Shack')
GO
And for the sake of your example with Wendy at the Watertower:
SELECT
CompanyContactView.*
FROM
CompanyContactView
WHERE
FREETEXT((FirstName,LastName,CompanyName), 'Wendy') AND
FREETEXT((FirstName,LastName,CompanyName), 'Watertower')
GO
I created a method that works with any number full text indexes and columns. Using this method, it is very easy to add additional facets to search for.
Split the search phrase into rows in a temp table
Join to this temp table to search for each search term using CONTAINSTABLE on each applicable full text index.
Union the results together and get the distinct count of the search terms found.
Filter out results where the number of search terms specified does not match the number of search terms found.
Example:
DECLARE #SearchPhrase nvarchar(255) = 'John Doe'
DECLARE #Matches Table(
MentionedKey int,
CoreType char(1),
Label nvarchar(1000),
Ranking int
)
-- Split the search phrase into separate words.
DECLARE #SearchTerms TABLE (Term NVARCHAR(100), Position INT)
INSERT INTO #SearchTerms (Term, Position)
SELECT dbo.ScrubSearchTerm(Term)-- Removes invalid characters and convert the words into search tokens for Full Text searching such as '"word*"'.
FROM dbo.SplitSearchTerms(#SearchPhrase)
-- Count the search words.
DECLARE #numSearchTerms int = (SELECT COUNT(*) FROM #SearchTerms)
-- Find the matching contacts.
;WITH MatchingContacts AS
(
SELECT
[ContactKey] = sc.[KEY],
[Ranking] = sc.[RANK],
[Term] = st.Term
FROM #SearchTerms st
CROSS APPLY dbo.SearchContacts(st.Term) sc -- I wrap my CONTAINSTABLE query in a Sql Function for convenience
)
-- Find the matching companies
,MatchingContactCompanies AS
(
SELECT
c.ContactKey,
Ranking = sc.[RANK],
st.Term
FROM #SearchTerms st
CROSS APPLY dbo.SearchCompanies(st.Term) sc
JOIN dbo.CompanyContact cc ON sc.CompanyKey = cc.CompanyKey
JOIN dbo.Contact c ON c.ContactKey = cc.ContactKey
)
-- Find the matches where ALL search words were found.
,ContactsWithAllTerms AS
(
SELECT
c.ContactKey,
Ranking = SUM(x.Ranking)
FROM (
SELECT ContactKey, Ranking, Term FROM MatchingContacts UNION ALL
SELECT ContactKey, Ranking, Term FROM MatchingContactCompanies
) x
GROUP BY c.ContactKey
HAVING COUNT(DISTINCT x.Term) = #numSearchTerms
)
SELECT
*
FROM ContactsWithAllTerms c
Update
Per the comments, here's an example of my SearchContacts function. It's just a simple wrapper function because I was using it in multiple procedures.
CREATE FUNCTION [dbo].[SearchContacts]
(
#contactsKeyword nvarchar(4000)
)
RETURNS #returntable TABLE
(
[KEY] int,
[RANK] int
)
AS
BEGIN
INSERT #returntable
SELECT [KEY],[RANK] FROM CONTAINSTABLE(dbo.Contact, ([FullName],[LastName],[FirstName]), #contactsKeyword)
RETURN
END
GO
I have two tables, one has foreign keys to the other. I want to delete duplicates from Table 1 at the same time updating the keys on Table 2. I.e count the duplicates on Table 1 keep 1 key from the duplicates and query the rest of the duplicate records on Table 2 replacing them with the key I'm keeping from Table 1. Soundex would be the best option because not all the names are spelled right in Table 1. I have the basic algorithm but not sure how to do it. Help?
So far this is what I have:
declare #Duplicate int
declare #OriginalKey int
create table #tempTable1
(
CourseID int, <--- The Key I want to keep or delete
SchoolID int,
CourseName nvarchar(100),
Category nvarchar(100),
IsReqThisYear bit,
yearrequired int
);
create table #tempTable2
(
CertID int,
UserID int,
CourseID int, <---- Must stay updated with Table 1
SchoolID int,
StartDateOfCourse datetime,
EndDateOfCourse datetime,
Type nvarchar(100),
HrsOfClass float,
Category nvarchar(100),
Cost money,
PassFail varchar(20),
Comments nvarchar(1024),
ExpiryDate datetime,
Instructor nvarchar(200),
Level nchar(10)
)
--Deletes records from Table 1 not used in Table 2--
delete from Table1
where CourseID not in (select CourseID from Table2 where CourseID is not null)
insert into #tempTable1(CourseID, SchoolID, CourseName, Category, IsReqThisYear, yearrequired)
select CourseID, SchoolID, CourseName, Category, IsReqThisYear, yearrequired from Table1
insert into #tempTable2(CertID, UserID, CourseID, SchoolID, StartDateOfCourse, EndDateOfCourse, Type, HrsOfClass,Category, Cost, PassFail, Comments, ExpiryDate, Instructor, Level)
select CertID, UserID, CourseID, SchoolID, StartDateOfCourse, EndDateOfCourse, Type, HrsOfClass,Category, Cost, PassFail, Comments, ExpiryDate, Instructor, Level from Table2
select cour.CourseName, Count(cour.CourseName) cnt from Table1 as cour
join #tempTable1 as temp on cour.CourseID = temp.CourseID
where SOUNDEX(temp.CourseName) = SOUNDEX(cour.CourseName) <---
The last part does not exactly work, gives me an error
Error: Column 'Table1.CourseName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
UPDATE: Some of the names in CourseName have numbers in them too. Like some are in romans and numeral format. Need to find those too but Soundex ignores numbers.
I am moving old project that used single table inheritance in to a new database, which is more structured. How would I write a SQL script to port this?
Old structure
I've simplified the SQL for legibility.
CREATE TABLE customers (
id int(11),
...
firstName varchar(50),
surname varchar(50),
address1 varchar(50),
address2 varchar(50),
town varchar(50),
county varchar(50),
postcode varchar(50),
country varchar(50),
delAddress1 varchar(50),
delAddress2 varchar(50),
delTown varchar(50),
delCounty varchar(50),
delPostcode varchar(50),
delCountry varchar(50),
tel varchar(50),
mobile varchar(50),
workTel varchar(50),
);
New structure
CREATE TABLE users (
id int(11),
firstName varchar(50),
surname varchar(50),
...
);
CREATE TABLE addresses (
id int(11),
ForeignKey(user),
street1 varchar(50),
street2 varchar(50),
town varchar(50),
county varchar(50),
postcode varchar(50),
country varchar(50),
type ...,
);
CREATE TABLE phone_numbers (
id int(11),
ForeignKey(user),
number varchar(50),
type ...,
);
With appropriate cross-database notations for table references if appropriate:
INSERT INTO Users(id, firstname, surname, ...)
SELECT id, firstname, surname, ...
FROM Customers;
INSERT INTO Addresses(id, street1, street2, ...)
SELECT id, street1, street2, ...
FROM Customers;
INSERT INTO Phone_Numbers(id, number, type, ...)
SELECT id, phone, type, ...
FROM Customers;
If you want both the new and the old address (del* version), then repeat the address operation on the two sets of source columns with appropriate tagging. Similarly, for the three phone numbers, repeat the phone number operation. Or use a UNION in each case.
First make sure to backup your existing data!
The process is differnt if you are going to use the original id field or generate a new one.
Assuming you are going to use the orginal, make sure that you have the ability to insert id fields into the table before you start (the SQL Server equivalent if you are autogenrating the number is Set identity Insert on, not sure what mysql would use). Wirte an insert from the old table to the parent table:
insert newparenttable (idfield, field1, field2)
select idfield, field1, field2 from old parent table
then write similar inserts for all the child tables depending on what fields you need. Where you have multiple phone numbers in differnt fields, for instance, you would use a union all stament as your insert select.
Insert newphone (phonenumber, userid, phonetype)
select home_phone, id, 100 from oldparenttable
union all
select work_phone, id, 101 from oldparenttable
Union all
select cell_phone, id, 102 from oldparenttable
If you are going to have a new id generated, then create the table with a field for the old id. You can drop this at the end (although I'd keep it for about six months). Then you can join from the new parent table to the old parent table on the oldid and grab the new id from the new parent table when you do you inserts to child tables. Something like:
Insert newphone (phonenumber, userid, phonetype)
select home_phone, n.id, 100 from oldparenttable o
join newparenttable n on n.oldid = o.id
union all
select work_phone, n.id, 101 fromoldparenttable o
join newparenttable n on n.oldid = o.id
Union all
select cell_phone, n.id, 102 from oldparenttable o
join newparenttable n on n.oldid = o.id