SQL Query to search records in multiple tables - sql

I'm trying to implement a search feature. I need to look into multiple tables in SQL database using a text-string. Currently, I'm only looking into 3 tables i.e.,
Table Items:
[dbo].[Items]
(
[ItemID] INT IDENTITY (1, 1) NOT NULL,
[CategoryID] INT NOT NULL,
[BrandID] INT NOT NULL,
[ItemName] NVARCHAR(MAX) NOT NULL,
[ItemPrice] DECIMAL(18, 2) NOT NULL,
[imageUrl] NVARCHAR(MAX) NULL,
CONSTRAINT [PK_dbo.Items]
PRIMARY KEY CLUSTERED ([ItemID] ASC),
CONSTRAINT [FK_dbo.Items_dbo.Brands_BrandID]
FOREIGN KEY ([BrandID]) REFERENCES [dbo].[Brands] ([BrandID]),
CONSTRAINT [FK_dbo.Items_dbo.Categories_CategoryID]
FOREIGN KEY ([CategoryID]) REFERENCES [dbo].[Categories] ([CategoryID])
)
Table Categories:
[dbo].[Categories]
(
[CategoryID] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (MAX) NULL,
CONSTRAINT [PK_dbo.Categories]
PRIMARY KEY CLUSTERED ([CategoryID] ASC)
)
Table Brands:
[dbo].[Brands]
(
[BrandID] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (MAX) NULL,
CONSTRAINT [PK_dbo.Brands]
PRIMARY KEY CLUSTERED ([BrandID] ASC)
)
Any records that may contain the supplied text-string must be fetched out. I'm a newbie on SQL knowledge. This is my implementation is:
SELECT *
FROM Items
WHERE ItemName LIKE 'cocacola'
SELECT *
FROM Categories
WHERE Name LIKE 'cocacola'
SELECT *
FROM Brands
WHERE Name LIKE 'cocacola'
which is obviously incorrect. Can someone please guide.
Thanks.

If you want to return a substring search, it might be slow depending on how much data you have.
If you are able to pre-specify the tables, and want a single search that searches all and returns matches across all tables, you will want something like this:
SELECT
'Items' as table_name,
item_id as record_id,
ItemName AS found
FROM
Items
WHERE
ItemName LIKE '%cocacola%'
UNION
SELECT
'Categories' as table_name,
CategoryID AS record_id,
Name AS found
FROM
Categories
WHERE
Name LIKE '%cocacola%'
UNION
SELECT
'Brands' as table_name,
BrandID AS record_id,
Name AS found
FROM
Brands
WHERE
Name LIKE '%cocacola%'
The UNION will append the results from one query to another.
It will be slow if you have a lot of data

You solution is not incorrect. You run three queries. Each against a different Table. Depending on your use case this is probably fine.
You can join the tables if you want to search all tables with only one query. This is probably slower than running three queries because the database has to match the values together.
SELECT *
FROM Items
FULL OUTER JOIN Categories ON Categories.CategoryID = Items.CategoryID
FULL OUTER JOIN Brands ON Brands.BrandID = Items.BrandID
WHERE Items.ItemName LIKE 'cocacola'
AND Categories.Name LIKE 'cocacola'
AND Brands.Name LIKE 'cocacola'
If you get a hit in the category name with this query, the category will be listed for every item that's associated with this category.

It sounds like you might want to try using a union to join together the results of all three queries.
For example:
SELECT ItemID, ItemName
FROM Items
WHERE ItemName = 'cocacola'
UNION
SELECT CategoryID, Name
FROM Categories
WHERE Name = 'cocacola'
UNION
SELECT BrandID, Name
FROM Brands
WHERE Name = 'cocacola'
One note about union is that you have to make sure that each part of the query is returning the same number of columns with the same datatype in the same order.

Related

Redshift create list and search different table with it

I think there a few ways to tackle this, but I'm not sure how to do any of them.
I have two tables, the first has ID's and Numbers. The ID's and numbers can potentially be listed more than once, so I create a result table that lists the unique numbers grouped by ID.
My second table has rows (100 million) with the ID and Numbers again. I need to search that table for any ID that has a Number not in the list of Numbers from the result table.
Can redshift do a query based on if the ID matches and the Number exists in the list from the table? Can this all be done in memory/one statement?
DROP TABLE IF EXISTS `myTable`;
CREATE TABLE `myTable` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`ID` varchar(255),
`Numbers` mediumint default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO `myTable` (`ID`,`Numbers`)
VALUES
("CRQ44MPX1SZ",1890),
("UHO21QQY3TW",4370),
("JTQ62CBP6ER",1825),
("RFD95MLC2MI",5014),
("URZ04HGG2YQ",2859),
("CRQ44MPX1SZ",1891),
("UHO21QQY3TW",4371),
("JTQ62CBP6ER",1826),
("RFD95MLC2MI",5015),
("URZ04HGG2YQ",2860),
("CRQ44MPX1SZ",1892),
("UHO21QQY3TW",4372),
("JTQ62CBP6ER",1827),
("RFD95MLC2MI",5016),
("URZ04HGG2YQ",2861);
SELECT ID, listagg(distinct Numbers,',') as Number_List, count(Numbers) as Numbers_Count
FROM myTable
GROUP BY ID
AS result
DROP TABLE IF EXISTS `myTable2`;
CREATE TABLE `myTable2` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`ID` varchar(255),
`Numbers` mediumint default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO `myTable2` (`ID`,`Numbers`)
VALUES
("CRQ44MPX1SZ",1870),
("UHO21QQY3TW",4350),
("JTQ62CBP6ER",1825),
("RFD95MLC2MI",5014),
("URZ04HGG2YQ",2859),
("CRQ44MPX1SZ",1891),
("UHO21QQY3TW",4371),
("JTQ62CBP6ER",1826),
("RFD95MLC2MI",5015),
("URZ04HGG2YQ",2860),
("CRQ44MPX1SZ",1882),
("UHO21QQY3TW",4372),
("JTQ62CBP6ER",1827),
("RFD95MLC2MI",5016),
("URZ04HGG2YQ",2861);
Pseudo Code
Select ID, listagg(distinct Numbers) as Violation
Where Numbers IN NOT IN result.Numbers_List
or possibly: WHERE Numbers NOT LIKE '%' || result.Numbers_List|| '%'
Desired Output
(“CRQ44MPX1SZ”, ”1870,1882”)
(“UHO21QQY3TW”, ”4350”)
EDIT
Going the JOIN route, I am not getting the right results...but I'm pretty sure my WHERE implementation is wrong.
SELECT mytable1.ID, listagg(distinct mytable2.Numbers, ',') as unauth_list, count(mytable2.Numbers) as unauth_count
FROM mytable1
LEFT JOIN mytable2 on mytable1.id = mytable2.id
WHERE (mytable1.id = mytable2.id)
AND (mytable1.Numbers <> mytable2.Numbers)
GROUP BY mytable1.id
Expected output:
(“CRQ44MPX1SZ”, ”1870,1882”, 2)
(“UHO21QQY3TW”, ”4350”, 1)
Just left join the two tables on ID and numbers and check for (where clause) to see if the match wasn't found. Shouldn't be a need for listagg() and complex comparing. Or did I miss part of the question?

Group by count multiple tables

Need to find out why my group by count query is not working. I am using Microsoft SQL Server and there are 2 tables I am trying to join.
My query needs to bring up the number of transactions made for each type of vehicle. The output of the query needs to have a separate row for each type of vehicle such as ute, hatch, sedan, etc.
CREATE TABLE vehicle
(
vid INT PRIMARY KEY,
type VARCHAR(30) NOT NULL,
year SMALLINT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
);
INSERT INTO vehicle
VALUES (1, 'Sedan', 2020, 240)
CREATE TABLE purchase
(
pid INT PRIMARY KEY,
vid INT REFERENCES vehicle(vid),
pdate DATE NOT NULL,
datepickup DATE NOT NULL,
datereturn DATE NOT NULL,
);
INSERT INTO purchase
VALUES (1, 1, '2020-07-12', '2020-08-21', '2020-08-23')
I have about 10 rows on information in each table I just haven't written it out.
This is what I wrote but it doesn't return the correct number of transactions for each type of car.
SELECT
vehicle.vid,
COUNT(purchase.pid) AS NumberOfTransactions
FROM
purchase
JOIN
vehicle ON vehicle.vid = purchase.pid
GROUP BY
vehicle.type;
Any help would be appreciated. Thanks.
Your GROUP BY and SELECT columns are inconsistent. You should write the query like this:
SELECT v.Type, COUNT(*) AS NumPurchases
FROM Purchase p JOIN
Vehicle v
ON v.vID = p.pID
GROUP BY v.Type;
Note the use of table aliases so the query is easier to write and read.
If this doesn't produce the expected values, you will need to provide sample data and desired results to make it clear what the data really looks like and what you expect.

Ambiguous column name SQL

I get the following error when I want to execute a SQL query:
"Msg 209, Level 16, State 1, Line 9
Ambiguous column name 'i_id'."
This is the SQL query I want to execute:
SELECT DISTINCT x.*
FROM items x LEFT JOIN items y
ON y.i_id = x.i_id
AND x.last_seen < y.last_seen
WHERE x.last_seen > '4-4-2017 10:54:11'
AND x.spot = 'spot773'
AND (x.technology = 'Bluetooth LE' OR x.technology = 'EPC Gen2')
AND y.id IS NULL
GROUP BY i_id
This is how my table looks like:
CREATE TABLE [dbo].[items] (
[id] INT IDENTITY (1, 1) NOT NULL,
[i_id] VARCHAR (100) NOT NULL,
[last_seen] DATETIME2 (0) NOT NULL,
[location] VARCHAR (200) NOT NULL,
[code_hex] VARCHAR (100) NOT NULL,
[technology] VARCHAR (100) NOT NULL,
[url] VARCHAR (100) NOT NULL,
[spot] VARCHAR (200) NOT NULL,
PRIMARY KEY CLUSTERED ([id] ASC));
I've tried a couple of things but I'm not an SQL expert:)
Any help would be appreciated
EDIT:
I do get duplicate rows when I remove the GROUP BY line as you can see:
I'm adding another answer in order to show how you'd typically select the lastest record per group without getting duplicates. You's use ROW_NUMBER for this, marking every last record per i_id with row number 1.
SELECT *
FROM
(
SELECT
i.*,
ROW_NUMBER() over (PARTITION BY i_id ORDER BY last_seen DESC) as rn
FROM items i
WHERE last_seen > '2017-04-04 10:54:11'
AND spot = 'spot773'
AND technology IN ('Bluetooth LE', 'EPC Gen2')
) ranked
WHERE rn = 1;
(You'd use RANK or DENSE_RANK instead of ROW_NUMBER if you wanted duplicates.)
You forgot the table alias in GROUP BY i_id.
Anyway, why are you writing an anti join query where you are trying to get rid of duplicates with both DISTINCT and GROUP BY? Did you have issues with a straight-forward NOT EXISTS query? You are making things way more complicated than they actually are.
SELECT *
FROM items i
WHERE last_seen > '2017-04-04 10:54:11'
AND spot = 'spot773'
AND technology IN ('Bluetooth LE', 'EPC Gen2')
AND NOT EXISTS
(
SELECT *
FROM items other
WHERE i.i_id = other.i_id
AND i.last_seen < other.last_seen
);
(There are other techniques of course to get the last seen record per i_id. This is one; another is to compare with MAX(last_seen); another is to use ROW_NUMBER.)

How to join two tables together and return all rows from both tables, and to merge some of their columns into a single column

I'm working with SQL Server 2012 and wish to query the following:
I've got 2 tables with mostly different columns. (1 table has 10 columns the other has 6 columns).
however they both contains a column with ID number and another column of category_name.
The ID numbers may be overlap between the tables (e.g. 1 table may have 200 distinct IDs and the other 900 but only 120 of the IDs are in both).
The Category name are different and unique for each table.
Now I wish to have a single table that will include all the rows of both tables, with a single ID column and a single Category_name column (total of 14 columns).
So in case the same ID has 3 records in table 1 and another 5 records in table 2 I wish to have all 8 records (8 rows)
The complex thing here I believe is to have a single "Category_name" column.
I tried the following but when there is no null in both of the tables I'm getting only one record instead of both:
SELECT isnull(t1.id, t2.id) AS [id]
,isnull(t1.[category], t2.[category_name]) AS [category name]
FROM t1
FULL JOIN t2
ON t1.id = t2.id;
Any suggestions on the correct way to have it done?
Make your FULL JOIN ON 1=0
This will prevent rows from combining and ensure that you always get 1 copy of each row from each table.
Further explanation:
A FULL JOIN gets rows from both tables, whether they have a match or not, but when they do match, it combines them on one row.
You wanted a full join where you never combine the rows, because you wanted every row in both tables to appear one time, no matter what. 1 can never equal 0, so doing a FULL JOIN on 1=0 will give you a full join where none of the rows match each other.
And of course you're already doing the ISNULL to make sure the ID and Name columns always have a value.
SELECT ID, Category_name, (then the other 8 columns), NULL, NULL, NULL, NULL
FROM t1
UNION ALL
SELECT ID, Category_name, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, (then the other 4 columns)
FROM t2
This demonstrates how you can use a UNION ALL to combine the row sets from two tables, TableA and TableB, and insert the set into TableC.
Create two source tables with some data:
CREATE TABLE dbo.TableA
(
id int NOT NULL,
category_name nvarchar(50) NOT NULL,
other_a nvarchar(20) NOT NULL
);
CREATE TABLE dbo.TableB
(
id int NOT NULL,
category_name nvarchar(50) NOT NULL,
other_b nvarchar(20) NOT NULL
);
INSERT INTO dbo.TableA (id, category_name, other_a)
VALUES (1, N'Alpha', N'ppp'),
(2, N'Bravo', N'qqq'),
(3, N'Charlie', N'rrr');
INSERT INTO dbo.TableB (id, category_name, other_b)
VALUES (4, N'Delta', N'sss'),
(5, N'Echo', N'ttt'),
(6, N'Foxtrot', N'uuu');
Create TableC to receive the result set. Note that columns other_a and other_b allow null values.
CREATE TABLE dbo.TableC
(
id int NOT NULL,
category_name nvarchar(50) NOT NULL,
other_a nvarchar(20) NULL,
other_b nvarchar(20) NULL
);
Insert the combined set of rows into TableC:
INSERT INTO dbo.TableC (id, category_name, other_a, other_b)
SELECT id, category_name, other_a, NULL AS 'other_b'
FROM dbo.TableA
UNION ALL
SELECT id, category_name, NULL, other_b
FROM dbo.TableB;
Display the results:
SELECT *
FROM dbo.TableC;

SQL Server 2005 query optimization with Max subquery

I've got a table that looks like this (I wasn't sure what all might be relevant, so I had Toad dump the whole structure)
CREATE TABLE [dbo].[TScore] (
[CustomerID] int NOT NULL,
[ApplNo] numeric(18, 0) NOT NULL,
[BScore] int NULL,
[OrigAmt] money NULL,
[MaxAmt] money NULL,
[DateCreated] datetime NULL,
[UserCreated] char(8) NULL,
[DateModified] datetime NULL,
[UserModified] char(8) NULL,
CONSTRAINT [PK_TScore]
PRIMARY KEY CLUSTERED ([CustomerID] ASC, [ApplNo] ASC)
);
And when I run the following query (on a database with 3 million records in the TScore table) it takes about a second to run, even though if I just do: Select BScore from CustomerDB..TScore WHERE CustomerID = 12345, it is instant (and only returns 10 records) -- seems like there should be some efficient way to do the Max(ApplNo) effect in a single query, but I'm a relative noob to SQL Server, and not sure -- I'm thinking I may need a separate key for ApplNo, but not sure how clustered keys work.
SELECT BScore
FROM CustomerDB..TScore (NOLOCK)
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2 (NOLOCK)
WHERE sc2.CustomerID = 12345)
Thanks much for any tips (pointers on where to look for optimization of sql server stuff appreciated as well)
When you filter by ApplNo, you are using only part of the key. And not the left hand side. This means the index has be scanned (look at all rows) not seeked (drill to a row) to find the values.
If you are looking for ApplNo values for the same CustomerID:
Quick way. Use the full clustered index:
SELECT BScore
FROM CustomerDB..TScore
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345)
AND CustomerID = 12345
This can be changed into a JOIN
SELECT BScore
FROM
CustomerDB..TScore T1
JOIN
(SELECT Max(ApplNo) AS MaxApplNo, CustomerID
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345
) T2 ON T1.CustomerID = T2.CustomerID AND T1.ApplNo= T2.MaxApplNo
If you are looking for ApplNo values independent of CustomerID, then I'd look at a separate index. This matches your intent of the current code
CREATE INDEX IX_ApplNo ON TScore (ApplNo) INCLUDE (BScore);
Reversing the key order won't help because then your WHERE sc2.CustomerID = 12345 will scan, not seek
Note: using NOLOCK everywhere is a bad practice