Mysql selecting rows from multiple tables - sql

I'm working on a catalog site where users can browse categories. Categories can contain other categories and products, and products can belong to more than one category. The relevant database schema looks something like this:
CREATE TABLE products (
product_id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
product_title VARCHAR(100) NOT NULL,
product_status TINYINT UNSIGNED NOT NULL
);
CREATE TABLE product_categories (
category_id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
parent_category_id INT UNSIGNED NOT NULL,
category_title VARCHAR(100) NOT NULL,
category_status TINYINT UNSIGNED NOT NULL,
category_order INT UNSIGNED NOT NULL
);
CREATE TABLE products_categories (
product_id INT UNSIGNED NOT NULL,
category_id INT UNSIGNED NOT NULL,
product_order INT UNSIGNED NOT NULL,
PRIMARY KEY(product_id, category_id)
);
The issue i'm having is I need to paginate the results using LIMIT n, n:
$perpage = 20;
$start = (isset($_GET['page'])) ? (int)$_GET['page'] * $perpage : 1;
$limitsql = "LIMIT $start, $perpage";
But I can't figure out how to select both distinct categories and products without joining and merging the results. Ideally I would like results like this:
product_id | product_title | category_id | category_title
NULL | NULL | 32 | category foo
NULL | NULL | 239 | category bar
9391 | product foo | NULL | NULL
325 | product bar | NULL | NULL
The best I've been able to do is get something like this, which doesn't really help:
product_id | product_title | category_id | category_title
9391 | product foo | 32 | category foo
325 | product bar | 239 | category bar
239 | product foo2 | 32 | category foo
115 | product bar2 | 239 | category bar
The only other solutions that I can think of would be to query all subcategories and products within the category, stick them in a php array and extract the current page with array_slice. Considering the volume of products (several thousand) this isn't a very appealing option.
Otherwise I could query the number of categories, and offset the $start in the LIMIT clause by the number of categories. This get's messy though if there is more than a full page of categories.
Here is my current working query which gives me the results above:
SELECT
p.product_id, p.product_title,
c.category_id, c.category_title
FROM products AS p
JOIN product_categories AS c
ON c.parent_category_id='20'
INNER JOIN products_categories AS pc
ON p.product_id=pc.product_id
WHERE p.product_status='1' AND pc.category_id='20'
ORDER BY pc.product_order ASC
Edit
I think i've got it working with UNION, which I completely forgot about
SELECT
c.category_id AS row_id, c.category_title AS row_title, 1 AS is_category
FROM product_categories AS c
WHERE c.parent_category_id='20'
UNION
SELECT
p.product_id AS row_id, p.product_title AS row_title, 0 AS is_category
FROM products AS p
INNER JOIN products_categories AS pc
ON p.product_id=pc.product_id
Edit 2
I guess Union isn't going to work as I thought. Since both are treated as separate queries I can't apply LIMIT to the entire result, only each individual SELECT. Also it seems the columns selected from each statement must be of the same type of the corresponding type in the other statement.

Use:
SELECT *
FROM (SELECT c.category_id AS row_id, c.category_title AS row_title, 1 AS is_category
FROM product_categories AS c
WHERE c.parent_category_id='20'
UNION
SELECT p.product_id AS row_id, p.product_title AS row_title, 0 AS is_category
FROM products AS p
JOIN products_categories AS pc ON p.product_id=pc.product_id) x
LIMIT x, y

Another way you could approach this would be changing your schema to make categories and products the same thing essentially.
CREATE TABLE items (
item_id INT UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
item_title VARCHAR(100) NOT NULL,
item_status TINYINT UNSIGNED NOT NULL,
category_or_item TINYINT UNSIGNED NOT NULL,
);
CREATE TABLE items_parents (
item_id INT UNSIGNED NOT NULL,
parent_id INT UNSIGNED NOT NULL, #points to itemid
item_order INT UNSIGNED NOT NULL,
PRIMARY KEY(item_id, parent_id)
);
Your query then is flat and you can sort it by category_or_item so categories appear first.

Related

SQL comparison report on cartesian product using subquery

I'm a student building a comparison report query in MySQL on a database that tracks customers, products, and purchases in separate tables. I have to create a report that shows how many products were sold every month for each province using a subquery. I was told to use a cross join between product and customer, however, my query runs into a problem when I try to group them as the records all collapse into each other and I don't understand why this is happening. I'm not sure if this is the correct way to approach this problem since my customer and product table don't have any values that intersect with each other except through the purchase table.
These are my create table scripts
CREATE TABLE 'customer' (
'CustomerID' INT NOT NULL,
'City' VARCHAR(100) NOT NULL,
'Province' CHAR(2) NOT NULL,
PRIMARY KEY ('CustomerID'));
CREATE TABLE 'product' (
'ProductID' INT NOT NULL,
'ProductName' VARCHAR(100) NOT NULL,
'Price' DECIMAL(5,2) NOT NULL,
PRIMARY KEY ('ProductID'));
CREATE TABLE 'purchase' (
'PurchaseID' INT NOT NULL,
'PurchaseDate' DATE NOT NULL,
'customer_CustomerID' INT NOT NULL,
'product_ProductID' INT NOT NULL,
PRIMARY KEY ('PurchaseID'),
CONSTRAINT 'fk_purchase_customer'
FOREIGN KEY ('customer_CustomerID')
REFERENCES 'customer' ('CustomerID'),
CONSTRAINT 'fk_purchase_product'
FOREIGN KEY ('product_ProductID')
REFERENCES 'product' ('ProductID'));
This is the query that I have written as I have understood the instructions.
SELECT DISTINCT province, productName AS Product, JanTotalSales
FROM PRODUCT cross join CUSTOMER
LEFT JOIN
(
SELECT purchaseID, product_productID, customer_customerID, COUNT(purchaseDate) AS JanTotalSales
FROM PURCHASE
WHERE MONTH(purchaseDate) = 01
)JAN ON PRODUCT.productID = JAN.product_productID
GROUP BY province, productID;
I should be getting results like this
Province
Product
JanTotalSales
FebTotalSales
...
TotalSales
QC
Paper
1
NULL
...
1
ON
Paper
1
2
...
3
AB
Paper
1
NULL
...
1
AB
Wire
2
2
...
4
ON
Wire
2
1
...
3
NULL
Kit
NULL
NULL
...
NULL
SK
Gummy
1
1
...
2
NULL
Bag
NULL
NULL
...
NULL
However, I receive results like this when I do it on the January subquery.
Province
Product
JanTotalSales
AB
Paper
NULL
AB
Wire
NULL
AB
Kit
NULL
AB
Kit
13
ON
Paper
NULL
ON
Wire
NULL
ON
Kit
NULL
ON
Kit
13
I appreciate whatever help you can give to show me where I'm going wrong. From what I understand it's something to do with how the grouping occurs but I can't figure out why.

Joining column for multiple tables

I am trying to extract two same-data-type columns from two different tables using one query. NOTE: Accounts attribute length in both table varies. Union can't work here because number of columns are (in reality) different in both tables.
CREATE TABLE IF NOT EXISTS `mydb`.`TABLE_A` (
`ID_TABLE_A` INT NOT NULL AUTO_INCREMENT,
`ACCOUNT` VARCHAR(5) NULL,
`SALES` INT NULL,
PRIMARY KEY (`ID_TABLE_A`))
ENGINE = InnoDB;
CREATE TABLE IF NOT EXISTS `mydb`.`TABLE_B` (
`ID_TABLE_B` INT NOT NULL AUTO_INCREMENT,
`ACOUNT` VARCHAR(9) NULL,
`SALES` INT NULL,
PRIMARY KEY (`ID_TABLE_B`))
ENGINE = InnoDB;
Requirement:(I know this can't be right but just to demonstrate a partial picture)
SELECT
ACCOUNTS,
SALES
FROM
TABLE_A, TABLE_B
Result:
---------------
|accounts|sales|
| 2854 |52500 |
| 6584 |54645 |
| 54782| 5624 |
| 58496|46259 |
| 56958| 6528 |
---------------
If you want the union of two tables that are not union-compatible, then make them union-compatible:
(SELECT
ACCOUNTS,
SALES
FROM
TABLE_A) UNION ALL
(SELECT
ACCOUNTS,
SALES
FROM TABLE_B)
I put the UNION ALL based on the assumption that you would like to keep duplicates. If you would like the output to be duplicate-free, replace it with UNION.

returning array of rows from psql in window function

I am trying to return an array of names as a row in PSQL so that i don't return duplicate entries of data. This is my current query:
SELECT DISTINCT
thread_account.*,
thread.*,
MAX(message.Created) OVER (PARTITION BY thread.id) as Last_Message_Date,
MAX(message.content) OVER (PARTITION BY thread.id) as Last_Message_Sent,
ARRAY_AGG((account.first_name, account.last_name)) OVER (PARTITION BY thread.id) as user
FROM thread_account
JOIN thread on thread.id = thread_account.thread
JOIN message on message.thread = thread_account.thread
JOIN account on account.id = message.account
WHERE thread_account.account = 299
ORDER BY MAX(message.Created) OVER (PARTITION BY thread.id) desc;
any thoughts?
I would like to be able to do something like:
ARRAY_AGG(distinct (account.first_name, account.last_name))
OVER (PARTITION BY thread.id) as user
but it doesn't let you do distinct inside a window function
Here are the table definitions:
create table thread (
id bigserial primary key,
subject text not null,
created timestamp with time zone not null default current_timestamp
);
create table thread_account (
account bigint not null references account(id) on delete cascade,
thread bigint not null references thread(id) on delete cascade
);
create index thread_account_account on thread_account(account);
create index thread_account_thread on thread_account(thread);
create table message (
id bigserial primary key,
thread bigint not null references thread(id) on delete cascade,
content text not null,
account bigint not null references account(id) on delete cascade,
created timestamp with time zone not null default current_timestamp
);
create index message_account on message(account);
create index message_thread on message(thread);
create table account (
id bigint primary key,
first_name text,
last_name text,
email text
);
I don' know why you need the relation thread_account because involved accounts are referenced through messages already.
A possible Query could be:
SELECT DISTINCT
Thread_id,
Thread_Subject,
Thread_Created,
ARRAY_AGG(Message_Account) OVER (PARTITION BY Thread_Id) AS Involed_Accounts,
Last_Message_Date,
Last_Message_Sent
FROM (
SELECT DISTINCT ON (thread.id, message.account)
thread.id AS Thread_Id,
thread.subject AS Thread_Subject,
thread.created AS Thread_Created,
message.account AS Message_Account,
MAX(message.Created)
OVER (PARTITION BY thread.id) AS Last_Message_Date,
MAX(message.content)
OVER (PARTITION BY thread.id) AS Last_Message_Sent
FROM
thread
INNER JOIN message ON (message.thread = thread.id)
INNER JOIN account ON (message.account = account.id)
) as threads
ORDER BY Last_Message_Date desc;
Result:
thread_id | thread_subject | thread_created | Involed_Accounts | last_message_date | last_message_sent
-----------+----------------+-------------------------------+---------------+-------------------------------+-------------------
1 | Thread 1 | 2016-02-17 19:42:58.630795+01 | {1,2,3,4,5,6} | 2016-02-17 19:56:35.749875+01 | R
3 | Thread 3 | 2016-02-17 19:42:58.630795+01 | {1,4,5,8} | 2016-02-17 19:47:27.952065+01 | N
2 | Thread 2 | 2016-02-17 19:42:58.630795+01 | {7,8,9,10} | 2016-02-17 19:47:27.952065+01 | J
You should check the query plan to ensure it performs good on your database.

How to Verify Multiple Foreign Key Combination in a table with a Primary Key in another table

I Have a Specification Master Table with three columns
(
ID BIGINT PRIMARY KEY,
Name varchar,
Value varchar
)
a Product Master Table with two Columns
(
ID BIGINT PRIMARY KEY,
Name varchar
)
a Stock table with
(
StockID BIGINT PRIMARY KEY,
ProductID BIGINT FOREIGN KEY (Product.ID),
SpecGroupID BIGINT UNIQUE KEY,
Stock INT
)
a Specification Grouping Table with
(
GroupID BIGINT FOREIGN KEY (Stock.SpecGroupID),
SPecificationID BIGINT FOREIGN KEY (Specification.ID),
PRIMARY KEY (Composite)
)
Now I am looking for a combination of specification if it has any stock or not.
but could not find a logic to match exact combination.
The problem I am facing if a combination of specification has n specification associated with a stock.SpecGroupID in Specification Grouping Table.
While I am searching with a few less than those n specification combination it always returning the same SpecGroupID for n specs group.
Imagine I have a apple (Color: Red; Size:5; Weight:10) in stock
And someone is ordering for a apple (Color: Red; Size:5)
I Need to give a result: Not Available
Will this work for you?
SELECT Specification.ID SpecId
, Specification.Name SpecificationName
, Value SpecificationValue
, MAX(ISNULL(Stock.Stock,0)) HasStock
FROM
Specification
LEFT JOIN SpecificationGrouping sg
ON sg.SpecificationID = Specification.ID
LEFT JOIN Stock
ON GroupID = SpecGroupID
GROUP BY Specification.ID
, Specification.Name
, Value
First cte Groups tell me the specifications of each group.
Second cte GroupDetail join all properties in a single string like this Color|Red,Size|5,Weight|10
Third cte OrderDetail is just the test input. in this sample 6 and 9 are orders.
Last query. Try to find one item on stocks with exact detail as the order
If a specification isn't on stocks I assume is an order, here 6 doesn't have item on stock so produce all null and 9 does have stock and return the StockValue.
SQL Fiddle Demo
WITH Groups as
(
SELECT *
FROM SpecificationGrouping SG
INNER JOIN Specification S
ON Sg.SPecificationID = S.ID
),
GroupDetail as
(
Select distinct ST2.GroupID,
substring(
(
Select ','+ ST1.Name + '|' + ST1.Value AS [text()]
From Groups ST1
Where ST1.GroupID = ST2.GroupID
ORDER BY ST1.Name
For XML PATH ('')
), 2, 1000) [Detail]
From Groups ST2
),
OrderDetail as
(
SELECT Detail
FROM GroupDetail
WHERE GroupID = 9
)
SELECT S.SpecGroupID, S.Stock, G.[Detail]
FROM Stock S
INNER JOIN GroupDetail G
ON S.SpecGroupID = G.GroupID
RIGHT JOIN OrderDetail O
ON O.Detail = G.Detail
Output for 9
| SpecGroupID | Stock | Detail |
|-------------|-------|----------------------------|
| 3 | 5 | Color|Red,Size|5,Weight|10 |

whats wrong with this query?

I'm trying to write a query that selects from four tables
campaignSentParent csp
campaignSentEmail cse
campaignSentFax csf
campaignSentSms css
Each of the cse, csf, and css tables are linked to the csp table by csp.id = (cse/csf/css).parentId
The csp table has a column called campaignId,
What I want to do is end up with rows that look like:
| id | dateSent | emailsSent | faxsSent | smssSent |
| 1 | 2011-02-04 | 139 | 129 | 140 |
But instead I end up with a row that looks like:
| 1 | 2011-02-03 | 2510340 | 2510340 | 2510340 |
Here is the query I am trying
SELECT csp.id id, csp.dateSent dateSent,
COUNT(cse.parentId) emailsSent,
COUNT(csf.parentId) faxsSent,
COUNT(css.parentId) smsSent
FROM campaignSentParent csp,
campaignSentEmail cse,
campaignSentFax csf,
campaignSentSms css
WHERE csp.campaignId = 1
AND csf.parentId = csp.id
AND cse.parentId = csp.id
AND css.parentId = csp.id;
Adding GROUP BY did not help, so I am posting the create statements.
csp
CREATE TABLE `campaignsentparent` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`campaignId` int(11) NOT NULL,
`dateSent` datetime NOT NULL,
`account` int(11) NOT NULL,
`status` varchar(15) NOT NULL DEFAULT 'Creating',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=latin1
cse/csf (same structure, different names)
CREATE TABLE `campaignsentemail` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parentId` int(11) NOT NULL,
`contactId` int(11) NOT NULL,
`content` text,
`subject` text,
`status` varchar(15) DEFAULT 'Pending',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=140 DEFAULT CHARSET=latin1
css
CREATE TABLE `campaignsentsms` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parentId` int(11) NOT NULL,
`contactId` int(11) NOT NULL,
`content` text,
`status` varchar(15) DEFAULT 'Pending',
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=141 DEFAULT CHARSET=latin1
You need to aggregate the sums separately, not as shown in the question.
SELECT csp.id, csp.dateSent dateSent,
e.email_count, f.fax_count, s.sms_count
FROM campaignSentParent AS csp
JOIN (SELECT cse.ParentId, COUNT(*) AS email_count
FROM campaignSentEmail cse
GROUP BY cse.ParentID) AS e ON e.parentID = csp.id
JOIN (SELECT csf.ParentId, COUNT(*) AS fax_count
FROM campaignSentFax csf
GROUP BY csf.ParentID) AS f ON f.ParentID = csp.id
JOIN (SELECT css.ParentID, COUNT(*) AS sms_count
FROM campaignSentSms css
GROUP BY css.ParentId) AS s ON s.ParentID = csp.id
WHERE csp.campaignId = 1
To do this, you pretty much have to use the JOIN notation as shown.
You depending on the quality of your optimizer and the cardinalities of the various tables and the available indexes, you might find it effective to include a join with CampaignSentParent in each of the sub-queries with the csp.CampaignID = 1 condition, so as to limit the data aggregated by the sub-queries.
You might notice that the result count you get is 2510340. The prime factorization of 2510340 is 2 × 2 × 3 × 5 × 7 × 43 × 139, and your expected answer is 139, 129, and 140. You can get 3 × 43 = 129; 2 × 2 × 5 × 7 = 140; and 139 = 139. In other words, the original query is generating the Cartesian product of all the rows in the three dependent tables and counting the product, rather than counting the relevant rows from each dependent table separately.
You're missing a GROUP BY statement at the end. I can't tell from your example what you want them to be grouped by to actually give you the code.
Add GROUP BY dateSent to the end of your query.
Try adding a group by clause.
SELECT csp.id id, csp.dateSent dateSent,
COUNT('cse.parentId') emailsSent,
COUNT('csf.parentId') faxsSent,
COUNT('css.parentId') smsSent
FROM campaignSentParent csp,
campaignSentEmail cse,
campaignSentFax csf,
campaignSentSms css
WHERE csp.campaignId = 1
AND csf.parentId = csp.id
AND cse.parentId = csp.id
AND css.parentId = csp.id
GROUP BY csp.id, csp.dateSent
When you use an aggregate function, you normally need to include a group by.