SQL multiple inclusive joins - sql

Below is schema description. I would like to construct a query that for a given user will return all the cases that are shared directly via case_users OR indirectly via case_groups table. Here is my attempt, where I pull the groups the user belongs to upfront:
SELECT * FROM `cases`
INNER JOIN `case_users` ON `cases`.`id` = `case_users`.`case_id`
INNER JOIN `case_groups` ON `cases`.`id` = `case_groups`.`case_id`
WHERE `case_users`.`user_id` = '<USER_ID>'
OR `case_groups`.`group_id` IN (<USER_GROUP_LIST>)
EXPLAIN returns the following: Impossible WHERE noticed after reading const table...
How can I get it done? Ideally I would like to retrieve all the cases in a single shot - without pulling the USER_GROUP_LIST - groups that the user belongs to.
mysql> describe users;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
+-------------+--------------+------+-----+---------+----------------+
mysql> describe cases;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
+-------------+--------------+------+-----+---------+----------------+
mysql> describe case_users;
+-------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+-------+
| user_id | int(11) | NO | PRI | NULL | |
| case_id | int(11) | NO | PRI | NULL | |
+-------------+---------+------+-----+---------+-------+
mysql> describe case_groups;
+-------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+-------+
| case_id | int(11) | NO | PRI | NULL | |
| group_id | int(11) | NO | PRI | NULL | |
+-------------+---------+------+-----+---------+-------+
mysql> describe group_users;
+-------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+-------+
| group_id | int(11) | NO | PRI | NULL | |
| user_id | int(11) | NO | PRI | NULL | |
+-------------+---------+------+-----+---------+-------+

Your joins will only return cases whose Id is in both the case_users and case_groups..
If its one or the other, then you need 2 queries, which you can UNION to get all the results in a single resultset:
SELECT `cases`.* FROM `cases`
INNER JOIN `case_users` ON `cases`.`id` = `case_users`.`case_id`
WHERE `case_users`.`user_id` = '<USER_ID>'
UNION
SELECT `cases`.* FROM `cases`
INNER JOIN `case_groups` ON `cases`.`id` = `case_groups`.`case_id`
WHERE `case_groups`.`group_id` IN (SELECT `group_users`.`group_id`
FROM `group_users`
WHERE `group_users`.`user_id` = '<USER_ID>')

Related

Restoring SQL table into new table with more columns

I'm trying to salvage a Gitorious installation that has gone bad. I've dumped the SQL table using mysqldump, but now I'm running into the problem that the new version of Gitorious changed its SQL schema in a few places.
In particular, the old version has a table taggings, which looks like
mysql> describe taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| tag_id | int(11) | YES | MUL | NULL | |
| taggable_id | int(11) | YES | MUL | NULL | |
| taggable_type | varchar(255) | YES | | NULL | |
| created_at | datetime | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
In the new version, this table has gotten three extra columns:
mysql> describe taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| tag_id | int(11) | YES | MUL | NULL | |
| taggable_id | int(11) | YES | MUL | NULL | |
| taggable_type | varchar(255) | YES | | NULL | |
| created_at | datetime | YES | | NULL | |
| tagger_id | int(11) | YES | | NULL | |
| tagger_type | varchar(255) | YES | | NULL | |
| context | varchar(255) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
8 rows in set (0.00 sec)
so that
grep 'INSERT INTO `taggings`' inuse.sql | mysql -uroot gitorious_production
fails with
ERROR 1136 (21S01) at line 1: Column count doesn't match value count at row 1
Is there an easy way to tell MySQL that the final two fields should be left at their default value, NULL?
(The new Gitorious' taggings table starts out empty.)
As a general best practice, you should mention the field names in which you're inserting :
Insert into taggings (id,tag_id,taggable_id,taggable_type,created_at) values (...your values...)
Rename your new table taggings as taggings_old
Create a table named taggings with your old schema
Insert your data
Add the new column to your table taggings

Storing government forms

I want to store a large number of filled-out government forms, like the Application for Federal Assistance. The forms are varied and change yearly. Field types vary, and can be: boolean, string, date, int, among others.
Is the best way to store these forms to completely normalize data?
À la:
form
+-----------------+-----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+-----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| govt_identifier | char(40) | YES | | NULL | |
| description | char(100) | YES | | NULL | |
+-----------------+-----------+------+-----+---------+----------------+
filled_form (a form a person has actually filled out)
+-----------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| form_id | int(11) | NO | | NULL | |
| person_id | int(11) | NO | | NULL | |
+-----------+---------+------+-----+---------+----------------+
text_field (a class of input; belongs to a form)
+---------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+----------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | char(40) | YES | | NULL | |
| form_id | int(11) | NO | | NULL | |
+---------+----------+------+-----+---------+----------------+
text_value (a particular input record; belongs to a class and filled_form)
+----------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| value | text | YES | | NULL | |
| text_field_id | int(11) | NO | | NULL | |
| filled_form_id | int(11) | NO | | NULL | |
+----------------+---------+------+-----+---------+----------------+
... continue for all input types
While this would work, your SQL will be slightly awkward and quite non-intuitive. Have you considered actually creating data models for each form individually and then using those to populate your forms. It may seem more work up front, but the development of your data capture will potentially be simpler.
I would have a look at single table inheritance.
Model each field as a base class Field with subclasses IntField, BoolField, etc.
The Field class will have a member Name (string), IntField will have IntValue (int), BoolField will have BoolValue (bit), etc.
This requires you to have one column for each possible type in your Field-table, that is a bit space overhead, but on the other hand it gives you type safety. If you model as single table inheritance you can probably hook up your favorite ER-mapper without problem.

Return null on condition in sub-table

Say i have 2 tables, person and job.
+--------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | | NULL | |
| job_id | int(11) | NO | | NULL | |
+--------+--------------+------+-----+---------+----------------+
+----------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------------+------+-----+---------+----------------+
| j_id | int(11) | NO | PRI | NULL | auto_increment |
| j_name | varchar(255) | NO | | NULL | |
| j_active | tinyint(1) | NO | | 0 | |
+----------+--------------+------+-----+---------+----------------+
How would i do a select where it only returns a job_id where j_active = 1 and otherwise return 0 or NULL? So, I would want to always return all persons but when their job isn't active i dont want to return their job id
select * from person p left join job j on p.job_id=j.j_id and j.j_active=1
A case statement should work. Something like:
select name, case when j_active=1 then job_id else null end as job_id
from person join job on (person.job_id=job.j_id)

Why won't MySQL use a reference index on the JOIN?

In the following example, MySQL fails to use to find a ref for the JOIN clause (or so it appears). Can anyone explain why?
mysql> explain SELECT 1
FROM `businesses`
INNER JOIN `categories`
ON (`businesses`.`id` = `categories`.`business_id`)
WHERE (`categories`.`category_id` IN (1321, 7304, 9189, 4736, 4737, 1322, 8554, 1323, 1324, 9459, 1325, 1326, 4738, 1327, 1328, 1329, 1330, 1331, 1332, 1333, 1334, 8031, 8387)
AND `businesses`.`id` <= 170261
AND `businesses`.`id` >= 160262 ) ;
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+-------------+-------------------------------------+-------+--------------------------+-------------+---------+------+-------+-
| 1 | SIMPLE | businesses | range | PRIMARY | PRIMARY | 4 | NULL | 20492 | Using where
| 1 | SIMPLE | categories | range | business_id,idx_category | business_id | 10 | NULL | 20584 | Using where; Using index
+----+-------------+-------------------------------------+-------+--------------------------+-------------+---------+------+-------+-
categories table:
| categories | CREATE TABLE `categories` (
`id` int(11) NOT NULL auto_increment,
`business_id` int(10) unsigned default NULL,
`category_id` int(10) unsigned default NULL,
`country_id` char(2) default NULL,
`state_id` int(10) unsigned default NULL,
`city_id` int(10) unsigned default NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `business_id` (`business_id`,`category_id`),
KEY `idx_category2` (`country_id`,`state_id`,`city_id`,`category_id`),
KEY `idx_category` (`category_id`)
) ENGINE=InnoDB AUTO_INCREMENT=13155275 DEFAULT CHARSET=latin1 |
Index info on categories:
+-------------------------------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------------------------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| categories | 0 | PRIMARY | 1 | id | A | 13154781 | NULL | NULL | | BTREE | |
| categories | 0 | business_id | 1 | business_id | A | 13154781 | NULL | NULL | YES | BTREE | |
| categories | 0 | business_id | 2 | category_id | A | 13154781 | NULL | NULL | YES | BTREE | |
| categories | 1 | idx_category2 | 1 | country_id | A | 17 | NULL | NULL | YES | BTREE | |
| categories | 1 | idx_category2 | 2 | state_id | A | 17 | NULL | NULL | YES | BTREE | |
| categories | 1 | idx_category2 | 3 | city_id | A | 53913 | NULL | NULL | YES | BTREE | |
| categories | 1 | idx_category2 | 4 | category_id | A | 13154781 | NULL | NULL | YES | BTREE | |
| categories | 1 | idx_category | 1 | category_id | A | 51995 | NULL | NULL | YES | BTREE | |
+-------------------------------------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Maybe it's because you're not looking for all categories with that business_id, but further limit the categories like;
WHERE (`categories`.`category_id` IN (1321, 7304, 9189, etc)
The MySQL guide has an article on the range join type that might be relevant.

MySQL: Removing duplicate columns on Left Join, 3 tables

I have a table that uses 3 foreign keys into other tables. When I perform a left join, I get duplicate columns. MySQL says that the USING syntax will reduce the duplicate columns, but there aren't examples for multiple keys.
Given:
mysql> describe recipes;
+------------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+------------------+------+-----+---------+-------+
| ID_Recipe | int(11) | NO | PRI | NULL | |
| Recipe_Title | char(64) | NO | | NULL | |
| Difficulty | int(10) unsigned | NO | | NULL | |
| Elegance | int(10) unsigned | NO | | NULL | |
| Quality | int(10) unsigned | NO | | NULL | |
| Kitchen_Hours | int(10) unsigned | NO | | NULL | |
| Kitchen_Minutes | int(10) unsigned | NO | | NULL | |
| Total_Hours | int(10) unsigned | NO | | NULL | |
| Total_Minutes | int(10) unsigned | NO | | NULL | |
| Serving_Quantity | int(10) unsigned | NO | | NULL | |
| Description | varchar(128) | NO | | NULL | |
| ID_Prep_Text | int(11) | YES | | NULL | |
| ID_Picture | int(11) | YES | | NULL | |
| Category | int(10) unsigned | NO | | NULL | |
| ID_Reference | int(11) | YES | | NULL | |
+------------------+------------------+------+-----+---------+-------+
15 rows in set (0.06 sec)
mysql> describe recipe_prep_texts;
+------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+---------------+------+-----+---------+-------+
| ID_Prep_Text | int(11) | NO | PRI | NULL | |
| Preparation_Text | varchar(2048) | NO | | NULL | |
+------------------+---------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
mysql> describe recipe_prep_texts;
+------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+---------------+------+-----+---------+-------+
| ID_Prep_Text | int(11) | NO | PRI | NULL | |
| Preparation_Text | varchar(2048) | NO | | NULL | |
+------------------+---------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
mysql> describe mp_references;
+--------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| ID_Reference | int(11) | NO | PRI | NULL | |
| ID_Title | int(11) | YES | | NULL | |
| ID_Category | int(11) | YES | | NULL | |
+--------------+---------+------+-----+---------+-------+
3 rows in set (0.00 sec)
My query statement:
SELECT *
FROM Recipes
LEFT JOIN (Recipe_Prep_Texts, Recipe_Pictures, mp_References)
ON (
Recipe_Prep_Texts.ID_Prep_Text = Recipes.ID_Prep_Text AND
Recipe_Pictures.ID_Picture = Recipes.ID_Picture AND
mp_References.ID_Reference = Recipes.ID_Reference
);
My objective is to get one row of all the columns from the join without duplicate columns. I'm using MySQL C++ Connector to send the SQL statements and retrieve result sets. I believe that the C++ Connector is having issues with duplicate column names.
So what is the SQL statement syntax that I should use?
Reference to MySQL JOIN syntax
I believe the following should work:
SELECT *
FROM Recipes
LEFT JOIN Recipe_Prep_Texts USING (ID_Prep_Text)
LEFT JOIN Recipe_Pictures USING (ID_Picture)
LEFT JOIN mp_References USING (ID_Reference)
Since it looks like most of the tables you are joining on have a few columns except for the first one, how about:
SELECT Recipes.*,
Recipe_Prep_Texts.Preparation_Text,
Recipe_Pictures.Foo, -- describe is missing in OP
mp_References.ID_Title,
mp_References.ID_Category
FROM Recipes
LEFT JOIN (Recipe_Prep_Texts, Recipe_Pictures, mp_References)
ON (
Recipe_Prep_Texts.ID_Prep_Text = Recipes.ID_Prep_Text AND
Recipe_Pictures.ID_Picture = Recipes.ID_Picture AND
mp_References.ID_Reference = Recipes.ID_Reference
);
I can't tell you how many times I wished I had
SELECT (* - foo) FROM table
especially in cases where foo is some huge field like a BLOB and I just want to see everything else without breaking the formatting.
You are selecting * from the combined resulting table. Limit that * to whatever columns you want to keep.
Try the following query:
SELECT name,ac,relation_name
FROM table1
LEFT JOIN table2 USING (ID_Prep_Text)
LEFT JOIN table3 USING (ID_Picture);