Parent/Child Tables Query Pattern - sql

Suppose I have the following parent/child table relationship in my database:
TABLE offer_master( offer_id int primary key, ..., scope varchar )
TABLE offer_detail( offer_detail_id int primary key, offer_id int foreign key, customer_id int, ... )
where offer_master.scope can take on the value
INDIVIDUAL: when the offer is to made to particular customers. In this case,
whenever a row is inserted into offer_master, a corresponding row is
added to offer_detail for each customer to which the offer has been extended.
e.g.
INSERT INTO offer_master( 1, ..., 'INDIVIDUAL' );
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 1, 1, 100, ... )
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 2, 1, 101, ... )
GLOBAL: when the offer is made to all customers. In this case,
new offers can be added to the parent table as follows:
INSERT INTO offer_master( 2, ..., 'GLOBAL' );
INSERT INTO offer_master( 3, ..., 'GLOBAL' );
but a child row is added to offer_detail only
when a customer indicates some interest in the offer. So
it may be the case that, at some later point we will have
INSERT INTO offer_detail( offer_detail_id, offer_id, customer_id, ... )
VALUES ( 4, 3, 100, ... )
Given this situation, suppose we would like to query the database
to obtain all offers which have been extended to customer 100;
this includes 3 types of offers:
offers which have been extended specifically to customer 100.
global offers which customer 100 showed no interest in.
global offers which customer 100 did show interest in.
I see two approaches:
Using a Subquery:
SELECT *
FROM offer_master
WHERE offer_id in (
SELECT offer_id
FROM offer_detail
WHERE customer_id = 100 )
OR scope = 'GLOBAL'
Using a UNION
SELECT om.*
FROM offer_master om INNER JOIN
offer_detail od
ON om.offer_id = od.offer_id
WHERE od.customer_id = 100
UNION
SELECT *
FROM offer_master
WHERE scope = 'GLOBAL'
Note: a UNION ALL cannot be used since a global offer
which a customer has shown interest in would be duplicated.
My question is:
Does this query pattern have a name?
Which of the two query methods are preferable?
Should the database design be improved in some way?

I'm not aware of a pattern name.
To me, the second query is clearer but I think either is OK.
offer_detail seems to be a dual purpose table which is a bit of a red flag to me. You might have separate tables for the customers in an individual offer, and the customers who have expressed interest.

Related

Ambiguity with column reference [duplicate]

This question already has answers here:
SQL column reference "id" is ambiguous
(5 answers)
Closed 4 months ago.
I try to run a simple code as follows:
Create Table weather (
city varchar(80),
temp_lo int,
temp_hi int,
prcp real,
date date
);
Insert Into weather Values ('A', -5, 40, 25, '2018-01-10');
Insert Into weather Values ('B', 5, 45, 15, '2018-02-10');
Create Table cities (
city varchar(80),
location point
);
Insert Into cities Values ('A', '(12,10)');
Insert Into cities Values ('B', '(6,4)');
Insert Into cities Values ('C', '(18,13)');
Select * From cities, weather Where city = 'A'
But what I get is
ERROR: column reference "city" is ambiguous.
What is wrong with my code?
If I were you I'd model things slightly differently.
To normalise things a little, we'll start with the cities table and make a few changes:
create table cities (
city_id integer primary key,
city_name varchar(100),
location point
);
Note that I've used an integer to denote the ID and Primary Key of the table, and stored the name of the city separately. This gives you a nice easy to maintain lookup table. By using an integer as the primary key, we'll also use less space in the weather table when we're storing data.
Create Table weather (
city_id integer,
temp_lo int,
temp_hi int,
prcp real,
record_date date
);
Note that I'm storing the id of the city rather than the name. Also, I've renamed date as it's not a good idea to name columns after SQL reserved words.
Ensure that we use IDs in the test data:
Insert Into weather Values (1, -5, 40, 25, '2018-01-10');
Insert Into weather Values (2, 5, 45, 15, '2018-02-10');
Insert Into cities Values (1,'A', '(12,10)');
Insert Into cities Values (2,'B', '(6,4)');
Insert Into cities Values (3,'C', '(18,13)');
Your old query:
Select * From cities, weather Where city = 'A'
The name was ambiguous because both tables have a city column, and the database engine doesn't know which city you mean (it doesn't automatically know if it needs to use cities.city or weather.city). The query also performs a cartesian product, as you have not joined the tables together.
Using the changes I have made above, you'd require something like:
Select *
From cities, weather
Where cities.city_id = weather.city_id
and city_name = 'A';
or, using newer join syntax:
Select *
From cities
join weather on cities.city_id = weather.city_id
Where city_name = 'A';
The two queries are functionally equivalent - these days most people prefer the 2nd query, as it can prevent mistakes (eg: forgetting to actually join in the where clause).
Both tables cities and weather have a column called city. On your WHERE clause you filter city = 'A', which table's city is it refering to?
You can tell the engine which one you want to filter by preceding the column with it's table name:
Select * From cities, weather Where cities.city = 'A'
You can also refer to tables with alias:
Select *
From cities AS C, weather AS W
Where C.city = 'A'
But most important, make sure that you join tables together, unless you want all records from both tables to be matched without criterion (cartesian product). You can join them with explicit INNER JOIN:
Select
*
From
cities AS C
INNER JOIN weather AS W ON C.city = W.city
Where
C.city = 'A'
In the example you mention, this query is used:
SELECT *
FROM weather, cities
WHERE city = name;
But in here, cities table has name column (instead of city which is the one you used). So this WHERE clause is linking weather and cities table together, since city is weather column and name is cities column and there is no ambiguity because both columns are named different.

DB: How to set up a many to many table(s) to handle multiple selectable conditions

I am working on a search filter for a website that will help users find a venue(for get-togethers and ceremonies) that meets their needs. Filters would include such things as: style, amenities, event type, etc. Multiple options in a category can apply to a venue, so a user can select multiple options from style, amenities and event type categories when searching.
My issue is in how I should approach the table design in the database. Currently I have a Venue table with a unique id and basic information, and a number of tables representing each category (style, amenities, etc) where they contain an id and name field.
I know that I need an intermediary table to hold foreign keys, so each option applicable to a category is associated to the venue.
Option 1: Create for each category table a many to many intermediary table with foreign keys to that category and the venue.
Option 2: Create one large intermediary table with foreign keys for every category, as well as the Venue
i.e.
fk_venue
fk_style
fk_amenities
...
I am trying to decide what is more efficient and less of a problem in coding for. Option 1 would require a query to each table which may become complicated to work with, where as option 2 seems easier to query but might have a much larger number of records to handle a venue with many amenities AND event types for example.
This doesn't seem like a new problem but I have had trouble finding resources that detail how best to approach this. We are currently using MSSQL for the DB and are building the site using .net core.
Go with option one. Create a join table to record the many-to-many relationships of each available feature of a venue. Option 2 is very wasteful in terms of storage. Consider a case where you have a venue with only one amenity, when 50 amenities types are available. Also, as I understand what you are suggesting for option 2, you would have to update your database design each time you add an amenity, event_type, or style. That would be a very difficult thing support wise.
In the case of Option 1, some of the tables would be:
Table Name: venue_amenities
Columns: venue_id, amenity_id
Table Name: venue_event_types
Columns: venue_id, event_type_id
Table Name: venue_styles
Columns: venue_id, style_id
When you query everything with a filter, you could query it like:
select distinct
v.venue_id
from venues v
inner join venue_amenities va on v.venue_id = va.venue_id
inner join venue_event_types vet on v.venue_id = vet.venue_id
inner join venue_styles vs on v.venue_id = vs.venue_id
where va.amenity_id in ([selected amenities])
and vet.event_type_id in ([selected event types])
and vs.venue_style in ([selected styles])
Option 3: You could start out with a meta data design. This would allow you to have multiple records per item or entity.
Often these things evolve with the development of tasks, or the evolution of the process and learning the data or the customer understanding some of the finer details that are drawn out as time goes on.
I've seen similar things where people design for hashtags or white lists, searching for that might get you closer to what you are looking for. Here is a working example to get you started.
declare #venue as table(
VenueID int identity(1,1) not null primary key clustered
, Name_ nvarchar(255) not null
, Address_ nvarchar(255) null
);
declare #venueType as table (
VenueTypeID int identity(1,1) not null primary key clustered
, VenueType nvarchar(255) not null
);
declare #venueStuff as table (
VenueStuffID int identity(1,1) not null primary key clustered
, VenueID int not null -- constraint back to venueid
, VenueTypeID int not null -- constraint to dim or lookup table for ... attribute types
, AttributeValue nvarchar(255) not null
);
insert into #venue (Name_)
select 'Bob''s Funhouse'
insert into #venueStuff (VenueID, VenueTypeID, AttributeValue)
select 1, 1, 'Scarrrrry' union all
select 1, 2, 'Food Avaliable' union all
select 1, 3, 'Game tables provided' union all
select 1, 4, 'Creepy';
insert into #venueType (VenueType)
select 'Haunted House Theme' union all
select 'Gaming' union all
select 'Concessions' union all
select 'post apocalyptic';
select a.Name_
, b.AttributeValue
, c.VenueType
from #venue a
join #venueStuff b
on a.VenueID = b.VenueID
join #venueType c
on c.VenueTypeID = b.VenueTypeID

Inserting multiple records in database table using PK from another table

I have DB2 table "organization" which holds organizations data including the following columns
organization_id (PK), name, description
Some organizations are deleted so lot of "organization_id" (i.e. rows) doesn't exist anymore so it is not continuous like 1,2,3,4,5... but more like 1, 2, 5, 7, 11,12,21....
Then there is another table "title" with some other data, and there is organization_id from organization table in it as FK.
Now there is some data which I have to insert for all organizations, some title it is going to be shown for all of them in web app.
In total there is approximately 3000 records to be added.
If I would do it one by one it would look like this:
INSERT INTO title
(
name,
organization_id,
datetime_added,
added_by,
special_fl,
title_type_id
)
VALUES
(
'This is new title',
XXXX,
CURRENT TIMESTAMP,
1,
1,
1
);
where XXXX represent "organization_id" which I should get from table "organization" so that insert do it only for existing organization_id.
So only "organization_id" is changing matching to "organization_id" from table "organization".
What would be best way to do it?
I checked several similar qustions but none of them seems to be equal to this?
SQL Server 2008 Insert with WHILE LOOP
While loop answer interates over continuous IDs, other answer also assumes that ID is autoincremented.
Same here:
How to use a SQL for loop to insert rows into database?
Not sure about this one (as question itself is not quite clear)
Inserting a multiple records in a table with while loop
Any advice on this? How should I do it?
If you seriously want a row for every organization record in Title with the exact same data something like this should work:
INSERT INTO title
(
name,
organization_id,
datetime_added,
added_by,
special_fl,
title_type_id
)
SELECT
'This is new title' as name,
o.organization_id,
CURRENT TIMESTAMP as datetime_added,
1 as added_by,
1 as special_fl,
1 as title_type_id
FROM
organizations o
;
you shouldn't need the column aliases in the select but I am including for readability and good measure.
https://www.ibm.com/support/knowledgecenter/ssw_i5_54/sqlp/rbafymultrow.htm
and for good measure in case you process errors out or whatever... you can also do something like this to only insert a record in title if that organization_id and title does not exist.
INSERT INTO title
(
name,
organization_id,
datetime_added,
added_by,
special_fl,
title_type_id
)
SELECT
'This is new title' as name,
o.organization_id,
CURRENT TIMESTAMP as datetime_added,
1 as added_by,
1 as special_fl,
1 as title_type_id
FROM
organizations o
LEFT JOIN Title t
ON o.organization_id = t.organization_id
AND t.name = 'This is new title'
WHERE
t.organization_id IS NULL
;

How to store cascade categories in DB

I have a problem about storing multiple categorical data. One category can have any size of cascade depth. I think it is not good idea to create more tables with relationships. What is the best way of storing this kind of categorical data.
ex categories:
-MainCategory1
-subcategory1
-subcategory11
-subcategory12
-subcategory13
--subcategory131
-subcategory2
-subcategory21
-subcategory22
-subcategory221
-subcategory23
-subcategory231
-subcategory2311
-MainCategory2
-subcategory21
-subcategory211
-subcategory2131
-subcategory2131
-subcategory212
-subcategory213
-subcategory2131
One common practice would be to create a single table where each category has an id, a name and a parent id (with top categories having parent id of null):
CREATE TABLE categories (
id NUMERIC PRIMARY KEY,
name VARCHAR(100),
parent_id NUMERIC FOREIGN KEY REFERENCES categories(Id)
)
Some of your data, e.g., would look like this:
INSERT INTO categories VALUES (1, 'MainCategory1', null);
INSERT INTO categories VALUES (2, 'subcategory1', 1);
You need to define parent child structure
CREATE TABLE CATEGORIES (ID INT, PARENT_ID INT, NAME VARCHAR)
then you select categories that have no PARENT_ID
SELECT * FROM CATEGORIES WHERE PARENT_ID IS NULL
they are masters and then on each layer you select
SELECT C.* FROM CATEGORIES C
INNER JOIN CATEGORIES C1 ON C1.PARENT_ID = C.ID
to get children of current record.
And then insert into categories
INSERT INTO CATEGORIES
SELECT 1, NULL, 'MainCategory1'
UNION ALL SELECT 10, 1, 'subcategory1'
UNION ALL SELECT 11, 10, 'subcategory11'
UNION ALL SELECT 12, 10, 'subcategory12'
UNION ALL SELECT 13, 10, 'subcategory13'
UNION ALL SELECT 131, 13, 'subcategory131'
UNION ALL SELECT 2, 1, 'subcategory2'
-- ...AND SO ON
SQL Server implements the hierarchyid data type.
You should consider using that.
http://msdn.microsoft.com/en-us/library/bb677173.aspx
This is a long-standing issue with using a relational database to contain hierarchical data. Generally it was done using self-referencing tables but that has always been an absolute bear to work with. I always find it somewhat fitting that the designer of Java noticed that the best hierarchical database in common use was the directory structure of the disks. So that's what Java uses!
Many DBMSs have enhancements to make working with hierarchical data easier. They work (or not) with varying degrees of success and difficulty for the developer. One method I recently worked out as a response to a question here at SO can be seen here. It's not complete. But actions such as moving a node or entire subtree from one place to another would just be a mathematical operation, not readjusting FK pointers.

Unique constraint on Distinct select in Oracle database

I have a data processor that would create a table from a select query.
<_config:table definition="CREATE TABLE TEMP_TABLE (PRODUCT_ID NUMBER NOT NULL, STORE NUMBER NOT NULL, USD NUMBER(20, 5),
CAD NUMBER(20, 5), Description varchar(5), ITEM_ID VARCHAR(256), PRIMARY KEY (ITEM_ID))" name="TEMP_TABLE"/>
and the select query is
<_config:query sql="SELECT DISTINCT ce.PRODUCT_ID, ce.STORE, op.USD ,op.CAD, o.Description, ce.ITEM_ID
FROM PRICE op, PRODUCT ce, STORE ex, OFFER o, SALE t
where op.ITEM_ID = ce.ITEM_ID and ce.STORE = ex.STORE
and ce.PRODUCT_ID = o.PRODUCT_ID and o.SALE_ID IN (2345,1234,3456) and t.MEMBER = ce.MEMBER"/>
When I run that processor, I get an unique constraint error, though I have a distinct in my select statement.
I tried with CREATE TABLE AS (SELECT .....) its creating fine.
Is it possible to get that error? I'm doing a batch execute so not able to find the individual record.
The select distinct applies to the entire row, not to each column individually. So, two rows could have the same value of item_id but be different in the other columns.
The ultimate fix might be to have a group by item_id in the query, instead of select distinct. That would require other changes to the logic. Another possibility would be to use row_number() in a subquery and select the first row.