Not sure if this is the right way to phrase the question but I have a database shown below
ProductA ProgramA
ProductB ProgramB
ProductC ProgramBoth
One issue I'm facing is when I put this into a dashboard and I use the dashboard to filter only ProgramA, I want to see both Product A and Product C. And when I filter ProgramB, I would like to see both Product B and Product C. Technically the user can select two of the programs in the dashboard drop down ("ProgramA + ProgramBoth), but they don't.
Am I pushing the limits of SQL? Is there a way around? As a note, I'm importing this from a Google Sheet, so I can change the underlying values if that's easier. In Google Sheets, I have a dropdown so only one value can be put in at a time (can be changed).
You are not pushing the limits of SQL :-)
What you are describing is a many-to-many relationship, and instead of thinking about a relationship of "both", think about it as a row for each relationship so for product C you will have 2 rows - one for program A and one for program B, something along the lines of:
CREATE TABLE Product_Programs
(
Product VARCHAR(10) NOT NULL REFERENCES Products(Product),
Program VARCHAR(10) NOT NULL REFERENCES Programs(Program),
PRIMARY KEY (Product, Program)
);
INSERT INTO Product_Programs (Product, Program)
VALUES ('ProductA', 'ProgramA'),
('ProductB', 'ProgramB'),
('ProductC', 'ProgramA'),
('ProductC', 'ProgramB');
Now you can easily query for any product participating in a program with
SELECT Product
FROM Product_Programs
WHERE Program = 'ProgramA';
Which will return both product A and product C.
HTH
Related
I've got a problem with an Asset Database that I have been developing for a customer in MSSQL. It entails capturing Required Actions, for example Lifting Equipment at a specific location needs to be inspected 6 months after purchase. The Due Dates for these required actions can be calculated in different ways but to simplify here will be calculated based on their Purchase Date.
So to that end I have a table called tblActionsRequired that contains the following relevant fields:
ActionID - for the action required
EquipmentCategoryID or EquipmentTypeID or EquipmentID - so either one of these fields are required. With this they specify that an action is required for either a category of equipment or an equipment type or a specific piece of equipment. So an example would be that a 2kg Powder Fire Hydrant would be an equipment type, it would fall into the category Fire Safety Equipment and there might be a specific 2kg Powder Fire Hydrant with an asset number of say PFH2KG001.
BasedAtID - the company's branches or sites
Ideally what I'd like to do is keep as much as possible in one query as opposed to creating separate queries or views for every combination and then adding them together using UNIONs. I have several other similar fields by which these required actions can be segmented so it may seem simple enough here to just use unions but I've calculated I would need to cater for 48 different combinations and probably create a View for each and then UNION them together!
So next I have tblEquipment that contains the following relevant keys:
EquipmentID - the primary key
EquipmentTypeID = foreign key, which Equipment Type this asset is a member of
BasedAtID - foreign key, which site the asset is located at
The Equipment Types then belong to Equipment Categories and the Categories then allow building a tree structure with parent-child relationships, but these I think I have sufficiently taken care of in creating a view called vwCategoryTree with the following fields:
ParentCategoryID
EquipmentTypeID
This view has been tested and checks out fine, it cuts through the tree structure and allows you to perform joins between EquipmentTypeID and their ultimate parents with EquipmentCategoryID.
What I need help with is how to do some sort of conditional join between tblActionsRequired and tblEquipment based on which of the fields EquipmentCategoryID, EquipmentTypeID, or EquipmentID have a value. If only EquipmentID or EquipmentTypeID could be specified then I think this would work:
ON (tblActionsRequired.EquipmentID IS NOT NULL AND tblEquipment.EquipmentID = tblActionsRequired.EquipmentID) OR (tblActionsRequired.EquipmentTypeID IS NOT NULL AND tblActionsRequired.EquipmentTypeID = tblEquipment.EquipmentTypeID)
But how do I bring a third table into this join to cater for EquipmentCategoryID or at least avoid having to use a UNION?
Sorry if something doesn't make sense, please just ask! Thank you very much!
One possible approach:
select ...
from tblEquipment e
left join vwCategoryTree c on e.EquipmentTypeID = c.EquipmentTypeID
join tblActionsRequired r
on (e.EquipmentID = r.EquipmentID or
e.EquipmentTypeID = r.EquipmentTypeID or
c.ParentCategoryID = r.EquipmentCategoryID)
Before I go asking more questions about the coding, I'd like to first figure out the best method for me to follow for making my database. I'm running into a problem with how I should go about structuring it to keep everything minimized and due to its' nature I have lots of re-occurring data that I have to represent.
I design custom shirts and have a variety of different types of shirts for people to choose from that are available in both adult and child sizes of both genders. For example, I have crewneck shirts, raglan sleeves, ringer sleeves and hoodies which are available for men, women, boys, girls and toddlers. The prices are the same for each shirt from the toddler sizes up to 1x in the adult sizes, then 2x, 3x, 4x and 5x are each different prices. Then there's the color options for each kind of shirt which varies, some may have 4 color options, some have 32.
So lets take just the crewneck shirts for an example. Men s-1x, Women s-1x, Boys xs-1x, girls xs-1x and toddlers NB-18months is a total of 22 rows that will be represented in the table and are all the same price. 2X and up only apply to men and women so that's 8 more rows which makes 30 rows total for just the crewneck shirts. When it gets into the color options, there's 32 different colors available for them. If I were to do each and every size for all of them that would be 960 total rows just for the crewneck shirts alone with mainly HIGHLY repeated data for just one minor change.
I thought about it and figured It's best to treat these items on the table as actual items in a stock room because THEY'RE REALLY THERE in the stock room... you don't have just one box of shirts that you can punch a button on the side to turn to any size of color, you have to deal with the actual shirt and tedious task of putting them somewhere, so I deciding against trying to get outrageous with a bunch of foreign keys and indexes, besides that it gets just as tedious and you wind up having to represent just as much but with a lot more tables when you could've just put the data it's linking to in the first table.
If we take just the other 3 kinds of shirts and apply that same logic with all the colors and sizes just for those 4 shirts alone there will be 3,840 rows, with the other shirts left I'm not counting in you could say I'm looking at roughly 10,000 rows of data all in one table. This data will be growing over time and I'm wondering what it might turn into trying to keep it all organized. So I figured maybe the best logic to go with would be to break it down like the do in an actual retail store, which is to separate the departments into men, women, boys, girls and babies. so that way I have 5 separate tables that are only called when the user decides to "go to that department" so if there's a man who wants the men shirts he doesn't have 7,000+ rows of extra data present that doesn't even apply to what he's looking for.
Would this be a better way of setting it up? or would it be better to keep it all as one gigantic table and just query the "men" shirts in the php from the table in the section for men and the same with women and kids?
My next issue is all the color options that may be available, as I said before some shirts will have as few as 4 some will have as many as 32, so some of those are enough data to form a table all on their own, so I could really have a separate table for every kind of shirt. I'll be using a query in php to populate my items from the tables so I don't have to code so much in the html and javascript. That way I can set it to SELECT ALL * table WHERE type=men and it will take all the men shirts and auto populate the coding for each one. That way as I add and take things to and from the tables it'll automatically be updated. I already have an idea for HOW I'm going to do that, but I can only think so far into it because I haven't decided on a good way to set the tables up which is what I'd have to structure it to call from.
For example, if I have all the color options of each shirt all on the same table versus having it broken down and foreign keys linking to other tables to represent them. that would be two totally different ways of having to call it forth, so I'm stuck on this and don't really know where to go with it. any suggestions?
Typically retail organization is by SKU (stock keeping unit). Department and color are attributes of a garment, not the way you identify the garment for the purpose of accounting or stocking.
CREATE TABLE Skus (
sku BIGINT UNSIGNED PRIMARY KEY,
description TEXT,
department VARCHAR(10) NOT NULL,
color VARCHAR(10) NOT NULL,
qty_in_stock INT UNSIGNED NOT NULL DEFAULT 0,
unit_price NUMERIC(9,2) NOT NULL,
FOREIGN KEY (department) REFERENCES Departments(department),
FOREIGN KEY (color) REFERENCES Colors(color)
);
This is better than splitting into five tables, because:
You can quickly get a sum of the total value of all your stock.
You can switch the department of a given SKU easily.
When someone buys a few garments, their order lineitems reference a single table instead of five different tables (that would be invalid for a relational database).
There are lots of other examples of tasks that are easier if similar entities are stored in one table.
I know you don't want to break it out into separate tables, but I think going the multiple table route would be the best. However, I don't think it is as bad as you think. My suggestion would be the following. Obviously, you want to change the names of the fields, but this is a quick representation:
Shirts
- id (primary key)
- description
- men (Y/N)
- women (Y/N)
- boy (Y/N)
- girl (Y/N)
- toddlers (Y/N)
Sizes
- id (primary key)
- shirt_id (foreign key)
- Size
Colors
- id (primary key)
- shirt_id (foreign key)
- Color
Price
- id (primary key)
- shirt_id (foreign key)
- size_id (foreign key)
- price
Having these three tables makes it so that you won't have to store all 10,000 rows in one single table and maintain it, but the data is still all there. Keeping your data separated into their proper places keeps from replicating needless information.
Want to pull all men's shirts?
SELECT * FROM shirts WHERE men = '1'
To be honest, you should really have at least 5 or 6 tables. One/two containing the labels for sizes and colors (either one table containing all, or one for each one) and the other 4 containing the actual data. This will keep your data uniform across everything (example: Blue vs blue). You know what they say, there is more than one way to skin a cat.
You need to think about a database term called 'normalization'. Normalization means that everything has it's place in the database and should not be listed twice but reused as needed. The most common mistake people make is to not ask or think about what will happen down the road and they put up a database that has next to no normalization, has massive memory consumed do to large datatypes, no seeding done, and is completely inflexible and comes at a great cost to change later because it was made without thinking of the future.
There are many levels of normalization but the most consistent thing is to think about a simple example I could give you to explain some simple concepts that can be applied to larger things later. This is assuming you have access to SQL management studio, SSMS, HOWEVER if you are using MYSQL or Oracle the principles are still very similar and the comments sections will show what I am getting at. This example you can self run if you have SSMS and just paste it in and hit F5. If you don't just look at the comments section although these concepts are better to see in action than to try to just envision what they mean.
Declare #Everything table (PersonID int, OrderID int, PersonName varchar(8), OrderName varchar(8) );
insert into #Everything values (1, 1, 'Brett', 'Hat'),(1, 2, 'Brett', 'Shirt'),(1, 3, 'Brett', 'Shoes'),(2,1,'John','Shirt'),(2,2,'John','Shoes');
-- very basic normalization level in that I did not even ATTEMPT to seperate entities into different tables for reuse.
-- I just insert EVERYTHING as I get in one place. This is great for just getting off the ground or testing things.
-- but in the future you won't be able to change this easily as everything is here and if there is a lot of data it is hard
-- to move it. When you insert if you keep adding more and more and more columns it will get slower as it requires memory
-- for the rows and the columns
Select Top 10 * from #Everything
declare #Person table ( PersonID int identity, PersonName varchar(8));
insert into #Person values ('Brett'),('John');
declare #Orders table ( OrderID int identity, PersonID int, OrderName varchar(8));
insert into #Orders values (1, 'Hat'),(1,'Shirt'),(1, 'Shoes'),(2,'Shirt'),(2, 'Shoes');
-- I now have tables storing two logic things in two logical places. If I want to relate them I can use the TSQL language
-- to do so. I am now using less memory for storage of the individual tables and if one or another becomes too large I can
-- deal with them isolated. I also have a seeding record (an ever increasing number) that I could use as a primary key to
-- relate row position and for faster indexing
Select *
from #Person p
join #Orders o on p.PersonID = o.PersonID
declare #TypeOfOrder table ( OrderTypeID int identity, OrderType varchar(8));
insert into #TypeOfOrder values ('Hat'),('Shirt'),('Shoes')
declare #OrderBridge table ( OrderID int identity, PersonID int, OrderType int)
insert into #OrderBridge values (1, 1),(1,2),(1,3),(2,2),(2,3);
-- Wow I have a lot more columns but my ability to expand is now pretty flexible I could add even MORE products to the bridge table
-- or other tables I have not even thought of yet. Now that I have a bridge table I have to list a product type ONLY once ever and
-- then when someone orders it again I just label the bridge to relate a person to an order, hence the name bridge as it on it's own
-- serves nothing but relating two different things to each other. This method takes more time to set up but in the end you need
-- less rows of your database overall as you are REUSING data efficiently and effectively.
Select Top 10 *
from #Person p
join #OrderBridge o on p.PersonID = o.PersonID
join #TypeOfOrder t on o.OrderType = t.OrderTypeID
I'm developing a program in Visual studio, it's a fairly simple menu system for a hypothetical restauraunt and essentially provides the users with a series of forms with questions, and eventually gives them options, they can then build up a list of meal items etc. and get given cost and similar.
Information on the products is stored in an SQL database , such as the name, price, calories etc.
I've had a look around however I'm struggling to, if it's even possible find a way of referencing a specific field or row within the database, either by a name or a Key.
Is this possible?
Regards.
I think all you are asking about is a Primary Key?
Example Table
Product
------------
ProductId|Name |Cost|
1 |Apple |1.50|
2 |Orange|2.00|
Example Query To Pull Only the Apple
SELECT Name, Cost
FROM Product
WHERE ProductId = 1
How to create a table that will automatically generate these unique keys in SQL Server
CREATE TABLE Product (ProductId INT IDENTITY(1,1), Name VARCHAR(50),
Cost FLOAT(2))
How data would be inserted to utilize the auto key generation. Followed with a way to get the new identity
INSERT INTO Product (Name, Cost) VALUES ('Apple', 1.50)
RETURN ##IDENTITY
Hopefully that gives you a push in the right direction?
I am wondering is it more useful and practical (size of DB) to create multiple tables in sql with two columns (one column containing foreign key and one column containing random data) or merge it and create one table containing multiple columns. I am asking this because in my scenario one product holding primary key could have sufficient/applicable data for only one column while other columns would be empty.
example a. one table
productID productname weight no_of_pages
1 book 130 500
2 watch 50 null
3 ring null null
example b. three tables
productID productname
1 book
2 watch
3 ring
productID weight
1 130
2 50
productID no_of_pages
1 500
The multi-table approach is more "normal" (in database terms) because it avoids columns that commonly store NULLs. It's also something of a pain in programming terms because you have to JOIN a bunch of tables to get your original entity back.
I suggest adopting a middle way. Weight seems to be a property of most products, if not all (indeed, a ring has a weight even if small and you'll probably want to know it for shipping purposes), so I'd leave that in the Products table. But number of pages applies only to a book, as do a slew of other unmentioned properties (author, ISBN, etc). In this example, I'd use a Products table and a Books table. The books table would extend the Products table in a fashion similar to class inheritance in object oriented program.
All book-specific properties go into the Books table, and you join only Products and Books to get a complete description of a book.
I think this all depends on how the tables will be used. Maybe your examples are oversimplifying things too much but it seems to me that the first option should be good enough.
You'd really use the second example if you're going to be doing extremely CPU intensive stuff with the first table and will only need the second and third tables when more information about a product is needed.
If you're going to need the information in the second and third tables most times you query the table, then there's no reason to join over every time and you should just keep it in one table.
I would suggest example a, in case there is a defined set of attributes for product, and an example c if you need variable number of attributes (new attributes keep coming every now and then) -
example c
productID productName
1 book
2 watch
3 ring
attrID productID attrType attrValue
1 1 weight 130
2 1 no_of_pages 500
3 2 weight 50
The table structure you have shown in example b is not normalized - there will be separate id columns required in second and third tables, since productId will be an fk and not a pk.
It depends on how many rows you are expecting on your PRODUCTS table. I would say that it would not make sense to normalize your tables to 3N in this case because product name, weight, and no_of_pages each describe the products. If you had repeating data such as manufacturers, it would make more sense to normalize your tables at that point.
Without knowing the background (data model), there is no way to tell which variant is more "correct". both are fine in certain scenarios.
You want three tables, full stop. That's best because there's no chance of watches winding up with pages (no pun intended) and some books without. If you normalize, the server works for you. If you don't, you do the work instead, just not as well. Up to you.
I am asking this because in my scenario one product holding primary key could have sufficient/applicable data for only one column while other columns would be empty.
That's always true of nullable columns. Here's the rule: a nullable column has an optional relationship to the key. A nullable column can always be, and usually should be, in a separate table where it can be non-null.
If I were to have an online shopping website that sold apples and monitors and these were stored in different tables because the distinguishing property of apples is colour and that of monitors is resolution how would I add these both to an invoice table whilst still retaining referential integrity and not unioning these tables?
Invoices(InvoiceId)
|
InvoiceItems(ItemId, ProductId)
|
Products(ProductId)
| |
Apples(AppleId, ProductId, Colour) Monitors(MonitorId, ProductId, Resolution)
In the first place, I would store them in a single Products table, not in two different tables.
In the second place, (unless each invoice was for only one product) I would not add them to a single invoice table - instead, I would set up an Invoice_Products table, to link between the tables.
I suggest you look into Database Normalisation.
A question for your data model is You need a reference scheme will you use to identify products? Maybe SKU ?
Then identify each apple as a product by assigning an SKU. Likewise for monitors. Then use the SKU in the invoice item. Something like this:
product {sku}
key {sku};
invoice_item {invoice_id, sku}
key {invoice_id, sku} ;
apple {color, sku}
key {color}
key {sku};
monitor {size, sku}
key {size}
key {sku};
with appropriate constrains... in particular, the union of apple {sku} and monitor {sku} == product {sku}.
So Invoice table has a ProductID FK, and a ProductID can be either an AppleID (PK color) or MonitorID (PK resolution)?
If so, you can introduce a ProductTypeID with values like 0=apple, 1=monitor, or a isProductTypeApple boolean if there's only ever going to be 2 product types, and include that in the ProductID table PK.
You also need to include the ProductTypeID field in the Apple table and Monitor table PK.
I like name-value tables for these...It might be easier to redesign so it goes 'Product' and then 'product details'...product details holds the product id, the detail type and then the value. This would allow you to hold apples and monitors in the same table regardless of identifying attribute (and leave it open for other product to be added later on).
Similiar approach can be taken in the invoice table...have a 'product_type' column that tells you which table to look into (apple or monitor) and then a 'product_id' that references whatever ID column is in the apple/monitor table. Querying on a setup like this is a bit difficult and may force you to use dynamic sql...I'd only take this route if you have no control over doing the redesign above (and other answers posted here refer to)
First solution is preferential I would think...change the design on this db to the name value pair with the products and you'll save headaches writing your queries later.