Normalizing Data and Insert Into SQL - sql

I have what may be a stupid question but here it goes.
I have an ORDER_T table, a CUSTOMER_T table, and a ORDERLINE_T table.
I also have a set of data I need to normalize. Each record in this "bad data" has up to 3 items stored in it in attributes called Item1, Item2, and Item3. I thought I was normalizing it correctly by taking each item, separating it, and having it constitute it's own record was good. For example
ORDER_T
OrderID ItemID ItemDescription CustomerID
1 1001 Apple 100
1 1002 Grape 100
1 1003 Pear 100
OrderID is the PK and CustomerID is the FK. I realize thought as I tried to INSERT INTO my DB that it complained of multiple duplicate records via the PK. Duh--that makes sense. Now my question is:
I believe I am wrong but what would be the correct way to normalize data (to the third form) where each OrderID consists of multiple items? Is having attributes such as Item1, Item2, Item3, etc. "bad form" where it is not scalable and statically set like that? Am I overthinking it and should have simply left it alone?
I just believe I need some direction and I'll be good to go.

you need next tables:
all unique customers
customers:
CustomerId (PK)
Name
all unique items
Items:
ItemId (PK)
ItemName
all unique orders:
Orders:
OrderId (PK)
CustomerID (FK)
OrderDate
and then you need many-to-many relationship table:
OrderItems:
OrderId (FK)
ItemId (FK)
count
primary key (OrderId, ItemId)
then you will be able to insert order (which can be empty), then add/remove items from this order via OrderItems table

Related

SQL ecommerce database. how to relate foreign keys in same row?

I am trying to create a sql table for orders. We have another table what has a primnary key of productID. When a customer creates an order it should list the orderID as well total and a foreignkey of productID. The issue I have is that it only allows 1 productID.
Is there a way for sql to add multiple foreign keys to the same row for the same item? If that makes sense?
I placed both tables here to try and show what I meant.
Your table structure only allows one product per order, because you've got a single productId column on the orders table.
To allow multiple products per order, I would create an orderItems table. Each orderItem has a different productId, and links back to the orders table via an orderId. Like this:
------------------------
orders table
------------------------
orderId (primary key)
orderDate
orderTotal
customerId (foreign key)
specialInstructions
-------------------------
orderItems table
-------------------------
orderItemId (primary key)
orderId (foreign key)
productId (foreign key)
quantity
-------------------------
products table
-------------------------
productId (primary key)
productTitle
productDescription
productPrice

How to choose primary key and normalize this relation schema?

Suppose I have a table with following attributes
order id
item id
item quantity
item unit price
item payment
where "item payment = item unit price x item quantity".
Let us simplify the situation, and assume each order has any quantity of just one item id, and different orders may have the same item id.
What is the primary key, "order id", "order id" and "item id", or
something else?
How can it be normalized into 3NF?
Here is a solution that I am thinking:
a table with order id (primary key), item id, item quantity, and item payment
a table with item id (primary key, and foreign key to the previous table), and item unit price.
Continue with the tentative solution I gave in part 2. In the first table, for each item id, item payment is
proportional to item quantity. If the primary key of the first table
is order id, item payment depends on non-primary-key attribute item quantity, which
violates 3NF requirements of no transitivity.
Shall I split the first table into:
a table with order id (primary key), and item id
a table with item id (primary key, and foreign key to the table before), item quantity, and item payment
or into:
a table with order id (primary key) and item id
a table with item unit price (primary key, and foreign key to the original second table), item quantity, and item payment?
Thanks.
You should probably have a table with OrderID as the primary key. This is because you're likely to have attributes that have non-trivial Functional Dependencies on the order (eg. order_date, order_status, CustomerID) that are not dependent on a line in the order.
You should also again have a table where ItemID is the primary key. Again it will have attributes that would have a functional dependencies on ItemID (e.g. description, price, etc)
Finally you'd have a third table. This table would have Foreign keys to Order and Item. These keys would represent a candidate compound key. You could either use this or create a surrogate Primary Key OrderItemID. If you do create a surrogate key I would still be sure to create a unique key (OrderID, Item).
+----------------+ +----------------+
| OrderID | | ItemID |
+----------------+ +----------------+
| CustomerID | | Description |
| OrderDate | | Price |
| Status | +----------------+
| Payment | |
+----------------+ |
| |
| |
| +---------------------+ |
| | OrderItemID | |
| +---------------------+ |
+-------+ OrderID FK U1 +---+
| ItemID FK U1 |
| Quantity |
+---------------------+
Let's not talk of IDs at the start...
There are orders. Orders usually have an order number that you can have printed on the invoice etc. An order has an order date, and a supplier when this is about orders you place with your suppliers or a client when this is about orders your clients place with you.
There are items that can be ordered. Items have an item number, e.g. a GTN (Global Trade Number). Items have a name and a price or even a price list for different dates, different customers, whatever.
An order can contain several items usually, e.g. 5 pieces of item A and 10 pieces of item B. These are order positions containing item, amount and price.
That could be the tables (primary key bold, unique columns italic):
client (client_number, client_name)
item (item_number, item_name, price)
order (order_number, order_date, client_number)
order_position (order_number, item_number, amount, price)
You would not store single price and amount and total price, as this would be redundant. Avoid redundancy in a database, for this can result in a lot of problems.
You can use technical IDs in your tables. You can make these the tables' primary keys, but you'll have to store all data mentioned above still, and what was a primary key before is then a column or set of columns that is defined non-nullable and unique which is literally the same as a primary key:
Tables (primary key bold, unique columns italic):
client (client_id, client_number, client_name)
item (item_id, item_number, item_name, price)
order (order_id, order_number, order_date, client_number)
order_position (order_position_id, order_id + item_id, amount, price)
This looks like an order line. Typically you'd have a primary key of order id and order line number. item id should be a foreign key from your items table, which should have a price, but unless you never give discounts, your order line should have a price, too.
Having the amount paid against an order line is OK, if you allow partial payments and want to track it at that level.
Guessing via your names & common sense, a 3NF decomposition of your table is
-- order order_id requests item item_id in quantity item_quantity
order_has_item_in_quantity(order_id, item_id, item_quantity)
-- item item_id has unit price item_unit_price
item_has_unit_price(item_id, item_unit_price)
-- some order requests some item in quantity item_quantity
-- and that item has unit price item_unit_price
-- and item_unit_price * item_quantity = item_payment
unit_price_and_quantity_has_payment(item_unit_price, item_quantity, item_payment)
However, if you already have access to a multiplication table (which is a constant), which you do in an SQL query (via operator *), then your design doesn't need column item_payment in the original and consequently its decomposition doesn't have table unit_price_and_quantity_has_payment--it is a certain restriction of the multiplication table; it is a certain function of the multiplication table & the first two tables.
As to my guesses, and relevant CKs (candidate keys), and justification: Normalization uses FDs (functional depencies), and you haven't mentioned them, so it doesn't seem like you have even a basic idea of what you are doing. So right now your question is just asking for some chapter(s) of some textbook(s). That's too broad--read some. None of the answers here correctly explain or reference how to do this--they are useless for the next case & unjustifiable for this case. Moreover they are all guessing at your specification but should be asking you for appropriate info.

Adding a unique constraint for a relationship 2 degrees away

Consider this database with 4 tables (primary-keys in asterisks):
Products( *ProductId*, SkuText, ... )
ProductRevisions( ProductId, *RevisionId*, ... )
Orders( *OrderId*, ... )
OrderItems( *OrderId*, *ProductRevisionId*, Quantity, ... )
The idea being that a Product SKU can have multiple revisions (e.g. a 2016 version of a product compared to its 2015 version). The business rules are such that an Order for a Product can only have a single ProductRevision, e.g. an order cannot request both the 2014 and 2016 versions of the same product, they can only have the 2014 or 2016 version.
Ordinarily this wouldn't be a problem: the OrderItems table would have a ProductId column with a UNIQUE constraint on OrderId and ProductId. However because OrderItems's references ProductRevisionId (so the reference to the ProductId is indirect) it means a simple UNIQUE constraint fails and the schema would accept the following data, even though it is invalid as-per the business rules:
Products
ProductId, SkuText
1, 'Kingston USB Stick'
ProductRevisions
ProductId, RevisionId, ...
1, 1, '2014 model'
1, 2, '2016 model'
Orders
OrderId
1
OrderItems
OrderId, ProductRevisionId, Quantity
1, 1, 100
1, 2, 50 -- Invalid data! Two revisions of the same Product should not be in the same order.
What I need is something like this:
ALTER TABLE OrderItems
ADD CONSTRAINT UNIQUE ( OrderId, SELECT ProductId FROM ProductRevisions WHERE RevisionId = OrderItems.ProductRevisionId )
I don't want to denormalize my OrderItems table by adding an explicit ProductId column because that adds a potential point of failure if the parent/child relationship between a given ProductId and ProductRevisionId were to change then the data becomes invalid.
What are my options?
This is really more of a comment, but it is too long.
One option is to create a trigger. This allows you to validate the data using any rules that you want. However, triggers are cumbersome and unnecessary.
Another option is essentially what you say: include both Product and ProductRevision in OrderLines. However, this doesn't quite solve the problem. You need to ensure that the product actually matches the product on the revision.
I am thinking that the best option might be to have a Revision column in ProductRevisions. So, this table would have:
ProductRevisionId -- primary key for the table
ProductId
RevisionId
unique constraint on (ProductId, RevisionId)
The foreign key constraint in OrderLines can then have two columns in it -- (ProductId, RevisionId). Then a unique constraint on (OrderId, ProductId) ensures only one revision.
The downside to this method is that a product can only appear on only one line in each order. However, you don't need triggers.
You can create an indexed view to enforce the constraint. In your case, it'd be something like:
create view [OrderItemProductRevisions]
with schemabinding
as
select oi.OrderID, pr.ProductID
from dbo.OrderItems as oi
join dbo.ProductRevisions as pr
on oi.ProductRevisionID = pr.ProductRevisionID
go
create unique clustered index [CUIX_OrderItemProductRevisions]
on [OrderItemProductRevisions] (OrderID, ProductID)
go
Now, if you try to add two revisions of the same product to the same order, you should violate the unique index on the view and it will be disallowed.

Many-to-Many relationship with same table and with relationship constraint

I have a SellerProduct table. Each row within the table represents product information as offered by a seller. The SellerProduct table has the following columns:
id (serial, pk)
productName (nvarchar(50))
productDescription (ntext)
productPrice (decimal(10,2))
sellerId (int, fk to Seller table)
A product may be the same across sellers, but the productName, productDescription and productPrice can vary per seller.
For example, consider the product TI-89. Seller A may have the the following information for the product:
productName = TI-89 Graphing Calc
productDescription = A graphing calculator that...
productPrice 65.12
Seller B may have the the following information for the product:
productName = Texas Instrument's 89 Calculator
productDescription = Feature graphing capabilities...
productPrice 66.50
Admin users will be required to identify that products are the same across various sellers.
I need a way to capture this information (i.e. products are the same across sellers). I could create another table called SellerProductMapper as follows:
sellerProductId1 (int, pk, fk to SellerProdcut table)
sellerProductId2 (int, pk, fk to SellerProdcut table)
The problem with this approach is that it permits sellerProductId1 and sellerProductId2 to be from the same seller for a given row. That should not be allowed.
How can I capture this many-to-many relationship while enforcing this constraint?
You need something that you don't currently have: a "Product Identity" table. If I were designing it, it would have a product ID, Manufacturer's product code, and manufacturer's description. Then the entries in SellerProduct would reference the seller and the product, and you could enforce the constraint with a unique index on the combination of seller and product.
You are coming across your issue because you actually have a more serious data problem with how your table design is laid out.
Your id field does not uniquely identify your data; Making sure every column is dependent on this field is paramount to proper normalization. You should never be in the situation where you need a human pair of eyes to identify two different pieces of data which actually represent the same thing. If I had to guess that id field is probably just an incremented key... ditch this for a truly unique identifier... such as composite key of the manufacturer and the manufacturer's serial number so you know you cannot have two of the same product
Your sellerID field belongs in a different table entirely. A product is just that... a single entity which represents an object. A seller on the other hand is a separate entity that provides a product for sale. Since a seller can have many products and a product can be sold by many sellers, you need a bridge entity (aka a composite entity) to eliminate the many-to-many relationship. If you split the SellerID info from your product table you will have something like this:
Product Table
serialnumber pk
manufacturer pk
productName
productDescription
SellerProducts Table (bridge entity between product and seller)
sellerID pk
manufacturer pk
serialnumber pk
Price
Seller Table
sellerID pk
Name
Location
Other seller based info, etc...
This information is more normalized with productName and productDescription dependent on the primary key of the Product table and price dependent on the primary key of the SellerProducts table.
Unfortunately, cleaning up your data will most likely prove to be tedious... but unless you address this normalization issue now, your problems will only keep compounding until the database is impossible to maintain.

Storing invoices in a database

I am making a piece of invoicing software and I want it to save each individual invoice.
The user creates invoices by selecting a customer, as well as however many items are being billed to the customer. Seeing as most invoices will have multiple items, what is the best way to save them to the database without being incredibly redundant? I'm willing to rearrange my entire database if need be.
My tables look like this:
Customers Table:
Id / Primary key
FullName
Address
Phone
Items Table (a table of products offered):
Id / Primary key
ItemName
Price
Description
Invoices Table (saved invoices):
Id / Primary key
CustId / Foreign key is Id in Customer table
ItemId / Foreign key is Id in Item table
Notes
You need another table to store invoices (what you call Invoices now actually stores invoice items).
Customer
id
name
etc.
Item
id
name
etc.
Invoice
id
cust_id
date
InvoiceItem
id
inv_id
item_id
This is the classic way of modeling a many to many relationship using a junction table (i.e. InvoiceItem).
It looks like you will actually want a 4th table to join them. To normalize your data, only keep on each line things that are specific to that invoice
Invoices table
Id / Primary key
CustId / Foreign key is Id in Customer table
Notes
Invoice Items table
InvoiceId
ItemId