SQL Server constraints about Relations between tables

SQL Server constraints about Relations between tables - sql

I have been working on a SQL Server project that allows the users of a shopping website to insert their reviews for the product they bought.
Basically, I have 4 tables:
Customer: (Customer_ID, Username, Telephone_number, Grade)
Product: (Product_ID, Product_code, Name)
Review: (Review ID, Title, Content, Product_ID, Customer_ID)
Bill: (Bill_ID, Date, Product_ID, Customer_ID)
I've got two problems:
Firstly, I don't know how to force that only people who bought a product can review it.
Secondly, I don't know how to increase the grade in Customer table by a certain number of points (bonus points) after they review of a product.
Can anyone tell me how to solve these problems, especially in SQL Server code?

Several ways that you can do to protect your Review table from inserting such these records and it is best to handle these in your server-side or client-side code but as a design point of your DB, I think the best is to:
Design your Review table like:
(Review_ID, Title, Content, Bill_ID)
and set you Bill_Id column to not allow NULL, so that every review record must relate to a bill (shopping) record then you can handle error in your code which warns the users or...
Also if your grade is only about reviewing, you can set bonus (grade) a ratio of reviews so the grade would be like:
SELECT 5*COUNT(*) -- for example two reviews = 10 bonus
FROM Review
GROUP BY Customer_ID
And one more time I suggest you to handle all these in your code not in your DB.
Another suggestion is (If the logic and business of your application is based on database - which a shopping website is not!!) - is to create a stored procedure for INSERT operation like usp_ReviewInsert and call it in your code as a user wants to post a review, then your stored procedure handles all validating stuff (like relation between Bill and Review) and all updating stuff (like updating grade to a higher) and the insert operation in itself.

Related

Visual Studio reference specific (SQL) database item

I'm developing a program in Visual studio, it's a fairly simple menu system for a hypothetical restauraunt and essentially provides the users with a series of forms with questions, and eventually gives them options, they can then build up a list of meal items etc. and get given cost and similar.
Information on the products is stored in an SQL database , such as the name, price, calories etc.
I've had a look around however I'm struggling to, if it's even possible find a way of referencing a specific field or row within the database, either by a name or a Key.
Is this possible?
Regards.

I think all you are asking about is a Primary Key?
Example Table
Product
------------
ProductId|Name |Cost|
1 |Apple |1.50|
2 |Orange|2.00|
Example Query To Pull Only the Apple
SELECT Name, Cost
FROM Product
WHERE ProductId = 1
How to create a table that will automatically generate these unique keys in SQL Server
CREATE TABLE Product (ProductId INT IDENTITY(1,1), Name VARCHAR(50),
Cost FLOAT(2))
How data would be inserted to utilize the auto key generation. Followed with a way to get the new identity
INSERT INTO Product (Name, Cost) VALUES ('Apple', 1.50)
RETURN ##IDENTITY
Hopefully that gives you a push in the right direction?

How do I tie a table to a value (for example an ID) in SQL?

These are actually two questions relating to an online bookstore.
I have a table for customers in SQL, and it has all this customer info, including a customer ID. Then I have a table full of books on sale. Lastly, I have a table for a shopping cart.
Now the shopping cart is going to be a table very similar to the books table, only it will have less items since it'll contain some subset of the book table's contents.
I want the entire cart to be tied to a single customer's ID, and I want every entry in that cart to come directly from the Books table.
How to I go about defining such a table, I mean what statements do I need?
As it is, I'm confused about the issue, because the entire cart table essentially an attribute of a single customer, but I have no idea how to represent that in SQL. I want to be able to look up the cart table using the customer ID, basically.
Any help would be greatly appreciated.

You basically need these four tables:
create table books (book_id int, name varchar(200), author(200));
create table customers (customer_id int, name varchar(200));
create table carts (cart_id int, customer_id int);
create table cart_details (cart_id int, line_number int, book_id int, qty int, price numeric(18,2));
In essence, you will store objects like this:
Books into books table
Customers into customers table
Carts into carts and cart_details table. carts will represent a cart for a certain customer, and cart_details will represent cart's content.
Whenever you want to retrieve a cart corresponding to a certain customer you can just do:
select * from carts inner join customers using (customer_id);
If you also want cart's detail you can do:
select * from carts
inner join cart_details using (cart_id)
inner join customers using (customer_id)
;
Note: Tried to write examples in a as general as possible SQL syntax since you didn't provide the RDBMS you are using. Also, left out on purpose all details related to primary and foreign keys so you can understand tables and their attributes first.

You could address this by creating a CustomerCart table that references the Customer and Book tables. It could have a format like the following (I'm making this as non-implementation-specific as possible, partly because you haven't indicated what RDBMS you're using):
Composite primary key consisting of CustomerId + a CartId (allows a Customer to have multiple / saved / historical carts)
BookId (each record then represents an instance of a book from the catalog being added to a customer's cart)
Measures, such as Quantity, that relate to the instance of the given book in the given customer's cart

Think of your problem like this... one customer can have many books in their shopping cart, and one book can exist in many customers' shopping carts, so you have a many-to-many relationship.
With this in mind, the way to handle this problem is create a table that tracks the associations between customers and books, including any other information that is relevant.
create table shopping_cart (
customerId int,
bookId int,
quantity int,
etc...
)

It depends on the functionality you want to support with a cart.
The easiest would be to create a cart table with BookID and CustomerID. However if a customer or book is ever removed; you will, though cascade deletes if allowed, automatically remove something from someone's cart... or remove all their cart items for a person if the person is removed... do you want that to happen? If not then you need to keep all book and customer information in the cart.
Secondly, do you want to keep record of everything they've purchased over time in a "Cart" do you want them to be able to checkout with a subset of items in their cart? Is the life of the cart their visit to the site, or would the items remain months/years after they come back?
The answers to these types of questions determine the appropriate design.

Basic SQL Insert statement approach

Given that I have two tables
Customer (id int, username varchar)
Order (customer_id int, order_date datetime)
Now I want to insert into Order table based on customer information which is available in Customer table.
There are a couple of ways I can approch this problem.
First - I can query the customer information into a variable and then use it in an INSERT statement.
DECLARE #Customer_ID int
SELECT #Customer_ID = id FROM Customer where username = 'john.smith'
INSERT INTO Orders (customer_id, order_date) VALUES (#Customer_ID, GETDATE())
Second Approach is to use a combination of INSERT and SELECT query.
INSERT INTO Orders (customer_id, order_date)
SELECT id, GETDATE() FROM Customers
WHERE username = 'john.smith'
So my question is that which is a better way to proceed in terms of speed and overhead and why ? I know if we have a lot of information getting queried from Customer table then the second approach is much better.
p.s. I was asked this question in one of the technical interviews.

The second approach is better.
The first approach will fail if the customer is not found. No check is being done to make sure the customer id has been returned.
The second approach will do nothing if the customer is not found.
From an overhead approach why create variables if they are not needed. Set based sql is usually the better approach.

In a typical real-world order-entry system, the user has already looked the Customer up via a Search interface, or has chosen the customer from a list of customers displayed alphabetically; so your client program, when it goes to insert an order for that customer, already knows the CustomerID.
Furthermore, the order date is typically defaulted to getdate() as part of the ORDERS table definition, and your query can usually ignore that column.
But to handle multiple line items on an order, your insert into ORDER_HEADER needs to return the order header id so that it can be inserted into the ORDER DETAIL line item(s) child rows.

I don't recommend either approach. Why do you have the customer name and not the id in the first place? Don't you have a user interface that maintains a reference to the current customer by holding the ID in its state? Doing the lookup by name exposes you to potentially selecting the wrong customer.
If you must do this for reasons unknown to me, the 2nd approach is certainly more efficient because it only contains one statement.

Make the customer id in order table a foreign key which refers to customer table.

How do I optimise my voting application to produce monthly charts?

I'd appreciate any help you can offer - I'm currently trying to decide on a schema for a voting app I'm building with PHP / MySQL, but I'm completely stuck on how to optimise it. The key elements are to allow only one vote per user per item, and be able to build a chart detailing the top items of the month – based on votes received that month.
So far the initial schema is:
Items_table
item_id
total_points
(lots of other fields unrelated to voting)
Voting_table
voting_id
item_id
user_id
vote (1 = up; 0 = down)
month_cast
year_cast
So I'm wondering if it's going to be a case of selecting all information from voting table where month = currentMonth & year = currentYear, somehow running a count and grouping by item_id; if so, how would I go about doing so? Or would I be better off creating a separate table for monthly charts which is updated with each vote, but then should I be concerned with the requirement to update 3 database tables per vote?
I'm not particularly competent – if it shows – so would really love any help / guidance someone could provide.
Thanks,
_just_me

I wouldn't add separate tables for monthly charts; to prevent users from casting more than one vote per item, you could use a unique key on voting_table(item_id, user_id).
As for the summary, you should be able to use a simple query like
select item_id, vote, count(*), month, year
from voting_table
group by item_id, vote, month, year

I would use a voting table similar to this:
create table votes(
item_id
,user_id
,vote_dtm
,vote
,primary key(item_id, user_id)
,foreign key(item_id) references item(item_id)
,foreign key(user_id) references users(user_id)
)Engine=InnoDB;
Using a composite key on a innodb table will cluster the data around the items, making it much faster to find the votes related to an item. I added a column vote_dtm which would hold the timestamp for when the user voted.
Then I would create one or several views, used for reporting purposes.
create view votes_monthly as
select item_id
,year(vote_dtm) as year
,month(vote_dtm) as month
,sum(vote) as score
,count(*) as num_votes
from votes
group
by item_id
,year(vote_dtm)
,month(vote_dtm);
If you start having performance issues, you can replace the view with a table containing pre-computed values without even touching the reporting code.
Note that I used both count(*) and sum(vote). The count(*) would return the number of cast votes, whereas the sum would return the number of up-votes. Howver, if you changed the vote column to use +1 for upvotes and -1 for downvotes, a sum(vote) would return a score much like the votes on stackoverflow are calculated.

Database structure for storing historical data

Preface:
I was thinking the other day about a new database structure for a new application and realized that we needed a way to store historical data in an efficient way. I was wanting someone else to take a look and see if there are any problems with this structure. I realize that this method of storing data may very well have been invented before (I am almost certain it has) but I have no idea if it has a name and some google searches that I tried didn't yield anything.
Problem:
Lets say you have a table for orders, and orders are related to a customer table for the customer that placed the order. In a normal database structure you might expect something like this:
orders
------
orderID
customerID
customers
---------
customerID
address
address2
city
state
zip
Pretty straightforward, orderID has a foreign key of customerID which is the primary key of the customer table. But if we were to go and run a report over the order table, we are going to join the customers table to the orders table, which will bring back the current record for that customer ID. What if when the order was placed, the customers address was different and it has been subsequently changed. Now our order no longer reflects the history of that customers address, at the time the order was placed. Basically, by changing the customer record, we just changed all history for that customer.
Now there are several ways around this, one of which would be to copy the record when an order was created. What I have come up with though is what I think would be an easier way to do this that is perhaps a little more elegant, and has the added bonus of logging anytime a change is made.
What if I did a structure like this instead:
orders
------
orderID
customerID
customerHistoryID
customers
---------
customerID
customerHistoryID
customerHistory
--------
customerHistoryID
customerID
address
address2
city
state
zip
updatedBy
updatedOn
please forgive the formatting, but I think you can see the idea. Basically, the idea is that anytime a customer is changed, insert or update, the customerHistoryID is incremented and the customers table is updated with the latest customerHistoryID. The order table now not only points to the customerID (which allows you to see all revisions of the customer record), but also to the customerHistoryID, which points to a specific revision of the record. Now the order reflects the state of data at the time the order was created.
By adding an updatedby and updatedon column to the customerHistory table, you can also see an "audit log" of the data, so you could see who made the changes and when.
One potential downside could be deletes, but I am not really worried about that for this need as nothing should ever be deleted. But even still, the same effect could be achieved by using an activeFlag or something like it depending on the domain of the data.
My thought is that all tables would use this structure. Anytime historical data is being retrieved, it would be joined against the history table using the customerHistoryID to show the state of data for that particular order.
Retrieving a list of customers is easy, it just takes a join to the customer table on the customerHistoryID.
Can anyone see any problems with this approach, either from a design standpoint, or performance reasons why this is bad. Remember, no matter what I do I need to make sure that the historical data is preserved so that subsequent updates to records do not change history. Is there a better way? Is this a known idea that has a name, or any documentation on it?
Thanks for any help.
Update:
This is a very simple example of what I am really going to have. My real application will have "orders" with several foreign keys to other tables. Origin/destination location information, customer information, facility information, user information, etc. It has been suggested a couple of times that I could copy the information into the order record at that point, and I have seen it done this way many times, but this would result in a record with hundreds of columns, which really isn't feasible in this case.

When I've encountered such problems one alternative is to make the order the history table. Its functions the same but its a little easier to follow
orders
------
orderID
customerID
address
City
state
zip
customers
---------
customerID
address
City
state
zip
EDIT: if the number of columns gets to high for your liking you can separate it out however you like.
If you do go with the other option and using history tables you should consider using bitemporal data since you may have to deal with the possibility that historical data needs to be corrected. For example Customer Changed his current address From A to B but you also have to correct address on an existing order that is currently be fulfilled.
Also if you are using MS SQL Server you might want to consider using indexed views. That will allow you to trade a small incremental insert/update perf decrease for a large select perf increase. If you're not using MS SQL server you can replicate this using triggers and tables.

When you are designing your data structures, be very carful to store the correct relationships, not something that is similar to the correct relationships. If the address for an order needs to be maintained, then that is because the address is part of the order, not the customer. Also, unit prices are part of the order, not the product, etc.
Try an arrangement like this:
Customer
--------
CustomerId (PK)
Name
AddressId (FK)
PhoneNumber
Email
Order
-----
OrderId (PK)
CustomerId (FK)
ShippingAddressId (FK)
BillingAddressId (FK)
TotalAmount
Address
-------
AddressId (PK)
AddressLine1
AddressLine2
City
Region
Country
PostalCode
OrderLineItem
-------------
OrderId (PK) (FK)
OrderItemSequence (PK)
ProductId (FK)
UnitPrice
Quantity
Product
-------
ProductId (PK)
Price
etc.
If you truly need to store history for something, like tracking changes to an order over time, then you should do that with a log or audit table, not with your transaction tables.

Normally orders simply store the information as it is at the time of the order. This is especially true of things like part numbers, part names and prices as well as customer address and name. Then you don;t have to join to 5 or six tables to get teh information that can be stored in one. This is not denormalization as you actually need to have the innformation as it existed at the time of the order. I think is is less likely that having this information in the order and order detail (stores the individual items ordered) tables is less risky in terms of accidental change to the data as well.
Your order table would not have hundreds of columns. You would have an order table and an order detail table due to one to many relationships. Order table would include order no. customer id 9so you can search for everything this customer has ever ordered even if the name changed), customer name, customer address (note you don't need city state zip etc, put the address in one field), order date and possibly a few other fields that relate directly to the order at a top level. Then you have an order detail table that has order number, detail_id, part number, part description (this can be a consolidation of a bunch of fields like size, color etc. or you can separate out the most common), No of items, unit type, price per unit, taxes, total price, ship date, status. You put one entry in for each item ordered.

If you are genuinely interested in such problems, I can only suggest you take a serious look at "Temporal Data and the Relational Model".
Warning1 : there is no SQL in there and almost anything you think you know about the relational model will be claimed a falsehood. With good reason.
Warning2 : you are expected to think, and think hard.
Warning3 : the book is about what the solution for this particular family of problems ought to look like, but as the introduction says, it is not about any technology available today.
That said, the book is genuine enlightenment. At the very least, it helps to make it clear that the solution for such problems will not be found in SQl as it stands today, or in ORMs as those stand today, for that matter.

What you want is called a datawarehouse. Since datawarehouses are OLAP and not OLTP, it is recommended to have as many columns as you need in order to achieve your goals. In your case the orders table in the datawarehouse will have 11 fields as having a 'snapshot' of orders as they come, regardless of users accounts updates.
Wiley -The Data Warehouse Toolkit, Second Edition
It's a good start.

Our payroll system uses effective dates in many tables. The ADDRESSES table is keyed on EMPLID and EFFDT. This allows us to track every time an employee's address changes. You could use the same logic to track historical addresses for customers. Your queries would simply need to include a clause that compares the order date to the customer address date that was in effect at the time of the order. For example
select o.orderID, c.customerID, c.address, c.city, c.state, c.zip
from orders o, customers c
where c.customerID = o.customerID
and c.effdt = (
select max(c1.effdt) from customers c1
where c1.customerID = c.customerID and c1.effdt <= o.orderdt
)
The objective is to select the most recent row in customers having an effective date that is on or before the date of the order. This same strategy could be used to keep historical information on product prices.

I myself like to keep it simple. I would use two tables: a customer table and a customer history table. If you have the key (e.g. CustomerID) in the history table there is no reason to make a joining table, a select on that key will give you all records.
You also don't have audit information (e.g. date modified, who modified etc) in the history table as you show it, I expect you want this.
So mine would look something like this:
CustomerTable (this contains current customer information)
CustomerID (distinct non null)
...all customer information fields
CustomerHistoryTable
CustomerID (not distinct non null)
...all customer information fields
DateOfChange
WhoChanged
The DateOfChange field is the date the customer table was changed (from the values in this record) to the values in a more recent record of the values in the CustomerTable.
You orders table just needs a CustomerID if you need to find the customer information at the time of the order it is a simple select.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas