Adding Records and Decreasing Quantity - sql

I'm starting programming in pl/sql and I have some problems. I'm making a database, which includes tables like:
TRAINS: ID_TRAIN, NAME, TYPE
CARS: ID_CAR, TYPE, SEATS, QUANTITY
TRAINS_W_CARS: ID_TRAIN (FK), ID_CAR (FK), QUANTITY
I want to make a procedure/function, which will decrease amount of available cars, when I add some to train.
For example: I have 50 available cars with 2 class, and when I add 5 of them to the some train, this amount will be decreased to 45.
To be honest, I don't really know how to go about it, because I have not done so complicated things this far.

As an alternative, consider modifying the nature of what you're storing in the first place.
Currently it seems like you're storing one record which is a count of 50 "cars", and one record which is a count of 0 "train cars". So you have to manually keep track of moving these counts from one record to another. This is not only error-prone, but can get confusing fast. Mostly because there's no actual record of what cars are where, there are just counts.
To update, you have to do the following:
Decrement the car count by 1.
Increment the train car count by 1.
Instead, consider something like the following...
The CARS table has 50 records, each representing an actual car.
The TRAINS table has 1 record, representing an actual train.
The CARS table has a FK to the TRAINS table. This identifies to which train that car belongs. This value is nullable for cars which aren't currently connected to a train.
In this scenario, you have to do the following:
Update the FK of the CARS record.
It's a single atomic operation which either succeeds or fails. No half-committed data, no missing cars, etc.
Each record in your data should represent "a thing". Not counts and descriptions of things, but actual things. And in this scenario you also don't need that linking table, because "trains" and "cars" are inherently not many-to-many things. (How can a car simultaneously be connected to two trains?)
Always consider the real-world concept being modeled in your data. You have:
A train, which can have many cars
A car, which can have many seats
A seat
A passenger
etc.
Build the relationships based on the actual real-world objects, not on the screens and reports of data that you want to see. Those reports can be easily generated from the real-world data, but real-world data can't always be generated from flat reports. (For example, in your current setup you can look at a report of how many cars are in a train, but you don't know which ones.)

I don't know whether you are inserting rows in your table manually or from some UI button click but you can do below:
Add a before/after insert trigger in your CARS table and in that trigger write whatever you have to modify in your TRAINS table.

Related

SQL schema design: two tables vs adding columns to the same table

This question is about design decision, hence might be a bit opinionated:
Imagine you are designing database for a car dealer where they ONLY auction cars. Some cars are for display only, and some cars are to be sold in auction.
I have a Car entity with 10 attributes: ID, Model, Mode, YearMade, IsDisplayOnly....
Now, I want to add selling price and selling notes to those cars that are for sale (i.e. IsDisplayOnly = false)
I image that there are two ways this can be done:
Add Price and PriceNotes columns into the Car table, knowing that they are always null for IsDisplayOnly = true cars, and those that haven't been sold at auction yet.
Add a new table SaleInfo with 3 columns: CarID, Price, PriceNotes where CarID is the PK and also FK pointing to the ID column in the Car table.
Which option would align most with the best schema design practice? Why?
You should have one car for cars and the attributes of cars. You should have a separate table for the cars for auction.
Why? These are different entities. Your problem definition suggests an auction table. That auction table should have a foreign key references to the cars that are available for auction. A separate table ensures that that foreign key reference is valid.
There are some other reasons that are not apparent in your simplified example. Notes and prices might change over time, so they should be going into a history table. Display cars have other attributes, like the period of time when they are on display and how they are ultimately disposed of. This suggests that they too have particular attributes.
My advice would be to use three tables:
-The first to store all the makes and models of the cars. As well as their costs(eg Honda something or other selling for X amount of money)
-The second to store the details of the individual vehicles, containing a foreign key to the primary key of one of the Make/Model stored in the first table, as well as individual details such as the color, VIN no. etc. As well as whether they can be sold or not.
-The third table would contain the details of each individual purchases, linked to the table containing each individual vehicle, this would be linked to the table containing the details of each individual vehicle, with each purchase connected to a single instance. On the table of vehicles.
The advantages for this layout is that you are actually going to end up using less storage space in the long run, as instead of having the same three fields (The make, model and year) repeating for every vehicle, you will only have a single field to represent that data instead of multiple redundant fields. Another advantage will be searching, as if you are searching for details of individual vehicles of the same brand/type, you will be able to search using only one field, the key linked to the table containing the make and model. This would drastically decrease search times and improve the effectiveness of the system overall.

Designing a cube

I’ve been asked create our analysis cube and have a design question.
We sell ‘widgets’ and ‘parts’ to go with those widgets. Each order has many widgets and sometimes a few parts.
What I’m stuck on is – to me, an order is a fact in a measure. But, what are the widgets? Are they a dimension and each fact in the measure will be an entry for every part and widget for the order.
So, if order 123 had widget 1 and widget 2 and part 5, then there will be 3 facts in the measure for the same order? Is that correct?
At its basic level you can consider most facts to be transactions or transaction line items. So, for example, you may have a 'sales' fact table in which each record represents one line item from that sale. Each fact record would have numeric columns representing metrics and other columns joining to dimension tables. The combination of those dimensions would describe that line item. So, in your case, you likely have something like:
1) A 'date' dimension detailing the date of the transaction
2) A 'widget' dimension detailing the widget sold on that transaction
3) A 'customer' dimension detailing the customer who bought that item (almost certainly the same customer would appear on every line item for this transaction)
4) ... determined by what information you have and what business problem you're trying to solve.
Now, the dimension tables contain further details. For example, your widget dimension table likely contains things like the name of the widget, the color, the manufacturer, etc. Every time your company sells one of these widgets, the record in the fact table links to that same dimension record for that name, color, manufacturer, etc. combination (i.e. you don't create a new dimension record every time you sell the same item - this is a one-to-many relationship - each dimension record may have many related fact records).
You other dimension tables would similarly describe their dimensions. For example, the customer dimension might give the customer's name, their address, ...
So, the short answer to your question is that widget likely is a dimension, items and widgets may (or may not) actually be the same dimension (in a school class I suspect that they are), and that you would have 3 fact records for that one transaction.
This is probably along the same lines as the prior answer but....
If you try and model "many widgets per order" you'll have issues because you end up with a many (order fact) to many (widgets) relationship. In a cube / star schema design, many to many relationship usually need to be moddeled out to be many to one in some way.
So what you do is try and identify what special thing identifies an "order" (as opposed to a bunch of widgets in an order). Usually that is simply stuff like order date, customer, order number, tax
An example way to model this is:
If you have a single order with five widgets, you model that as a fact table with five records that happens to have a repeating widget, customer, date etc. in it
Then you have to work out how you spread an order header tax amount over five records. The two obvious solutions are:
Create a widget that represents tax and add that as another record
Spread the tax over five records, either evenly or weighted by something
Modelling "parts" just takes these concepts further.
It is important to understand what the end user wants to see, why they want to see parts. What do they want to measure by parts, how do you assign higher level values (like tax) down to lower levels like parts.

Database model with historical data

Firstly, I am going to explain my problem by using example from real life.
Let’s say that we are company and we are selling different means of transport, e.g. cars, buses, trucks, trains, planes, etc.
Let's say that we have around 10.000.000 different items with daily changes.
For each item we have an unique name (e.g. car Audi A8 X or plane Boing 747-200B Y) where X and Y are unique values.
Don’t worry about naming because it works just fine.
For each item we also have some special data. Data depends on type, e.g. for car: dimensions (length, width, height …), powertrain, etc. For planes we have e.g. length, interior width, wingspan, wing area, wing sweep, etc.
And now the problem … I would like to put all this data from different Excel files and paper to database.
Question 1: Which database model is better?
Idea #1: I am going to create one table, called items where I am going to store only name of product which we are selling (e.g. car Audi A8 X, plane Boing 747-200B Y, etc.). And than in other tables (car, plane, train …) I will store extra data for cars / planes / trains.
So if I would like to get all data of e.g. car than I will have to check table car. If I would like to get all data of e.g. train than I will have to check table train.
Idea #2: Should I create one table where I am going to store all item’s names (just like in Idea #1, items). And than additional pivot table (e.g. data with fields: item, key, value) where I will be able to find all informations?
Question 2: I need history of all data. In first case I will have to duplicate row from e.g. table car just because one fields is different. But for Idea #2 ... for all rows in pivot table data would be necessary to have information if data is valid (or when was valid).
Can you please help me? I have no idea which model is better or what is actually using in production. Also ... is there any good book about storing historical data to database?
Thanks!
You present two problems to us. The first is organizing specialized data about subtypes (cars, buses, trucks, etc.). The second is dealing with temporal (historical) data.
Your idea #1 resembles a design pattern known as "Class Table Inheritance". If you will do a search on this phrase, you will find many articles outlining exactly how it works. These will pretty much confirm your initial reaction, but they will add lots more helpful detail. You will also find numerous references to previous Q&A entries in this site, and in the DBA site.
For an alternate design, look up "Single Table Inheritance". This stores everything in a single fat table, with NULLS in spaces that don't pertain to the case at hand.
I am not sure what you mean by storing something in a pivot table. I'm familiar with pivot tables in Excel, but I have always used them as results calculated from ordinary tables where the data is stored.
How to deal with historical data is a separate issue.

I need help normalizing this table to 3N into the E-R Model

I need help getting this table "Route Sheet" to 3rd Normal form. I was doing it in class but my teacher keeps telling me that some things are wrong and I want to see how you really need to normalize this in the E-R Model. Any help with the just the final process would be appreciated, I really need to learn this before my exam. (I need to use foreign keys and primary keys too)
Note: There can be more vehicles, more than one type of vehicles and more drivers.
Here is a pic: http://i60.tinypic.com/so0chy.png
Here is what i have done so far of the E-R model in powerdesigner: http://i62.tinypic.com/2zoyse0.png
The arrow means many (if no arrow it means one), so i would be one to many.
Let's look at what you have. Think about which pieces of data are in a one to many relationship or a many to many relationship. Those will indicate the need for more tables. First you have vehicles; that would be one table. You have vehicle types; that would be a lookup table that is a parent to the vehicle table as there will be many vehicles for each type.
Now you have drivers. Depending on the information you have about drivers, you may need separate tables for things like addresses or phone numbers if a driver can have more than one.
Then you have trip data which would include the pk for both the vehicle and driver (Assumption one vehicle, one driver per trip; if that is not true, then another table will be needed) as well as the start and stop times.
Then you have the details about the trips which go into the trip detail table but some of those things are going to be lookups from other tables.
Such as customers (there is another set of tables there as customers are likely to have more than one possible address, so customers and customer addresses).
Also you probably need a product table that defines the products you have available for delivery. In the detail table, though, you need the product_id and then add the quantity and price if you need those. This is because those are historical values and you can't use the price in the product table as that changes over time, but you need a record of what it actually was at that point in time.
It also appeared there were some other values that would make sense as lookup tables that are parents of the trip detail table.

Normalization in database with countries as columns

This has been bugging me for a while, consider a table with attributes like this: {ID, Value, Australia, India, France, Germany}, where ID is the primary key, Value is some text, say car-model and under each attribute like Australia, India is the number of cars manufactured corresponding to that value.
Intuitively I know that the correct way to put this by {ID, Value, Cars-Manufactured, Country} , but can someone tell me why this is correct in terms of database normalization? Which normalization does the first table not meet. Or is the first table correct too?
The rule it violates is "no repeating groups". This is one of the rules for first normal form.
A column for each country is a repeating group. The data under each column is the same data, just applicable in a different context. When there is only one value there -- like number of cars made in that country -- this may not be obvious, maybe it's even debatable. But suppose we need two pieces of information for each country, like number manufactured and number sold. Now the table has a set of paired columns: Australia_manufactured, Australia_sold, India_manufactured, India_sold, France_manufactured, France_sold, etc. You have a set of two columns repeated multiple times.
Someone could ask, What is the difference between multiple distinct fields and a repeating group? How is "India_manufactured, Australia_manufactured, France_manufactured" different from "number_manufactured, price, description"? The difference is that in the first case, the semantic meaning of the value is the same, all that differs is a context, an application. In the second case, the semantic meaning is different. That is, it is hard to imagine a query or program that processes the data beyond a trivial "find the biggest value" or some such in which we would run it today processing number_manufactured, and then run it tomorrow doing exactly the same processing but on sale_price. But we could easily imagine running today for India and tomorrow for Germany.
Of course there are times when it can be ambiguous. That's why database designers get paid the big bucks. :-)
Okay, that's the rule. Does the rule have practical value?
Let's consider scenario A, one table:
model (model_id, description, india_manufactured, australia_manufactured, france_manufactured)
Scenario B, two tables:
model (model_id, description)
production (model_id, country_code, manufactured)
There are a number of reasons why scenario A sucks. Here's the biggest:
Queries are much simpler with Scenario B. We do not have to hard-code countries into our program or query. Write a query to accept a country code as a parameter and return the number of each model manufactured in that country. In scenario B, simple:
select description, manufactured
from model join production on model.model_id=production.model_id
where production.country_code=#country
Easy. Now do it with scenario A. Something like:
select description,
case when #country_code='IN' then india_manufactured
when #country_code='AU' then australia_manufactured
when #country_code='FR' then france_manufactured
else null
end as manufactured
from model
Or suppose we want the total produced in all countries. Scenario B:
select description, sum(manufactured)
from model
join production on model.model_id=production.model_id
Scenario A:
select description, india_manufactured+australia_manufactured+france_manufactured
from model
(Might be more complex if we have to allow for nulls.)
We'd likely have many, many such queries throughout the system. In real life, many would be much more complex than this, with multiple such messy case clauses or juggling multiple columns. Now suppose we add another country. In scenario B, this is zero effort. We can add and delete countries all we like and the queries don't change. but in scenario A, we would have to find every query and change it. If we miss one, we won't get any compile errors or anything like that. We'll just mysteriously get incorrect results.
Oh, and by the way, it's likely that there will be times when we only want to process some of the countries. Like, say some of the countries have a VAT and some don't, or whatever. In scenario B, we add a column for this fact and test on it. That's just "join country on country.country_code=production.country_code and country.vat=1". In scenario A the programmer would almost surely end up hard-coding the list of specific countries in each query. Then someone comes along later and sees that query X processes India and France and query Y processes France and Germany and query Z processes Germany and Singapore and he might well have no idea why. Even if he knows, the list is hard-coded in every query so every update requires updating every query, changing code rather than changing data.
suppose we come across a query that only processes three of the four countries.
Oh, and by the way,
How do we know whether this is a mistake, someone forgot one of the countries when writing the query or missed this query when a new country was added; or whether there is some reason why this country was excluded?
The second approach is better for you as you will better clarity in terms of the data and also you can avoid INSERT DELETE and UPDATE anomalies.
Yes with the second approach you will have more data in terms of number.
Basically when you design a DB the normal approach is to go for 3NF.
Table COUNTRYANDCARS [MODEL (PK), AUSTRALIA, INDIA, FRANCE, GERMANY]
Ideally the above approach is correct when you have only fixed countries.
Table CARPRODUCTION [MODEL (PK), COUNTRY (PK), COUNT]
This would meet for all.