Which columns to put index in the below query in MS SQL Server - sql

I have a table with employee_name,state,city and zip. I have to join using separate city, state and zip table and one table with all city,state and zip columns.
So which columns should I create index on: city,state,zip separately or a combined index on city,state and zip.
This is a theoretical question so please help me understand the indexes and the performance issues.

If I understand your question, what you need is a numerical primary key column on the City State Zip table. Then the table with employee_name has a column with the appropriate key from from the other table. The City, State Zip table of course has an index on the primary key column. You may also want an index on the employee_name column.
Although it's possible, for efficiency purposes one normally would not throw an index across three text columns. That would make for a rather large index. So we create what is called a surrogate key. In this case, as I suggested, the surrogate key should be an simple integer. Three text columns, City, State, Zip together form what is called the natural key. But this natural keys actually contains redundant information that increases its size. For example, their is a dependency between zip code and state.

Related

Postgresql: Primary key for table with one column

Sometimes, there are certain tables in an application with only one column in each of them. Data of records within the respective columns are unique. Examples are: a table for country names, a table for product names (up to 60 characters long, say), a table for company codes (3 characters long and determined by the user), a table for address types (say, billing, delivery), etc.
For tables like these, as the records are unique and not null, the only column can be used as the primary key, technically speaking.
So my question is, is it good enough to use that column as the primary key for the table? Or, is it still desirable to add another column (country_id, product_id, company_id, addresstype_id) as the primary key for the table? Why?
Thanks in advance for any advice.
there is always a debate between using surrogate keys and composite keys as primary key. using composite primary keys always introduces some complexity to your database design so to your application.
think that you have another table which is needed to have direct relationship between your resulting table (billing table). For the composite key scenario you need to have 4 columns in your related table in order to connect with the billing table. On the other hand, if you use surrogate keys, you will have one identity column (simplicity) and you can create unique constraint on (country_id, product_id, company_id, addresstype_id)
but it is hard to say this approach is better then the other one because they both have Pros and Cons.
You can check This for more information

One Primary Key Value in many tables

This may seem like a simple question, but I am stumped:
I have created a database about cars (in Oracle SQL developer). I have amongst other tables a table called: Manufacturer and a table called Parentcompany.
Since some manufacturers are owned by bigger corporations, I will also show them in my database.
The parentcompany table is the "parent table" and the Manufacturer table the "child table".
for both I have created columns, each having their own Primary Key.
For some reason, when I inserted the values for my columns, I was able to use the same value for the primary key of Manufacturer and Parentcompany
The column: ManufacturerID is primary Key of Manufacturer. The value for this is: 'MBE'
The column: ParentcompanyID is primary key of Parentcompany. The value for this is 'MBE'
Both have the same value. Do I have a problem with the thinking logic?
Or do I just not understand how primary keys work?
Does a primary key only need to be unique in a table, and not the database?
I would appreciate it if someone shed light on the situation.
A primary key is unique for each table.
Have a look at this tutorial: SQL - Primary key
A primary key is a field in a table which uniquely identifies each
row/record in a database table. Primary keys must contain unique
values. A primary key column cannot have NULL values.
A table can have only one primary key, which may consist of single or
multiple fields. When multiple fields are used as a primary key, they
are called a composite key.
If a table has a primary key defined on any field(s), then you cannot
have two records having the same value of that field(s).
Primary key is table-unique. You can use same value of PI for every separate table in DB. Actually that often happens as PI often incremental number representing ID of a row: 1,2,3,4...
For your case more common implementation would be to have hierarchical table called Company, which would have fields: company_name and parent_company_name. In case company has a parent, in field parent_company_name it would have some value from field company_name.
There are several reasons why the same value in two different PKs might work out with no problems. In your case, it seems to flow naturally from the semantics of the data.
A row in the Manufacturers table and a row in the ParentCompany table both appear to refer to the same thing, namely a company. In that case, giving a company the same id in both tables is not only possible, but actually useful. It represents a 1 to 1 correspondence between manufacturers and parent companies without adding extra columns to serve as FKs.
Thanks for the quick answers!
I think I know what to do now. I will create a general company table, in which all companies will be stored. Then I will create, as I go along specific company tables like Manufacturer and parent company that reference a certain company in the company table.
To clarify, the only column I would put into the sub-company tables is a column with a foreign key referencing a column of the company table, yes?
For the primary key, I was just confused, because I hear so much about the key needing to be unique, and can't have the same value as another. So then this condition only goes for tables, not the whole database. Thanks for the clarification!

Why do we require secondary indices in DBMS?

I get the point that primary indices are unique to each record and hence retrieving a record gets faster using primary indexing. What happens when we use secondary indexing.
Of what I can think of,
ID Name School
1 John XYZ
2 Roger XYZ
3 Ray ABC
4 Matt KJL
5 Roger ABC
if we have secondary indexing on Name, then it will help me retrieve records relevant to names and not with id hence it would not restrict me to one record if I query a record for Roger and I would be able to get result pertaining to both Rogers. Hence if the table is extensively queried based on the secondary index, it should be used.
Am I right?
Apart from speeding up specific queries, perhaps the most common case for secondary indexes is to speed up checking of UNIQUE constraints. Consider e.g. a table
CREATE TABLE Person (
id int primary key,
fname text not null,
lname text not null,
date_of_birth date not null,
...
UNIQUE (fname, lname, date_of_birth)
)
Here we want to enforce the UNIQUE constraint to ensure the same person doesn't appear in the table multiple times under different ids. But at the same time we wouldn't want to make (fname, lname, date_of_birth) the primary key, because a person's name could potentially change, and because using 3 attributes as reference can be cumbersome.
Now, when inserting a new record into the table, the DBMS needs to check whether it already contains another tuple with the same (fname, lname, date_of_birth), and a secondary index on these attributes can help speed this check up.
Note that UNIQUE constraints automatically generate their indexes, so there is no need to create them explicitly.
Another common case where secondary indexes are required (and must be created explicitly) are foreign key constraints that target attributes that do not make up the primary key for the target table.

Table design, composite key

I have a table with some data summary which consist of client_id, location_id, category_id and summary columns. Values of the three id's columns are not unique.
At the moment I have created a composite key from client_id, location_id, category_id using primary keys. Those three columns will uniquely identify rows.
My question is, if I still should include unique primary key for that table for example column with auto-increment id ?
That depends completely on your uses of the table. If you don't want to refer to a given row in a query (for example, having a dependent table), the separate PK is unnecessary (eg. if you always ask for statistics for a given client and a given location and a given category). However, if you do have dependent tables, you probably want a separate PK as well.
If your composite key is the primary clustered index then I would say it's not necessary.

SQL Server 2008 - Table - Clarifications

I am new to SQL Server 2008 database development.
Here I have a master table named ‘Student’ and a child table named ‘Address’. The common column between these tables is ‘Student ID’.
My doubts are:
Do we need to put ‘Address Id’ in the ‘Address’ table and make it primary key? Is it mandatory? ( I won’t be using this ‘Address Id’ in any of my reports )
Is Primary key column a must in any table?
Would you please help me on these.
Would you please also refer best links/tutorials for SQL Server 2008 database design practices (If you are aware of) which includes naming conventions, best practices, SQL optimizations etc. etc.
1) Yes, having an ADDRESS_ID column as the primary key of the ADDRESS table is a good idea.
But having the STUDENT_ID as a foreign key in the ADDRESS table is not a good idea. This means that an address record can only be associated to one student. Students can have roommates, so they'd have identical addresses. Which comes back to why it's a good idea to have the ADDRESS_ID column as a primary key, as it will indicate a unique address record.
Rather than have the STUDENT_ID column in the ADDRESS table, I'd have a corrollary/xref/lookup table between the STUDENT and ADDRESS tables:
STUDENT_ADDRESSES_XREF
STUDENT_ID, pk, fk to STUDENTS table
ADDRESS_ID, pk, fk to ADDRESS table
EFFECTIVE_DATE, date, not null
EXPIRY_DATE, date, not null
This uses a composite primary key, so that only one combination of the student & address exist. I added the dates in case there was a need to know when exactly, because someone could move back home/etc after all.
Most importantly, this works off the ADDRESS_ID column to allow for a single address to be associated to multiple people.
2) Yes, defining a primary key is frankly a must for any table.
In most databases, the act also creates an index - making searching more efficient. That's on top of the usual things like making sure a record is a unique entry...
Every table should have a way to uniquely and unambiguously identify a record. Make AddressID the primary key for the address table.
Without a primary key, the database will allow duplicate records; possibly creating join problems or trigger problems (if you implement them) down the road.