PowerPivot relationship based on two columns per table - powerpivot

I have a table geoLocations which holds among others the two columns latitude and longitude. There is a second table (let's name it cities), which holds for each unique pair of laitude and longitude the corresponding city.
How can I model this relationship using PowerPivot? Creating two separate relationships will fail, as one coordinate may occur several times in the lookup-table (only the combination of both is unique).
Thanks in advance :)

You should create a key in each table that concatenates the 2 fields together. This could be done in a calculate column eg:
= LATITUDE & "-" & LONGITUDE
Or on import which would yield better performance on a large data set.
Jacob

Related

SQL - Selecting columns based on attributes of the column

I am currently designing a SQL database to house a large amount of biological data. The main table has over 100 columns, where each row is a particular sampling event and each column is a species name. Values are the number of individuals found of that species for that sampling event.
Often, I would like to aggregate species together based on their taxonomy. For example: suppose Sp1, Sp2, and Sp3 belong to Family1; Sp4, Sp5, and Sp6 belong to Family2; and Family1 and Family2 belong to Class1. How do I structure the database so I can simply query a particular Family or Class, instead of listing 100+ columns each time?
My first thought was to create a second table that lists the attributes of each column from the first table. Such that the primary key in the second table corresponded to the column headers in table 1, and the columns in table 2 are the categories I would want to select by (such as Family, Feeding type, life stage, etc.). However, I'm not sure how to write a query that can join tables in such a way.
I'm a newbie to SQL, and am not sure if I'm going about this in completely the wrong way. How can I structure my data/write queries to accomplish my goal?
Thanks in advance.
No, no, no. Don't make species columns in the table.
Instead, where you have one row now, you want multiple rows. It would have columns such as:
id: auto generated sequential number
sampleId: whatever each row in the current table belongs to
speciesId: reference to the species table
columns of data for that species on that sampling
The species table could then have a hierarchy, the entire hierarchy with genus, family, order, and so on.

How to deal with one single cell containg multiple values?

I'm having an exercise requiring to create two table for a travel business:
Activity
Booking
it turns out that the column activities in the Booking table references from the Activities table. However it contains multiple value. How do I sort it out? If I insert multiple rows there will possibly duplication in the Booking's primary key.
As Gordon mentioned you should refactor your tables for better normalization. If I interpret your intent correctly this is more like what your schema should look like. Booking should only contain an ID for adventure and an ID for Customer. You will add a row to [AdventureActivity] for each activity booked on a [Booking]. With this design you can JOIN tables and get all the data you require without having to try to parse out multiple values in a column.

Design Sql Tables with common columns

I have 5 tables that have the same structure and same columns: id, name, description. So I wonder what is the best way to design or to avoid having 5 tables that have the same columns:
Create a category table that will include my three common
columns and another column "enum" that will differentiate my categories
ex (city, country, continent, etc.)
Create a category table that will include my three common
columns and create the other five tables that will just include an
id.
Note that I would have an assocation table that should include the id of cities, id countries, id continents, etc. so i can display them into a report
Thank you for your advice.
The decision on how many tables to have under these circumstances simply depends.
The most important factor is whether the five things are independent entities or whether they are related. A simple way to understand this is by understanding foreign key relationships: Will other tables have a column that could refer to any of the five (say "geoid")? Or will other tables have a column that generally refers to one of the five ("cityid", "countryid")? The ability to define helpful foreign key constraints often drives the table structure.
There are other considerations. If your data is at the geographic level, then it might represent hierarchies . . . cities are in countries, countries are on continents. Some databases (such as MySQL) do not support hierarchical queries at all. Under these circumstances, you might consider denormalizing the data for querying purposes.
Other considerations can also come into play. If your application is going to be internationalized, then having all the reference tables in a single place is handy -- for providing cultural-specific information (language, currency symbol, and so on). One way of handling this situation is to put all such references in a single table (and perhaps using more sophisticated foreign key relationships).
The column names are not important, just the data in the columns. If City description, country description and continent description are different information then you are already doing this the right way. The only time you would aim to reduce this data would be if you were repeating information but for the titles of the data it's fine.
In fact. You are doing this correctly. Country will have different values from city for every field mentioned. Id is just an id, every table should have one. Name and description wont be the same across country and city.
Also, this way if you want a countrys name you dont have to go through every country, continent and city. You only have 192 or so entries to go through. If you had all of that in one massive table you would have to use it for everything and go through every result every time you want data. You would also have to distinguish between cities, countries and continents in some other way than the separate tables.
Eg:
method 1, with 5 tables:
SELECT * FROM country
does the same as
method 2, 1 table:
SELECT * FROM table WHERE enumvalue = 'country';
If you have tables representing city, country and continent, and they all have exactly the same fields, you have a fundamental problem. In real life, every city is in a country and every country is in at least one continent (more or less) but your data structure does not reflect that. Your city table should look something like this:
id (primary key)
countryId (foreign key to country)
name
other fields
You'll need a similar relationship between countries and continents. However, before you do so, you have to decide what to do about countries like Russia which is in two continents and Palau which isn't really in any.
You may also want to have a provinceStateTerritory table to help you sort out the 38 places in the United States named Springfield. Or, you may want to handle that situation differently.

what is the best database design for this table when you have two types of records

i am tracking exercises. i have a workout table with
id
exercise_id (foreign key into exercise table)
now, some exercises like weight training would have the fields:
weight, reps (i just lifted 10 times # 100 lbs.)
and other exercises like running would have the fields: time, distance (i just ran 5 miles and it took 1 hours)
should i store these all in the same table and just have some records have 2 fields filled in and the other fields blank or should this be broken down into multiple tables.
at the end of the day, i want to query for all exercises in a day (which will include both types of exercises) so i will have to have some "switch" somewhere to differentiate the different types of exercises
what is the best database design for this situation
There are a few different patterns for modelling object oriented inheritance in database tables. The most simple being Single table inheritance, which will probably work great in this case.
Implementing it is mostly according to your own suggestion to have some fields filled in and the others blank.
One way to do it is to have an "exercise" table with a "type" field that names another table where the exercise-specific details are, and a foreign key into that table.
if you plan on keeping it only 2 types, just have exercise_id, value1, value2, type
you can filter the type of exercise in the where clause and alias the column names in the same statment so that the results don't say value1 and value2, but weight and reps or time and distance

How to represent a set of entities from separate tables?

I have a few tables representing geographical entities (CITIES, COUNTIES, ZIPCODES, STATES, COUNTRIES, etc).
I need to way represent sets of geographical entities. A set can contain records from more than one table. For example, one set may contain 3 records from CITIES, 1 record from COUNTIES and 4 from COUNTRIES.
Here are two possible solutions:
A table which contains three columns - one record for each entity. The table will contain multiple records for each set, all sharing the the set number.
set_id INT, foreign_table VARTEXT(255), foreign_id INT
Sample entries for set #5:
(5,'CITIES',4)
(5,'CITIES',12)
(5,'ZIPCODES',91)
(5,'ZIPCODES',92)
(5,'COUNTRIES',15)
A table which contains a TEXT column for each entity type, which will include a string set with the appropriate entries:
set_id INT,cities TEXT,counties TEXT,zipcodes TEXT,states TEXT,countries TEXT
So the above set will be represented with a single record
(5,'4,12','','91,92','','15')
Any other ideas? Would love to hear your input.
Thanks!
Both solutions you propose don't have real foreign keys. In the first solution, one foreign_id can point to many tables, which is hard (or at least inefficient) for a database to enforce. The second solution stores multiple values in one column, which is the one thing everyone agrees you shouldn't do (it breaks first normal form.)
What I would do is this: cities, zip codes, and states all "have a" geographical location. The normal way to implement that is a one to many relation. Create a geolocation table, and add a geolocation_id column to the cities, zip code, and state tables.
EDIT: Per your comment, to get from a geolocation to its cities:
select *
from geolocation g
left join cities c
on g.id = c.geolocation_id
left join zipcodes z
on g.id = z.geolocation_id
....
The database will resolve the joins using the foreign key index, which is very fast.
One Location Set can have many Geography items
One Geography item can belong to many Location Sets
Regarding the Geography item table, two approaches are possible. In the first case the super-type/subtype relationship is overlapping -- more than one sub-type can be linked to the super-type.
For example, there can be GeographyID = 5 in Geography and Zipcodes, Cities, States, Countries tables.
In the second case, we can consider the exclusive (disjoint) relationship, in which only one subtype can be connected to the super-type. The parent-child relationship is used to create paths, like ZIP/City/State/Country -- that is if actual administrative areas allow for this type of relationship.
In this example, there can be GeographyID = 5 in the Geography and only one more sub-type table.