This is the query I'm trying to visually represent and diagrams and small explanation I made.
I realize it would be much easier not using Venn Diagrams for joins but I don't understand it enough to use anything else like Cartesian products.
I just want to make sure it makes sense because I'm having a hard time understanding the joining of 3 or more tables.
You would want something like this:
You will have the entirety of the blue (author) area.
You will have the red (Allocation) area that overlaps the blue (Author) area and for the rest of the blue area that does not overlap the red area then the Allocation column values will be NULL.
You will have the green (Book) area that overlaps the intersection of the red (Allocation) and blue (Author) area (where all 3 colours overlap) and for the rest of the blue area that does not overlap the green area then the Book column values will be NULL.
The areas of the red (Allocation) and green (Book) circles that do not overlap the blue (Author) circle will be excluded from the result set.
For a 3 table left join:
1.For table Author, it looks at every entry and checks if it exists in
table Allocation. If it does not, it returns null.
2. For those entries which were not null in the previous join, checks
if they exist in table Book. If it does not, it returns null.
Hope this helps. I agree that the diagram might not be the best representation for 3 or more tables joined.
Related
I'm trying to get a query together that selects products based on a combination of data (body type, materials and colors,etc.) and I've gotten close but I'm still missing a step to get exactly what I want. I've been looking more at pivoting data and I've tinkered with this query but I'm still left with this issue.
Basically, I have multiple products that only have one material and color attributed to them and they each have a unique stock keeping ID, but some of the body types will have 2 materials and colors per stock keeping unit.
My current results in this db fiddle https://www.db-fiddle.com/f/u4zKAdw3H4hFLbfnzEeZS2/1
are as shown:
For the most part the results are there but BodB has the same color for both materials and should be one single row like BoDB | Fabric | Black | Leather | Black
So if one body code is attached to 2 different materials but the same color with sequence 1 and 2, it should be a single row effectively grouped by the body and the color (the materials wouldn't be the same, only the color. And there will only ever be up to 2 materials and colors on a given stock keeping unit
The fiddle is ready to go with these results, any help is much appreciated
Being new with SQL and SSRS and can do many things already, but I think I must be missing some basics and therefore bang my head on the wall all the time.
A report that is almost working, needs to have more results in it, based on conditions.
My working query so far is like this:
SELECT projects.project_number, project_phases.project_phase_id, project_phases.project_phase_number, project_phases.project_phase_header, project_phase_expensegroups.projectphase_expense_total, invoicerows.invoicerow_total
FROM projects INNER JOIN
project_phases ON projects.project_id = project_phases.project_id
LEFT OUTER JOIN
project_phase_expensegroups ON project_phases.project_phase_id = project_phase_expensegroups.project_phase_id
LEFT OUTER JOIN
invoicerows ON project_phases.project_phase_id = invoicerows.project_phase_id
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total >0 )
The parameter is for selectionlist that is used to choose a project to the report.
How to have also records that have
( project_phase_expensegroups.projectphase_expense_total ) with value 0 but there might be invoices for that project phase?
Tried already to add another condition like this:
WHERE ( projects.project_number = #iProjectNumber )
AND
( project_phase_expensegroups.projectphase_expense_total > 0 )
OR
( invoicerows.invoicerow_total > 0 )
but while it gives some results - also the one with projectphase_expense_total with value 0, but the report is total mess.
So my question is: what am I doing wrong here?
There is a core problem with your query in that you are left joining to two tables, implying that rows may not exist, but then putting conditions on those tables, which will eliminate NULLs. That means your query is internally inconsistent as is.
The next problem is that you're joining two tables to project_phases that both may have multiple rows. Since these data are not related to each other (as proven by the fact that you have no join condition between project_phase_expensegroups and invoicerows, your query is not going to work correctly. For example, given a list of people, a list of those people's favorite foods, and a list of their favorite colors like so:
People
Person
------
Joe
Mary
FavoriteFoods
Person Food
------ ---------
Joe Broccoli
Joe Bananas
Mary Chocolate
Mary Cake
FavoriteColors
Person Color
------ ----------
Joe Red
Joe Blue
Mary Periwinkle
Mary Fuchsia
When you join these with links between Person <-> Food and Person <-> Color, you'll get a result like this:
Person Food Color
------ --------- ----------
Joe Broccoli Red
Joe Bananas Red
Joe Broccoli Blue
Joe Bananas Blue
Mary Chocolate Periwinkle
Mary Chocolate Fuchsia
Mary Cake Periwinkle
Mary Cake Fuchsia
This is essentially a cross-join, also known as a Cartesian product, between the Foods and the Colors, because they have a many-to-one relationship with each person, but no relationship with each other.
There are a few ways to deal with this in the report.
Create ExpenseGroup and InvoiceRow subreports, that are called from the main report by a combination of project_id and project_phase_id parameters.
Summarize one or the other set of data into a single value. For example, you could sum the invoice rows. Or, you could concatenate the expense groups into a single string separated by commas.
Some notes:
Please, please format your query before posting it in a question. It is almost impossible to read when not formatted. It seems pretty clear that you're using a GUI to create the query, but do us the favor of not having to format it ourselves just to help you
While formatting, please use aliases, Don't use full table names. It just makes the query that much harder to understand.
You need an extra parentheses in your where clause in order to get the logic right.
WHERE ( projects.project_number = #iProjectNumber )
AND (
(project_phase_expensegroups.projectphase_expense_total > 0)
OR
(invoicerows.invoicerow_total > 0)
)
Also, you're using a column in your WHERE clause from a table that is left joined without checking for NULLs. That basically makes it a (slow) inner join. If you want to include rows that don't match from that table you also need to check for NULL. Any other comparison besides IS NULL will always be false for NULL values. See this page for more information about SQL's three value predicate logic: http://www.firstsql.com/idefend3.htm
To keep your LEFT JOINs working as you intended you would need to do this:
WHERE ( projects.project_number = #iProjectNumber )
AND (
project_phase_expensegroups.projectphase_expense_total > 0
OR project_phase_expensegroups.project_phase_id IS NULL
OR invoicerows.invoicerow_total > 0
OR invoicerows.project_phase_id IS NULL
)
I found the solution and it was kind easy after all. I changed the only the second LEFT OUTER JOIN to INNER JOIN and left away condition where the query got only results over zero. Also I used SELECT DISTINCT
Now my report is working perfectly.
Say I have a Person table that stores information about that person (weird right?). I have select boxes for things like gender, hair color, and eye color. Instead of creating separate tables with a description field for each, is there a good way to use a single table? Maybe a Resources table with a Name and Description fields? Is it just that simple?
Resources
=========
ID Name Description
--------------------
1 Gender Male
2 Gender Female
3 Eye Color Blue
4 Eye Color Green
5 Eye Color Brown
6 Hair Color Black
7 Hair Color Brunette
8 Hair Color Blonde
9 Hair Color Red
Person
=========
ID Name Gender Eye_Color Hair_Color
-----------------------------------------------
1 Ryan 1 3 8
Is this the recommended way or is there something better for this?
Yes it is that simple, IMO your approach is correct. But please note you approach will not work if you get to select Ex: multiple hair colors for one person.
But I believe keeping code simple until you get a requirement to change it, read about YAGNI when u have some time :)
You could do it that way and it would be a polymorphic association.
If you don't need to query this information but just be able to access it you can use serialize and just store all the values in one column.
So a person record would have a column, let's call it attributes, that would have "eye_color: blue, gender: male", etc...
I'd create a separate table called Physical_attributes and an assossiative one between Person and Physical_attributes, personal_physical_attributes, where I'd store the person's id, the Physical_attribute's id and the description for that Physical_attribute.
I want to implement graph coloring using databases.
There is a table that will store all the vertices (1,2,3...) and a table that stores the name of all colors(red,blue,green,etc..).
Now a want to create a coloring table with columns vertex and color which will take all possible combinations from the above tables and then check the constraints in each of those tables. Whichever table satisfies the constraints of graph coloring is a solution.
Now how to create tables for each combinations??
Guys please help. Stuck on it from a while...
An example instance:
vertex
1
2
3
Colors
red
blue
coloring
a)
1 red
2 blue
3 red
b)
1 red
2 red
3 blue
c)
1 blue
2 red
3 red
.
.
.
6 tables
I'm not sure I understand your question, so I'll make some assumptions. Assuming you have a table called Vertex, with the following rows:
1
2
3
... and a table called Color, with the following rows:
Red
Green
Blue
... you can generate a table of all possible combinations with a simple unconstrained join, like this:
SELECT *
INTO VertexColor
FROM Vertex, Color
The result will be a new table, with the following rows:
1, Red
1, Green
1, Blue
2, Red
2, Green
2, Blue
3, Red
3, Green
3, Blue
Happy to help further if this does not answer your question.
SELECT Vertices.vertex, Colors.Color from Vertices
CROSS JOIN Color from Colors
EDIT: Seeing the new comments: This doesn't sound like a problem that is well suited for SQL, mainly because your number of columns in your resultset is dependent on the number of rows in your vertices table. That's not something that is easy in SQL (you probably need a multistep process, using dynamic sql through sp_execute). Since the ordering of the colums carries significance, you can't return a resultset containing only each vertex - color pair either, because the order in which the rows are returned may vary. To me it sounds like a problem better handled outside the database engine. You can still use the above cross join to get a preliminary dataset, where you filter out some conditions you have on the set.
I`m new to data warehousing, but I think my question can be relatively easy answered.
I built a star schema, with a dimension table 'product'. This table has a column 'PropertyName' and a column 'PropertyValue'.
The dimension therefore looks a little like this:
surrogate_key | natural_key (productID) | PropertyName | PropertyValue | ...
1 5 Size 20 ...
2 5 Color red
3 6 Size 20
4 6 Material wood
and so on.
In my fact table I always use the surrogate keys of the dimensions. Cause of the PropertyName and PropertyValue columns my natural key isn`t unique / identifying anymore, so I get way too much rows in my fact table.
My question now is, what should I do with the property columns? Would it be best, to put each property into separate dimensions, like dimension size, dimension color and so on? I got about 30 different properties.
Or shall I create columns for each property in the fact table?
Or make one dimension with all properties?
Thanks in advance for any help.
Your dimension table 'product' should look like this:
surrogate_key | natural_key (productID) | Color | Material | Size | ...
1 5 red wood 20 ...
2 6 red ...
If you have to many properties, try to group them in another dimension. For example Color and Material can be attributes of another dimension if you can have the same product with same id and same price in another color or material. Your fact table can identify product with two keys: product_id and colormaterial_id...
Reading recommendation:
The Data Warehouse Toolkit, Ralph Kimball
Your design is called EAV (entity-attribute-value) table.
It's a nice design for the sparse matrices (large number of properties with only few of them filled at the same time).
However, it has several drawbacks.
It cannot be indexed (and hence efficiently searched) on two or more properties at once. A query like this: "get all products made of wood and having size or 20" will be less efficient.
Implementing constraints involving several attributes at once is more complex
etc.
If it's not a problem for you, you can use EAV design.