Is it possible to create a User-Defined hierarchy between dimension attributes that are unrelated? - sql-server-2012

I am working on spiking out a BI solution and I am trying to create a Hierarchy in out 'Client' dimension more or less to replicate what I know the end users will do.
The Client dimension table has 3 foreign key relationships with other tables, each of these relationships are standalone from the others. They are Role, Service Type, and Status.
Whenever this dimension will be used it will almost always be with the Role attribute first so I tried to create hierarchies like Role -> Service Type -> Client. Now when I try to process with this setup I get the error "The table that is required for a join cannot be reached based on the relationships in the data source view"
Is there any way to create a Hierarchy with disparate attributes like this?

It isn't possible to build a hierarchy across multiple dimensions that are not related to one another in that way.
One option would be to restructure your schema to a Star schema rather than a Snowflake schema. If that's not an option, depending on the details you might be able to build a query in SSAS or a view in SQL server, and then create a dimension in SSAS from that.
From the information given it's hard to judge whether Role, Service Type, and Status should be in separate tables, but have a good think about this. Should the attributes you want to use from them actually be attributes of Client? Would doing this allow you to get rid of some of those tables and/or reduce the amount of Snowflaking? Remember that dimensional models are very different from a normal relational model: the aim is to keep the number of joins low, and it can be acceptable to repeat attributes in different tables if that's going to be more efficient. Of course, sometimes restructuring just isn't a realistic prospect, which is where a query or view might come in useful to work around this issue.
I've been learning SSAS recently myself, and have found that when I'm struggling to get something working in SSAS, the problem is very often the underlying schema not being set up in the best way for what I'm trying to accomplish.
(Apologies if this is a little vague but it's difficult to be more detailed without knowing the nature of the data, and I don't yet have the ability to comment and ask for more information!)

Related

How to convert SQL many to many data model to firebase data model

I am having troubles trying to convert my data model of an attendance system for a football trainer (I have done it as if it were a SQL normalized relational model) to a firebase model. Here is a picture of my relational model:
I was thinking about making 4 Collections also:
Players
Attendance
Match
MatchType (it can be friendly-match, tournament, practice, among others)
I think this depends of how do you want to use the data. When I look at this it seems that one collection "Attandence" is enough, which in your schema is connecting all tables.
The idea of relational databases is that data should not be redundant, so every information is stored only once and connected by relation using keys like ex. PlayerID.
While in noSQL databases you does not care about data redundancy. So you are storing the same information (like player name) in many documents. The idea is to have everything in one document and do not create sophisticated queries to get information - just get document and you have everything.
So all depend how do you use information, which we do not know. You can put everything in one collection and just get all information from one document.
On the other hand you can create 4 collection with exactly the same fields as in SQL database and use in relational way just to have cheap, fast and serverless database engine.
What more you can change your solution any time as you do not define any schema.
So in Firestore you are free to choose any solution, you should think first how do you will use the information.

Normalisation and multi-valued fields

I'm having a problem with my students using multi-valued fields in access and getting confused about normalisation as a result.
Here is what I can make out. Given a 1-to-many relationship, e.g.
Articles Comments
-------- --------
artID{PK} commID{PK}
text text
artID{FK}
Access makes it possible to store this information into what appears to be one table, something like
Articles
--------
artID{PK}
text
comment
+ value
"value" referring to multiple comment values for the comment "column", which access actually stores as a separate table. The specifics of how the values are stored - table, its PK and FK - is completely hidden, but it is possible to query the multi-valued field, e.g. in the example above with the query
INSERT INTO article( [comment].Value )
VALUES ('thank you')
WHERE artID = 1;
But the query doesn't quite reveal the underlying structure of the hidden table implementing the multi-valued field.
Given this (disaster, in my view) - my problem is how to help newcomers to database design and normalisation understand what Access is offering them, why it may not be helpful, and that it is not a reason to ignore the basics of the relational model. More specifically:
Are there better ways, besides queries as above, to reveal the structure behind multi-valued fields?
Are there good examples of where the multi-valued field is not good enough, and shows the advantage of normalising explicitly?
Are there straightforward ways to obtain the multi-select visual output of Access multi-values, but based on separate, explicit tables?
Thanks!
I cannot give you advice in using this feature, because I never used it; however, I can give you reasons not to use it.
I want to have full control on what I'm doing. This is not the case for multi-valued fields, therefore I don't use them.
This feature is not expandable. What if you want to add a date field to your comments, for instance?
It is sometimes necessary to upsize an Access (backend) database to a "big" database (SQL Server, Oracle). These Databases don't offer such a feature. It is often the customer who decides which database has to be used. Recently I had to migrate an Access application (frontend) using an Oracle backend to a SQL-Server backend because my client decided to drop his Oracle server. Therefore it is a good idea to restrict yourself to use only common features.
For common tasks like editing lookup tables I created generic forms. My existing solutions will not work with multi-valued fields.
I have a (self-made) tool that synchronizes changes in the structure of the database on my developer’s site with the database on the client’s site. This tool cannot deal with multi-valued fields.
I have tools for the security management that can grant SELECT, INSERT, UPDATE and DELETE rights on tables or revoke them. Again, the management tool does not work with multi-valued fields.
Having a separate table for the comments allows you to quickly inspect all the comments (by opening the table). You cannot do this with multi-valued fields.
You will not see the 1 to n relation between the articles and the comments in a database diagram.
With a separate table you can choose whether you want to cascade deletes to the details table or not. If you don't, you will not be able to delete an article as long as there are comments attached to it. This can be desirable, if you want to protect the comments from being deleted inadvertently.
It is important to realize the difference between physical and logical relationships. Today the whole internet and web services (SOAP) quite much realizes on a data format that is multi-value in nature.
When you represent multi-value data with a relational database (such as Access), then behind the scenes you are using a traditional (and legitimate) relation. I cannot stress that as such, then the use of multi-value columns in Access is in fact a LEGITIMATE relational model.
The fact that table is not exposed does not negate this issue. In fact, if you represent an invoice (master record, and repeating details) as a XML data cube, then we see two things:
1) you can build and represent that invoice with a relational database like Access
2) such a relational data model that is normalized can ALSO be represented as a SINGLE xml string.
3) deleting the XML record (or string) means that cascade delete of the child rows (invoice details) MUST occur.
So while it is true that Multi-Value fields been added to Access to deal with SharePoint, it is MOST important to realize that such data can be mapped to a relational database (if you could not do this, then Access could not consume that XML data using relational database tables as ACCESS CURRENTLY DOES RIGHT NOW).
And with the web such as XML, and SharePoint then the need to consume and manage and utilize such data is not only widespread, but is in fact a basic staple of the internet.
As more and more data becomes of a complex nature, we find the requirement for multi-value data exploding in use. Anyone who used that so called "fad" the internet is thus relying and using data that is in fact VERY OFTEN XML and is multi-value (complex) in nature.
As long as the logical (not physical) relational data model is kept, then use of multi-value columns to represent such data is possible and this is exactly what Access is doing (it is mapping the relational data model to a complex model). Note that the complex (xml) data model does NOT necessary have to be relational in nature. However, if you ARE going to map such data to Access then the complex multi-value model MUST CONFORM TO A RELATIONAL data model.
This is EXACTLY what is occurring in Access.
The fact that such a correct and legitimate math relational model is not exposed is of little issue here. Are we to suggest that because Excel does not expose the binary codes used then users will never learn about computers? Or perhaps we all must program in assembler so we all correctly learn how computers works.
At the end of the day, who cares and why does this matter? The fact that people drive automatic cars today does not toss out the concept that they are using different gears to operate that car. The idea that we shut down all of society because someone is going to drive an automatic car or in this case use complex data would be galactic stupid on our part.
So keep in mind that extensions to SQL do exist in Access to query the multi-value data, but as well pointed out here those underlying tables are not exposed. However, as noted, exposing such tables would STILL REQUIRE one to not change or mess with cascade delete since that feature is required TO MAINTAIN A INTERSECTION OF FEATURES and a CORRECT MATH relational model between the complex data model (xml) and that of using two related tables to represent such data.
In other words, you can use related tables to represent the complex data model IF YOU REMOVE the ability of users to play with the referential integrity options. The RI options MUST remain as set in those hidden tables else such data will not be able to make the trip BACK to the XML or complex data model of which it was consumed from.
As noted, in regards to users being taught how gasoline reacts with oxygen for that of learning to drive a car, or using a word processor and being forced to learn a relational model and expose the underlying tables makes little sense here.
However, the points made here in regards to such tables being exposed are legitimate concerns.
The REAL problem is SQL server and Oracle etc. cannot consume or represent that complex data WHILE ACCESS CAN CONSUME such data.
As noted, the complex data ship has LONG ago sailed! XML, soap, and the basic technologies of the internet are based on this complex data model.
In effect, SQL server, Oracle and most databases cannot that consume this multi-value data represent it without users having to create and model such data in a relational fashion is a BIG shortcoming of SQL server etc.
Access stands alone in this ability to consume this data.
So, for anyone who used a smartphone, iPad or the web, you are using basic technologies that are built around using complex data, something that Access now allows.
It is likely that the rest of the industry will have to follow suit given that more and more data is complex in nature. If the database industry does not change, then the mainstream traditional relational database system will NOT be the resting place of such data.
A trend away from storing data in related tables is occurring at a rapid pace right now and products like SharePoint, or even Google docs is proof of this concept. So Access is only reacting to market pressures and it is likely that other database vendors will have to follow suit or simply give up on being part of the "fad" called the internet.
XML and complex data structures are STAPLE and fact of our industry right now – this is not an issue we all should run away from, but in fact embrace.
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
kallal#msn.com
The technical discussion is interesting. I think the real problem lies in student understanding. Because it is available in Access students will use it, and initially it will probably provide a simple solution to some design problems. The negatives will occur later when they try and use the data. Maybe a simple example demonstrating the problems would persuade some students to avoid using multi-valued fields ? Maybe an example of storing the data in another, more usable format would help ?
Good luck !
Peter Bullard
MS Access does a great job of simplifying database management and abstracting out a lot of complexity. This however makes the learning of dbms concepts a bit difficult. Have you tried using other 'standard' dbms tools like MySQL (or even sqlite). From a learning perspective they may be better.
I know this post is old. But, it's not quite the same as every other post I've seen on this topic. This one has someone making a good case for using Multi Valued Fields...
As someone who is trying who is still trying very hard to get their head around Access, I find the discussion for and against using the Multi Valued Fields incredibly frustrating.
I'm trying to sort through it all, but if everyone is so against them, what is an alternative method? It seems that in every search result I find everyone is either telling you how to use Multi Valued Fields and Controls or telling you how horrible and what a mistake they are. Many people refer to an alternative to them, but nobody says "Here's an example". I'm here to learn about these things. And while I know that this is a simpler concept for a lot of people in these forums, I could really use some examples to take a look at.
I'm at a point where I have to decide which way to go. It would be wonderful to compare examples of using Multi Valued Fields and alternatives and using a control to select multiple values.
Or am I wrong and the functionality of a combobox where you can select multiple items is only available through Access?
I want to address the last of your questions first. There is a way of providing a visual presentation of a parent child relationship. It's called subforms. If you get help about subforms in Access, it will explain the concept.
I have used subforms in a project where I wanted to display the transaction header in a form and the transaction details in a subform. There is nothing to hinder this construct even when the data is stored in two normalized tables.
Of course, this affects the screen, not the database. That's the whole point. Normalization is relevant to storage and retrieval, not to other uses of data.

Is it possible to use Nhibernate with partition of an object over several tables?

We are having a system that gather large quantities of data each month and performes rather advanced calculations that increase the database even more. Since we have the requirements from the customer that we need to store data for fast access three years back and that we must be able to access older data (up to ten years), this however can be low performance and requires some work. We want to avoid performance issues where the database and its tables grows out of proportion.
After discussing using SQL Enterprise (VERY costly and full of traps since we haven't gotten the know-how) and since our system have so many tables that referenses each other we are leaning towards creating some kind of history tables to which we move data in a monthly fashion and rewrite the select queries that we have based on parameters to search either in the regular table or in the history or both depending on the situation.
Since we also are using NHibernate for mapping I was wondering if it is possbible to create a mapping file that handles this by itself (almost) using some sort of polymorfism or inheritance in which each object is stored in different tables based on parameters?
I know this sounds complicated and strange and that there is other methods to perform this but I this question I would rather have people answering the question asked and not give other sugestions to use instead.
As far as I know NHibernate can't do that (each class can be mapped to one table/view )but you can use SQL Queries or StoredProcedures (depends on the version of NHibernate that you are using) to populate mapped objects.
In your case you can have a combined view created by making unions of different tables Then you can use a SQL query to populated your entity.
There's also another solution that you create a summary object for your queries that uses that view ,therefore you can use both HQL and criteria to query this object.
Short answer "no". I would not create views as you mention a lot of joining.
Personally I would create summary tables and map to these directly using a stateless session or a very least mutable=false on the class definition. Think of these summary tables as denormalised data for report only. The only drawback is if historic data changes on a regular basis then the summary tables also needs changing. If historical data never changes then this should be simple to achieve.
I would also most probably store these summary tables on another catalog rather than adding to the size of the current system.
Its not a quick win this one I am afraid.

Is it a good idea to create schema types to organize the relationships of tables

Is it a good idea to create a schema type to separate the table relationships. I guess when you are browsing the tables in SSMS you will see them group together by schema type. But is it worth the trouble? Anyone with experience with this in real world scenarios?
I've generally found that to be more of a hassle than any help it's provided. What do you do with tables that are relevant to multiple areas? What happens when a table seems to belong to one area but later migrates to another area of the application? Do you change its schema and refactor all of your code?
I have used multiple schemata to make delineations when there is a VERY clear boundary between objects, but usually not something like what you have in your diagram. One example is objects which are used just for DBA support. I might put those into their own schema if they aren't used by the actual application itself.

Purpose and effect of SSAS hierarchies?

Firstly, I feel comfortable with what a hierarchy is in terms of the concept and how it impacts the design of a DW's star schema. I have some dimensions with lots of attributes, and I could create lots of hierarchies within SSAS. I would like a better understanding of how the OLAP engine uses the hierarchies that I create so that I can make a more informed decision on how I design my hierarchies(that's a tough word to type the first few times). There are also limitations with SSAS regarding attributes appearing in multiple hierachies so sometimes I have to do extra work to work around those limitations or decide which hierarchy is more important.
I also wonder what negative impacts a hierarchy might have, such as making the dimension more confusing for users. I might hide the attributes which are included in hierarchies to eliminate the duplicate attribute and make the dimension less confusing. But then a user wants to see which months of the year they typically get more sales. If I've hidden the month attribute so that it is only available through a Year->Month hierarchy, are they forced to always include the Year part of the hierarchy, preventing them from doing such analysis?
I few articles on hierarchies have stated something to the effect of "allowing the user to drill down to detailed data". Which is misleading, because you can simply drag the separate year and month attributes to a report and you've accomplished just that without the use of a hierarchy. So such an explanation is a little superficial. I feel like there must be a lot more to it than that.
Some articles seem to suggest it determines whether or not attributes are considered for aggregation. This seems counter intuitive, because I thought that already occurs when you included an attribute in a cube. I mean the whole point of creating a cube consisting of attributes, is to have an intersection of all of the attributes so that you can quickly aggregate on any combination of them, so it confuses me when something implies the opposite of that by saying only attributes in hierarchies are considered for aggregation:
Attributes only exposed in attribute hierarchies[as opposed to user
hierarchies] are not automatically considered for aggregation by the
Aggregation Design Wizard. Queries involving these attributes are
satisfied by summarizing data from the primary key. Without the
benefit of aggregations, query performance against these attributes
hierarchies can be slow.
-SSAS 2008 Performance Guide
Can someone explain how the engine uses my hierarchies in contrast with just including the attribute in the cube? (besides the aesthetics of grouping attributes together)
Unnatural hierarchies are confusing as heck to me in particular. In the SSAS 2008 Performance Guide they show one example as a Gender->Education hierarchy. I think my users would mumble "stupid programmer" every time they had to drill through Gender just to get to Education.
What rational do you follow on when and when not to create a hierarchy?
Not sure 100% the comments I will say applies to SSAS, but as we're both 100% MDX/XMLA compatible it's similar.
You may start by reading this and the many-to-many documentation.
The first difference between using hierarchies with levels and attributes is performance. You've two different scenarios for a drilldown (take [Asia] as a particular member and let's find all countries of [Asia]):
Using hierarchy with levels : [Asia].children()
Using attributes : ([Asia],[Countries])
The first option is trivial and very fast (the structure is in memory). The second one implies iterating though all countries and 'check' if they exist (aka are countries of [Asia]). This can be a pain for huge attributes (>100k). Once done, we need to go to our fact tables where each members has a set of associated fact rows. The version with a single hierarchy is again direct. The one with two might imply some additional internal operations -> all rows of [Asia] minus the ones of a particular country. Simplified version is also more handy for the cache.
Second, you define a 'natural' drilldown path that can be directly used in the GUI.
On top, you can add special aggregations types (First,Last, Min, Max...) that will take into account the structure of a given hierarchy.
There are successfully OLAP solutions that work without hierarchical structures but you've less features to play with for making a solution.
I hope it helps you understand these concepts better.