I have a very large amount of data that would most naturally be represented as a tree:
Category 1
Sub-category 1
data point 1
attribute 1
Sub-cateogry 2
data point 1
attribute 1
attribute 2
data point 2
Category 2
Sub-category 1
Sub-category 1
data point 1
Sub-category 2
data point 1
data point 2
Sub-category 2
data point 1
data point 2
data point 3
...
The individual data points have text and numerical attributes, bit it doesn't really suited for representation as a set of related tables. I would like to be able to perform SQL-like queries, but I would also like to be able to browse through the data in a way that makes the tree structure of the data obvious, like with a file manager.
There's probably some class of application that is ideal for such a thing, but it isn't occurring to me at the moment. Some kind of combination of a database and a tree viewer control? Anyone know what it is I'm looking for? As always, I'm terrified of asking a question in the wrong forum, but I see some related questions here at stackoverflow, so hopefully it's OK. Thanks!
You could make a table like this
id
name
parent_id
This structure would allow for nested categories
You could then make a table that relates category and data points.
The java.swing packages contain several table and tree solutions such as the JTable and JTree classes. JTree can be easily constructed to produce the tree structure you are looking for (looks like a file directory.)
The JTable class can be used to create sortable and searchable tables, although you would have to borrow or write your own sort & search methods.
Although these are from Java, other languages offer similar structures that may serve your needs without using a database. That being said, "mySQL" is a very easy to use database and you can download the community DB package for free.
Related
I have a .net solution with a big form with many data that the customer need to fill, like a form with many steps to fill all data we need to get.
So i was wondering if it's better (from a performance and design approach) a traditional big table with many fields, o store the data only on one field of XML type.
Example of one "TraditionalTable":
RecordId
CustomerId
Data 1
Data 2....
to Data N
1
120
01/01/1980
abcd ....
123
2
20
04/02/2004
fgh ....
230
3
10
05/01/1995
xyz ....
135
Example of one "DataWithXMLField":
RecordId
CustomerId
FormData
1
120
< data>< customerdetails>< borndate>01/01/1980< /borndate>< /customerdetails >< financialinfo >...."
I've done many systems like this and prefer to keep the data as XML (often it's a serialized object). I find this to be efficient at runtime and at design time. (See item below about binary attachments).
The following are some suggestions based on what I've done in the past. Obviously it's not a one-sized hammer...
Often data is "collected" by a user and "approved" by an administrator. While collecting the data, it's stored as XML. When approved, the XML is shred and placed into "normal" relational tables/fields.
Often this data has been collected through multiple pages. Storing as XML allows collecting data in a way that is logical to the user but doesn't fit the final data structure very well.
If a form is abandoned (not completed or canceled) it's easy to delete a single row.
Things to keep in mind:
Some data is related to workflow and is separate from the data being collected. For example, and field for "Form Status" may go from "In Progress", to "Submitted" to "Approved". This type of data should be kept as regular columns.
Store Binary Data separately. If your form includes submitting binary data (like uploading a PDF) I like to generate a GUID on the front end. Store that GUID in the XML and then save the binary data separately using the GUID. Possibly on disk or in a separate "attachments" table.
Define a column for a "version number" of the XML. This way you can programmatically identify what is in the XML. This will help in the future when you need to make changes to the XML.
Define a column for a "Summary" that is short human-friendly version of the XML. For example, if your XML contains information for registering for summer camps, your "XML Summary" might contain the text: "SMITH,JOHN, Camp White Pine 2021". This text us calculated on the front end. It can then be used for displaying rows of data without having to poke into the XML. For example, an administrative page may exist that lists applications that require approval.
Define a column to indicate if the XML meets all your requirements. You don't want to validate XML in the database (it's often hard, and likely repetitive of the UI). Your business layer can apply business rules (Validation) to the XML (or classes) and store in the database an indicator that all business rules are met.
I have few tables as shown below
Polls
PollId Question Option
1 What 1
2 Why 4
Updates
UpdateId Text
1 Sleep
2 Play
Polls and updates are just two sample tables (In reality there are more tables like ,photos, videos,links etc). But when a user visit his home (like facebook new feed) he must be displayed with data relevant to him (no such data included in this example). ie I want to select data from all tables with less number of query executions. (ie, I want to present a mixture of datas, ie polls, photos, videos etc )
Currently, I'm fetching only ids and type (ie which table) from all of the tables and gather further data while iterating through this resultset. (ie from c# calling another SqlQuery) .
Is there a way to query the data from whole tables at once? (OUTER JOIN?, UNION?)
Or simply,
How can I select different type of entities at once in a single sql Query?
You could write your query so that you have one long select list for everything you want and it all comes back in one result set but I suspect that wouldn't work too well because you might have varying numbers of different types of items per user.
If you really must have it all in one hit then you can issue multiple queries in one go and get multiple result sets back. To handle this you can use an ADO.Net DataSet. See this SO example (but not the accepted answer - see Vikram Dibyal's answer as that gives a very basic overview of what I think you're asking for).
I won't copy and paste the stuff from the linked thread, just head over and take a look.
I've got a table in my database storing items:
Items
-------
ItemID
Name
...
Etc
and a separate table storing the PK of two different items from the first table. I want to be able to list the one item, and then any number of related items. I've tried searching for examples but haven't found much surprisingly...
RelatedItems
------------
ItemID
RelatedItemID
If I have four products, whose IDs are 1, 2, 3 and 4... and 1 is related to 2 and 3 I might have data that looks like this:
ItemID RelatedItemID
1 2
1 3
4 1
I am then modeling them in the Entity Framework Designer, and the designer automatically adds an association from the Items table to itself (many to many). The designer also adds two navigation properties, if I use the first property on Item #1 I get all items where Item #1 is in the first column, and if I use the second property I get all the items where Item #1 is in the second column.
I however just want to have one navigation property where I can say Items.RelatedItems and it returns all the items that the above two properties would when combined. I know I can join the two results after the fact but I can't help to think I'm doing something wrong and there is a better way.
Hopefully this is all clear enough.
It sounds like SQL schemas just aren't very good at modeling the concept you're looking for. The schema you've chosen would work well if you want to establish a directional relationship (item A is related to item B, but item B may or may not be related to item A). If you were looking for a grouping-style relationship (Items A and B are in the same group), I can think of a different approach you'd use. But I can't think of a good way to model an inherently bi-directional relationship using a traditional relational database.
Some workarounds might be to use a View that joins the two results, or to use triggers to make sure that every mapping from A to B has a corresponding mapping from B to A, so that both of the properties always return the same objects.
If you have an instance of an Item, call it item, then the following will give you the related items...
item.RelatedItems.Select(ri => ri.Item);
Your RelatedItems property on item (ie the first navigation property you mentioned) will be a collection of RelatedItem objects, each of which has two navigation properties of its own, one of which will be named Item and will be a link to the related item.
Note that this is air code, as I'm not in front of anything I can test this on right now, but I think this will do what you want.
If you want to make it simpler, you can write an extension method to wrap up the Select(), something like this...
public static IEnumerable<Item> RelItems(this Item item) {
return item.RelatedItems.Select(ri => ri.Item);
}
Then you could just do...
item.RelItems();
Note that I couldn't name the extension method RelatedItems, as that would clash with the navigation property that EF would have created for the second table. That's perhaps not a good name for that tables, as it's not the actual items, rather the IDs of the items. Either way, the above code should work.
At the moment the team i am working with is looking into the possibility of storing data which is entered by users from a series of input wizard screens as an XML blob in the database. the main reason for this being that i would like to write the input wizard as a component which can be brought into a number of systems without having to bring with it a large table structure.
To try to clarify if the wizard has 100 input fields (for example) then if i go with the normal relational db structure then their will be a 1 to 1 relationship so will have 100 columns in database. So to get this working in another system will have to bring the tables,strore procedures etc into the new system.
I have a number of reservations about this but i would like peoples opinions??
thanks
If those inputted fields don't need to be updated or to be used for later calculation or computation some values using xml or JSON is a smart choice.
so for your scenario seems like its a perfect solution
I'm building a shopping cart website and using SQL tables
CATEGORY
Id int,
Parent_Id,
Description varchar(100)
Data:
1 0 Electronics
2 0 Furniture
3 1 TVs
4 3 LCD
5 4 40 inches
6 4 42 inches
PRODUCTS
Id int,
Category_Id int
Description...
Data:
1 5 New Samsung 40in LCD TV
2 6 Sony 42in LCD TV
As you can see I only have one column for the last Child Category
Now what I need to do is search by Main Category at homepage, for example if the user clicks to Electronics, show both TVs as they have a Parent-Parent-Parent Id at Electronics, keeping in mind that Products table do have only one column for Category.
Shall I update the Products Table and include 6 columns for category childs in order to solve this? Or how can I build an effective SQL Stored Procedure for this?
Thank you
Jerry
in Oracle, you would use CONNECT BY
If you're using SQL 2008 then you might want to look at the HIERARCHYID data type. Otherwise, you might want to consider redesigning the Category table. How you have it modeled now, you have to use recursion to get from children notes to parents or from parents down through children.
Instead of using the linked list model (which is what you have) you could use the nested set model for hierarchies. Do a search on Joe Celko and Nested Set Model and you should be able to find some good descriptions of it. He also wrote an entire book on modeling trees and hierarchies in SQL. The nested set model requires a bit of set up to maintain the data, but it's much easier to work with when selecting out data. Since your categories will probably remain relatively stable it seems like a good solution.
EDIT: To actually answer your question... you could write a stored procedure that sits in a WHILE loop, selecting children and collecting any products found in a table variable. Check ##ROWCOUNT in each loop and if it's 0 then you've gotten to the end. Then you just select out from your table variable. It's a recursive (and slow) method, which is why this type of a model doesn't work very well in many cases in SQL.
Under almost no circumstances should you just add 6 (or 7 or 8) category IDs to your products table. Bad. Bad. Bad. It will be a maintenance nightmare among other things (what happens when your categories go 7 levels deep... then 8... then 9.
Use recursive CTEs to do this ! works like a dream ! http://msdn.microsoft.com/en-us/library/ms186243.aspx