I'm working on an app for which I'd like to do something very similar to the problem proposed in this question: How to add row-span in view-based NSTableView?
The second answer intrigues me as a clever solution, but I can't figure out how to keep the two table views in sync with each other. In particular, I don't see any obvious way to make sure that the rows in the item table view show up next to the group the correspond to. The most obvious solutions to me are:
Basing the data source of the items table on that of the group table view. So, each object in the group data source has a list of the items that belong to it, then each time the item table view needs a row, iterate through the groups, and count the items in each group until you find the one you need. This sounds horribly inefficient.
Some clever application of NSSortDescriptors on the items such that they end up sorted so the rows match up. This seems kind of magic to me, like you'd be lucky if you could get it to work deterministically.
Keep a pointer to the current group you're processing through and return the next item in the group until you've exhausted the group's items, then move on to the next group. This would depend on the table view asking for rows in sequential order. Seems like this would also be really difficult if there was any concurrency or out-of-order-ness anywhere.
All of these solutions have some pretty obvious flaws. I feel like I'm missing the "trick" here, or maybe just the giant purple elephant standing in front of me. :)
Edit: In response to rdelmar's comment, I'll add a few clarifications. For my particular case, the number of items in a group is variable — some groups could have two, others ten. The solution I'd like to find shouldn't depend on there being a fixed number of items in a group.
In terms of selection behavior, it's probably not necessary that each item in a group be selectable, but they do need to be editable. Groups will probably be edited as a whole, i.e. the user will say "I want to edit group A", which will trigger the ability to edit any field in the group or the items that belong to it. It's probably possible to use labels instead of a table view, but it seems like that would involve duplicating a lot of work the table view would give you for free (arranging views in a grid, etc).
The very first solution I came up with for this actually involved embedding a table view for the items inside each row of the group table view. So, the top-level table view would consist only of groups, then each group would have its own embedded table for displaying the items it has. I eventually gave up on that solution in hopes of finding one that involved a shorter view tree.
Related
I am setting up a fairly large dataset (catalogue) on a sql database (i'd guesstimate ∼100k records) to store information regarding products. Each product is characterized by about 20-30 properties, so that would basically mean 20-30 column. The system is setup so that each of these properties is actually linked to a code, and each product is therefore characterized by a unique string made concatenating all these properties (the string has to be unique, if two product codes are the same then the two products are actually the same product). What I am trying to figure out is if sql-wise there is any difference to storing the catalogue as a table of 20-30 columns, or if I am better off just having 1 column with the code and decoding the properties from the code. The difference being that in one case I would do
SELECT * FROM Catalogue WHERE Color='RED'
versus
SELECT * FROM Catalogue WHERE Code LIKE '____R____________'
Also it might make it easier to check whether a product already exists, as I am only comparing a single column compared to 20-30 columns. I could also just add an extra column to the complete table to store the code and use one method when doing one operation and another when doing another operation.
I have almost no knowledge of how the SQL engine works so I might be completely off with my reasoning here.
The code approach seems silly. Why do I phrase it this way?
You have a few dozen columns with attributes and you know what they are. Why would you NOT include that information in the data model.
I am also amused by how you are going to distinguish these comparisons:
WHERE Code LIKE '____R____________'
WHERE Code LIKE '___R_____________'
WHERE Code LIKE '_____R___________'
WHERE Code LIKE '____R___________'
That just seems like a recipe for spending half the rest of your future life on debugging -- if not your code then someone else's.
And, with separate columns, you can create indexes for commonly used combinations.
If not all rows have all attributes -- or if the attributes can be expanded in the future -- you might want a structure with a separate line for each attribute:
entityId code value
1 Color Red
This is called an entity-attribute-value (EAV) model and is appropriate under some circumstances.
I am trying to figure out how I can relate records form 1 table to each other
I have a table with Individual cases (e.g disciplinary, grievance etc)
However, multiples of these cases could relate to each other.
e.g a group of people get into a fight, all the people involved would get an individual case, then all the cases would need to be related/linked.
Im struggling to figure out how would be best to store this data, whether it be in the same table or a new table.
This is the backend to a Winform. So now I need to be able to link records, in the example data the user would select that caseID 1- 4 are linked.
So the question is how do i store the data that these cases are related. Because other case might be linked at a later date.
It sounds like you need some sort of "master case" or "incident".
I think I would say that an "incident" comprises one or more "cases" and keep both an incidents and cases tables. In fact cases might be a bad name. Instead, it might be more like IncidentPerson.
An alternative approach would be to have a "master case" id on each case. This could be NULL or the same case. I'm not as fond of this approach because it will likely lead to confusion down the road. One analyst will count cases per month using "cases" and another using "master cases" and you'll spend a lot of time trying to figure out why the numbers are different.
Surfing the net I ran into Aquabrowser (no need to click, I'll post a pic of the relevant part).
It has a nice way of presenting search results and discovering semantically linked entities.
Here is a screenshot taken from one of the demos.
On the left side you have they word you typed and related words.
Clicking them refines your results.
Now as an example project I have a data set of film entities and subjects (like wolrd-war-2 or prison-escape) and their relations.
Now I imagine several use cases, first where a user starts with a keyword.
For example "world war 2".
Then i would somehow like to calculate related keywords and rank them.
I think about some sql query like this:
Lets assume "world war 2" has id 3.
select keywordId, count(keywordId) as total from keywordRelations
WHERE movieId IN (select movieId from keywordRelations
join movies using (movieId)
where keywordId=3)
group by keywordId order by total desc
which basically should select all movies which also have the keyword world-war-2 and then looks up the keywords which theese films have as well and selects those which occour the most.
I think with theese keywords I can select movies which match best and have a nice tag cloud containing similar movies and related keywords.
I think this should work but its very, very, very inefficient.
And its also only one level or relation.
There must be a better way to do this, but how??
I basically have an collection of entities. They could be different entities (movies, actors, subjects, plot-keywords) etc.
I also have relations between them.
It must somehow be possible to efficiently calculate "semantic distance" for entities.
I also would like to implement more levels of relation.
But I am totally stuck. Well I have tried different approaches but everything ends up in some algorithms that take ages to calculate and the runtime grows exponentially.
Are there any database systems available optimized for that?
Can someone point me in the right direction?
You probably want an RDF triplestore. Redland is a pretty commonly used one, but it really depends on your needs. Queries are done in SPARQL, not SQL. Also... you have to drink the semantic web koolaid.
From your tags I see you're more familiar with sql, and I think it's still possible to use it effectively for your task.
I have an application where a custom-made full-text search implemented using sqlite as a database. In the search field I can enter terms and popup list will show suggestions about the word and for any next word only those are shown that appears in the articles where previously entered words appeared. So it's similar to the task you described
To make things more simple let's assume we have only three tables. I suppose you have a different schema and even details can be different but my explanation is just to give an idea.
Words
[Id, Word] The table contains words (keywords)
Index
[Id, WordId, ArticleId]
This table (indexed also by WordId) lists articles where this term appeared
ArticleRanges
[ArticleId, IndexIdFrom, IndexIdTo]
This table lists ranges of Index.Id for any given Article (obviously also indexed by ArticleId) . This table requires that for any new or updated article Index table should contain entries having known from-to range. I suppose it can be achieved with any RDBMS with a little help of autoincrement feature
So for any given string of words you
Intersect all articles where all previous words appeared. This will narrow the search. SELECT ArticleId FROM Index Where WordId=... INTERSECT ...
For the list of articles you can get ranges of records from ArticleRanges table
For this range you can effectively query WordId lists from Index grouping the results to get Count and finally sort by it.
Although I listed them as separate actions, the final query can be just big sql based on the parsed query string.
Having dutifully normalised all my data, I'm having a problem combining 3NF rows into a single row for output.
Up until now I've been doing this with server-side coding, but for various reasons I now need to select all rows related to another row, and combine them in a single row, all in MySQL...
So to try and explain:
I have three tables.
Categories
Articles
CategoryArticles_3NF
A category contains CategoryID + titles, descriptions etc. It can contain any number of articles in the Articles table, consisting of ArticleID + a text field to house the content.
The CategoryArticles table is used to link the two, so contains both the CategoryID and the ArticleID.
Now, if I select a Category record, and I JOIN the Articles table via the linking CategoryArticles_3NF table, the result is a separate row for each article contained within that category.
The issue is that I want to output one single row for each category, containing content from all articles within.
If that sounds like a ridiculous request, it's because it is. I'm just using articles as a good way to describe the problem. My data is actually somewhat different.
Anyway - the only way I can see to achieve this is to use a 'GROUP_CONCAT' statement to group the content fields together - the problem with this is that there is a limit to how much data this can return, and I need it to be able to handle significantly more.
Can anyone tell me how to do this?
Thanks.
This sounds like something that should be done in the front end without more information.
If you need to, you can increase the size limit of GROUP_CONCAT by setting the system variable group_concat_max_len. It has a limit based on max_allowed_packet, which you can also increase. I think that the max size for a packet is 1GB. If you need to go higher than that then there are some serious flaws in your design.
EDIT: So that this is in the answer and not just buried in the comments...
If you don't want to change the group_concat_max_len globally then you can change it for just your session with:
SET SESSION group_concat_max_len = <your value here>
I'm involved on a project to make a survey system. We've been hammering out the logic for a few question types, and I could use a second opinion on what is the best way to proceed. We work on a ASP.NET 2.0 website using VB(VS2005) with an Oracle database.
In our oracle server, we plan for some tables to organize our data. There's a table for our surveys, one for questions (keys determine which survey it goes on), one for answers (again, keys determine what question it belongs to), and one for answer collection. Most questions only return one response, and that's pretty easy to figure out. However, when we start thinking about items that return multiple answers, it starts to get tricky.
For example, if we have a simple matrix of 3x3 filled with check boxes. The rows are days 'Monday', 'Wednesday', 'Friday'. The columns are activities like 'Biking', 'Running', 'Driving'. The user checks each one they did for a given day, thus each row can have more than one response. Another one we want to think about is what if instead of checkboxes, we have textboxes where users write in a value of how many minutes they spent on an activity.
So far, for collecting responses, I like the idea of traversing a list of controls in the form and keeping tabs on the kinds of controls that collect data. Since the controls are created in code, usually they're given an ID of a string with a number affixed to the end to keep track of what question type it is, and what number it is.
Question #1:
Should the data returned from the user be in a single database entry with delimiters to separate each answer, or should each answer get it's own entry?
Question #2:
What's the best way to identify what response goes with what answer (on the survey)?
assuming that space and speed are unlikely to be serious limiting issues in this system, i suggest that you keep your data normalized
so the answer to question 1 is: each answer gets its own entry; with a sequence number if necessary
and the answer to question 2 is: by the foreign keys
Question #1: Should the data returned from the user be in a single database entry with delimiters to separate each answer, or should each answer get it's own entry?
I dont think so, i think you need to maintain a simple cross reference table with the question Id and the answer (either a key if its multiple choice, or text if you allow it).
If you are talking about an answer grid, then your cross reference table could have one more column, with the id of the "category".