I am creating an object oriented representation of a novel or book. I am looking for design patterns or advice in terms of what to make an object, and what to make an attribute of another object.
For example, suppose I'm interested in the characters in the novel, and which chapters and pages they appear on.
The concept I'm struggling with is how to organize an object system with two types of "containers" that both participate in relationships with the same instances. I imagine this comes up in other systems where there is a metaphorical "physical" and a "logical" view of the underlying data.
There are a variety possible objects here: Novel, Chapter, Page, Character.
There are a variety of relationships between these possible objects:
Novels have a sequence of Chapters, a sequence of Pages, and a collection of Characters
Chapters have a sequence of Pages, and a collection of zero or more Characters
Pages are associated with one or more Chapters, and zero or more Characters
Characters are associated with one or more Pages and Chapters
The purpose of these objects would be to answer questions like:
What pages does character Alice appear on?
What characters appear in chapter 6?
Which characters frequently appear on the same pages?
What page is character Bob mentioned for the first time on?
I'm a bit lost as to how to approach this sort of design. I see a few approaches:
Make everything (Novel, Chapter, Page, Character) an object, and each of these objects has lists of references to other objects they contain / relate to
Give primacy to one or the other of Chapter or Page, and make the other an attribute of the first. For example we could go with just a Novel, Chapter, and Character object list, and make "pages" an attribute of the Chapter object.
We could take the above idea even further and just stick with Novel and Character, and give each Character a few attributes such as: "Occurs in Chapter" and "Occurs on Pages"
Well, I hope that's clear enough for some OO design gurus to suggest where to draw the line between Object or Attribute, and how to design an object system where there are different kinds of containers (Chapter, Page) that the objects of interest (Character) belong to.
It is important that you have stated the purpose of your software will be to answer various questions about the book, including what characters appear on a given page.
This tells us two things:
You are seeking to model the physical layout of a particular
edition of a book (since other wise the example question above would
make no sense).
Your design is going to be heavily influenced by
the scope of the questions that need to be answered, as this will
determine whether your system will:
use data structures to pre-cache meta-data that will be required as answers (e.g.
explicitly store a list of pages a character appears on),
store the "raw data" (e.g. as a tree of novel > chapters > pages > text) which is then processed to answer a given question,
some combination of both
I suspect that a combination is likely to be right in your case, so your "raw data" will be represented in similar fashion to the first approach you mention:
Make everything (Novel, Chapter, Page, Character) an object, and each
of these objects has lists of references to other objects they contain
/ relate to"
except that instances of Page would reference instances of Text, rather than Character which instead would become a meta-data class.
Meta-data could be pre-cached, or generated on demand by trawling the raw data.
Either way, you'll want to normalize your data model,
Related
I have an MS Access database with several tables. Almost all tables contain inventory information about different classes of items (there are some utility tables which store extra information, such as a list of classes and lists of commonly used lookup values). Some classes of items have particular data specific to them - for instance, volume is relevant for liquids but not solid objects, but all objects have a location. The logical structure of my database is a textbook example of a case where an object oriented model provides clarity and maintainability benefits:
There is one basic table which is a catch-all table for all items that don't fit into other categories. It contains a few columns, like item name, date, location and notes that is applicable to any item. This would be the top superclass, e.g. class InventoryTable.
There are tables for specific classes, such as a table for printer cartridges. This table will have all the columns that InventoryTable has, but also include some specialized information that is only relevant for printer cartridges, such as printer model, ink color and brand. This table would be a subclass, e.g. class PrinterCartridgeTable : InventoryTable.
Sometimes there is a deeper inheritance structure. For example, there may be a table for all documents (class DocumentTable : InventoryTable, includes extra field for how many pages a document has) and then another table for letters (class LetterTable : DocumentTable which also has columns for sender and recipient of the letter). The assumption is that one would look for letters in the LetterTable, and if not found there, could try looking in the DocumentTable and the top level InventoryTable.
Let's say my dates are currently displayed as MM/DD/YYYY. I want to change them to ISO format (YYYY-MM-DD). Currently, I have to open every single table I have (about 20) and change the format in each one of them one by one. If there was some kind of inheritance mechanism, I could instead change the format only in my top-level InventoryTable, and all my other tables would inherit the change.
Or, suppose I decide to store a new piece of data, called "Owner", for all items. This would describe who entered the item into the inventory. I could simply add this column to InventoryTable, and it would appear in all the child tables automatically.
Lastly, let's say I make cosmetic changes such as rearranging the order of columns. Let's say in my document-related tables, the page number appeared at the end. I instead move the page number to the very beginning of the table - this would propagate to both DocumentTable as well as LetterTable but not unrelated tables.
Bear in mind that I am editing these tables manually using the GUI of MS Access 2013. When editing information pertaining to a single class of items, I would not like to switch back and forth between tables or queries to edit different parts of the same record - I want to be able to see and edit all of the information for any given record in one place. Therefore, some complicated solutions based on chaining queries may be impractical.
Is it possible for me to accomplish what I want (the inheritance structure) in Access using some kind of object oriented scheme? Is there an alternative way of obtaining the same benefits? Do I have no choice except to give up and manually propagate every change to all tables?
The relational data model does not have inheritance built in. There are several design patterns that allow the database designer to mimic the behavior of inheritance in a system of relational tables. Two common designs are known as "Single Table Inheritance" and "Class Table Inheritance". There are two tags in this area with questions that relate to these two techniques, and a brief description in the info under the tag. With one of these two techniques, you will be able to model a superclass/subclass situation.
For a more complete description, you could search for Martin Fowler's treatment of the two techniques on the web. There is a third technique, called "Shared Primary Key" which allows you to enforce the one-to-one nature of the IS-A relationship between members of the subclasses and members of the superclass.
Your big problem in MS Access is going to be implementing the code that these techniques leave to the application programmer. Get ready to do plenty of coding in VBA, and tying this code to the user's dashboard.
It is not possible to make tables in Access object-oriented because it is not possible to directly associate methods with tables. An object is defined to be both properties and methods. Access is not designed to do that.
Also note that Access is not the best that Microsoft has to offer. You will get more power and capabilities with SQL Server.
Does the name of a directory title 'the container' or the 'contents'? This question nags at me because if the name of a directory semantically titles 'the container' then the name should be singular. (By analogy: When referring to an actual physical bag that contains your groceries - one would probably refer to it as a 'grocery' bag and not the 'groceries' bag.) Conversely if one were to assert that the name of the directory titles the contents of the directory then it would make more sense to use a plural form.
I understand that there are common-sense and even usability concerns associated with this question; however, although I would like to hear the practical results of these two options I am more concerned with semantics.
So in summary: does the name of a directory serve as a title for the container or the contents?
Thanks.
A directory on its own needs no name as a directory without any content is useless. Directories exist to group a set of files together. Even if a directory is currently empty, it represents such a group, just that there are currently no files in this group and that's why it is empty. So the name of a directory should always describe what you can find within that directory.
Assume you have a drawer with boxes and you use these boxes to group physical objects together. To know what is inside each box without having to first open it and look inside, you label the boxes. How would you label these boxes?
If a box contains pencils, you'd label it Pencils and not Pencil, correct? If a box contains paper clips, you'd label it Paper Clips and not Paper Clip, wouldn't you? That's because in these cases the label only describes the kind of item to be found within the box. Same goes for directories. A directories containing pictures should most likely be named Pictures, so you know that the files you can find inside it are of type picture.
But sometimes you group items together, not because they are of the same kind but because they belong to the same "entity". E.g. if you have a large box that contains all items related to your trip to Japan in 2012, you would label it "Trip to Japan, 2012" or maybe just "Japan, 2012". Actually you could label it "Trip to Japan in 2012 Items" but "Items" is redundant, as it is obvious you will find items inside. The same way it is redundant to add "Files" to a directory name. So if you are not grouping files because the files itself have something in common but because they belong to a common "entity", you usually name the directory after that entity and since is only one such entity, it would be singular.
A directory with the pictures of Peter's Birthday would most likely be named Pictures/Peter's Birthday. On the other hand, if you keep pictures of every birthday of Peter, year after year, you would rather use a structure like Pictures/Peter's Birthdays/2016. Note how it suddenly became "Birthdays" as now the directory name again describes the kind of items found inside and not an individual event/purpose.
As a general rule of the thumb: Always name directories in such a way that the reader of the directory name has a very good idea of what kind of files and other directories they can expect to find inside that directory, so they can decide whether it's interesting to "go there" or not by just having read the directory name.
If you name a directory Recipe, what will the reader expect to find inside? I would expect to find one or more files, all belonging to a single recipe, e.g. a short ingredient list, a longer instruction text and maybe some supporting photos. Contrary, if you name the directory Recipes, what will the reader expect to find there? I would expect several recipes, either multiple files and every file contains one recipe or multiple sub-directories and each one contains the files that belong to one recipe. As you can clearly see by this simple example, whether you choose plural or not has an effect on the expectation of the reader.
My first impulse is to say that you name a generic container based on its contents. But I'll dig a little deeper just for the fun of it. (Scroll to the end if you just want the summarised conclusion.)
Firstly, I don't think Tum's car and cake examples help us very much. Sure, they are made from simpler components, but only for the purpose of creating new and self-contained objects. The grocery bag is a collection of molecules—so what. The more meaningful thing to ask is: is the object's fundamental purpose to hold other objects? In other words, is it a generic container, like the folder in a file system? You would definitely answer no to the cake. You would probably answer no to the car (even though it does, admittedly, hold people). You would certainly answer yes to the grocery bag.
In your grocery bag example, you said we would refer to it as a grocery bag and not the groceries bag. Sure. The following sentences are all grammatically correct and natural sounding:
Would you help me bring the bags in?
Would you help me bring the grocery bags in?
Would you help me bring the groceries in?
Would you help me bring the shopping in?’
The first sentence is only meaningful if the listener knows we just went shopping and can infer the contents of said bags without us telling them. Without that information, the bags might, for all we know, contain venomous snakes. The second sentence is the most descriptive, but includes redundant information. Anyone familiar with groceries and the process of buying them can reasonably infer that they will be contained in bags. The third sentence tells us all we need to know in the most succinct manner.
The forth sentence takes a different approach entirely, labelling the singular activity that produced the groceries. But it doesn't tell us what kind of product we shopped for, so it’s not as descriptive. (Sometimes this approach is the best option as will be seen in other examples.)
If you look at the user's home folder on any new Windows PC or Mac, you'll find it pre-populated with folders like Documents, Downloads, and Pictures. A folder labelled 'Pictures' tells us all we need to know. You could choose to suffix all your directory names with 'dir', 'folder' or something similarly redundant, but it adds nothing meaningful. (I'm old enough to remember seeing Mac users of yesteryear add the 'ƒ' character—generated by pressing Option-F—to the end of folder names. Crazy times.)
Ah, but there are exceptions! On my Mac, Apple (in its infinite wisdom, of course) chose singular names for three subfolders: Desktop, Library and Public. The rationale, one suspects, is that no one knows what the Desktop contains—not even the user oftentimes! Similarly, the Public folder might contain anything. To get all scholarly, we would call it a heterogeneous collection. It’s like that big box of stuff you pull out at Christmas time, which contains a hotchpotch of stuff like tinsel, baubles, stockings, wrapping paper and fake snow—easier to just label it ‘Christmas’, according to its purpose and theme.
The Library folder is an interesting one—not so much because Apple didn't name it according to its contents, but because it gives us an interesting real-world example. Is the purpose of a library to hold other objects, or is it a functional object in its own right? I'd say it's somewhere in the middle. Let’s say we decide to label a real-world library based on contents. We could call it ‘Books’. (Most libraries contain more than books, but to keep it simple we'll imagine that our library only contains books.) A library could hang a big sign out front that just said 'Books'—but then you might assume the books were for sale, rather than free to borrow. You'd probably assume this because you live in a capitalist society where most big signs are trying to peddle something. So the question of semantics is also one of context.
There's one more question of context that you didn't supply in your question. Where are these directories stored? Are they on your personal computer, or are they on a web server? Why does it matter? Well, on your PC, you're probably viewing each folder in a GUI, where the label is attached to an individual icon. But if the directories are part of a website structure, it's more likely that users will only ever see the names as part of a file path (if they notice them at all). The question here is, do you care, and if so, do you want the file path to read like a sentence? This is best illustrated by an example:
http://acme.com/order/explosive/detonator/dx3000
While each category could be plural, the path reads more like an English sentence by using singular directory names. While this seems a bit forced perhaps, you do see this approach on some websites.
TL;DR: When the function or theme of a container is more descriptive than its contents, a singular label (e.g. Library, Desktop, Christmas) can work best. The more heterogeneous a collection, the more likely this approach will make sense. But in most cases, labelling the plural contents (e.g. Documents, Downloads, Pictures) of the collection is easier and more descriptive.
Interesting question.
If you look at a class as a substitute for a tag name (something you would do when using a <div> or a <span>, which have no semantic meaning), your class would have to describe the contents of the element.
But at the same time, the element it self is the content. If you look at a car you say 'this is a car', you don't say 'these are car parts'. Or, if you eat a cake you call it 'cake' and not 'ingredients'.
So I guess a semantic class name would be singular and not plural, because it is always only one element. This might result in using a lot of 'wrapper' in your class names, because finding a fitting name for your element is not always that easy.
I hope this answers your question, if not you might want to read this: http://css-tricks.com/semantic-class-names/
The convention proposed by user Mecki is coherent, but I wonder whether it is really useful in the context of file organization.
In my experience, a very useful convention is to see file path components as a collection of searchable tags. This aids in classifying and finding files, which, for me is the main purpose of establishing a convention for directory and file names.
For example, one file could be named camera-MODEL-manual.pdf (where MODEL is the concrete model). There could be also a directory that contains various manuals. Or perhaps there are multiple manuals for a particular device. In that case, they could be kept in manual/DEVICE/.... Naming directories that contain manuals with the tag "manual" in singular helps searches for manuals.
Within this convention, one can search for file paths containing "birthday" and "peter" without having to remember whether the directory is "2022-birthday-peter" or "birthday/peter/2022". Or one can search for all PDF files whose path contains "car" to find some car-related document.
While in English most plural forms are simply created by adding an "s" suffix, so that a search for "birthday" may bring up "birthdays", this is not true in many other languages. And even in English, it may be preferable to search for whole words, such that a search for "birth" does not bring up "brithday".
This is a generic question, I don't know if it belongs to Programming or StackOverflow.
I'm writing a litte simulation. Without going very deep into its details, consider that many kind of identities are involved. They correspond to Object since I'm using a OOP language.
There are Guys that inhabit the world simulated
There are Maps
A map has many Lots, that are pieces of land with some characteristics
There are Tribes (guys belong to tribes)
There is a generic class called Position to locate the elements
There are Bots in control of tribes that move guys around
There is a World that represents the world simulated
and so on.
If the simulated world was laid down as a database, the objects would be tables with lots of references, but in memory I have to use a different strategy. So, for example, a Tribe has an array of Guys as a property, The world has a, array of Bots, of Tribes, of Maps. A Map has a Dictionary whose key is a Position and whose value is a Lot. A Guy has a Position that is where he stands.
The way I lay down such connections is pretty much arbitrary. For example, I could have an array of Guys in the World, or an Array of guys per Lot (the guys standing on a piece of land), or an array of Guys per Bot (with the Guys controlled by the bot).
Doing so, I also have to pass around a lot of objects. For example, a Bot must have informations about the Map and opponent Guys to decide how to move its Guys.
As said, in a database I'd have a Guys table connected to the Lots table (indicating its position), to the Tribe table (indicating which Tribe it belongs to) and so it would also be easy to query "All the guys in Position [1, 5]". "All the Guys of Tribe 123". "All the Guys controlled by Bot B standing on the Lot b34 not belonging to the Tribe 456" and so on.
I've worked with APIs where to get the simplest information you had to make an instance of the CustomerContextCollection and pass it to CustomerQueryFactory to get back a CustomerInPlaceQuery to... When people criticize OOP and cite verbose abstractions that soon smell ridiculous, that's what I mean. I want to avoid such things and having to relay on deep abstractions and (anti pattern) abstract contexts.
The question is: what is the preferred, clean way to manage entities and collections of entities that are deeply linked in multiple ways?
It depends on your definition of "clean". In my case, I define clean as: I can implement desired behavior in an obvious, efficient manner.
Building OOP software is not a data modeling exercise. I'd suggest stepping back a little. What does each one of those objects actually do? What methods are you going to implement?
Just because "guys are in a lot" doesn't mean that the lot object needs a collection of guys; it only needs one if there are operations on a lot that affect all the guys in it. And even then, it doesn't necessarily need a collection of guys - it needs a way to get the guys in the lot. This may be an internally stored collection, but it could also be a simple method that calls back into the world to find guys matching a criteria. The implementation of that lookup should be transparent to anyone.
From the tenor of your questions, it seems like you're thinking of this from a "how do I generate reports" perspective. Step back and think of the behaviors you're trying to implement first.
Another thing I find extremely valuable is to differentiate between Entities and Values. Entities are objects where identity matters - you may have two guys, both named "Chris", but they are two different objects and remain distinct despite having the same "key". Values, on the other hand, act like ints. From your above list, Position sounds a lot like a value - Position(0,0) is Position(0,0) regardless of which chunk of memory (identity) those bits are stored in. The distinction has a bit effect on how you compare and store values vs. entities. For example, your Guy objects (entities) would store their Position as a simple member variable.
I've found a great reference for how to think about such things is Eric Evan's "Domain Driven Design" book. He's focused on business systems, but the discussions are very valuable for how you think about building OO systems in general I've found.
I would say that no 'true' answer exists to your core question -- a best way to manage collections of entities that are linked in multiple ways. It really depends on the kind of application (simulation) - here are some thoughts:
Is execution time important?
If this is the case, there is really no way around analyzing in which way your simulator will iterate over (query) the objects from the pool: sketch out the basic simulation loop and check what kind of events will require to iterate over what kind of model entities (I assume you are developing a discrete-event simulation?). Then you should organize the data structures in a way that optimizes the most frequent/time-consuming events (as opposed to "laying down the connections arbitrarily"). Additionally, you may want to use special data structures (such as k-d trees) to organize entities with properties that you need to query often (e.g., position data). For some typical problems, e.g. collision detection, there is also a whole lot of approaches to solve them efficiently (so look for suitable libraries/frameworks, e.g. for multi-agent simulation).
How flexible do you want to make it?
If you really want to make it super-flexible and really don't want to decide on the hierarchy of the model entities, why not just use an in-memory database? As you already said, databases are easily applicable to your problem (and you can easily save the model state, which may also be useful).
How clean is clean enough?
If you want to be absolutely sure that the rest of your simulator is not affected by the design choices you make in regards of your model representation, hide it behind an interface (say, ModelWorld), which defines methods for all the types of queries your simulator may invoke (this is orthogonal to the second point and may help with the first point, i.e. figuring out what kind of access pattern your simulator exhibits). This allows you to change implementations easily, without affecting any other parts of the simulator code.
I newbie in Sharepoint development.
I has some hierarchical structure like internet forum:
Forum
Post
Comment
For each of this entities I create content type.
I see, that Sharepoint allow store in list different content types and I can store all forums with their posts and comments in single list (Forum and Post will be 'Folder', Comment - Item).
From other side, I can create separate lists for each content type:
Forums List, Posts List, Comments List and link them in some way.
Is anybody can outline Pros and Cons for both solutions? I have about 2 weeks experience in Sharepoint and can't select best way.
P.S. Sorry for my English.
The short answer is: it depends.
First, they need to logically fit together. A user should expect items of these various types to be grouped together (or at least wouldn't be surprised that they have been grouped together). And in terms of design, they should have some common intersection of list type and fields. Combining Documents, Discussions, and Events into a single list wouldn't be a good idea. Likewise, I'm not sure Posts and Comments (as you mention above) would be a good fit for a single list. They just don't logically fit and their schemas probably do not have enough in common.
Once that has been determined, I would put multiple Content Types in the same list if they are meant to be used together. Will you want to show all of these items, regardless of Content Type, together in a view? Do all of these items share the same workflows, policies, permissions, etc? If the answer is no for any of these, then split the Content Types into different lists.
As I said, it depends. I'm not sure there really is a hard or fast rule for this. I see it a little like database normalization. We know the forms and the options. But depending on the project, sometimes we normalize a little more, sometimes we denormalize a little more, but we almost never (I hope) have one, monster table that contains every type of row in the database.
I'm looking at a job description that I'm considering applying for, and one of the requirements listed is "Familiar with Meta-Data design principles".
Can some give a brief explanation? I'm probably familiar with the concept, but I've never heard that terminology before.
I did Google to find more info, but didn't get good results. Except for this white paper titled Metadata Principles and Practicalities. It was a little heavy, and I was hoping to find a quick explanation.
Additional Note: Thanks for all the answers so far. They've been very good. I wanted to clarify that I'm familiar with what metadata is, but I've just never heard of "metadata design principles". What sort of design principles are there for metadata have? Is this a large enough topic for a book? for a pamphlet? As Robert Harvey points out, it sounds like a nebulous term invented by someone in HR.
I'll bet it means "design principles include being driven by meta-data".
There aren't many design principles for meta-data -- it's usually given by your tools.
However, some organizations want to use meta-data as a key part of application software specification, construction and operation.
If they want someone who's design principles include using meta-data heavily, then it might come out as a phrase like "meta-data design principles".
But, before I said anything, I'd ask them what they think they meant by this.
Essentially, that would be the design of data about data; that is, characterizing data with additional data. Metadata is data about data; where data can be the orders that you get for a given item, the metadata about it can be things like how MANY orders you got, etc. Proper metadata design involves understanding what types of information is likely to be useful and interesting about whatever data you're analyzing, and recognizing how to most appropriately track and capture it.
For example, the number of sales of a given book in a particular day may be useful; not necessarily so the number of sales of the same book in a given minute. Likewise, the number of sales in a given year may be less useful than sales by month, etc. In this example, it's granularity, but metadata design can involve many other things; perhaps geographic distribution of sales is important, as another example.
The phrase, "Familiar with metadata design principles," sounds suspiciously like one of those nebulous phrases invented by an HR department that has no clue what they are talking about. However, I'll take a stab at it.
Metadata is data that enhances other data by describing the properties or characteristics of that other data.
Examples:
In the following tag:
Link to Google
the href descriptor is metadata because it "decorates," or further describes, the link. It is a property of the link. In general all HTML attributes are metadata.
A C# attribute is metadata. Microsoft calls attributes "a way to associate declarative information with a class."
[System.Serializable]
public class SampleClass
{
// Objects of this type can be serialized.
}
In a database table, the value contained in the Address field of a record:
12345 Main Street
is just data, but the field's definition in the database:
Type: Text
Length: 50
is metadata.
In an MP3 file, the audio is just data, but the MP3 tags such as Author, Title, and Bitrate are metadata.
XML is data, XSD is metadata. XSD can be used to express a set of rules to which an XML document must conform in order to be considered 'valid'.
The number of sales of a particular book in a given period is not metadata for the book, because it does not further describe the book itself, only its sales. However, the Author, Title, and number of pages of a book is metadata for that book (as is the ISBN).
There. Now you know all about "Metadata Design Principles."
Here is an excerpt from "Applying UML and Patterns" by C. Larman:
Reflective or Meta-Level Designs
An example of this approach is using
the java.beans.Introspector to
obtain a BeanInfo object, asking for
the getter Method object for bean
property X, and calling
Method.invoke. The system is
protected from the impact of logic or
external code variations by
reflective algorithms that use
introspection and meta-language
services. It may be considered a
special case of data-driven designs.