Object oriented design. What better? [closed] - oop

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have a Printer class that should print a number and text. The number never changes for each Client class. I have more Client objects with different number values.
What design is better?
In the sample1 the number sends to print() method as argument, therefore all Client objects use single Print object. In the sample2 the number sends to the Printer constructor, therefore each Client object have own Printer object.
Please help me figure it out.

Number 2 seems to fit your requirements better.
In the first solution you use a "generic" Printer, which knows nothing about Clients or their numbers, therefore need the number as a parameter. This seems logical because you probably have a physical "printer" in real life and that does not depend on any Clients.
However, your object model must fit the requirements, not "real life". This is a bit confusing, because we sometimes call the "requirements" "real life". Regardless, your requirements clearly state that Client wants to print some text and for the client the "number" is static, i.e. irrelevant. So just make a mental change, that the Printer is not a generic printer, but a Printer specifically there for the Clients.
With this mental model the 2. solution clearly fits better.

I would recommend solution 1.
Because if the number is not a requirement for the printer it shouldn't go there. If the number is specific to clients, they should store their individual value and pass it to the printer. This makes the printer reusable in other places, where the number might change.
What if you need to query the client's number? Solution 2 makes you ask the printer about the number specific to the client, whether it is unique or not. That's not good: you are violating separation of concerns. But beside that it doesn't feel smooth, right? Solution 2 forces you to initialize the printer again or create a new instance any time the number changes (and it will as you stated).
The printer shouldn't care about the content it is printing. To make more reusable you could make the 'Print()' method to accept an IPrintable with a method 'GetData()'. Then the printer doesn't have to change the signature of the 'Print()' any time you add new content types or content and in this case you would also avoid too much arguments in the method's signature. So new content types just implement IPrintable.
Now you decided you need to print a number, a text and an additional date or timestamp? Then simply modify the IPrintable object or create a new implementation instead of modifying the Printer class itself. The IPrintable object could also be responsible for formatting the output. Printer shouldn't care about formatting too in order to make it more generic. Otherwise small changes require you to implement a new printer.
A printer usually has a queue to allow concurrent use by the clients. It will be a lot more difficult to implement this in case you store those kind of information directly in the printer object. Code will look not nice anymore. Better to keep associated data together e.g. inside an IPrintable parameter.

Related

Should the back-end perform conformity checks before a SQL query? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Context
By conformity check I mean eliminating queries that definitely are going to return nothing.
For example:
Consider table boxes, where one of the available columns is color CHAR(6);
A user sends this string 'abcdefg' to be queried against column color through his interaction with the front-end;
Then, the back-end would execute a query similar to SELECT * FROM boxes WHERE color = ?, using the same string mentioned above;
At least in my PostgreSQL installation I can execute this query, even knowing it's never going to return anything (the length of 'abcdefg' is 7).
Currently, both the front-end and the back-end perform conformity checks prior to accessing data from our DB (to avoid unnecessary calls).
As a matter of fact, the front-end is designed to forbid users from requesting invalid queries. But supposing that these checks didn't take place, especially at the back-end, how significant would that be to an application?
Question
How does PostgreSQL treats these queries, does it have any type of algorithm that instantly returns nothing if such a query is executed? Or would it be better to not call the DB and just send to the user something like not found or invalid request?
Further Context
We already sanitize all input acquired from our front-end interfaces, so this is not a question about the possible benefits/downsides regarding the safety gained after the execution of these checks.
The language used at our back-end is Go, which I believe to have no issues at performing these checks regularly (i.e. on most HTTP requests).
PS.: I know you can cast hexadecimal to ints in PostgreSQL, this is just a hypothetical problem which I used to ease the comprehension of the problem (I hope it did).
I would perform such checks either in the frontend or in the backend, wherever it is most convenient, but not in both. The second line of defense is the database, and two is enough.
It is a good thing to find incorrect data in the application, but don't go overboard: if you hard-code something like a maximal string length in both the database and the application, you'll have to modify that limit in two places whenever you do, and code redundancy is a bad thing.
What is still sane depends a lot on taste and opinion: I think it is fine to check length limits in the application rather than relying on errors from the database, but I think it is questionable to burden the application with complicated logic that guesses at the results of SQL statements.
What is important is to model all your important consistency checks in the database, then nothing much can go wrong as long as you catch and gracefully handle database errors. Everything beyond that can be considered performance tuning and should only be done if it offers a demonstrable benefit.

Primary Key Type Guid or Int? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am wondering what is the recommended type for PK in sql server? I remember reading a long time ago this article but now I am wondering if it is still a wise decision to use GUID still.
One reason that got me thinking about it is, these days many sites use the id in the url for instance Course/1 would get the information about that record.
You can't really do that with a guid, which would mean you would need some new column that would be unique and use that, what is more work as you got to make sure each record has a unique number.
There is never a "one solution fits all". You have to carefully design your architecture and select the best options for your scenario. Both INT and GUID types are valid options like they've always been.
You can absolutely use GUID in a URL. In fact, in most scenarios, it is better to use a GUID (or another random ID) in the URL than a sequential numeric ID for security reason. If you use sequential ID, your site visitors will be able to easily guess other users' IDs and potentially access their contents. For example, if my profile URL is /Profiles/111, I can try Profile/112 and see if I can access it. If my reservation URL is Reservation/444, I can try Reservation/441 and see what happens. I can easily guess other IDs in the system. Of course, you must have strong permissions, so I should not be able to see those other pages that don't belong to my account, but if there is any issues or holes in your permissions and security, a breach can happen. While with GUID and other random IDs, there is no way to guess other IDs in the system, so such a breach is much more difficult.
Another issue with sequential IDs is that your users can guess how many accounts or records you have and their order in your database. If my ID is 50269, I know that you must have almost this number of records. If my Id is 4, then I know that you had a very few accounts when I registered. For that reason, many developers start the first ID at some random high number like 1529 instead of 1. It doesn't solve the issue entirely, but it avoid the issues with small IDs. How important all that guessing is depends on the system, so you have to evaluate your scenario carefully.
That's on the top of the benefits mentioned in the article that you mentioned in your question. But still, an integer is better in some areas, so choose the best option for your scenario.
EDIT To answer the point that you raised in your comment about user-friendly URLs. In those scenarios, sequential numbers is the wrong answer. A better solution is a unique string in the URL which is linked to your numeric ID. For example, the Cars movie has this URL on IMDB:
https://www.imdb.com/title/tt0317219/
Now, compare that to the URL of the same movie on Wikipedia, Rotten Tomatoes, Plugged In, or Facebook:
https://en.wikipedia.org/wiki/Cars_(film)
https://www.rottentomatoes.com/m/cars/
https://www.pluggedin.ca/movie-reviews/cars/
https://www.facebook.com/PixarCars
We must agree that those URLs are much friendlier than the one from IMDB.
I've worked on small, medium, and large scale implementations(100k+ users) with SQL and Oracle. The major of the time PK type of INT is used when needed. The GUID was more popular 10-15 years ago, but even at its height was not as populate as the INT. Unless you see a need for it I would recommend INT.
My experience has been that the only time a GUID is needed is if your data is on the move or merged with other databases. For example, say you have three sites running the same application and you merge those three systems for reporting purposes.
If your data is stationary or running a single instance, int should be sufficient.
According to the article you mention:
GUIDs are unique across every table, every database, every server
Well... this is a great promise, but fails to deliver. GUID are supposed to be unique snowflakes. However, reality is much more complicated than that, and there are numerous reasons why they end up not being unique.
One of the main reasons is not related to the UUID/GUID specification, but by poor implementations of it. For example some Javascript implementations rank as the worst ones, using pseudo random numbers that are quite predictable. Other implementations are much more decent.
So, bottom line, study the specific implementation of UUID/GUID you are and will be using. Don't just read and trust the specification. Otherwise you may be up for a surprise, when you get called at 3 am on a Saturday night by angry customers.

Resource modelling in a REST API ( problems with timeseries data& multiple identifiers) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have some trouble modeling the resources in a domain to fit with a REST API. The example is obviously contrived and simplified, but it illustrates 2 points where I'm stuck.
I know that:
a user has pets
a pet has multiple names - one by each member of the family
a pet has: a date of birth, a date of death and a type (dog,cat...)
I need to be able to query based on dates (actually the date, or range of dates is mandatory when asking about pets). E.g.: tell me what pets I have now; tell me what pets grandma says we had 5 years ago until 3 years ago.
How should I handle dates?
a. in the query string: /pets/dogs/d123?from=10102010&to=10102015 (but as I understand, query string is mostly for optional parameters; and date/range of dates is needed. I was thinking of having the current date as default, if there's nothing in the query string. Any thoughts on this?)
b. somewhere in the path. Before /pets? This seems a bit weird when I change between a date and a range of dates. And my real path is already kind of long
How should I handle multiple names?
The way I see it, I must specify who uses the name I'm searching for.
/pets/dogs/rex -> I want to know about the dog called rex (by whom, me or grandma?). But where to put grandma?
I've seen some people say not to worry about the url, and use hypermedia But the way I understood that(and it's possible I just got it wrong) is that you have to always start from the root (here /pets )and follow the links provided in the response. And then I'm even more stuck(since the date makes for a really really long list of possibilities).
Any help is appreciated. Thanks
What might be useful in such scenarios is a kind of resource query language. It don't know the technology stack that you use, but a JavaScript example can be found here.
Absolutely do not put any dates in the path. This is considered as a bad style and users may be confused since, they most of them may be not used to such strange design and simply will not know how to use the API. Passing dates via query string is perfectly fine. You can introduce a default state - which is not a bad idea - but you need to describe the state (e.g. include dates) in the response. You can also return 400 Bad Request status code when dates range is missing in request. Personally, I'd go for default state and dates via query string.
In a such situation the only thing that comes to my mind is to reverse the relation, so it would be:
/users/grandma/dogs/rex
or:
/dogs/rex/owners/grandma
What can be done also is to abandon REST rules and introduce new endpoint /dogs/filter which will accept POST request with filter in the body. This way it will be much easier to describe the whole query as well to send it. As I mentioned this is not RESTful approach, however it seems reasonable in this situation. Such filtering can be also modeled with pure REST design - filter will become a resource as well.
Hypermedia seems not the way to go in this particular scenario - and to be honest I don't like hypermedia design very much.
You can use the query string if you want, there is no restriction about that. The path contains the hierarchical, while the query contains the non-hierarchical part, but that is not mandatory either.
By the queries I suggest you to think about the parameters and about what will be in the response. For example:
I want to know about the dog called rex (by whom, me or grandma?)
The params are: rex and grandma and you need dogs in the response.
So the hyperlink will be something like GET /pets/dogs/?owner=grandma&name=rex or GET /pets/dogs/owner:grandma/name:rex/, etc... The URI structure does not really matter if you attach some RDF metadata to the hyperlink and the params e.g. you can use the https://schema.org/AnimalShelter vocab. Ofc. this is not the best fit, because it does not concern about multiple names given by multiple persons, but it is a good start to create your own vocab if you decide to use RDF.

Method names for getting data [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Warning: This is a not very serious question/discussion that I am posting... but I am willing to bet that most developers have pondered this "issue"...
Always wanted to get other opinions regarding naming conventions for methods that went and got data from somewhere and returned it...
Most method names are somewhat simple and obvious... SaveEmployee(), DeleteOrder(), UploadDocument(). Of course, with classes, you would most likely use the short form...Save(), Delete(), Upload() respectively.
However, I have always struggled with initial action...how to get the data. It seems that for every project I end up jumping between different naming conventions because I am never quite happy with the last one I used. As far as I can tell these are the possibilities -->
GetBooks()
FetchBooks()
RetrieveBooks()
FindBooks()
LoadBooks()
What is your thought?
It is all about consistent semantics;
In your question title you use getting data. This is extremely
general in a sense that you need to define what getting means
semantically significantly unambiguous way. I offer the follow
examples to hopefully put you on the right track when thinking about
naming things.
getBooks() is when you are getting
all the books associated with an
object, it implies the criteria for the set is
already defined and where they are coming from is a hidden detail.
findBooks(criteria)
is when are trying to find a sub-set
of the books based on parameters to
the method call, this will usually
be overloaded with different search
criteria
loadBooks(source) is when you are
loading from an external source,
like a file or db.
I would not use
fetch/retrieve because they are too vague and get conflated with get and there is no unambiguous semantic associated with the terms.
Example: fetch implies that some entity needs to go and get something that is remote and bring it back. Dogs fetch a stick, and retrieve is a synonym for fetch with the added semantic that you may have had possession of the thing prior as well. get is a synonym for obtain as well which implies that you have sole possession of something and no one else can acquire it simultaneously.
Semantics are extremely important:
the branch of linguistics and logic concerned with meaning
The comments are proof that generic terms like get and fetch have
no specific semantic and are interpreted differently by different
people. Pick a semantic for a term, document what it is intended to
imply if the semantic is not clear and be consistent with its use.
words with vague or ambigious meanings are given different semantics by different people because of their predjudices and preconceptions based on their personal opinions and that will never end well.
Honestly you should just decide with your team which naming convention to use. But for fun, lets see what your train of thought would be to decide on any of these:
GetBooks()
This method belongs to a data source, and we don't care how it is obtaining them, we just want to Get them from the data source.
FetchBooks()
You treat your data source like a bloodhound, and it is his job to fetch your books. I guess you should decide on your own how many he can fit in his mouth at once.
FindBooks()
Your data source is a librarian and will use the Dewey Decimal system to find your books.
LoadBooks()
These books belong in some sort of "electronic book bag" and must be loaded into it. Be sure to call ZipClosed() after loading to prevent losing them.
RetrieveBooks()
I have nothing.
The answer is just stick to what you are comfortable with and be consistant.
If you have a barnes and nobles website and you use GetBooks(), then if you have another item like a Movie entity use GetMovies(). So whatever you and your team likes and be consistant.
It is not clear by what you mean for "getting the data". From the database? A file? Memory?
My view about method naming is that its role is to eliminate any ambiguities and ideally a need to look up documentation. I believe that this should be done even at the cost of longer method names. According to studies, most intermediate+ developers are able to read multiple words in camel case. With IDE and auto completions, writing long method names is also not a problem.
Thus, when I see "fetchBooks", unless the context is very clear (e.g., a class named BookFetcherFromDatabase), it is ambiguous. Fetch it from where? What is the difference between fetch and find? You're also risking the problem that some developers will associate semantics with certain keywords. For example, fetch for database (or memory) vs. load (from file) or download (from web).
I would rather see something like "fetchBooksFromDatabase", "loadBookFromFile", "findBooksInCollection", etc. It is less sightly, but once you get over the length, it is clear. Everyone reading this would right away get what it is that you are trying to do.
In OO (C++/Java) I tend to use getSomething and setSomething because very often if not always I am either getting a private attribute from the class representing that data object or setting it - the getter/setter pair. As a plus, Eclipse generates them for you.
I tend to use Load only when I mean files - as in "load into memory" and that usually implies loading into primitives, structs (C) or objects. I use send/receive for web.
As said above, consistency is everything and that includes cross-developers.

Are user-defined SQL datatypes used much? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
My DBA told me to use a user-defined SQL datatype to represent addresses, and then use a single column of that new type in our users table instead of multiple address columns. I've never done this before and am wondering if this is a common approach.
Also, what's the best place to get information about this - is it product-specific?
As far as I can tell, at least in the SQL Server world, UDT aren't used very much.
Trouble with UDT is the fact you can't easily update them. Once created and used in databases, they're almost like set in stone.
There's no "CREATE OR ALTER (UDT)" command :-( So to change something, you have to do a lot of shuffling around - possibly copying away existing data, then dropping lots of columns from other tables, then dropping your UDT, re-creating it with the new structure and reapplying the data and everything.
That's just too much hassle - and you know : there will be change!
Right now, in SQL Server land, UDT are just a nice idea - but really badly implemented. I wouldn't recommend using them extensively.
Marc
There are a number of other questions on SO about how to represent addresses in a database. AFAICR, none of them suggest a user-defined type for the purpose. I would not regard it as a common approach; that is not to say it is not a reasonable approach. The main difficulties lie in deciding what methods to provide to manipulate the address data - those used for formatting the data to appear on an envelope, or in specific places on a printed form, or to update fields, worrying about the many ramifications of international addresses, and so on.
Defining user-defined types is very product specific. The ways you do it in Informix are different from the ways it is done in DB2 and Oracle, for example.
I would also rather avoid using User defined datatypes as their defination and usability will make your code dependant on a particular database.
Instead if you are using any object oriented language, create a composition relationship to define addresses for an employee (for example) and store the addresses in a separate table.
Eg. Employees table and Employee_Addresses table. One employee can have multiple addresses.
user-defined SQL datatype to represent addresses
User-defined types can be quite useful, but a mailing address doesn't jump out as one of those cases (to me, at least). What is a mailing address to you? Is it something you print on an envelope to mail someone? If so, text is about as good as it's going to get. If you need to know what state someone is in for legal reasons, store that separately and it's not a problem.
Other posts here have criticized UDTs, but I think they do have some amazing uses. PostgreSQL has had full text search as a plugin based on UDTs for a long time before full-text search was actually integrated into the core product. Right now PostGIS is a very successful GIS product that is entirely a plugin based on UDTs (it has GPL license, so will never be integrated into core).