PostgreSQL: How safe is it to rely on default constraint names? [closed]

PostgreSQL: How safe is it to rely on default constraint names? [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
PostgreSQL provides the ability to magically generate constraint names in statements like CREATE TABLE, ALTER TABLE if none are provided explicitly. The naming convention is well known and I personally like it very much. But how stable and official is it? Is it something which one can rely on for different major releases or even the next 50 years?
I always had the impression that this is an implementation detail and while a lot of people rely on it, one shouldn't and always use explicit names to properly document things instead. I think I've read something like that in the official documentation in the past, but couldn't find it anymore...
So is there a definitive, official statement how reliable this naming scheme is or if users should always try to provide explicit names?

Strictly, if it's not in the documentation, you should not rely on it.
The docs only say:
If you don't specify a constraint name in this way, the system chooses a name for you.
so strictly I should recommend not baking the constraint names into the application unless you specify them explicitly in the SQL. This will also make the connection more apparent when reading the SQL - you bothered to specify constraint names for a reason.
That said, constraint name generation has not AFAIK changed, at least since I started using Pg around 7.4. So while it's not part of the official documented API, it's probably also not especially bad to rely on it. Also, constraint names are always going to be preserved by pg_dump and pg_upgrade, so it likely doesn't matter much unless you are doing a clean reload into a new version that has changed default constraint name generation.
TL;DR: It doesn't look like they're officially defined and documented, but they're unlikely to change, and if they do the impact is minimal. So relying on them is probably OK. Just document that in the app.

Related

Should a REST API select on a ID or a name field? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm designing a REST API and trying to decide which is the more correct way of returning a single resource:
/resource/{id}
or
/resource/{name}
The ID would be immutable, so I'm thinking that it would be better to select by it, but name would be more friendly looking. What is the best practice? I've seen both used before "in the wild".

Basically REST is built on top of unique IDs, thus:
GET /resources/{id}/
should be used. However, there's nothing that prevents you from making name field unique (now it behaves as plain old ID) and build REST on top of this unique ID.
If this is not what you need and name cannot be made unique, then another option is to implement filtering via name:
GET /resources?name=<SOME_NAME>
It also should be resources (plural) since it indicates that there's a collection under the hood.

Whether using name instead is practical comes down to your business case.
Will 'name' always be unique? Or will the application deal with there being more than one occurrence?
Are 'pretty' URLs important? In most applications I've worked on, querying uses unique IDs which are never exposed to the end-user, as they have no business meaning whatsoever. They are in effect surrogate primary keys.

/resource/{id} is more technically correct, but if it were me, I'd allow both. Assuming names can't contain ONLY numbers, and ids can ONLY be numbers, you could easily detect which was supplied and allow for either to be used. ;)

This is good question .. it depends on business case example if api is used through cli like docker then you might want to use user friendly ids like name
But as soon as it become part of URL it has limitations like ASCII (to avoid url encoding or loss of readability ) char only and some defined length like 128 chars etc.

Method names for getting data [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Warning: This is a not very serious question/discussion that I am posting... but I am willing to bet that most developers have pondered this "issue"...
Always wanted to get other opinions regarding naming conventions for methods that went and got data from somewhere and returned it...
Most method names are somewhat simple and obvious... SaveEmployee(), DeleteOrder(), UploadDocument(). Of course, with classes, you would most likely use the short form...Save(), Delete(), Upload() respectively.
However, I have always struggled with initial action...how to get the data. It seems that for every project I end up jumping between different naming conventions because I am never quite happy with the last one I used. As far as I can tell these are the possibilities -->
GetBooks()
FetchBooks()
RetrieveBooks()
FindBooks()
LoadBooks()
What is your thought?

It is all about consistent semantics;
In your question title you use getting data. This is extremely
general in a sense that you need to define what getting means
semantically significantly unambiguous way. I offer the follow
examples to hopefully put you on the right track when thinking about
naming things.
getBooks() is when you are getting
all the books associated with an
object, it implies the criteria for the set is
already defined and where they are coming from is a hidden detail.
findBooks(criteria)
is when are trying to find a sub-set
of the books based on parameters to
the method call, this will usually
be overloaded with different search
criteria
loadBooks(source) is when you are
loading from an external source,
like a file or db.
I would not use
fetch/retrieve because they are too vague and get conflated with get and there is no unambiguous semantic associated with the terms.
Example: fetch implies that some entity needs to go and get something that is remote and bring it back. Dogs fetch a stick, and retrieve is a synonym for fetch with the added semantic that you may have had possession of the thing prior as well. get is a synonym for obtain as well which implies that you have sole possession of something and no one else can acquire it simultaneously.
Semantics are extremely important:
the branch of linguistics and logic concerned with meaning
The comments are proof that generic terms like get and fetch have
no specific semantic and are interpreted differently by different
people. Pick a semantic for a term, document what it is intended to
imply if the semantic is not clear and be consistent with its use.
words with vague or ambigious meanings are given different semantics by different people because of their predjudices and preconceptions based on their personal opinions and that will never end well.

Honestly you should just decide with your team which naming convention to use. But for fun, lets see what your train of thought would be to decide on any of these:
GetBooks()
This method belongs to a data source, and we don't care how it is obtaining them, we just want to Get them from the data source.
FetchBooks()
You treat your data source like a bloodhound, and it is his job to fetch your books. I guess you should decide on your own how many he can fit in his mouth at once.
FindBooks()
Your data source is a librarian and will use the Dewey Decimal system to find your books.
LoadBooks()
These books belong in some sort of "electronic book bag" and must be loaded into it. Be sure to call ZipClosed() after loading to prevent losing them.
RetrieveBooks()
I have nothing.

The answer is just stick to what you are comfortable with and be consistant.
If you have a barnes and nobles website and you use GetBooks(), then if you have another item like a Movie entity use GetMovies(). So whatever you and your team likes and be consistant.

It is not clear by what you mean for "getting the data". From the database? A file? Memory?
My view about method naming is that its role is to eliminate any ambiguities and ideally a need to look up documentation. I believe that this should be done even at the cost of longer method names. According to studies, most intermediate+ developers are able to read multiple words in camel case. With IDE and auto completions, writing long method names is also not a problem.
Thus, when I see "fetchBooks", unless the context is very clear (e.g., a class named BookFetcherFromDatabase), it is ambiguous. Fetch it from where? What is the difference between fetch and find? You're also risking the problem that some developers will associate semantics with certain keywords. For example, fetch for database (or memory) vs. load (from file) or download (from web).
I would rather see something like "fetchBooksFromDatabase", "loadBookFromFile", "findBooksInCollection", etc. It is less sightly, but once you get over the length, it is clear. Everyone reading this would right away get what it is that you are trying to do.

In OO (C++/Java) I tend to use getSomething and setSomething because very often if not always I am either getting a private attribute from the class representing that data object or setting it - the getter/setter pair. As a plus, Eclipse generates them for you.
I tend to use Load only when I mean files - as in "load into memory" and that usually implies loading into primitives, structs (C) or objects. I use send/receive for web.
As said above, consistency is everything and that includes cross-developers.

Are user-defined SQL datatypes used much? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
My DBA told me to use a user-defined SQL datatype to represent addresses, and then use a single column of that new type in our users table instead of multiple address columns. I've never done this before and am wondering if this is a common approach.
Also, what's the best place to get information about this - is it product-specific?

As far as I can tell, at least in the SQL Server world, UDT aren't used very much.
Trouble with UDT is the fact you can't easily update them. Once created and used in databases, they're almost like set in stone.
There's no "CREATE OR ALTER (UDT)" command :-( So to change something, you have to do a lot of shuffling around - possibly copying away existing data, then dropping lots of columns from other tables, then dropping your UDT, re-creating it with the new structure and reapplying the data and everything.
That's just too much hassle - and you know : there will be change!
Right now, in SQL Server land, UDT are just a nice idea - but really badly implemented. I wouldn't recommend using them extensively.
Marc

There are a number of other questions on SO about how to represent addresses in a database. AFAICR, none of them suggest a user-defined type for the purpose. I would not regard it as a common approach; that is not to say it is not a reasonable approach. The main difficulties lie in deciding what methods to provide to manipulate the address data - those used for formatting the data to appear on an envelope, or in specific places on a printed form, or to update fields, worrying about the many ramifications of international addresses, and so on.
Defining user-defined types is very product specific. The ways you do it in Informix are different from the ways it is done in DB2 and Oracle, for example.

I would also rather avoid using User defined datatypes as their defination and usability will make your code dependant on a particular database.
Instead if you are using any object oriented language, create a composition relationship to define addresses for an employee (for example) and store the addresses in a separate table.
Eg. Employees table and Employee_Addresses table. One employee can have multiple addresses.

user-defined SQL datatype to represent addresses
User-defined types can be quite useful, but a mailing address doesn't jump out as one of those cases (to me, at least). What is a mailing address to you? Is it something you print on an envelope to mail someone? If so, text is about as good as it's going to get. If you need to know what state someone is in for legal reasons, store that separately and it's not a problem.
Other posts here have criticized UDTs, but I think they do have some amazing uses. PostgreSQL has had full text search as a plugin based on UDTs for a long time before full-text search was actually integrated into the core product. Right now PostGIS is a very successful GIS product that is entirely a plugin based on UDTs (it has GPL license, so will never be integrated into core).

Are foreign key constraints needed? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
In a perfect world, are foreign key constraints ever really needed?

Foreign keys enforce consistency in an RDBMS. That is, no child row can ever reference a non-existent parent.
There's a school of thought that consistency rules should be enforced by application code, but this is both inefficient and error-prone. Even if your code is perfect and bug-free and never introduces a broken reference, how can you be certain that everyone else's code that accesses the same database is also perfect?
When constraints are enforced within the RDBMS, you can rely on consistency. In other words, the database never allows a change to be committed that breaks references.
When constraints are enforced by application code, you can never be quite sure that no errors have been introduced in the database. You find yourself running frequent SQL scripts to catch broken references and correct them. The extra code you have to write to do this far exceeds any performance cost of the RDBMS managing consistency.

In addition to protecting the integrity of your data, FK constraints also help document the relationships between your tables within the database itself.

The world is not perfect that's why they are needed.

A world cannot be perfect without foreign keys.

Yes, if you want to ensure referential integrity.

In addition to consistency enforcement and documentation, they can actually speed up queries. The query optimizer can see a foreign constraint, understand its effect, and make a plan optimization that would be impossible w/o the constraint in place. See Foreign Key Constraints (Without NOCHECK) Boost Performance and Data Integrity. (SQL Server specific)

Additionally to the documentation effect Dave mentioned, FK constraints can help you to have write lesser code and automate some bits.
If you for example delete a customer record, all his invoices and invoice lines are also deleted automatically if you have "ON DELETE CASCADE" on their FK constrainst.

What mysql database tables and relationships would support a Q&A survey with conditional questions? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I'm working on a fairly simple survey system right now. The database schema is going to be simple: a Survey table, in a one-to-many relation with Question table, which is in a one-to-many relation with the Answer table and with the PossibleAnswers table.
Recently the customer realised she wants the ability to show certain questions only to people who gave one particular answer to some previous question (eg. Do you buy cigarettes? would be followed by What's your favourite cigarette brand?, there's no point of asking the second question to a non-smoker).
Now I started to wonder what would be the best way to implement this conditional questions in terms of my database schema? If question A has 2 possible answers: A and B, and question B should only appear to a user if the answer was A?
Edit: What I'm looking for is a way to store those information about requirements in a database. The handling of the data will be probably done on application side, as my SQL skills suck ;)

Survey Database Design
Last Update: 5/3/2015
Diagram and SQL files now available at https://github.com/durrantm/survey
If you use this (top) answer or any element, please add feedback on improvements !!!
This is a real classic, done by thousands. They always seems 'fairly simple' to start with but to be good it's actually pretty complex. To do this in Rails I would use the model shown in the attached diagram. I'm sure it seems way over complicated for some, but once you've built a few of these, over the years, you realize that most of the design decisions are very classic patterns, best addressed by a dynamic flexible data structure at the outset.
More details below:
Table details for key tables
answers
The answers table is critical as it captures the actual responses by users.
You'll notice that answers links to question_options, not questions. This is intentional.
input_types
input_types are the types of questions. Each question can only be of 1 type, e.g. all radio dials, all text field(s), etc. Use additional questions for when there are (say) 5 radio-dials and 1 check box for an "include?" option or some such combination. Label the two questions in the users view as one but internally have two questions, one for the radio-dials, one for the check box. The checkbox will have a group of 1 in this case.
option_groups
option_groups and option_choices let you build 'common' groups.
One example, in a real estate application there might be the question 'How old is the property?'.
The answers might be desired in the ranges:
1-5
6-10
10-25
25-100
100+
Then, for example, if there is a question about the adjoining property age, then the survey will want to 'reuse' the above ranges, so that same option_group and options get used.
units_of_measure
units_of_measure is as it sounds. Whether it's inches, cups, pixels, bricks or whatever, you can define it once here.
FYI: Although generic in nature, one can create an application on top of this, and this schema is well-suited to the Ruby On Rails framework with conventions such as "id" for the primary key for each table. Also the relationships are all simple one_to_many's with no many_to_many or has_many throughs needed. I would probably add has_many :throughs and/or :delegates though to get things like survey_name from an individual answer easily without.multiple.chaining.

You could also think about complex rules, and have a string based condition field in your Questions table, accepting/parsing any of these:
A(1)=3
( (A(1)=3) and (A(2)=4) )
A(3)>2
(A(3)=1) and (A(17)!=2) and C(1)
Where A(x)=y means "Answer of question x is y" and C(x) means the condition of question x (default is true)...
The questions have an order field, and you would go through them one-by one, skipping questions where the condition is FALSE.
This should allow surveys of any complexity you want, your GUI could automatically create these in "Simple mode" and allow for and "Advanced mode" where a user can enter the equations directly.

one way is to add a table 'question requirements' with fields:
question_id (link to the "which brand?" question)
required_question_id (link to the "do you smoke?" question)
required_answer_id (link to the "yes" answer)
In the application you check this table before you pose a certain question.
With a seperate table, it's easy adding required answers (adding another row for the "sometimes" answer etc...)

Personally, in this case, I would use the structure you described and use the database as a dumb storage mechanism. I'm fan of putting these complex and dependend constraints into the application layer.
I think the only way to enforce these constraints without building new tables for every question with foreign keys to others, is to use the T-SQL stuff or other vendor specific mechanisms to build database triggers to enforce these constraints.
At an application level you got so much more possibilities and it is easier to port, so I would prefer that option.
I hope this will help you in finding a strategy for your app.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas