Is it good to use id of object from a third-party server as my PK at SQL database? [closed] - sql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I write some app which analyze Instagram and Twitter posts (post serves in separate tables) and I load comments and likes too. So, it's good to use they id's as my primary key, or is better to create my id's which will not be related to third-party id.

Create your own ids in your database. In general you want these properties to be true about your primary keys:
Unique. This one the database management system will enforce for you.
Unrelated to the data they identify. This means that you shouldn't be able to calculate the primary key to any row based on the info in the row. For example, first name+last name would be a bad primary key for a People table, and credit card number would be a bad primary key for BillingInfo table.
By using the id generated by a third party service as your PK, you are unnecessarily coupling your database with their service.
Instead, there is a common pattern of using an altId column to store an extra id. You could even name the column better by calling it twitterId or something similar.

Apart from uniqueness and minimality three sensible criteria for deciding on keys are stability, simplicity and familiarity.
Above all, there is the business requirement: the need to represent the external reality with some acceptable degree of accuracy. If your database is intended to represent accurately things sourced from some given domain then you will need an identifier also sourced from that domain.

Related

SQL column naming best practice, should I use abbreviation? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I want to which is, in your opinion, the best practice for name SQL columns.
Example: Let's say that i have two columns name referenceTransactionId and source
now there is another way to write this like refTxnId and src
which one is the better way and why? Is it because of the memory usage or because of the readability?
Although this is a matter of opinion, I am going to answer anyway. If you are going to design a new database, write out all the names completely. Why?
The name is unambiguous. You'll notice that sites such as Wikipedia spell out complete names, as do standards such as time zones ("America/New_York").
Using a standard like the complete name means that users don't have to "think" about what the column might be called.
Nowadays, people type much faster than they used to. For those that don't, type ahead and menus provide assistance.
Primary keys and foreign keys, to the extent possible, should have the same name. So, I suspect that referenceTransactionId should simply be transactionId if it is referencing the Transactions table.
This comes from the "friction" of using multiple databases and having to figure out what a column name is.

Is it practical to use the person's ID card as a primary key? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm working on a project and I usually use the person's ID as a primary key to identify the person. But right now I'm working on something much more formal and serious then what I've been working on... (School DB) Is it good practice to use the person's/student's ID card instead of having another field (ID) auto generated/sequence as a PK.
It is a bad idea, for one simple reason: security.
You are better off designing your databases to have internal ids for all entities. The person's id would then be an attribute on the records, rather than a primary key. This allows you, for instance, to encrypt the id (and other sensitive information). If someone gets a hold of a print-out of some data, you don't have to worry that they are seeing personal information.
In the United States, this design is helped by the fact that social security numbers -- the closest thing we have to a national id -- were specifically designed not to be national id numbers. Apart from the issue of fraud, the approximately one billion numbers will run out one day.
I look after a similar student database and we use student ids as PK.
It helps us because students are aware of their IDs and if they have any issues they can come to us and quote their ID for us to resolve the issue. It certainly makes it easier than trudging through a load of John Smiths.
The down side I have found is that we do export data to programs such as excel and alot of the IDs have leading zeros which if you are not careful will be removed.
It is entirely up to you, but in my opinion I would use them.

Primary key design in SQL Server [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Right now, I am designing the database as you all know, primary keys are important factor in the database design.
I have dilemma in deciding the primary key between GUID, identity (Auto Increment) and a custom key for primary key, in which scenario should I use?
I am trying to database design for school. School has many branches in the different location and different cities. school have central database, they want to update every day at evening.
Please guide me? Thanks
In my opinion, the default choice is to use a 32-bit integer with auto-increment.
You might use GUID under specific scenarios that are covered in this QA: What are the best practices for using a GUID as a primary key, specifically regarding performance?
A custom key should only be used if you need to assign a specific key for some reason based on your design or based on business requirements.
Edit: Based on your update, if you are merging disconnected tables you might want to consider using GUID keys so you don't have problems with collisions.
I suggest you to choose identity (Auto Increment) because when you use custom primary key you should generate an random number and then save it. when identity (Auto Increment) don't need. and when you use GUID sql or you have to generate a GUID and then store it. identity (Auto Increment) is more general and easy to understand too.

Sql Server - Data warehouse design - Unique identifier for a customer [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have a data warehouse that was designed by someone else. That person states that having a unique identifier for each customer that doesn't change with time is impossible. Not knowing data architecture very well I want to know if that is true.
If possible, how complex would it be? Given that any other information about the customer might change at some point.
Thanks
Data warehouse information usually comes from some other system. That system should have been designed to have a surrogate key if people could not be uniquely identified. It is very rare for people to have a good unique identifier that is not a surrogate. Emails are inappropriate as they change (and people may have multiple emails) and can be reused for other people, SSNs are not as unique as you might expect, even things like medical liscence numbers for doctors end up not being unique due to data entry errors in source systems. Names are clearly not unique not even when combined with other information such as address. I have never seen, in the hundreds of databases that I have had reason to query, any one which had a good unique identifier for a person that was not a surrogate key.
If the designers of the orginal system were incompetent (no database table ever should be missing a primary key) then the data warehouse indeed may not have a way to uniquely identify individuals and the chances of there being duplicates in the data are right at 100%. There is no point in adding a surrogate key to a data warehouse if it didn't come from the originating system. How would you know if this John Smith was id 1234 or id 4567 when updating the information?

Best practices for database logic in or out of database. Save logic in database? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Best practices for database logic in or out of database. Save logic in database?
What is the best practice as far as saving enumerations and other lookup data in or out of the actual database? For instance in a web store is it okay to save all of the products if you are still going to write code to put the data into and out of the tables that use this product information. What if you had user information like roles (manager, employee, etc). Would it make sense to have a lookup table for the roles or can your CRUD logic keep all of that and when a new user is added/updated the CRUD code can do that validation?
This may or may not be a community wiki that is fine if it needs to be tagged as such. I really just want more information and to know what others are doing.
EDIT: Great answers. And the consensus seems to be yes, put the constraints in the database. So my next question is what is the technical mechanism to make that happen. If I have a "roles lookup" table, and I go to add a new user. How do I say, the roles column for a new user must be one of any of the values in that lookup. I know how to do this in code but what is the SQL mechanism to do this?
The database is for data, and validation rules that enforce the integrity of that data.
To answer your specific question, yes, I would store users/roles in the database. There is no case in which I would want to have to update code in order to add users to the system.
The database is the place to enforce any logic that must be enforced to ensure data integrity. Doing that only in the application is a recipe for disaster, databases are not changed only by the application.
In part you need the lookup tables to ensure data integrity, so that values which are not part of the lookups cannot be added.
To answer your second question, look-up foreign key constraints.