Two different definitions of database schema - sql

a) I found two definitions of schema:
FIRST - A set of information that describes a table is known as a schema,
and schemas are used to describe specific tables within a database, as well
as entire databases (and the relationship between tables in them, if any).
SECOND - A database schema is a way to logically group objects such as tables, views, stored procedures etc. Think of a schema as a container of objects.
I assume the two descriptions describe entirely different concepts, which just happen to use the same name?
b)
A database schema is a way to logically group objects such as tables, views, stored procedures etc. Think of a schema as a container of objects.
If I understand the above definition correctly, then database schema is similar to a namespace, only difference being that we can assign access permissions to database schema, while same can’t be done with namespaces?
thanx

Yes, this can be confusing. Generally, in the context of relational databases in general, your schema is the collection of your database structures - your tables, views, keys, constraints, etc. Depending on whom you ask, this may or may not include triggers, user-defined functions, custom user types, stored procedures, and the like, but I lump them in as schema objects, as well.
Within the context of specific relational database management systems (e.g., MSSQL, Postgres), a schema is a logical grouping of database objects. It serves two purposes: 1) as you note, it acts as a namespace and allows you to group related database objects together, and reduces name collision; 2) you can assign security settings to the schema as a whole, rather than assigning permissions to the schema's objects individually.
The terminology collision is sometimes confusing, but intentional. It usually makes sense to talk about subsets of your entire schema, and to assign permissions and deal with behaviors on these subsets - and the database supports this by allowing you to group these subsets into their own schemas.

Related

Does DBMS store data in a hierarchical or navigational form, and how is it different from RDBMS? [duplicate]

After reading some answers on different websites I am confused now. So, it would be helpful to mention the key difference between DBMS and RDBMS and any relation between them.
Since this question become popular on Stack Overflow, I am posting an answer which answers this question for me. I found this answer on udemy website. Hope this will help future users and newbies searching for a good answer on this topic.
Key Difference between DBMS and RDBMS:
The key difference is that RDBMS (relational database management system) applications store data in a tabular form, while DBMS applications store data as files.
Does that mean there are no tables in a DBMS?
There can be, but there will be no “relation” between the tables, like in a RDBMS. In DBMS, data is generally stored in either a hierarchical form or a navigational form. This means that a single data unit will have one parent node and zero, one or more children nodes. It may even be stored in a graph form, which can be seen in the network model.
In a RDBMS, the tables will have an identifier called primary key. Data values will be stored in the form of tables. The relationships between these data values will be stored in the form of a table as well. Every value stored in the relational database is accessible. This value can be updated by the system. The data in this system is also physically and logically independent.
You can say that a RDBMS is an extension of a DBMS, even if there are many differences between the two. Most software products in the market today are both DBMS and RDBMS compliant. Essentially, they can maintain databases in a (relational) tabular form as well as a file form, or both. This means that today a RDBMS application is a DBMS application, and vice versa. However, there are still major differences between a relational database system for storing data and a plain database system.
Every RDBMS is a DBMS, but the opposite is not true: RDBMS is a DBMS which is based on the relational model, but not every DBMS must be relational.
However, since RDBMS are most common, sometimes the term DBMS is used to denote a DBMS which is NOT relational. It depends on the context.
DBMS : Data Base Management System
..... for storage of data and efficient retrieval of data.
Eg: Foxpro
1)A DBMS has to be persistent (it should be accessible when
the program created the data donot exist or even the
application that created the data restarted).
2) DBMS has to provide some uniform methods independent of a
specific application for accessing the information that is
stored.
3)DBMS does not impose any constraints or security with
regard to data manipulation. It is user or the programmer
responsibility to ensure the ACID PROPERTY of the database
4)In DBMS Normalization process will not be present
5)In dbms no relationship concept
6)It supports Single User only
7)It treats Data as Files internally
8)It supports 3 rules of E.F.CODD out off 12 rules
9)It requires low Software and Hardware Requirements.
FoxPro, IMS are Examples
RDBMS: Relational Data Base Management System
.....the database which is used by relations(tables) to
acquire information retrieval
Eg: oracle, SQL..,
1)RDBMS is based on relational model, in which data is
represented in the form of relations, with enforced
relationships between the tables.
2)RDBMS defines the integrity constraint for the purpose of
holding ACID PROPERTY.
3)In RDBMS, normalization process will be present to check
the database table cosistency
4)RDBMS helps in recovery of the database in case of loss of
data
5)It is used to establish the relationship concept between two database objects, i.e, tables
6)It supports multiple users
7)It treats data as Tables internally
8)It supports minimum 6 rules of E.F.CODD
9)It requires High software and hardware
From Wikipedia,
A database management system (DBMS) is a computer software application that interacts with the user, other applications, and the database itself to capture and analyze data. A general-purpose DBMS is designed to allow the definition, creation, querying, update, and administration of databases.
There are different types of DBMS products: relational, network and hierarchical. The most widely commonly used type of DBMS today is the Relational Database Management Systems (RDBMS)
DBMS:
A DBMS is a storage area that persist the data in files.
There are limitations to store records in a single database file.
DBMS allows the relations to be established between 2 files.
Data is stored in flat files with metadata.
DBMS does not support client / server architecture.
DBMS does not follow normalization. Only single user can access the data.
DBMS does not impose integrity constraints.
ACID properties of database must be implemented by the user or the developer
RDBMS:
RDBMS stores the data in tabular form.
It has additional condition for supporting tabular structure or data that enforces relationships among tables.
RDBMS supports client/server architecture.
RDBMS follows normalization.
RDBMS allows simultaneous access of users to data tables.
RDBMS imposes integrity constraints.
ACID properties of the database are defined in the integrity constraints.
Have a look at this article for more details.
A DBMS is used for storage of data in files. In DBMS relationships can be established between two files. Data is stored in flat files with metadata whereas RDBMS stores the data in tabular form with additional condition of data that enforces relationships among the tables. Unlike RDBMS, DBMS does not support client server architecture. RDBMS imposes integrity constraints and also follows normalization which is not supported in DBMS.
DBMS is the software program that is used to manage all the database that are stored on the network or system hard disk. whereas RDBMS is the database system in which the relationship among different tables are maintained.
DBMS: is a software system that allows Defining, Creation, Querying, Update, and Administration of data stored in data files.
Features:
Normal book keeping system, Flat files, MS Excel, FoxPRO, XML, etc.
Less or No provision for: Constraints, Security, ACID rules, users, etc.
RDBMS: is a DBMS that is based on Relational model that stores data in tabular form.
SQL Server, Sybase, Oracle, MySQL, IBM DB2, MS Access, etc.
Features:
Database, with Tables having relations maintained by FK
DDL, DML
Data Integrity & ACID rules
Multiple User Access
Backup & Restore
Database Administration
There are other database systems, such as document stores, key value stores, columnar stores, object oriented databases. These are databases too but they are not based on relations (relational theory) ie they are not relational database systems.
So there are lot of differences. Database management system is the name for all databases.
DBMS stands for "Database Management Systems" it includes all Databases. RDBMS are a special Type of DMBS . R in RDBMS implies that the database uses the Relational model. a collection of related tables in the relational model makes up a database.DBMS is used for simple and small application while RDBMS is used for applications with a huge database.DBMS are for smaller organizations where security is not concerned(i.e. DBMS does not impose any constraints) while RDBMS is quitely opposite( RDBMS define the integrity constraint for the purpose of holding ACID PROPERTY).

How Does NoSQL Scale Out Exactly?

I use SQL Server 2012. I have a database sharded across physical tiers by User ID. In my app User is an aggregate root (i.e., nothing about Users comes from or goes into my repository without the entire User coming or going). Everything about one particular User exists on one particular machine.
Is my system any less scalable than one that employs NoSQL? Or, can someone explain how NoSQL systems scale out across servers exactly? Wouldn't they have to shard in a similar manner to what I'm doing? We've all read that NoSQL enables scalability but at the moment I don't see how, say, MongoDB would benefit my architecture.
MongoDB allows you to scale in two ways: sharding and replication. I think you can do both in MS SQL Server.
What usually is different is the data model:
In a relational database, you typically have multiple tables that reference each other. Theoretically, you can do something similar with MongoDB by using multiple collections, however this is not the way it's typically done. Instead, in MongoDB, you tend to store all the data that belongs together in the same collection. So typically you have less collections than tables in a database. This will in many times result in more redundancy (data is copied). You can try to do that in a relational database, but it's not quite so easy (there will be less tables, each having more columns).
MongoDB collections are more flexible than tables in that you don't need to define the data model up front (the exact list of columns / properties, the data types). This allows you to change the data model without having to alter the tables - the disadvantage is that you need to take this into account in the application (you can't rely on all rows / documents having the same structure). I'm not sure if you can do that in MS SQL Server.
In MongoDB, each document is a Json object, so it's a tree and not a flat table. This allows more flexibility in the data model. For example, in an application I'm developing (Apache Jackrabbit Oak / MongoMK), for each property (column) we can store multiple values; one value for each revision. Doing that in a relational database is possible, but quite tricky.

Test two database data is identical or not

For a testing projects I need to test is databases are working as required or not; as well as need to check that given two databases, the data in DB are identical or not.
So, is there any pre-defined algorithm for preforming this task?
Define what is identical. Is it schema, data or both? If schema, is it only tables or all elements including functions/procedures/views etc.
If you would use jdbc, then start with schema revealing functions and compare and contrast the objects one by one. Then repeat the same for the data inside the tables.
You can take advantage of the OpenSource projects to read the schema details.

How to implement a catalogue for meta-data and automating SQL in a database?

I have read here the discussions on 5NF, EAV and 6NF and the need for a catalogue to handle meta-data and the complex SQL "automatically". How is that implemented in practice?
PerformanceDBA wrote several answers on 6NF and EAV that mentions catalogues, e.g. in the following questions:
Would like to Understand 6NF with an Example
6NF and historical attribute data
and especially Multiple fixed tables vs flexible abstract tables, where PerformanceDBA wrote
"Eg. For 6NF databases with a catalogue, I have a set of procs that
will [re]generate the SQL required to perform all SELECTs, and I
provide Views in 5NF for all users, so they do not need to know or
understand the underlying 6NF structure. They are driven off the
catalogue. Thus changes are easy and automated. EAV types do that
manually, due to the absence of the catalogue."
First, with LedgerSMB we reuse the system catalogs and information schema wherever we can. This means that the application actually spends some time querying the system catalogs. We also have some meta-data calculations for extended attributes. We don't do EAV here. Rather we have actual relations and meta-data about these which allows us to create relational queries on the client side. These are loaded at one point and cached. The catalog looks very much like an EAV catalog, but the underlying storage is actually relational and the functions which maintain these alter underlying tables. This gives you the flexibility of EAV without the underlying difficulties.
In future versions we will probably move to fewer application catalogs and greater use of the Pg system catalogs and information schema, and our interface will be simpler from an application perspective.

NoSQL databases (maps)

I am starting to look into NoSQL databases.
I think I got the main concept which is to store the data as "maps" i.e. as key-value pairs which the NoSQL distinguishes via a unique id.
At this point I am confused (in relation to when designing the database).
Does this (that the data are stored as "maps") imply that the data we store in a Collection of a NoSQL must be "homogenous"?
Same as a Map in Java for instance where the keys are all of the same type and the values are all of the same type?
Also is the concept of referential integrity supported in NoSQL databases? Or Since they are stored as maps, I will have to
write code to manually update any related collection?
If we're talking about MongoDB:
There is no referential integrity enforcement on the DB side. You have to do it in the application code. Triggers and cascade update/delete are also on you.
Documents can be complex tree-like structures, where keys are strings and values can be of different types.
Documents are not required to be homogenous. The same collection can contain documents of any structure and field set.