recursively dissect sql schema - sql

I use DBeaver with postgresql. It has a feature that lists a tree view of a db's schemas, including information_schema, pg_catalog, and public. Then, within each schema, there are a set of headings: Tables, Views, Materialized Views, Indexes, Functions, Sequences, Data Types, Aggregate Functions. Within each of these headings there are other entities, and so on to several levels in depth.
I would like to create that tree view independently of DBeaver, using tkinter. I can handle the tkinter part, but I haven't been able to divine the SQL statements that dissect schemas recursively down to leaf nodes. I've only found the topmost statement, which is:
select schema_name from information_schema.schemata
Beyond that, I cannot find anything that enables me to display deeper structure. I have read all the so-called schema tutorials; they are focused only on user-created tables. I've also read the official postgresql docs on schemas; they read like a dictionary and have no tutorial value whatever.
Any help, please.

You find all required information (e. g. schemata, tables, views, sequences, etc.) in the information_schema schema, which you can inspect with DBeaver. The PostgreSQL information schema is documented at https://www.postgresql.org/docs/current/information-schema.html . Good luck!

Related

Query Document Schema in MarkLogic

I would like to query the Schema definition of a Index in MarkLogic.
How can I query that?
What would be the query to do that?
I am talking about the Schema such as Elasticsearch Schema, with Field Types, Analyses, etc.
Please think of my question, as if I am asking how to see the column types, and column names in Oracle. How to do the same in MarkLogic? Any examples?
MarkLogic has a universal index, so there is no requirement to define a schema up front to search on specific elements or properties.
To do datatyped queries on element or properties, you can use TDE in MarkLogic 9 to define how to project datatyped values from documents in a collection into the indexes as a view over the documents. To find out the list of columns with data types for a view, you can either query the system columns view or retrieve the TDE template from the schemas database.
In MarkLogic 8 and before, you would define range indexes on elements, properties, fields, or paths. On the enode, the Admin API can get the list of range indexes for any database. On the middle tier, the Management REST API can express the equivalent REST request.
Hoping that clarifies,

Difference between SQL View and WITH clause

Can anybody in here tell me the difference of VIEW and WITH, as I have searched many placed and I can't find anything about it.
My thoughts is that VIEW and WITH are the same, except for VIEWs is saved as a Schema-object, but I can be wrong
SQL views and with clauses are very similar. Here are some differences.
Views create an actual object in the database, with associated metadata and security capabilities. With statements are only part of a single query.
In many databases, views have options, for instance, to index them or to "instantiate" them.
With statements offer the opportunity to have recursive CTEs, in some databases. This is not possible for views.
For simple subqueries incorporated into queries, they are quite similar. The choice really depends on whether you want to create a reusable code (views) or are focused on a single query (with).
Fundamentally, the definition of a view is saved in the database and can be reused by any query, whereas a WITH clause (or Common Table Expression, or CTE) is tied to one specific query and can only be reused by copying.
Otherwise, they will be substantially the same.
If you use a recursive WITH clause, then you can't achieve the same result in a VIEW unless the view definition itself uses a WITH clause (which is legitimate).
In a few words, WITH is a clause that is used in DML, VIEW is a database object. View definition may contain query that uses WITH. You can consider WITH as a variation of derived table (in Microsoft terminology) or inline view(in Oracle) that is defined before main DML and has an ability to refer to itself (recursive queries)
WITH is also used in SQLServer in a different context(query hints).

Is a snowflake schema better than a star schema for data mining?

I know the basic difference between a star schema and a snowflake schema-a snowflake schema breaks down dimension tables into multiple tables in order to normalize them, a star schema has only one "level" of dimension tables. But the Wikipedia article for Snowflake Schema says
"Some users may wish to submit queries to the database which, using conventional multidimensional reporting tools, cannot be expressed within a simple star schema. This is particularly common in data mining of customer databases, where a common requirement is to locate common factors between customers who bought products meeting complex criteria. Some snowflaking would typically be required to permit simple query tools to form such a query, especially if provision for these forms of query weren't anticipated when the data warehouse was first designed."
When would it be impossible to write a query in a star schema that could be written in a snowflake schema for the same underlying data? It seems like a star schema would always allow the same queries.
For data mining, you almost always have to prepare your data -- mostly as one "flat table".
It may be a query, prepared view or CSV export -- depends on the tool and your preference.
Now, to properly understand that article, one would probably have to smoke-drink the same thing as the author when he/she wrote it.
As you mention, preparing a flat table for data mining starting from a relational database is no simple task, and the snowflake or the star schema only work up to a point.
However, there is a software called Dataconda that automatically creates a flat table from a DB.
Basically, you select a target table in a relational database, and dataconda "expands" it by adding thousands new attributes to it; these attributes are obtained by executing complex queries involving multiple tables.

MySQL Views - When to use & when not to

the mysql certification guide suggests that views can be used for:
creating a summary that may involve calculations
selecting a set of rows with a WHERE clause, hide irrelevant information
result of a join or union
allow for changes made to base table via a view that preserve the schema of original table to accommodate other applications
but from how to implement search for 2 different table data?
And maybe you're right that it doesn't
work since mysql views are not good
friends with indexing. But still. Is
there anything to search for in the
shops table?
i learn that views dont work well with indexing so, will it be a big performance hit, for the convenience it may provide?
A view can be simply thought of as a SQL query stored permanently on the server. Whatever indices the query optimizes to will be used. In that sense, there is no difference between the SQL query or a view. It does not affect performance any more negatively than the actual SQL query. If anything, since it is stored on the server, and does not need to be evaluated at run time, it is actually faster.
It does afford you these additional advantages
reusability
a single source for optimization
This mysql-forum-thread about indexing views gives a lot of insight into what mysql views actually are.
Some key points:
A view is really nothing more than a stored select statement
The data of a view is the data of tables referenced by the View.
creating an index on a view will not work as of the current version
If merge algorithm is used, then indexes of underlying tables will be used.
The underlying indices are not visible, however. DESCRIBE on a view will show no indexed columns.
MySQL views, according to the official MySQL documentation, are stored queries that when invoked produce a result set.
A database view is nothing but a virtual table or logical table (commonly consist of SELECT query with joins). Because a database view is similar to a database table, which consists of rows and columns, so you can query data against it.
Views should be used when:
Simplifying complex queries (like IF ELSE and JOIN or working with triggers and such)
Putting extra layer of security and limit or restrict data access (since views are merely virtual tables, can be set to be read-only to specific set of DB users and restrict INSERT )
Backward compatibility and query reusability
Working with computed columns. Computed columns should NOT be on DB tables, because the DB schema would be a bad design.
Views should not be use when:
associate table(s) is/are tentative or subjected to frequent structure change.
According to http://www.mysqltutorial.org/introduction-sql-views.aspx
A database table should not have calculated columns however a database view should.
I tend to use a view when I need to calculate totals, counts etc.
Hope that help!
One more down side of view that doesn't work well with mysql replicator as well as it is causing the master a bit behind of the slave.
http://bugs.mysql.com/bug.php?id=30998

Enforce Referential Integrity on Materialized Path?

I'm trying to implement a tree like structure using a Materialized Path model described here: http://www.dbazine.com/oracle/or-articles/tropashko4.
Is it possible to enforce referential integrity on the [path] field? I don't see how SQL could do it, do I have to do it manually in the DAL?
Yes, you have to enforce data integrity yourself in the DAL when you use either Materialized Path or Nested Sets solutions for hierarchical data.
Adjacency List supports referential integrity, and this is true also for a design I call "Closure Table" (Tropashko calls this design "transitive closure relation").
"Materialized path" as presented by Vadim Tropashko in that article, introduces the notion of order into a relation ("Jones is the second member".).
"Materialized path" is nothing but "some form of materialized view" on the transitive closure, and therefore suffers all and exactly the same problems as any other "materialized view", except that matters are algorithmically worse precisely because of the involvement of a closure.
SQL is almost completely powerless when constraints-on-a-closure are in play. (Meaning : yes, SQL requires you to do everything yourself.) It's one of those areas where the RM shows the maximum of its almost unlimited power, but SQL fails abysmally, and where it is such a shame that the majority of people mistake SQL for being relational.
(#Bill Karwin : I'd like to be able to give you +1 for your remark on the relation between the depth of the trees and the result on performance. There are no known algorithms to compute closures that perform well in the case of trees with "crazy" depths. It's an algorithmic problem, not an SQL nor a relational one.)
EDIT
Yes, RM = Relational Model
In the materialized path model you can use arbitrary strings (maybe unicode strings to allow more than 256 children) instead of special strings of form "x.y.z". The id of the parent is then the id of the direct children with the last character removed. You can easily enforce this with a check constraint like (my example works in PostgreSQL)
check(parent_id = substring(id from 1 for char_length(id)-1)),
within your create table command. If you insist on strings of form "x.y.z", you'll have to play around with regular expressions, but I'd guess it's possible to find a corresponding check constraint.
Yes, we can "enforce referential integrity on the [path] field". I did that a couple of years ago, described here:
Store your configuration settings as a hierarchy in a database