I have a table MYTYPE in Oracle 10g representing a tree structure, which is something like this:
ID | PARENTID | DETAIL
I would like to select all rows in MYTYPE which are descendants of a particular ID, such that I can create queries elsewhere such as:
SELECT *
FROM MYDETAIL
WHERE MYTYPEID IN [all MYTYPE which are descendants of some ID];
What is a cost-efficient way of building the descendant set, preferably without using PL/SQL?
Oracle didn't support the ANSI hierarchical syntax of using a recursive Subquery Factoring (CTE in SQL Server syntax) until 11g R2, so you have to use Oracle's native CONNECT BY syntax (supported since v2):
SELECT t.*
FROM MYTABLE t
START WITH t.parentid = ?
CONNECT BY PRIOR t.id = t.parentid
Replace the question mark with the parent you want to find the hierarchical data based on.
Reference:
ASK TOM: "connect by"
Managing hierarchical data using ID,ParentID columns in an RDBMS is known as the Adjacency List model. While very easy to implement and maintain (i.e. insert, update, delete), it's expensive to determine lineage (i.e. ancestors and descendants). As other answers already have written, Oracle's CONNECT BY will work, but this is an expensive operation. You may be better off representing your data differently.
For your case, the easiest solution might adding a what's called a Hierarchy Bridge table to your schema and adding a LEVEL column to your original table. The table has columns ID,DescendantID whereby selecting on ID gives all descendant records, and selecting by DescentantID gives all ancestor records. LEVEL is necessary on the base table to order records. In this way you make a tradeoff of expensive updates for cheap reads, which is what your question implies you want.
Other possibilities that involve changing your base data include Nested Set and Materialized Path representations. That offer similar tradeoffs of more expensive writes for much cheaper reads. For a complete list of options, pros and cons, and some implementation notes, see my previous question on the topic.
Oracle can do recursive queries.
Try looking into start with ... connect by, something like this:
Select *
from MYDETAIL
Starting with PARENTID= 1 --or whatever the root ID is
connect by PARENTID = prior ID
http://psoug.org/reference/connectby.html
Here is the details for 'connect by' features in oracle.
http://psoug.org/reference/connectby.html
I have a need to build a schema structure to support table of contents (so the level of sections / sub-sections could change for each book or document I add)...one of my first thoughts was that I could use a recursive table to handle that. I want to make sure that my structure is normalized, so I was trying to stay away from deonormalising the table of contents data into a single table (then have to add columns when there are more sub-sections).
It doesn't seem right to build a recursive table and could be kind of ugly to populate.
Just wanted to get some thoughts on some alternate solutions or if a recursive table is ok.
Thanks,
S
It helps that SQL Server 2008 has both the recursive WITH clause and hierarchyid to make working with hierarchical data easier - I was pointing out to someone yesterday that MySQL doesn't have either, making things difficult...
The most important thing is to review your data - if you can normalize it to be within a single table, great. But don't shoehorn it in to fit a single table setup - if it needs more tables, then design it that way. The data & usage will show you the correct way to model things.
When in doubt, keep it simple. Where you've a collection of similar items, e.g. employees then a table that references itself makes sense. Whilst here you can argue (quite rightly) that each item within the table is a 'section' of some form or another, unless you're comfortable with modelling the data as sections and handling the different types of sections through relationships to these entities, I would avoid the complexity of a self-referencing table and stick with a normalized approach.
Dealing with SQL shows us some limitations and gives us an opportunity to imagine what could be.
Which improvements to SQL are you waiting for? Which would you put on top of the wish list?
I think it can be nice if you post in your answer the database your feature request lacks.
T-SQL Specific: A decent way to select from a result set returned by a stored procedure that doesn't involve putting it into a temporary table or using some obscure function.
SELECT * FROM EXEC [master].[dbo].[xp_readerrorlog]
I know it's wildly unrealistic, but I wish they'd make the syntax of INSERT and UPDATE consistent. Talk about gratuitous non-orthogonality.
Operator to manage range of dates (or numbers):
where interval(date0, date1) intersects interval(date3, date4)
EDIT: Date or numbers, of course are the same.
EDIT 2: It seems Oracle have something to go, the undocumented OVERLAPS predicate. More info here.
A decent way of walking a tree with hierarchical data. Oracle has CONNECT BY but the simple and common structure of storing an object and a self-referential join back to the table for 'parent' is hard to query in a natural way.
More SQL Server than SQL but better integration with Source Control. Preferably SVN rather than VSS.
Implicit joins or what it should be called (That is, predefined views bound to the table definition)
SELECT CUSTOMERID, SUM(C.ORDERS.LINES.VALUE) FROM
CUSTOMER C
A redesign of the whole GROUP BY thing so that every expression in the SELECT clause doesn't have to be repeated in the GROUP BY clause
Some support for let expressions or otherwise more legal places to use an alias, a bit related to the GROUP BY thing, but I find other times what I just hate Oracle for forcing me to use an outer select just to reference a big expression by alias.
I would like to see the ability to use Regular Expressions in string handling.
A way of dynamically specifying columns/tables without having to resort to full dynamic sql that executes in another context.
Ability to define columns based on other columns ad infinitum (including disambiguation).
This is a contrived example and not a real world case, but I think you'll see where I'm going:
SELECT LTRIM(t1.a) AS [a.new]
,REPLICATE(' ', 20 - LEN([a.new])) + [a.new] AS [a.conformed]
,LEN([a.conformed]) as [a.length]
FROM t1
INNER JOIN TABLE t2
ON [a.new] = t2.a
ORDER BY [a.new]
instead of:
SELECT LTRIM(t1.a) AS [a.new]
,REPLICATE(' ', 20 - LEN(LTRIM(t1.a))) + LTRIM(t1.a) AS [a.conformed]
,LEN(REPLICATE(' ', 20 - LEN(LTRIM(t1.a))) + LTRIM(t1.a)) as [a.length]
FROM t1
INNER JOIN TABLE t2
ON LTRIM(t1.a) = t2.a
ORDER BY LTRIM(t1.a)
Right now, in SQL Server 2005 and up, I would use a CTE and build up in successive layers.
I'd like the vendors to actually standardise their SQL. They're all guilty of it. The LIMIT/OFFSET clause from MySQL and PostGresql is a good solution that no-one else appears to do. Oracle has it's own syntax for explicit JOINs whilst everyone else uses ANSI-92. MySQL should deprecate the CONCAT() function and use || like everyone else. And there are numerous clauses and statements that are outside the standard that could be wider spread. MySQL's REPLACE is a good example. There's more, with issues about casting and comparing types, quirks of column types, sequences, etc etc etc.
parameterized order by, as in:
select * from tableA order by #columName
Support in SQL to specify if you want your query plan to be optimized to return the first rows quickly, or all rows quickly.
Oracle has the concept of FIRST_ROWS hint, but a standard approach in the language would be useful.
Automatic denormalization.
But I may be dreaming.
Improved pivot tables. I'd like to tell it to automatically create the columns based on the keys found in the data.
On my wish list is a database supporting sub-queries in CHECK-constraints, without having to rely on materialized view tricks. And a database which supports the SQL standard's "assertions", i.e. constraints which may span more than one table.
Something else: A metadata-related function which would return the possible values of a given column, if the set of possible values is low. I.e., if a column has a foreign key to another column, it would return the existing values in the column being referred to. Of if the column has a CHECK-constraint like "CHECK foo IN(1,2,3)", it would return 1,2,3. This would make it easier to create GUI elements based on a table schema: If the function returned a list of two values, the programmer could decide that a radio button widget would be relevant - or if the function returned - e.g. - 10 values, the application showed a dropdown-widget instead. Etc.
UPSERT or MERGE in PostgreSQL. It's the one feature whose absence just boggles my mind. Postgres has everything else; why can't they get their act together and implement it, even in limited form?
Check constraints with subqueries, I mean something like:
CHECK ( 1 > (SELECT COUNT(*) FROM TABLE WHERE A = COLUMN))
These are all MS Sql Server/T-SQL specific:
"Natural" joins based on an existing Foreign Key relationship.
Easily use a stored proc result as a resultset
Some other loop construct besides while
Unique constraints across non NULL values
EXCEPT, IN, ALL clauses instead of LEFT|RIGHT JOIN WHERE x IS [NOT] NULL
Schema bound stored proc (to ease #2)
Relationships, schema bound views, etc. across multiple databases
WITH clause for other statements other than SELECT, it means for UPDATE and DELETE.
For instance:
WITH table as (
SELECT ...
)
DELETE from table2 where not exists (SELECT ...)
Something which I call REFERENCE JOIN. It joins two tables together by implicitly using the FOREIGN KEY...REFERENCES constraint between them.
A relational algebra DIVIDE operator. I hate always having to re-think how to do all elements of table a that are in all of given from table B.
http://www.tc.umn.edu/~hause011/code/SQLexample.txt
String Agregation on Group by (In Oracle is possible with this trick):
SELECT deptno, string_agg(ename) AS employees
FROM emp
GROUP BY deptno;
DEPTNO EMPLOYEES
---------- --------------------------------------------------
10 CLARK,KING,MILLER
20 SMITH,FORD,ADAMS,SCOTT,JONES
30 ALLEN,BLAKE,MARTIN,TURNER,JAMES,WARD
More OOP features:
stored procedures and user functions
CREATE PROCEDURE tablename.spname ( params ) AS ...
called via
EXECUTE spname
FROM tablename
WHERE conditions
ORDER BY
which implicitly passes a cursor or a current record to the SP. (similar to inserted and deleted pseudo-tables)
table definitions with inheritance
table definition as derived from base table, inheriting common columns etc
Btw, this is not necessarily real OOP, but only syntactic sugar on existing technology, but it would simplify development a lot.
Abstract tables and sub-classing
create abstract table person
(
id primary key,
name varchar(50)
);
create table concretePerson extends person
(
birth date,
death date
);
create table fictionalCharacter extends person
(
creator int references concretePerson.id
);
Increased temporal database support in Sql Server. Intervals, overlaps, etc.
Increased OVER support in Sql Server, including LAG, LEAD, and TOP.
Arrays
I'm not sure what's holding this back but lack of arrays lead to temp tables and related mess.
Some kind of UPGRADE table which allows to make changes on the table to be like the given:
CREATE OR UPGRADE TABLE
(
a VARCHAR,
---
)
My wish list (for SQLServer)
Ability to store/use multiple execution plans for a stored procedure concurrently and have the system automatically understand the best stored plan to use at each execution.
Currently theres one plan - if it is no longer optimal its used anyway or a brand new one is computed in its place.
Native UTF-8 storage
Database mirroring with more than one standby server and the ability to use a recovery model approaching 'simple' provided of course all servers are up and the transaction commits everywhere.
PCRE in replace functions
Some clever way of reusing fragments of large sql queries, stored match conditions, select conditions...etc. Similiar to functions but actually implemented more like preprocessor macros.
Comments for check constraints. With this feature, an application (or the database itself when raising an error) can query the metadata and retrieve that comment to show it to the user.
Automated dba notification in the case where the optimizer generates a plan different that the plan that that the query was tested with.
In other words, every query can be registered. At that time, the plan is saved. Later when the query is executed, if there is a change to the plan, the dba receives a notice, that something unexpected occurred.