SQL Conditional joins, data inheritance - sql

I need to implement a data inheritance logic, based on a dynamic tree, with a fixed depth of 3.
Output 1
--Output 1.1
--Output 1.2
----Ouptut 1.2.1
----Output 1.2.2
--Output 1.3
Ouput 2
-- Output 2.1
-- Output 2.2
-- etc.
So I have an object defined as followed:
Object
-id
-createdAt
And the data attached to the object
ObjectData
-owner_id (fk to the object)
-output_id (fk to the output)
-name (this is the real data)
What I am trying to do is retrieve a list of Object, inner join the ObjectData base on owner_id/ouput_id, but if no ObjectData with given output_id exists, join the ObjectData of parent output.
At the moment, with my basic SQL query, if the object_data has no entry with a given output, it is not returned, which is how works INNER JOIN.
SELECT c.*, d.name FROM object c
INNER JOIN object_data d ON c.id = d.owner_id AND d.ouput_id = ?
What I am trying to achieve is to avoid having to create a new object_data entry when data is fully inherited, to optimize storage size, and to avoid huge inserts when adding a new output.
I must also be able to order the query on a object_data field.
Thank you !
EDIT:
Thanks to Gordon Linoff I think that coalesce is a solution for my problem, but according to mysql doc, coalesce will return first non null.
In my case, if the row exists, I want to return its value wether it is null or not.

I think I understand what you want. The following SQL joins in both the object data and the parent's object data, and then chooses between them in the select statement:
select o.*, coalesce(od.name, od_parent.name)
from object o left outer join
objectdata od_o
on od_o.id = o.id left outer join
objectdata od_parent
on od_parent.id = o.parent_id
This may seem like unnecessary effort on the part of the database engine. But most engines are highly optimized for joining tables together, so performance probably is not an issue -- especially if you have the right indexes on the tables.

Related

Abap subquery Where Cond [duplicate]

I have a requirement to pull records, that do not have history in an archive table. 2 Fields of 1 record need to be checked for in the archive.
In technical sense my requirement is a left join where right side is 'null' (a.k.a. an excluding join), which in abap openSQL is commonly implemented like this (for my scenario anyways):
Select * from xxxx //xxxx is a result for a multiple table join
where xxxx~key not in (select key from archive_table where [conditions] )
and xxxx~foreign_key not in (select key from archive_table where [conditions] )
Those 2 fields are also checked against 2 more tables, so that would mean a total of 6 subqueries.
Database engines that I have worked with previously usually had some methods to deal with such problems (such as excluding join or outer apply).
For this particular case I will be trying to use ABAP logic with 'for all entries', but I would still like to know if it is possible to use results of a sub-query to check more than than 1 field or use another form of excluding join logic on multiple fields using SQL (without involving application server).
I have tested quite a few variations of sub-queries in the life-cycle of the program I was making. NOT EXISTS with multiple field check (shortened example below) to exclude based on 2 keys works in certain cases.
Performance acceptable (processing time is about 5 seconds), although, it's noticeably slower than the same query when excluding based on 1 field.
Select * from xxxx //xxxx is a result for a multiple table inner joins and 1 left join ( 1-* relation )
where NOT EXISTS (
select key from archive_table
where key = xxxx~key OR key = XXXX-foreign_key
)
EDIT:
With changing requirements (for more filtering) a lot has changed, so I figured I would update this. The construct I marked as XXXX in my example contained a single left join ( where main to secondary table relation is 1-* ) and it appeared relatively fast.
This is where context becomes helpful for understanding the problem:
Initial requirement: pull all vendors, without financial records in 3
tables.
Additional requirements: also exclude based on alternative
payers (1-* relationship). This is what example above is based on.
More requirements: also exclude based on alternative payee (*-* relationship between payer and payee).
Many-to-many join exponentially increased the record count within the construct I labeled XXXX, which in turn produces a lot of unnecessary work. For instance: a single customer with 3 payers, and 3 payees produced 9 rows, with a total of 27 fields to check (3 per row), when in reality there are only 7 unique values.
At this point, moving left-joined tables from main query into sub-queries and splitting them gave significantly better performance.
than any smarter looking alternatives.
select * from lfa1 inner join lfb1
where
( lfa1~lifnr not in ( select lifnr from bsik where bsik~lifnr = lfa1~lifnr )
and lfa1~lifnr not in ( select wyt3~lifnr from wyt3 inner join t024e on wyt3~ekorg = t024e~ekorg and wyt3~lifnr <> wyt3~lifn2
inner join bsik on bsik~lifnr = wyt3~lifn2 where wyt3~lifnr = lfa1~lifnr and t024e~bukrs = lfb1~bukrs )
and lfa1~lifnr not in ( select lfza~lifnr from lfza inner join bsik on bsik~lifnr = lfza~empfk where lfza~lifnr = lfa1~lifnr )
)
and [3 more sets of sub queries like the 3 above, just checking different tables].
My Conclusion:
When exclusion is based on a single field, both not in/not exits work. One might be better than the other, depending on filters you use.
When exclusion is based on 2 or more fields and you don't have many-to-many join in main query, not exists ( select .. from table where id = a.id or id = b.id or... ) appears to be the best.
The moment your exclusion criteria implements a many-to-many relationship within your main query, I would recommend looking for an optimal way to implement multiple sub-queries instead (even having a sub-query for each key-table combination will perform better than a many-to-many join with 1 good sub-query, that looks good).
Anyways, any additional insight into this is welcome.
EDIT2: Although it's slightly off topic, given how my question was about sub-queries, I figured I would post an update. After over a year I had to revisit the solution I worked on to expand it. I learned that proper excluding join works. I just failed horribly at implementing it the first time.
select header~key
from headers left join items on headers~key = items~key
where items~key is null
if it is possible to use results of a sub-query to check more than
than 1 field or use another form of excluding join logic on multiple
fields
No, it is not possible to check two columns in subquery, as SAP Help clearly says:
The clauses in the subquery subquery_clauses must constitute a scalar
subquery.
Scalar is keyword here, i.e. it should return exactly one column.
Your subquery can have multi-column key, and such syntax is completely legit:
SELECT planetype, seatsmax
FROM saplane AS plane
WHERE seatsmax < #wa-seatsmax AND
seatsmax >= ALL ( SELECT seatsocc
FROM sflight
WHERE carrid = #wa-carrid AND
connid = #wa-connid )
however you say that these two fields should be checked against different tables
Those 2 fields are also checked against two more tables
so it's not the case for you. Your only choice seems to be multi-join.
P.S. FOR ALL ENTRIES does not support negation logic, you cannot just use some sort of NOT IN FOR ALL ENTRIES, it won't be that easy.

In an EXISTS can my JOIN ON use a value from the original select

I have an order system. Users with can be attached to different orders as a type of different user. They can download documents associated with an order. Documents are only given to certain types of users on the order. I'm having trouble writing the query to check a user's permission to view a document and select the info about the document.
I have the following tables and (applicable) fields:
Docs: DocNo, FileNo
DocAccess: DocNo, UserTypeWithAccess
FileUsers: FileNo, UserType, UserNo
I have the following query:
SELECT Docs.*
FROM Docs
WHERE DocNo = 1000
AND EXISTS (
SELECT * FROM DocAccess
LEFT JOIN FileUsers
ON FileUsers.UserType = DocAccess.UserTypeWithAccess
AND FileUsers.FileNo = Docs.FileNo /* Errors here */
WHERE DocAccess.UserNo = 2000 )
The trouble is that in the Exists Select, it does not recognize Docs (at Docs.FileNo) as a valid table. If I move the second on argument to the where clause it works, but I would rather limit the initial join rather than filter them out after the fact.
I can get around this a couple ways, but this seems like it would be best. Anything I'm missing here? Or is it simply not allowed?
I think this is a limitation of your database engine. In most databases, docs would be in scope for the entire subquery -- including both the where and in clauses.
However, you do not need to worry about where you put the particular clause. SQL is a descriptive language, not a procedural language. The purpose of SQL is to describe the output. The SQL engine, parser, and compiler should be choosing the most optimal execution path. Not always true. But, move the condition to the where clause and don't worry about it.
I am not clear why do you need to join with FileUsers at all in your subquery?
What is the purpose and idea of the query (in plain English)?
In any case, if you do need to join with FileUsers then I suggest to use the inner join and move second filter to the WHERE condition. I don't think you can use it in JOIN condition in subquery - at least I've never seen it used this way before. I believe you can only correlate through WHERE clause.
You have to use aliases to get this working:
SELECT
doc.*
FROM
Docs doc
WHERE
doc.DocNo = 1000
AND EXISTS (
SELECT
*
FROM
DocAccess acc
LEFT OUTER JOIN
FileUsers usr
ON
usr.UserType = acc.UserTypeWithAccess
AND usr.FileNo = doc.FileNo
WHERE
acc.UserNo = 2000
)
This also makes it more clear which table each field belongs to (think about using the same table twice or more in the same query with different aliases).
If you would only like to limit the output to one row you can use TOP 1:
SELECT TOP 1
doc.*
FROM
Docs doc
INNER JOIN
FileUsers usr
ON
usr.FileNo = doc.FileNo
INNER JOIN
DocAccess acc
ON
acc.UserTypeWithAccess = usr.UserType
WHERE
doc.DocNo = 1000
AND acc.UserNo = 2000
Of course the second query works a bit different than the first one (both JOINS are INNER). Depeding on your data model you might even leave the TOP 1 out of that query.

NHibernate HQL subselect left join?

I have the following issue and I would really appreciate your help:
I have all my filtering done through the HQL and one of the filters is a left outer join which must be a sub-select. So my thinking here is that I can create a dummy business object which I would like to populate (with an SP if possible) with data and use it in my left join.
Now how can I do that and still have all of the logic in 1 HQL query?
I guess the biggest issue for me is understanding how can I have this BO (not actually mapped to a table etc.) be used in a query where all the other BOs are mapped to tables etc.
I am trying to avoid doing any filtering in the actual C# code.
Thanks!
Example:
Entity A - mapped to table A
Entity B - mapped to table B
Entity C - mapped to table C
Entity D - not mapped to table - coming from SQL query or SP
HQL:
from A pbo
inner join B.EntityType etype
inner join C.EntityAddressList eadr
left join D.Level lvl
You cannot join to arbitrary SQL in HQL. So I don't think what you're trying to do is possible. You'll likely either have to map whatever D is, or write your query in SQL using Session.CreateSQLQuery(). You can still specify any entities that come back. See this article for reference. http://www.nhforge.org/doc/nh/en/index.html#d0e10274

Speeding up a SQL query with generic data information

Due to a variety of design decisions, we have a table, 'CustomerVariable'. CustomerVariable has three bits of information--its own id, an id to Variable (a list of possible settings the customer can have), and the value for that variable. The Variable table, on the other hand, has the information on a default--in case the CustomerVariable is not set.
This works well in most situations, allowing us not to have to create an insanely long list of information--especially in a case where there are 16 similar variables that need to be handled for a customer.
The problem comes in trying to get this information into a select. So far, our 'best' solution involves far too many joins to be efficient--we get a list of the 16 VariableIds we need information on, setting them into variables, first. Later on, however, we have to do this:
CROSS JOIN dbo.Variable v01
LEFT JOIN dbo.CustomerVariable cv01 ON cv01.customerId = c.id
AND cv01.variableId = v01.id
CROSS JOIN dbo.Variable v02
LEFT JOIN dbo.CustomerVariable cv02 ON cv02.customerId = c.id
AND cv02.variableId = v02.id
-- snip --
CROSS JOIN dbo.Variable v16
LEFT JOIN dbo.CustomerVariable cv16 ON cv16.customerId = c.id
AND cv16.variableId = v16.id
WHERE
v01.id = #cv01VariableId
v02.id = #cv02VariableId
-- snip --
v16.id = #cv16VariableId
I know there has to be a better way, but we can't seem to find it amidst crunch time. Any help would be greatly appreciated.
If your data set is relatively small and not too volatile, you may want to use materialized views (assuming your database supports them) to optimize the lookup.
If materialized views are not an option, consider writing a stored procedure that retrieves that data in two passes:
First retrieve all of the CustomerVariables available for a particular customer (or set of customers)
Next, retrieve all of the default values from the Variables table
Perform a non-distinct union on the results merging the defaults in wherever a CustomerVariable record is missing.
Essentially, this is the equivalent of:
SELECT variableId,
CASE WHEN CV.variableId = NULL THEN VR.defaultValue ELSE CV.value END
FROM Variable VR
LEFT JOIN CUstomerVariable CV on CV.variableId = VR.variableId
WHERE CV.customerId = c.id
The type of query you want is called a pivot table or crosstab query.
By far the easiest way of dealing with this is to create a view based off of a crosstab query. This will flip the columns from being vertical to being horizontal like a regular sql table. Once that is done, just query the view. Easy. ;)

NHibernate HQL Join Not Returning All Required Rows

I am modifying an existing HQL query which returns individual columns rather than an object graph but now I am not getting all rows that I need.
Here a few facts about the current schema:
An Estimate belongs to a Contract.
The OwningDepartment property of a Contract can be null.
The ParentBusinessStream property of a Department cannot be null
This is the query:
select e.ID, e.StatusCode.ID, e.InputDate, e.ParentClient.Name, e.ParentContractLocation.ParentLocation.Description, e.Description, e.InternalRef, e.ExternalRef, e.TotalIncTax, e.TaxTotal, e.Closed, e.ViewedByClient, e.HelpdeskRef, e.ParentContract.Reference, d.ParentBusinessStream.Title, d.Name
from Estimate e, Department d where (e.ParentContract.ID in (select cs.ParentContract.ID from ContractStaff cs
where cs.ParentStaff.ID=:staffID)) and ((d.ID = e.ParentContract.OwningDepartment.ID) OR (d.ID is null)) order by e.ID
Unfortunately my query is not returning Estimates where the parent contract does not have an owning department. Instead I want the relevant fields to just be null. I tried a left outer join but got the same results.
Any help would be very much appreciated. Apologies if I've done something stupid.
Cheers,
James
i've found that unusual queries containing left outer joins are better off by using an ISQLQuery which gives you access to proper SQL syntax as well as some HQL power.
Besides that, you don't provide mapping files which usually are helpful
I think I have worked it out: d.ParentBusinessStream.Title is an implicit inner join but since d can be null it doesn't work correctly. I have changed my query to take this into account