Left outer joins in FetchXML with multiple conditions - sql

I'm trying to do a left outer join in FetchXML with multiple conditions.
Here is the approximate SQL of what I'm trying to achieve:
SELECT incident.*, externalCheck.*
FROM incident
LEFT OUTER JOIN externalCheck
ON externalCheck.incidentId = incident.incidentId
AND externalCheck.checkType = 1
AND externalCheck.isLatest = 1;
NB: This should always return a 1:1 relationship since our business logic requires that there is only one isLatest for each checkType.
And here is my FetchXML:
<entity name="incident">
<all-attributes />
<link-entity name="externalCheck" from="incidentId" to="incidentId" link-type="outer">
<filter type="and">
<condition attribute="checkType" operator="eq" value="1" />
<condition attribute="isLatest" operator="eq" value="1" />
</filter>
<all-attributes />
</link-entity>
</entity>
The Problem
The problem is that incident records where the right-hand side of the join are null (i.e. there is no externalCheck record) are not being returned, whereas in a left outer join I would expect that the incident record is still returned even if the right-hand side of the join is null.
What I suspect is that FetchXML is converting my filter to a WHERE clause, rather than adding the conditions to the ON clause.
The Question
Can anyone confirm what is happening, and a possible solution?

Your suspicion is correct. But you can overcome from it somewhat.
Fetchxml is flexible & the below snippet will give the results for left outer join with multiple clause.
<entity name="incident">
<all-attributes />
<link-entity name="externalCheck" alias="ext" from="incidentId" to="incidentId" link-type="outer">
<filter type="and">
<condition attribute="checkType" operator="eq" value="1" />
<condition attribute="isLatest" operator="eq" value="1" />
</filter>
</link-entity>
<filter>
<condition entityname="ext" attribute="externalCheckid" operator= "null" />
</filter>
</entity>
The real problem is with externalCheck.*, you cannot get the related entity attributes. Have to remove the <all-attributes /> from link-entity node.
Read more

Related

How do I produce an XML response from PostgreSQL 1:n tables?

I have the table "client" with:
id name registered_on status
-- ------- ------------- ------
1 Alice 2020-03-04 a
2 Vincent 2020-03-05 p
3 Anne 2020-03-06 a
And the table "account" with:
client_id account_number type balance
--------- -------------- ---- -------
1 300-1 CHK 100
2 307-5 SAV 24
2 307-6 CHK 350
I created them in DB Fiddle (for a similar question I asked before about producing JSON).
Now, I need a SQL query to produce the 1:n XML document:
<client id="1" name="Alice" registered_on="2020-03-04" status="a">
<account account_number="300-1" type="CHK" balance="100" />
</client>
<client id="2" name="Vincent" registered_on="2020-03-05" status="p">
<account account_number="307-5" type="SAV" balance="24" />
<account account_number="307-6" type="CHK" balance="350" />
</client>
<client id="3" name="Anne" registered_on="2020-03-06" status="a" />
There's a 1:n relationship between the tables and some clients may not have an account (such as "Anne"). The result is a simple join (probably an outer join) that I know how to do. I just don't get how to produce a XML document from it.
If it's makes it easier/shorter I'm open to an alternative XML result, as long as it represents the same data; using tags, instead of attributes, for example.
After trying a bunch of options I was able to find the answer(s).
Original Format: With Attributes
It's possible to produce the XML result using an outer join:
select
xmlserialize(content -- remove this line to keep as XML instead of VARCHAR
xmlagg(r)
as text) -- remove this line to keep as XML instead of VARCHAR
from (
select
xmlelement(name client,
xmlattributes(c.id, c.name, c.registered_on, c.status),
case when count(a.client_id) > 0 then
xmlagg(xmlelement(name account,
xmlattributes(a.account_number, a.type, a.balance) ))
end
) as r
from client c
left join account a on a.client_id = c.id
group by c.id
) s
Or using subqueries (shorter but less performant):
select
xmlserialize(content -- remove this line to keep as XML instead of VARCHAR
xmlagg(
xmlelement(name client, xmlattributes(id, name, registered_on, status),
( select xmlagg(xmlelement(name account,
xmlattributes(a.account_number, a.type, a.balance)
)) from account a where a.client_id = c.id
)
))
as text) -- remove this line to keep as XML instead of VARCHAR
from client c;
Result:
<client id="1" name="Alice" registered_on="2020-03-04" status="a">
<account account_number="300-1" type="CHK" balance="100.00" />
</client>
<client id="2" name="Vincent" registered_on="2020-03-05" status="p">
<account account_number="307-5" type="SAV" balance="24.00" />
<account account_number="307-6" type="CHK" balance="350.00" />
</client>
<client id="3" name="Anne" registered_on="2020-03-06" status="a" />
Alternative Format: Without Attributes
Some people prefer to avoid attributes altogether and always use tags. That can also be done, using:
select
xmlserialize(content -- remove this line to keep as XML instead of VARCHAR
xmlagg(xmlelement(name client,
xmlforest(id, name, registered_on, status),
( select xmlagg(xmlelement(name account,
xmlforest(a.account_number, a.type, a.balance)))
from account a where a.client_id = c.id
)
))
as text) -- remove this line to keep as XML instead of VARCHAR
from client c;
Result:
<client>
<id>1</id>
<name>Alice</name>
<registered_on>2020-03-04</registered_on>
<status>a</status>
<account>
<account_number>300-1</account_number>
<type>CHK</type>
<balance>100.00</balance>
</account>
</client>
<client>
<id>2</id>
<name>Vincent</name>
<registered_on>2020-03-05</registered_on>
<status>p</status>
<account>
<account_number>307-5</account_number>
<type>SAV</type>
<balance>24.00</balance>
</account>
<account>
<account_number>307-6</account_number>
<type>CHK</type>
<balance>350.00</balance>
</account>
</client>
<client>
<id>3</id>
<name>Anne</name>
<registered_on>2020-03-06</registered_on>
<status>a</status>
</client>

Pentaho Mondrian MDX - removing parent titles

I am new to Pentaho, Mondrian & MDX. I started to use Pentaho CE 5.0.1 as my OLAP tool. I am struggling with a MDX query and hoping someone can give me some pointers on my issue.
I have the following time dimension mondrian schema:
`<!-- date dimension -->
<Dimension name="Time" type="TimeDimension">
<!-- Year, Quarter, Month, Week, Day -->
<Hierarchy name="YQMD" hasAll="true" allMemberName="All Dates" primaryKey="date_key" type="TimeDimension">
<Table name="dim_date" schema="webportal"/>
<Level name="Year" uniqueMembers="true" column="year4" levelType="TimeYears" type="Numeric"></Level>
<Level name="Quarter" uniqueMembers="false" column="quarter_name" levelType="TimeQuarters" type="String"></Level>
<Level name="Month" uniqueMembers="false" column="month_number" ordinalColumn="month_number" nameColumn="month_number" levelType="TimeMonths" type="Numeric"></Level>
<Level name="Week" column="week_in_month" uniqueMembers="false" levelType="TimeWeeks"></Level>
<Level name="Day" column="day_in_month" uniqueMembers="false" ordinalColumn="day_in_month" nameColumn="day_in_month" levelType="TimeDays" type="Numeric"></Level>
</Hierarchy>
</Dimension>`
I am running this MDX query:
WITH
MEMBER [Measures].[YTD] AS 'SUM(YTD(), [Measures].[Quotation Status])'
SELECT
NON EMPTY {[Measures].[Quotation Status], [Measures].[YTD]} ON COLUMNS,
NON EMPTY {
{[Time.YQMD].[Year].Members},
[Time.YQMD].[Month].Members
} ON ROWS
FROM [Broker Portal]
The MDX query works and returns the following data:
My question(s) are this:
How do I remove the first two lines containing 2015, 2016 in MDX?
How do I remove the the lines for months 03 - 12 for the year 2016?
I suspect something like the following might be heading in the right direction:
WITH
MEMBER [Measures].[YTD] AS
Sum
(
YTD()
,[Measures].[Quotation Status]
)
SELECT
NON EMPTY
{
[Measures].[Quotation Status]
,[Measures].[YTD]
} ON COLUMNS
,NON EMPTY
[Time.YQMD].[Year].MEMBERS
* [Time.YQMD].[Month].[Month].MEMBERS HAVING
[Measures].[Quotation Status] > 0 ON ROWS
FROM [Broker Portal];

PL/SQL Query Optimization

I have this query that executes 100,000s of times. It currently runs pretty fast. I was just wondering if there was a better way to run it for even faster response times.
CODES TABLE = 160KB
INDEXES: INSTANCE(UNIQUE) SHORT)DESC
CODE_VALUES=10MB
INDEXES: INSTANCE(UNIQUE), INTFC_INST, CODE_INST,SHORT_DESC
INTERFACES=160KB
INDEXES: INSTANCE (UNIQUE), SHORT_DESC
id="0" operation="SELECT STATEMENT" optimizer="ALL_ROWS" search_columns="0" cost="7">
id="1" operation="NESTED LOOPS" search_columns="0" cost="7" cardinality="1" bytes="102" cpu_cost="54,820" io_cost="7" qblock_name="SEL$1" time="1">
id="2" operation="MERGE JOIN" option="CARTESIAN" search_columns="0" cost="3" cardinality="1" bytes="33" cpu_cost="23,764" io_cost="3" time="1">
object_ID="0" id="3" operation="TABLE ACCESS" option="BY INDEX ROWID" object_name="CODES" object_type="TABLE" search_columns="0" cost="2" cardinality="1" bytes="19" cpu_cost="15,443" io_cost="2" qblock_name="SEL$1" time="1">
object_ID="1" id="4" operation="INDEX" option="RANGE SCAN" object_name="CODES_SHORT_DESC_FINDX" object_type="INDEX" search_columns="1" cost="1" cardinality="1" cpu_cost="8,171" io_cost="1" qblock_name="SEL$1" access_predicates=""A"."SYS_NC00010$"='MANAGER_GROUP'" time="1"/>
id="5" operation="BUFFER" option="SORT" search_columns="0" cost="1" cardinality="1" bytes="14" cpu_cost="8,321" io_cost="1" time="1">
object_ID="2" id="6" operation="TABLE ACCESS" option="BY INDEX ROWID" object_name="INTERFACES" object_type="TABLE" search_columns="0" cost="1" cardinality="1" bytes="14" cpu_cost="8,321" io_cost="1" qblock_name="SEL$1" time="1">
object_ID="3" id="7" operation="INDEX" option="RANGE SCAN" object_name="INTERFACES_SHORT_DESC_FINDX" object_type="INDEX" search_columns="1" cost="0" cardinality="1" cpu_cost="1,050" io_cost="0" qblock_name="SEL$1" access_predicates=""C"."SYS_NC00007>
object_ID="4" id="8" operation="TABLE ACCESS" option="BY INDEX ROWID" object_name="CODE_VALUES" object_type="TABLE" search_columns="0" cost="4" cardinality="1" bytes="69" cpu_cost="31,056" io_cost="4" qblock_name="SEL$1" filter_predicates="("A"."INSTANCE"="B"."CODE_INST" AND "B"."INTFC_INST"="C"."INSTANCE")" time="1">
object_ID="5" id="9" operation="INDEX" option="RANGE SCAN" object_name="CODE_VALUES_FUN_IDX" object_type="INDEX" search_columns="1" cost="1" cardinality="4" cpu_cost="8,771" io_cost="1" qblock_name="SEL$1" access_predicates=""B"."SYS_NC00010$"='150'" time="1"/>
SELECT A.INSTANCE, C.INSTANCE, B.LONG_DESC
FROM CODES A,
CODE_VALUES B,
INTERFACES C
WHERE A.INSTANCE = B.CODE_INST
AND B.INTFC_INST = C.INSTANCE
AND TRIM (A.SHORT_DESC) = TRIM (var1)
AND TRIM (B.SHORT_DESC) = TRIM (var2)
AND TRIM (C.SHORT_DESC) = TRIM (var3)
Avoid TRIM functions in WHERE and JOIN clauses - > TRIM (A.SHORT_DESC) = TRIM (var1)
Just creating indexes on JOIN, WHERE and GROUP clause columns doesn’t mean that your query will always return your required results quickly.
It is query optimizer which selects proper index for a query to give you an optimum performance but query optimizer can only suggest optimum query plan by using proper indexes WHEN your are helping it by writing good query syntax.
Using any type of function (system or user defined) in WHERE or JOIN clause can dramatically decrease query performance because this practice create hurdles in query optimizer work of proper index selection.
One common example is TRIM functions, which are commonly used by developers in WHERE clause.
USE AdventureWorks
GO
SELECT pr.ProductID,pr.Name,pr.ProductNumber,wo.* fROM Production.WorkOrder wo
INNER JOIN Production.Product pr
ON PR.ProductID = wo.ProductID
WHERE LTRIM(RTRIM(pr.name)) = 'HL Mountain Handlebars'
GO
SELECT pr.ProductID,pr.Name,pr.ProductNumber,wo.* fROM Production.WorkOrder wo
INNER JOIN Production.Product pr
ON PR.ProductID = wo.ProductID
WHERE pr.name = 'HL Mountain Handlebars'
Though outputs of both queries are same but first query took almost 99% of total execution time. This huge difference is just because of these trim functions so on production databases we must avoid these TRIM and other functions in both JOIN and WHERE clauses.
Taken from AASIM ABDULLAH blog
So what you could/should do is run an update on your data to trim it once and for all, and start triming it while its being added to the table, so no new data will ever require trimming.
Or if that for some reason is not possible, look for Function-Based indexes as suggested by Maurice Reeves in comments.

Pentaho Mondrian using aggregate tables

I have xml schema description for Pentaho Mondrian. It looks like this:
<Dimension foreignKey="dt" name="dt" type="TimeDimension">
<Hierarchy allMemberName="All" hasAll="true" name="Hierarchy" primaryKey="dt" visible="true">
<View alias="dt_view">
<SQL dialect="generic">select distinct "dt",date_part('year', "dt")::integer AS year, date_part('month', "dt")::integer AS month, date_part('day', "dt")::integer AS day from "world_steel_production"."world_steel_production_data"
</SQL>
</View>
<Level captionColumn="year" column="year" hideMemberIf="Never" levelType="TimeYears" name="Years" type="Integer" uniqueMembers="false"/>
<Level column="month" formatter="capsidea.MemberMonthFormatter" hideMemberIf="Never" levelType="TimeMonths" name="Month" type="Integer" uniqueMembers="false"/>
</Hierarchy>
</Dimension>
<Dimension foreignKey="obj" name="Index">
<Hierarchy allMemberName="All" hasAll="true" name="name_ru" primaryKey="key" visible="true">
<Table name="world_steel_production_dict_obj" schema="world_steel_production"/>
<Level column="key" nameColumn="name_ru" parentColumn="parent_key" hideMemberIf="Never" levelType="Regular" name="Level" type="Integer" uniqueMembers="true"/>
</Hierarchy>
</Dimension>
<Measure aggregator="sum" column="vl" name="Value" visible="true"/>
</Cube>
Our timedimension <Dimension foreignKey="dt" name="dt" type="TimeDimension"> contains two levels: "year" and "month"
When I choose level "year" Mondrian aggregates data by year.
It seems fine, but table world_steel_production_data has two dynamic levels in data which defines columns dl (1 - year dynamic and 4 - month dynamic)
This case when I aggregate data by year level in 1980 contains data with year and month dynamics.
I've read (http://mondrian.pentaho.com/documentation/aggregate_tables.php) that Pentaho can use aggregate table and I want to use them in order to split my month and year dynamics.
I've create two vies for aggregate table purpose
create or replace view world_steel_production.world_steel_production_data_view_year
as
select *
from world_steel_production.world_steel_production_data
where dl = 1
and
create or replace view world_steel_production.world_steel_production_data_view_month
as
select *
from world_steel_production.world_steel_production_data
where dl = 4
But now I'm wandering how to say to Pentaho in my xml schema definition use the first view for year dynamics and second one for month dynamics?
Or maybe there is another way to split year and months dynamics?
I think I've found the solution.
Mondrian has such thing as Closure tables (http://mondrian.pentaho.com/documentation/schema.php#Closure_tables ). At this table you can define how to aggregate in your hierarchy dimension.
What I've done in my situation:
I've created and filled hierarchy table for dates and link this table with data-table in my schema.
Next I've created and filled closure table.
As you can see, I've filled closure table as my world_steel_production_time_hierarchy has no hierarchy at all (time_id = parent_time_id).
And at last I've changed XML data definition.

Mondrian Rollup Total with a dimension

I want mondrian to rollup as per my dimension table. For example
fact_table(id int, dim_id)
1 2
2 2
3 1
4 3
dimension_table( dim_id, value)
1 first_value
2 second_value
The last value in fact_table is delibrately not in the list of dimensions
<Cube name="BYT">
<Dimension type="StandardDimension" visible="true"
highCardinality="false" name="d1">
<Hierarchy name="dimension_table" visible="true" hasAll="true">
<Table name="dimension_table">
</Table>
<Level name="dimension" visible="true" column="dim_id"
type="String" uniqueMembers="false" levelType="Regular"
hideMemberIf="Never">
</Level>
</Hierarchy>
</Dimension>
<Measure name="m1" column="id" datatype="Integer"
aggregator="count" visible="true">
</Measure>
<Cube>
when i fire the following query
select [Measure].[m1] on columns, {([d1])} on rows from [BYT]
The result is
- Dimension Measure
- All D1 4
What i would like mondrian to do is something like this
- Dimension Measure
- All D1 3
i.e. count only on the values included in the dimension table
Please suggest a solution. I read through the rollup policy in the documentation but that does not help my case.
If I understand you well, there is mistake in your cube xml.
You should put distinct count instead of count.
<Measure name="m1" column="id" datatype="Integer"
aggregator="distinct count" visible="true">
</Measure>
Hope this helps.
You need to use a NON EMPTY context, which will result in a join with your fact table.
select [Measure].[m1] on columns, NON EMPTY {([d1])} on rows from [BYT]