Pentaho parametrized datasource - datasource

I am trying to build some parametrized data source (sql query over jndi).
query of my data source is:
SELECT ${param_interval}(dim_date.date), count(docs_fact.id) as docs_count
FROM rel_docs_dates
left join docs_fact on rel_docs_dates.doc_id = docs_fact.id
left join dim_date on rel_docs_dates.date_id = dim_date.id
Parametr ${param_interval} can get two values: MONTH and DAY, and as i checked it got the correct values.
But when i am trying to make preview of my dashboard i get warning "error processing component".
Notice that this query (see bellow) works ok.
SELECT MONTH(dim_date.date), count(docs_fact.id) as docs_count, ${param_interval} as tmp_fiel
FROM rel_docs_dates
left join docs_fact on rel_docs_dates.doc_id = docs_fact.id
left join dim_date on rel_docs_dates.date_id = dim_date.id
Can somebody tell me where is mistake? Or (may be) this way to use parameters in data source is not supported?

finaly i found decision. it isnt what i would like to have but it works and it the most important thing.
i rewrite my query with 'case' constraction and, it is important, i changed type of my parametr from string to numeric (string doesnt work :( ). now my query looks like this:
SELECT
case ${param_interval}
when 1 then MONTH(dim_date.date)
when 2 then DAY(dim_date.date)
end
,count(docs_fact.id) as fact_count
FROM rel_docs_dates
left join docs_fact on rel_docs_dates.doc_id = docs_fact.id
left join dim_date on rel_docs_dates.date_id = dim_date.id
where dim_date.date > LAST_DAY(DATE_SUB(CURDATE(), INTERVAL ${param_period} MONTH))
AND dim_date.date < LAST_DAY(DATE_SUB(CURDATE(), INTERVAL 0 MONTH))
group by
case ${param_interval}
when 1 then MONTH(dim_date.date)
when 2 then DAY(dim_date.date)
end
order by YEAR(dim_date.date), MONTH(dim_date.date)
may be it will help somebody else.

Related

Error --> Can someone explain (to a newbie) how to join outside of a sub-query within a select statement?

Code:
select
cal.fiscal_week,
h.role_type,
h.se,
sub.*
from (
select
ds_denali__call_disposition__c as disposition,
datepart(hour, dateadd(hour, -5, ds_denali__call_started__c)) as call_time,
datepart(dw, ds_denali__call_started__c) as day_of_week,
ds_denali__owner__c as owner
from homer.sfdc.ds_denali__dialsource_action__c
where ds_denali__call_disposition__c in ('No Contact', 'Voicemail', 'Internal', 'Progression - Decs Mkr', 'Progression - Other', 'No Prog - Decs Mkr', 'Decision Positive', 'Decision Negative', 'No Prog - Other')
and trunc(ds_denali__call_started__c) >= 2018-11-01) sub
left join homer.homer_mapping.insidesales_hierarchy h on ds.ds_denali__owner__c = h.sfdc_id
left join homer.homer_mapping.adp_fiscal_calendar cal on trunc(ds.ds_denali__call_started__c) = cal.day_date
I need to join everything, but the 'homer.sfdc.ds_denali__dialsource_action_c' is a monster table, so I'm trying to cut down as much time as I can by sub-querying. But I still need to join it to two other tables we have (listed in the left join). The error I'm running into is that it keeps telling me that 'ds.' doesn't exist. I'm just unsure if this is a foundational piece that I'm not familiar with since I'm self-taught and pretty new. Any and all help is appreciated.
You dont seem to understand aliasing. I presume the column ds.ds_denali__owner__c is meant to reference the column of the homer.sfdc.ds_denali__dialsource_action__c table in your subquery. At the moment it is looking for a table called ds in default database / schema hence the error you are seeing.
Next you dont seem to understand subqueries properly. When you write a subquery you are essentially creating a new table that can then be used elsewhere in your query. in this case the table you made is aliased (named) sub and contains 4 columns. (disposition, call_time, day_of_week & owner)
Putting these together this
left join homer.homer_mapping.insidesales_hierarchy h on ds.ds_denali__owner__c = h.sfdc_id
should become
left join homer.homer_mapping.insidesales_hierarchy h on sub.owner = h.sfdc_id
and to fix
left join homer.homer_mapping.adp_fiscal_calendar cal on trunc(ds.ds_denali__call_started__c) = cal.day_date
you first would need to add ds_denali__call_started__c to your subquery.
As a final point it would suprise me greatly if your subquery actually spead things up. i would fully expect it to slow things down compared to joining directly to homer.sfdc.ds_denali__dialsource_action__c

Date filter in hive while doing left outer join

I am doing a query build in hive, the query is given below.
*
Select * from CSS407
LEFT OUTER JOIN PROD_CORE.SERV_ACCT_ISVC_LINK SASP
ON CSS407.TABLE_ABBRV_CODE = 'SACT'
AND CSS407.EVENT_ITEM_REF_NUM = SASP.Serv_Acct_Id
AND to_date(CSS407.EVENT_RTS_VAL) >= SASP.Acct_Serv_Pnt_Strt_Dt
AND to_date(CSS407.EVENT_RTS_VAL) < SASP.Acct_Serv_Pnt_End_Dt
LEFT OUTER JOIN PROD_CORE.CUST_ACCT_SA_LINK ASA
ON CSS407.TABLE_ABBRV_CODE = 'SACT'
AND CSS407.EVENT_ITEM_REF_NUM = ASA.Serv_Acct_Id
AND CSS407.EVENT_RTS_VAL_UTC_DTTM >= ASA.Acct_Relt_Strt_Dttm
AND CSS407.EVENT_RTS_VAL_UTC_DTTM < ASA.Acct_Relt_End_Dttm
LEFT OUTER JOIN PROD_CORE.CUST_SA_LINK ASAT
ON CSS407.TABLE_ABBRV_CODE = 'TACT'
AND CSS407.EVENT_ITEM_REF_NUM = ASAT.Serv_Acct_Id
AND CSS407.EVENT_RTS_VAL_UTC_DTTM >= ASAT.Acct_Relt_Strt_Dttm
AND CSS407.EVENT_RTS_VAL_UTC_DTTM < ASAT.Acct_Relt_End_Dttm
*
When I am executing the above table in hive I am getting the below error
"Both left and right aliases encountered in JOIN 'SASP'"
On further investigation I founded that we cannot use date between filter in the join on condition. In every post everyone is asking to insert that filter in where condition.
But in our case if we are moving that date between filter to where condition then we are not getting any data since left outer join is not satisfying.
I am getting this issue while executing in HIVE, it is working fine in Teradata and oracle
Please help.
Only equality(=) works in join condition in Hive.Move <= to where clause.
I have the similar issue earlier.Please check below thread.
Hive Select MAX() in Join Condition
Hope this helps.
There might be some common column between CSS407 and SERV_ACCT_ISVC_LINK which might be creating this error.

Join query returns different results

Please help...
I'm trying to get traffic data from my site www.mentallica.co.il using SQL
The problem is that my left join query returns different value when I'm using two or more.
SELECT ISNULL(SUM([TrafficDay].UniqueTraffic), 0) AS TrafficDay,
[Topics].topicNumber
FROM [Topics]
LEFT JOIN [Traffic] AS TrafficDay
ON ([TrafficDay].Date >= '10/16/2013'
AND [TrafficDay].Date <= '10/16/2013')
AND [TrafficDay].TopicNumber = [Topics].TopicNumber
GROUP BY [Topics].TopicNumber
Will return a TrafficDay real value
And the problem query is
SELECT ISNULL(SUM([TrafficDay].UniqueTraffic), 0) AS TrafficDay,
ISNULL(SUM([TrafficWeek].UniqueTraffic), 0) AS TrafficWeek,
[Topics].topicNumber
FROM [Topics]
LEFT JOIN [Traffic] AS TrafficDay
ON ([TrafficDay].Date >= '10/16/2013'
AND [TrafficDay].Date <= '10/16/2013')
AND [TrafficDay].TopicNumber = [Topics].TopicNumber
LEFT JOIN [Traffic] AS TrafficWeek
ON ([TrafficWeek].Date > '10/09/2013'
AND [TrafficWeek].Date < '10/16/2013')
AND [TrafficWeek].TopicNumber = [Topics].TopicNumber
GROUP BY [Topics].TopicNumber
This will return different (much bigger) value for TrafficDay
Why is that ?
How can I get the real values for TrafficDay and TrafficWeek?
Any solution would be great!
How do i join the same table based on other table values...
Have you tried temporarily creating a new view by right clicking on any view and selecting 'New View' then copy and paste your query into the box below. From there you will have somewhat of an Access view of your query and may be able to better visualize and change your joins ie the arrows between tables. Just a suggestion. Works for me sometimes. Good luck.

MS Access SQL: Troubles combining UNION ALL with a LEFT JOIN

I have created a query in MS Access to simulate a FULL OUTER JOIN and combine the results that looks something like the following:
SELECT NZ(estimates.employee_id, actuals.employee_id) AS employee_id
, NZ(estimates.a_date, actuals.a_date) AS a_date
, estimates.estimated_hours
, actuals.actual_hours
FROM (SELECT *
FROM estimates
LEFT JOIN actuals ON estimates.employee_id = actuals.employee_id
AND estimates.a_date = actuals.a_date
UNION ALL
SELECT *
FROM estimates
RIGHT JOIN actuals ON estimates.employee_id = actuals.employee_id
AND estimates.a_date = actuals.a_date
WHERE estimates.employee_id IS NULL
OR estimates.a_date IS NULL) AS qFullJoinEstimatesActuals
I have saved this query as an object (let's call it qEstimatesAndActuals). My objective is to LEFT JOIN qEstimatesAndActuals with another table. Something like the following:
SELECT *
FROM qJoinedTable
LEFT JOIN (SELECT *
FROM labor_rates) AS rates
ON qJoinedTable.employee_id = rates.employee_id
AND qJoinedTable.a_date BETWEEN rates.begin_date AND rates.end_date
MS Access accepts the syntax and runs the query, but it omits results that are clearly within the result set. Wondering if the date format was somehow lost, I placed a FORMAT around the begin_date and end_date to force them to be interpreted as Short Dates. Oddly, this produced a different result set, but it still omitted result that it shouldn't have.
I am wondering if the queries are performed in such a way that you can't LEFT JOIN the result set of a UNION ALL. Does anyone have any thoughts/ideas on this? Is there a better way of accomplishing the end goal?
I would try breaking each part of the query into its own access query object, e.g.
SELECT *
FROM estimates
LEFT JOIN actuals ON estimates.employee_id = actuals.employee_id
AND estimates.a_date = actuals.a_date
Would be qryOne
SELECT *
FROM estimates
RIGHT JOIN actuals ON estimates.employee_id = actuals.employee_id
AND estimates.a_date = actuals.a_date
WHERE estimates.employee_id IS NULL
OR estimates.a_date IS NULL
Would be qryTwo
SELECT * FROM qryOne
UNION ALL
SELECT * FROM qryTwo
Would be qryFullJoinEstimatesActuals, and finally
SELECT NZ(estimates.employee_id, actuals.employee_id) AS employee_id
, NZ(estimates.a_date, actuals.a_date) AS a_date
, estimates.estimated_hours
, actuals.actual_hours
FROM qryFullJoinEstimatesActuals
I've found that constructs that don't work in complex Access SQL statements often do work properly if they are broken down into individual query objects and reassembled step-by-step. Additionally, you can test each part of the query individually. This will help you find a workaround if one proves to be necessary.
You can find exactly how to do this here.
You're missing an INNER JOIN.... UNION ALL step.
Consistent with the odd behavior surrounding the dates, this issue turned out to be related to the use of NZ to select a date from qFullJoinEstimatesActuals. The use of NZ appears to make the data type ambiguous. As such, the following line from the example in my post caused the error:
, NZ(estimates.a_date, actuals.a_date) AS a_date
The ambiguous data type of a_date caused the BETWEEN operator to produce erroneous results when comparing a_date to rates.begin_date and rates.end_date in the LEFT JOIN. The issue was resolved by type casting the result of the NZ function, as follows:
, CDate(NZ(estimates.a_date, actuals.a_date)) AS a_date

MySQL to PostgreSQL: GROUP BY issues

So I decided to try out PostgreSQL instead of MySQL but I am having some slight conversion problems. This was a query of mine that samples data from four tables and spit them out all in on result.
I am at a loss of how to convey this in PostgreSQL and specifically in Django but I am leaving that for another quesiton so bonus points if you can Django-fy it but no worries if you just pure SQL it.
SELECT links.id, links.created, links.url, links.title, user.username, category.title, SUM(votes.karma_delta) AS karma, SUM(IF(votes.user_id = 1, votes.karma_delta, 0)) AS user_vote
FROM links
LEFT OUTER JOIN `users` `user` ON (`links`.`user_id`=`user`.`id`)
LEFT OUTER JOIN `categories` `category` ON (`links`.`category_id`=`category`.`id`)
LEFT OUTER JOIN `votes` `votes` ON (`votes`.`link_id`=`links`.`id`)
WHERE (links.id = votes.link_id)
GROUP BY votes.link_id
ORDER BY (SUM(votes.karma_delta) - 1) / POW((TIMESTAMPDIFF(HOUR, links.created, NOW()) + 2), 1.5) DESC
LIMIT 20
The IF in the select was where my first troubles began. Seems it's an IF true/false THEN stuff ELSE other stuff END IF yet I can't get the syntax right. I tried to use Navicat's SQL builder but it constantly wanted me to place everything I had selected into the GROUP BY and that I think it all kinds of wrong.
What I am looking for in summary is to make this MySQL query work in PostreSQL. Thank you.
Current Progress
Just want to thank everybody for their help. This is what I have so far:
SELECT links_link.id, links_link.created, links_link.url, links_link.title, links_category.title, SUM(links_vote.karma_delta) AS karma, SUM(CASE WHEN links_vote.user_id = 1 THEN links_vote.karma_delta ELSE 0 END) AS user_vote
FROM links_link
LEFT OUTER JOIN auth_user ON (links_link.user_id = auth_user.id)
LEFT OUTER JOIN links_category ON (links_link.category_id = links_category.id)
LEFT OUTER JOIN links_vote ON (links_vote.link_id = links_link.id)
WHERE (links_link.id = links_vote.link_id)
GROUP BY links_link.id, links_link.created, links_link.url, links_link.title, links_category.title
ORDER BY links_link.created DESC
LIMIT 20
I had to make some table name changes and I am still working on my ORDER BY so till then we're just gonna cop out. Thanks again!
Have a look at this link GROUP BY
When GROUP BY is present, it is not
valid for the SELECT list expressions
to refer to ungrouped columns except
within aggregate functions, since
there would be more than one possible
value to return for an ungrouped
column.
You need to include all the select columns in the group by that are not part of the aggregate functions.
A few things:
Drop the backticks
Use a CASE statement instead of IF() CASE WHEN votes.use_id = 1 THEN votes.karma_delta ELSE 0 END
Change your timestampdiff to DATE_TRUNC('hour', now()) - DATE_TRUNC('hour', links.created) (you will need to then count the number of hours in the resulting interval. It would be much easier to compare timestamps)
Fix your GROUP BY and ORDER BY
Try to replace the IF with a case;
SUM(CASE WHEN votes.user_id = 1 THEN votes.karma_delta ELSE 0 END)
You also have to explicitly name every column or calculated column you use in the GROUP BY clause.