Combine multiple database with different queries in 1 report ( bar chart) - pentaho

I need to create a report using Pentaho User Console. I want to view my report in bar chart. In that report I need to include multiple query from different database and then the result in 1 chart. For example I have 3 database: Car, House, employee. I also have 3 query: quantity of car for each type, quantity of available house, total no of employee for each department. 3 different database and 3 different query but I want to show all 3 result in 1 chart. How I can do that?

You know about schema creation?
Go through on google how to create schema, what is fact table, what is dimension table..
And you can use pentaho schema workbench. It is used for your kind of purpose only.
and after creating schema in pentaho schema workbench you can publish that schema in pentaho BI server and you can view it in bar chart over their, and you can do analysis and drill-up , drill-down, slicing, dicing kinds of operations as well.

You can use kettle transformation as a data source for Pentaho report. Within the transformation it's perfectly fine to query 3 different DBs and prepare the result data set.

Related

How to join data in Google Sheet to Metabase to create dashboard?

My company uses the metabase for data analysis. The data I need to build the dashboard on the metabase is divided into 2, part of the data is retrieved from the SQL querying on the metabase and the other part is using google sheets as manual data. How can I join the data of the metabase and google sheet to create the dashboard on the metabase.
For example:
The data I need to build the dashboard on the metabase:
Name Age Adress Salary
Smith 25 Evans Mills $9000
The data is retrieved from the SQL querying on the metabase:
Name Age Adress
Smith 25 Evans Mills
Manual data on google sheets:
Salary
$9000
As far as my understanding of metabase, one of its limit is that it can not run queries across different databases.
However, I have helped our customer solve similar problems like yours. The software architecture is like this:
Metabase -> Presto SQL/Trino -> different database and different data source
In this design:
Metabase handles the dashboard part of work.
Trino handles the joining across different data sources.
Note: In our customer's case, the integration really requires certain programming work. It is not a quite trivial job.

Value only showing the first item in SSRS report

So my problem here is that I have a Part number which lives in two warehouses hence it has two bin locations. If I just use =Fields!PrimBin.Value it only ever returns the first location. I need to display the PrimBin if the location is from a specific warehouse. To get the warehouse I use =Fields!WarehouseCode.value
What I need to do is only show the PrimBin.Value of MAINWHSE and not CELLWHSE
Thanks in advance.
Ok so the database it quite vast. However, for the information required I am using two tables. Part and PimWhse.
Part shares the Product ID to PrimWhse. In PrimWhse each partID has two locations "MAINWHSE", "CELLWHSE "and 1 bin to pick in each warehouse giving to possible locations.
So WarehouseCode.Value will have the information for which warehouse the part is located. and PrimBin.Value will have the warehouse position ID stored in it.
This is all setup via report style within the Epicor system. When I create a query in business activity to look in MAINWHSE it shows the correct information.
However, in the report data builder I'm not able to set this query so I assume SSRS will be able to see of both theses possible values for PrimBin.Value!? If not I guess I need to work out how to add a query to report data builder, which at the moment does no seem possible?
Thanks again.

How to send data to only one Azure SQL DB Table from Azure Streaming Analytics?

Background
I have set up an IoT project using an Azure Event Hub and Azure Stream Analytics (ASA) based on tutorials from here and here. JSON formatted messages are sent from a wifi enabled device to the event hub using webhooks, which are then fed through an ASA query and stored in one of three Azure SQL databases based on the input stream they came from.
The device (Particle Photon) transmits 3 different messages with different payloads, for which there are 3 SQL tables defined for long term storage/analysis. The next step includes real-time alerts, and visualization through Power BI.
Here is a visual representation of the idea:
The ASA Query
SELECT
ParticleId,
TimePublished,
PH,
-- and other fields
INTO TpEnvStateOutputToSQL
FROM TpEnvStateInput
SELECT
ParticleId,
TimePublished,
EventCode,
-- and other fields
INTO TpEventsOutputToSQL
FROM TpEventsInput
SELECT
ParticleId,
TimePublished,
FreshWater,
-- and other fields
INTO TpConsLevelOutputToSQL
FROM TpConsLevelInput
Problem: For every message received, the data is pushed to all three tables in the database, and not only the output specified in the query. The table in which the data belongs gets populated with a new row as expected, while the two other tables get populated with NULLs for columns which no data existed for.
From the ASA Documentation it was my understanding that the INTO keyword would direct the output to the specified sink. But that does not seem to be the case, as the output from all three inputs get pushed to all sinks (all 3 SQL tables).
The test script I wrote for the Particle Photon will send one of each type of message with hardcoded fields, in the order: EnvState, Event, ConsLevels, each 15 seconds apart, repeating.
Here is an example of the output being sent to all tables, showing one column from each table:
Which was generated using this query (in Visual Studio):
SELECT
t1.TimePublished as t1_t2_t3_TimePublished,
t1.ParticleId as t1_t2_t3_ParticleID,
t1.PH as t1_PH,
t2.EventCode as t2_EventCode,
t3.FreshWater as t3_FreshWater
FROM dbo.EnvironmentState as t1, dbo.Event as t2, dbo.ConsumableLevel as t3
WHERE t1.TimePublished = t2.TimePublished AND t2.TimePublished = t3.TimePublished
For an input event of type TpEnvStateInput where the key 'PH' would exist (and not keys 'EventCode' or 'FreshWater', which belong to TpEventInput and TpConsLevelInput, respectively), an entry into only the EnvironmentState table is desired.
Question:
Is there a bug somewhere in the ASA query, or a misunderstanding on my part on how ASA should be used/setup?
I was hoping I would not have to define three separate Stream Analytics containers, as they tend to be rather pricey. After running through this tutorial, and leaving 4 ASA containers running for one day, I used up nearly $5 in Azure credits. At a projected $150/mo cost, there's just no way I could justify sticking with Azure.
ASA is supposed to be purposed for Complex Event Processing. You are using ASA in your queries to essentially pass data from the event hub to tables. It will be much cheaper if you actually host a simple "worker web app" to process the incoming events.
This blog post covers the best practices:
http://blogs.msdn.com/b/servicebus/archive/2015/01/16/event-processor-host-best-practices-part-1.aspx
ASA is great if you are doing some transformations, filters, light analytics on your input data in real-time. Furthermore, it also works great if you have some Azure Machine Learning models that are exposed as functions (currently in preview).
In your example, all three "select into" statements are reading from the same input source, and don't have any filter clauses, so all rows would be selected.
If you only want to rows select specific rows for each of the output, you have to specify a filter condition. For example, assuming you only want records with a non null value in column "PH" for the output "TpEnvStateOutputToSQL", then ASA query would look like below
SELECT
ParticleId,
TimePublished,
PH
-- and other fields INTO TpEnvStateOutputToSQL FROM TpEnvStateInput WHERE PH IS NOT NULL

Adding limit parameters to bar charts in Pentaho Report Designer

I am using Pentaho Report Designer to generate reports from my olap cube using mdx. I want to generate bar chart reports from Pentaho Report Designer. I have 50000 records and writing a MDX query to display keywords along with their count. Problem is bar chart that is created is of 50000 records, but I want to pass two parameters that act as start and end value to display i.e user is prompted to enter starting and ending parameters (suppose he enters 1 and 10) so 10 records should be displayed.
I do not know the specifics of Pentaho MDX, but in general, I would use the following approach, assuming the 50000 records are in hierarchy [DimA].[Record]:
WITH SET [Selected Records] AS
SubSet([DimA].[Record].[Record].Members,
ParamRef('start') - 1,
ParamRef('end') - ParamRef('start') + 1
)
SELECT { [Measures].[Count] }
ON COLUMNS,
[Selected Records]
ON ROWS
FROM [MyCube]
I am a bit guessing about the use of ParamRef in Mondrian MDX here. The SubSet function is described for Analysis Services here: http://msdn.microsoft.com/en-us/library/ms144767.aspx

querying google fusion table

I have a Google fusion table with 3 row layouts as shown below:
We can query the fusion table as,
var query = new google.visualization.Query("https://www.google.com/fusiontables/gvizdata?tq=select * from *******************");
which select the data from the first row layout ie Rows 1 by default. Is there any way that we can query the second or 3rd Row layout of a fusion table?
API queries apply to the actual table data. The row layout tabs are just different views onto that data. You can get the actual query being executed for a tab with Tools > Publish; the HTML/JavaScript contains the FusionTablesLayer request.
I would recommend using the regular Fusion Tables APi rather than the gvizdata API because it's much more flexible and not limited to 500 response rows.
The documentation for querying a Fusion Tables source has not been updated yet to account for the new structure, so this is just a guess. Try appending #rows:id=2 to the end of your table id:
select * from <table id>#rows:id=2
A couple of things:
Querying Fusion Tables with SQL is deprecated. Please see the porting guide.
Check out the Working With Rows part of the documentation. I believe this has your answers.