Adding column to existing pentaho reports - pentaho

I am pretty new to Pentaho Tool. I have an existing *.prpt which generates an excel report file with me. All i need is to add a new column to it. Could you guys suggest me a way to do it.
Thanks in advance

Note :- These all are my assumption.
After open that *.prpt file in pentaho report designer. In right side Data tab is their click that tab and check for Data Sets and expand that one. Below image show sample structure of the Data Sets. In that image Query 1 is the main report data source open that Query 1 data source paste that sql1 in there and press Ok button. After drag and drop column1, column2, column3 to your report designer in details header. After save your prpt file it show result like below sql1 output.
sql1:- select column1, column2, column3 from table_name
-------------------------------
output:- | column1 | column2 | column3 |
-------------------------------
| 1 | 12 | 13 |
-------------------------------
| 2 | 22 | 23 |
-------------------------------
=> Now, you want to add first one more column to your report. Copy sql2 and paste in where you copied sql1 and remove sql1 and paste it sql2.
sql2:- select column1, column2, column3, column4 from table_name
-----------------------------------------
output:- | column1 | column2 | column3 | column4 |
-----------------------------------------
| 1 | 12 | 13 | 14 |
-----------------------------------------
| 2 | 22 | 23 | 24 |
-----------------------------------------
That's it you added extra column to your prpt file.
I think these information useful to you. Still you have any doubts fell free to ask.
Another simple solution is change the sql by your convenient ways and drag and drop that column in to details.
Thank you.

Related

Sql Server how to find values in different tables that have different suffix

I'm struggling to find a value that might be in different tables but using UNION is a pain as there are a lot of tables.
[Different table that contains the suffixes from the TestTable_]
| ID | Name|
| -------- | -----------|
| 1 | TestTable1 |
| 2 | TestTable2 |
| 3 | TestTable3 |
| 4 | TestTable4 |
TestTable1 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------- |
| 1 | goose | withFeather? |featherID |
| 2 | rooster| withoutFeather?|shinyfeatherID |
| 3 | rooster| age | 20 |
TestTable2 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------------------|
| 1 | brazilian_goose | withFeather? |featherID |
| 2 | annoying_rooster | withoutFeather?|shinyfeatherID |
| 3 | annoying_rooster | no_legs? |dead |
TestTable3 content:
| id | Name | q1 | a1 |
| -------- | ---------------------------------------- |
| 1 | goose | withFeather? |featherID |
| 2 | rooster| withoutFeather?|shinyfeatherID |
| 3 | rooster| age | 15 |
Common columns: q1 and a1
Is there a way to parse through all of them to lookup for a specific value without using UNION because some of them might have different columns?
Something like: check if "q1='age'" exists in all those tables (from 1 to 50)
Select q1,*
from (something)
where q1 exists in (TestTable_*)... or something like that.
If not possible, not a problem.
You could use dynamic SQL but something I do in situations like this where I have a list of tables that I want to quickly perform the same actions on is to either use a spreadsheet to paste the list of tables into and type a query into the cell with something like #table then use the substitute function to replace it.
Alternative I just paste the list into SSMS and use SHIFT+ALT+ArrowKey to select the column and start typing stuff out.
So here is my list of tables
Then I use that key combo. As you can see my cursor has now selected all those rows.
Now I can start typing and all rows selected will get the input.
Then I just go to the other side of the table names and repeat the action
It's not a perfect solution but it's quick a quick and dirty way of doing something repetitive quickly.
If you want to find all the tables with that column name you can use information schema.
Select table_name from INFORMATION_SCHEMA.COLUMNS where COLUMN_NAME = 'q1'
Given the type of solution you are after I can offer a method that I've had to use on legacy systems.
You can query sys.columns for the name of the column(s) you need to find in N tables and join using object_id to sys.tables where type='U'. This will give you a list of table names.
From this list you can then build a working query for each table, and depending on your requirements (is this ad-hoc?) either just manually execute it yourself of build a procedure that will do it for you using sp_executesql
Eg
select t.name, c.name
into #workingtable
from sys.columns c
join sys.tables t on t.object_id=c.object_id
where c.name in .....
psudocode:
begin loop while rows exist in #working table
select top 1 row from #workingtable
set #sql=your query specific to that table and column(s)
exec(#sql) / sp_executesql / try/catch as necessary
delete row from working table
end loop
Hopefully that give ideas at least for how you might implement your requirements.

Conditional update column B with modified value based on column A

I am facing a large table with data that got imported from a csv. However the delimiters in the csv where not sanitized, so the input data looked something like this:
alex#mail.com:Alex
dummy#mail.com;Bob
foo#bar.com:Foo
spam#yahoo.com;Spam
whatever#mail.com:Whatever
During the import : was defined as the delimiter, so each row with the delimiter ; was not imported properly. This resulted in a table structured like this:
| ID | MAIL | USER |
|-- --|---------------------|----------|
| 1 | alex#mail.com | ALEX |
| 2 | dummy#mail.com;Bob | NULL |
| 3 | foo#bar.com | Foo |
| 4 | spam#yahoo.com;Spam | NULL |
| 5 | whatever#mail.com | Whatever |
As reimporting is no option I was thinking about manually sanitizing the data in the affected rows by using SQL queries. So I tried to combine SELECT and UPDATE statements by filtering rows WHERE USER IS NULL and update both columns with the correct value where applicable.
What you need are string functions. Reading a bit, I find that Google BigQuery has STRPOS() and SUBSTR().
https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#substr
https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#strpos
An update query to fix the situation you are describing looks like this:
update table_name set mail =SUBSTR(mail,1,STRPOS(mail,';')-1), user =SUBSTR(mail,STRPOS(mail,';')+1) where user is null
The idea here is to split mail in its two parts, the part before the ; and the part after. Hope this helps.

How to Pivot row to columns for this kind of scenerio?

I have a data source that returns data in the following format:
id | subId | code | name | col1 | col2 | col3
1 | 1 | abc | xyz | "Whatever" | "WhateverA" | "WhateverB"
| "Whatever2" | "Whatever2A" | "Whatever2B"
I need to make col1 row values as headers, while col2 and col3 values as subcolumns under col1 for respective rows as:
id | subId | code | name | "Whatever" | "Whatever2" |
1 | 1 | abc | xyz | col2 | col3 | col2 | col3 |
|"WhateverA" | "WhateverB" |"Whatever2A" | "Whatever2B" |
There's nothing that I know of, out of the box, that would achieve this.Here's a high level overview of a possible solution:
Fetch and cache the dataset on the server
Bind the "main\top level" data to a GridView
Using jQuery handle tr\td click events captured inside the GridView
Make an AJAX call to the server -> fetch the sub data from the cached dataset -> return it -> append the fetched result to the GridView
Also, instead of making an AJAX call every single time a td is clicked you can fetch all the data at once and store it in a variable or localStorage.I've done it this way before and there are many resources on the web with step-by-step articles how to achieve step 1 - 4.Hope it gives you an idea

How to inner-join in Excel (eg. using VLOOKUP)

Is there a way to inner join two different Excel spreadsheets using VLOOKUP?
In SQL, I would do it this way:
SELECT id, name
FROM Sheet1
INNER JOIN Sheet2
ON Sheet1.id = Sheet2.id;
Sheet1:
+----+------+
| ID | Name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
+----+------+
Sheet2:
+----+-----+
| ID | Age |
+----+-----+
| 1 | 20 |
| 2 | 21 |
| 4 | 22 |
+----+-----+
And the result would be:
+----+------+
| ID | Name |
+----+------+
| 1 | A |
| 2 | B |
| 4 | D |
+----+------+
How can I do this in VLOOKUP? Or is there a better way to do this besides VLOOKUP?
Thanks.
You can acheive this result using Microsoft Query.
First, select Data > From other sources > From Microsoft Query
Then select "Excel Files*".
In the "Select Workbook" windows, you have to select the current Workbook.
Next, in the query Wizard windows, select sheet1$ and sheet2$ and click the ">" button.
Click Next and the query visual editor will open.
Click on the SQL button and paste this query :
SELECT `Sheet1$`.ID, `Sheet1$`.Name, `Sheet2$`.Age
FROM`Sheet1$`, `Sheet2$`
WHERE `Sheet1$`.ID = `Sheet2$`.ID
Finally close the editor and put the table where you need it.
The result should look like this :
First lets get a list of values that exist in both tables. If you are using excel 2010 or later then in Sheet 3 A2 put the following formula:
=IFERROR(AGGREGATE(15,6,Sheet2!$A$1:$A$5000/(COUNTIF(Sheet1!$A$1:$A$5000,Sheet2!$A$1:$A$5000)>0),ROW(1:1)),"")
If you are using 2007 or earlier then use this array formula:
=IFERROR(SMALL(IF(COUNTIF(Sheet1!$A$1:$A$5000,Sheet2!$A$1:$A$5000),Sheet2!$A$1:$A$5000),ROW(1:1)),"")
Being an array formula, copy and paste into the formula bar then hit Ctrl-Shift-Enter instead of Enter or Tab to leave the edit mode.
Then copy down as many rows as desired. This will create a list of ID'd that are in both lists. This does assume that ID is a number and not text.
Then with that list we use vlookup:
=IF(A2<>"",VLOOKUP(A2,Sheet1!A:B,2,FALSE),"")
This will then return the value from Sheet 1 that matches.
For Basic Excel Join without formuales or Excel Macros. Please check the website
http://exceljoins.blogspot.com/2013/10/excel-inner-join.html
Joins can Left Outer, Right Outer and Full Outer which used in rare ocassions, But we can achieve this for Excel Sheets, For more information check the below
http://exceljoins.blogspot.com/

SQL - Selecting all latest unique records

I'm struggling a bit at creating an SQL query to select some records from an Access Database (using Excel VBA).
A cut of one of the tables (let's call it 'table1') has the following columns:
| my_id | your_id | phase |
| 1 | 1 | Open |
| 2 | 1 | Close |
| 3 | 2 | Open |
| 4 | 3 | Close |
| 5 | 2 | Close |
| 6 | 3 | Open |
The field 'my_id' will always be a unique value whereas the 'your_id' field may contain duplicates.
What I would like to do is select everything from the table for the most recent record of the 'your_id' where the phase is 'Close'. So that means in the above example table it would select 5, 4 & 2.
Hope this makes sense, sorry if not - I'm struggling to articulate what I mean!
Thanks
Although from ur example if u just add where conditin as phase='Close' u will get the records of 5,4 and 2. But I am assuming that there might be scenarios (not in ur example) where more than 1 record can come with status as Close for any given your_id so query should look like this
Select * from table1 where my_id in (
Select Max(My_Id) from table1 where phase='Close' group by your_id)