Talend - Dynamic Column Name (Enterprise version) - dynamic

Can anyone help me solve this case?
I have much file to process, two of them is like on below screenshot with my expected output.
I use this transformation on Talend: tFileList---tInputExcel---tUnpivotRow---tMap---tPostgresqlOutput
The output is different to my expected output. This is the screenshot of the output
Can anyone help me to reach my expected output which is like on my first picture above?

This will be pretty hard. You'd have to handle that as a text file. And whenever you found "store" value in the first column you'd update your type with the value.
Here's how I'd start:
Basically tJavaFlex begin piece would contain:
String col1Type
String colNType
main part:
if input_row.col0.equalsIgnoreCase("store") {
col1Type = input_row.col1;
col2Type = input_row.col2;
colNType = input_row.colN;
continue; /*(so this record will be Ignored for the rest of the components!)*/
}
output_row.col1Type = col1Type;
output_row.col1Value = Integer.valueOf(input_row.col1);
/*coz we have text and need numbers :( */
I think using propagate results will save you from writing down all the other fields.
And from here it would be very simple as you have key-type-value-type-value-type-value results.

Related

Pentaho JsonInput GET fields

I'm trying to use PDI to read data from an API (json) and now I'm simply trying to use json input to get a few specific fields but the get fields button on the input step gives me.
ERROR (version 8.3.0.0-371, build 8.3.0.0-371 from 2019-06-11 11.09.08 by buildguy) : Index 1 out of bounds for length 1
all the steps execute fine, and produce data - just not the json input step doesn't wnat to give me the fields option! - I've tired the text file and json oput and both write valid json so IDK whats going on....
PS. this is my first time using PDI
ISSUE 2:
It looks like PDI uses jayway for its json path parsing so I've been using this site https://jsonpath.herokuapp.com/ jayway selection which gives me my expected path. When I put that into the 'fields' of the json input dialog I only get the FIRST instance of that path value vs it actually parsing the json and giving me every instance, and can't figure out why though I assume it has something to do with PDI's row based view on things but I also don't know how to get it to understand that its json and it should be giving me back all values that match that path.
UPDATE 1:
I've been looking at this https://forums.pentaho.com/threads/135882-Parsing-JSON-data-without-knowing-field-names/ it seems like this Modified Java Script Value step might be the way to go. Will continue testing.
UPDATE 2
OK - Used the MJSV as posted above along with a select fields step and finally able to get the key's
var obj = JSON.parse(mydata);
var keys = Object.keys(obj);
for (var i = 0; i < Object.keys(obj).length; i++) {
var row = createRowCopy(getOutputRowMeta().size());
var idx = getInputRowMeta().size();
row[idx++] = keys[i];
putRow(row);
}
trans_Status = SKIP_TRANSFORMATION;

How to retrieve a specific field from a list output?

I don't have any developer rights in my SAP-System but I found a way to write some ABAP-Code in a tiny "User-Exit" box (I don't know if that's what you call it) inside a report.
I'm trying to submit a HR-Report and plug it's outcoming PERNR into that same report again.
There's a syntax-error that is telling me that t_list doesn't have a component with the Name PERNR.
What do I have to do in order to get this to work?
DATA: t_list TYPE TABLE OF abaplist WITH HEADER LINE,
seltab TYPE TABLE OF rsparams,
selline LIKE LINE OF seltab.
*I found out that the name of the selection field in the Report-GUI is "PNPPERNR" and tested it
selline-selname = 'PNPPERNR'.
selline-sign = 'I'.
selline-option = 'EQ'.
SUBMIT Y5000112
USING SELECTION-SET 'V1_TEST'
EXPORTING LIST TO MEMORY
AND RETURN.
CALL FUNCTION 'LIST_FROM_MEMORY'
TABLES
listobject = t_list
EXCEPTIONS
not_found = 1
OTHERS = 2.
IF sy-subrc <> 0.
WRITE 'Unable to get list from memory'.
ELSE.
LOOP AT t_list.
*The Problem is here: how do I get the pnppernr out of t_list, it's the first column of the report output
selline-low = t_list-pernr.
append selline to seltab.
ENDLOOP.
SUBMIT Y5000112
WITH SELECTION-TABLE seltab
USING SELECTION-SET 'V2_TEST'
AND RETURN.
ENDIF.
Use the function module LIST_TO_ASCI to decode the contents of t_list into something readable. This answer contains some sample code including the data types required. At this point, the data you're looking for will probably occur at the same column range in the output. Use the standard substring access methods - e. g. line+42(21) to obtain the part of the line you need.
The vwegert's answer is more than useful! In my previous answer I forgot to mention LIST_TO_ASCI FM :)
The only thing I can add is that parsing of result lines has no universal solution and greatly depends on its structure. Usually it is done like:
LOOP AT t_list.
SPLIT t_list AT '|' INTO <required_structure>.
selline-low = <required_structure>-pernr.
APPEND selline TO seltab.
ENDLOOP.
where <`required_structure> is your Y5000112 output structure. But this may be not so simple and may require additional manipulations.

How to use Bioproject ID, for example, PRJNA12997, in biopython?

I have an Excel file in which are given more then 2000 organisms, where each one of them has a Bioproject ID associated (like PRJNA12997). The idea is to use these IDs to get the sequence for a later multiple alignment with other five sequences that I have in a text file.
Can anyone help me understand how I can do this using biopython? At least the part with the bioproject ID.
You can first get the info using Bio.Entrez:
from Bio import Entrez
Entrez.email = "Your.Name.Here#example.org"
# This call to efetch fails sometimes with a 400 error.
handle = Entrez.efetch(db="bioproject", id="PRJNA12997")
I've been trying, and Entrez.read(handle) doesn't seems to work. But if you do record_xml = handle.read() you'll get the XML entry for this record. In this XML you can get the ID for the organism, in this case 12997.
handle = Entrez.esearch(db="nuccore", term="12997[BioProject]")
search_results = Entrez.read(handle)
Now you can efecth from your search results. At this point you should use Biopython to parse whatever you will get in the efetch step, playing with the rettype http://www.ncbi.nlm.nih.gov/books/NBK25499/table/chapter4.T._valid_values_of__retmode_and/
for result in search_results["IdList"]:
entry = Entrez.efetch(db="nuccore", id=result, rettype="fasta")
this_seq_in_fasta = entry.read()

Selecting all info from nodes with the same name

I'm a total newbie when it comes to xml stuff.
So far I have this piece of xml that I want to extract info from, but all the node names are the same (so it just grabs one of them, unless stated otherwise).
It looks something like this:
<DocumentElement>
<Screening>
<ScreeningID>2</ScreeningID>
<ScreeningDate>2011-09-13T00:00:00-04:00</ScreeningDate>
<ScreeningResult>1</ScreeningResult>
<ScreeningResultText>Negative</ScreeningResultText>
<TextResult>0</TextResult>
<TextResultText>Not Tested</TextResultText>
<PageNumber>0</PageNumber>
<AddedDate>2015-05-03T16:06:41.71774-04:00</AddedDate>
<UpdateDate>2015-05-03T16:06:41.71774-04:00</UpdateDate>
</Screening>
<Screening>
<ScreeningID>3</ScreeningID>
<ScreeningDate>2011-09-13T00:00:00-04:00</ScreeningDate>
<ScreeningResult>1</ScreeningResult>
<ScreeningResultText>Negative</ScreeningResultText>
<TextResult>1</TextResult>
<TextResultText>Negative</TextResultText>
<PageNumber>9</PageNumber>
<AddedDate>2015-05-03T16:25:21.2904988-04:00</AddedDate>
<UpdateDate>2015-05-03T16:25:21.2904988-04:00</UpdateDate>
</Screening>
And I'm currently using this kind of snippet to extract info from the TextResult area
Select
answer.value('(/DocumentElement/Screening/TextResult)[1]','int')
From
Answers
However, that only grabs the first bit of info, I know that if I write something like this, it'll get me the second bit of info but on another column: answer.value('(/DocumentElement/Screening[2]/textResult)[1]','int')
I have two issues with this: 1. There isn't necessarily going to be only 2 nodes with the same name - it could go on infinitely. And 2. I would like all the info to be gathered into only one column.
Any help would be appreciated!
You can try this way :
SELECT
X.value('.','int') as 'TextResult'
FROM Answers as 'a'
CROSS APPLY a.answer.nodes('/DocumentElement/Screening/TextResult') as answers(X)
SQL Fiddle
I understand your meaning is: get all TextResult in your xml document. If so, you can try this:
string xml = #"<DocumentElement>
<Screening>
<ScreeningID>2</ScreeningID>
<ScreeningDate>2011-09-13T00:00:00-04:00</ScreeningDate>
<ScreeningResult>1</ScreeningResult>
<ScreeningResultText>Negative</ScreeningResultText>
<TextResult>0</TextResult>
<TextResultText>Not Tested</TextResultText>
<PageNumber>0</PageNumber>
<AddedDate>2015-05-03T16:06:41.71774-04:00</AddedDate>
<UpdateDate>2015-05-03T16:06:41.71774-04:00</UpdateDate>
</Screening>
<Screening>
<ScreeningID>3</ScreeningID>
<ScreeningDate>2011-09-13T00:00:00-04:00</ScreeningDate>
<ScreeningResult>1</ScreeningResult>
<ScreeningResultText>Negative</ScreeningResultText>
<TextResult>1</TextResult>
<TextResultText>Negative</TextResultText>
<PageNumber>9</PageNumber>
<AddedDate>2015-05-03T16:25:21.2904988-04:00</AddedDate>
<UpdateDate>2015-05-03T16:25:21.2904988-04:00</UpdateDate>
</Screening>
</DocumentElement>";
XElement xmlTree = XElement.Parse(xml);
IEnumerable<XElement> textResultList = from c in xmlTree.Descendants("TextResult")
select c;
foreach (var item in textResultList)
{
Console.WriteLine(item.Value);
}
Console.Read();
I hope this help

Mule ESB: How to do Condition checking in Datamapper using Xpath

i'm facing issue in xpath-I need do a check two attribute values, if the condition satisfies need to do hard code my own value. Below is my xml.
I need to check the condition like inside subroot- if ItemType=Table1 and ItemCondition=Chair1 then i have to give a hard coded value 'Proceed'( this hard coded value i will map to target side of datamapper).
<Root>
<SubRoot>
<ItemType>Table1</ItemType>
<ItemCondition>Chair1</ItemCondition>
<ItemValue>
.......
</ItemValue>
</SubRoot>
<SubRoot>
<ItemType>Table2</ItemType>
<ItemCondition>chair2</ItemCondition>
<ItemValue>
.......
</ItemValue>
</SubRoot>
....Will have multiple subroot
</Root>
I have tried to define rules as below, but it is throwing error
Type: String
Context:/Root
Xpath: substring("Proceed", 1 div boolean(/SubRoot[ItemType="Table1" and ItemCondition="Chair1"]))
But it is throwing error like
net.sf.saxon.trans.XPathException: Arithmetic operator is not defined for arguments of types (xs:integer, xs:boolean)
Is there any other shortcut way to perform this.Could you please help me, i have given lot more effort. Not able to resolve it. Thanks in advance.
I am not sure where you are applying this but the XPath expression you are looking for is:
fn:contains(/Root/SubRoot[2]/ItemCondition, "chair") and fn:contains(/Root/SubRoot[2]/ItemType, "Table")
So here is an example returning "Proceed" or "Stop" as appropriate:
if (fn:contains(/Root/SubRoot[1]/ItemCondition, "Chair") and fn:contains(/Root/SubRoot[2]/ItemType, "Table")) then 'Proceed' else 'Stop'
To implement the above condition , i was initially tired to do in xpath, gave me lot of error. I have implemented by simple if else condition in script part of data mapper
if ( (input.ItemType == 'Table') and (input.ItemCondition == 'chair')) {
output.Item = 'Proceed'}
else {
output.Item = 'Stop '};
Make sure about your precedence. Example, Here in the xml structure( or converted POJO) ItemType has to be checked first then followed with ItemCondition.
&& not seems to be working for me, change to 'and' operator
If you were first time trying to implement the logic. It may help you.