Send different metadata to different target streams - PDI - pentaho

I have two target streams (Matches and mismatches) defined as below:
#Override
public StepIOMetaInterface getStepIOMeta() {
StepMeta stepMeta = new StepMeta();
if (ioMeta == null) {
ioMeta = new StepIOMeta(true, false, false, false, false, true);
StreamInterface matchStream = new Stream(StreamType.TARGET, null, "Matches", StreamIcon.TARGET, null);
StreamInterface mismatchStream = new Stream(StreamType.TARGET, null, "Mismatches", StreamIcon.TARGET, null);
ioMeta.addStream(matchStream);
ioMeta.addStream(mismatchStream);
}
return ioMeta;
}
I want to send different meta data to these two targets. The meta data is received from the previous steps. For match, it needs to be a concatenation of both input streams and for mismatch just the first input stream.
I am stuck on how to define the metadata separately for the two target streams.
Appreciate your help.

List<StreamInterface> targets=getStepIOMeta().getTargetStreams();
List<StreamInterface> infos=getStepIOMeta().getInfoStreams();
if ( info != null )
{
if(targets!=null)
{
if(nextStep.getName().equals(targets.get(0).getStepname()))
{
if ( info != null ) {
for ( int i = 0; i < info.length; i++ ) {
if ( info[i] != null ) {
r.mergeRowMeta( info[i] );
}
}
}
}
if(nextStep.getName().equals(targets.get(1).getStepname()))
{
if ( info != null ) {
if ( info.length > 0 && info[0] != null ) {
r.mergeRowMeta( info[0] );
}
}
}
if(nextStep.getName().equals(targets.get(2).getStepname()))
{
if ( info != null ) {
if ( info.length > 0 && info[0] != null ) {
r.mergeRowMeta( info[1] );
}
}
}
}

Related

Best practice to check duplicate string data before insert data using Entity Framework Core in C#

I need an advice for my code. What I want to do is insert a row into a table using Entity Framework Core in ASP.NET Core.
Before inserting new data, I want to check if email and phone number is already used or not.
I want to return specifically, example if return = x, email used. If return = y, phone used.
Here's my code
public int Insert(Employee employee)
{
var checkEmail = context.Employees.Single(e => e.Email == employee.Email);
if (checkEmail != null)
{
var checkPhone = context.Employees.Single(e => e.Phone == employee.Phone);
if (checkPhone != null)
{
context.Employees.Add(employee);
context.SaveChanges();
return 1;
}
return 2;
}
return 3;
}
I'm not sure with my code, is there any advice for the best practice in my case?
I just don't like these "magic numbers" that indicate the result of your checks.... how are you or how is anyone else going to know what 1 or 2 means, 6 months down the road from now??
I would suggest to either at least create a constants class that make it's more obvious what these numbers mean:
public class CheckConstants
{
public const int Successful = 1;
public const int PhoneExists = 2;
public const int EmailExists = 3;
}
and then use these constants in your code:
public int Insert(Employee employee)
{
var checkEmail = context.Employees.Single(e => e.Email == employee.Email);
if (checkEmail != null)
{
var checkPhone = context.Employees.Single(e => e.Phone == employee.Phone);
if (checkPhone != null)
{
context.Employees.Add(employee);
context.SaveChanges();
return CheckConstants.Successful;
}
return CheckConstants.PhoneExists;
}
return CheckConstants.EmailExists;
}
and also in any code that calls this method and need to know about the return status code.
Alternatively, you could also change this to an enum (instead of an int):
public enum CheckConstants
{
Successful, PhoneExists, EmailExists
}
and then just return this enum - instead of an int - from your method:
public CheckConstants Insert(Employee employee)
{
var checkEmail = context.Employees.Single(e => e.Email == employee.Email);
if (checkEmail != null)
{
var checkPhone = context.Employees.Single(e => e.Phone == employee.Phone);
if (checkPhone != null)
{
context.Employees.Add(employee);
context.SaveChanges();
return CheckConstants.Successful;
}
return CheckConstants.PhoneExists;
}
return CheckConstants.EmailExists;
}
merge two database check to one Query
use SingleOrDefault instance of Single
public int Insert(Employee employee)
{
var checkEmail = context.Employees.Select (e=>new {e.Email , e.Phone }).SingleOrDefault(e => e.Email == employee.Email || e.Phone == employee.Phone);
if (checkEmail == null)
{
context.Employees.Add(employee);
context.SaveChanges();
return 1;
}
else if (checkEmail.Email == employee.Email)
return 3;
else
return 2;
}

Nextflow dynamic includes and output

I'm trying to use dynamic includes but I have problem to manage output files:
/*
* enables modules
*/
nextflow.enable.dsl = 2
include { requestData } from './modules/get_xapi_data'
include { uniqueActors } from './modules/unique_actors'
include { compileJson } from './modules/unique_actors'
if (params.user_algo) {
include { userAlgo } from params.user_algo
}
workflow {
dataChannel = Channel.from("xapi_data.json")
requestData(dataChannel)
uniqueActors(requestData.out.channel_data)
if (params.user_algo) {
user_algo = userAlgo(requestData.out.channel_data)
} else {
user_algo = null
}
output_json = [user_algo, uniqueActors.out]
// Filter output
Channel.fromList(output_json)
.filter{ it != null } <--- problem here
.map{ file(it) }
.set{jsonFiles}
compileJson(jsonFiles)
}
The problem is userAlgo can be dynamically loaded. And I don't know how I can take care of it. With this solution, I got a Unknown method invocation getFileSystem on ChannelOut type error.
The problem is that fromList expects a list of values, not a list of channels. If you use an empty Channel instead of checking for a null value, you could use:
if( params.user_algo ) {
user_algo = userAlgo(requestData.out.channel_data)
} else {
user_algo = Channel.empty()
}
user_algo
| concat( uniqueActors.out )
| map { file(it) }
| compileJson

BST delete - delete "tmp" causes lose of the tree

debug result
Attached my code for trying to delete a node in bst.
If I want to delete node 1, when specifying tmp = del in "if (del_node->l_ == NULL)", and remove tmp, then del is removed as well, and the tree data is lost. how can I solve this issue?
Example tree:
3
/ \
1 5
\
2
all data members and functions are declared public for simplicity.
void BST::DeleteNode(int data) {
BinaryTreeNode* &del_node = BST_Search(head_, data);
if (!del_node->l_ && !del_node->r_)
{
delete del_node;
del_node = nullptr;
return;
}
if (del_node->l_ == NULL)
{
BinaryTreeNode* tmp = del_node;
del_node = del_node->r_;
tmp = nullptr;
delete tmp;
return;
}
if (del_node->r_ == NULL)
{
BinaryTreeNode* tmp = del_node;
del_node = del_node->l_;
delete tmp;
return;
}
else
{
del_node->data_ = smallestRightSubTree(del_node->r_);
}
}
int BST::smallestRightSubTree(BinaryTreeNode* rightroot)
{
// if rightroot has no more left childs
if (rightroot && !rightroot->l_)
{
int tmpVal = rightroot->data_;
BinaryTreeNode* tmp = rightroot;
rightroot = rightroot->r_;
delete tmp;
return tmpVal;
}
return smallestRightSubTree(rightroot->l_);
}
int main()
{
BST bst;
bst.BST_Insert(bst.head_, 3);
bst.BST_Insert(bst.head_, 5);
bst.BST_Insert(bst.head_, 1);
bst.BST_Insert(bst.head_, 2);
bst.DeleteNode(1);
return 0;
}
Thanks for help!
EDIT: this is how tmp and del_node look like after the line "del_node = del_node->r_)" in the condition "if(del->l = null)"
void BST::BST_Insert(BinaryTreeNode*& head, int data) {
if (head == nullptr) {
head = new BinaryTreeNode(data, nullptr, nullptr);
return;
}
if (data > head->data_) {
BST_Insert(head->r_, data);
}
else {
BST_Insert(head->l_, data);
}
}
BinaryTreeNode* BST::BST_Search(BinaryTreeNode* root, int key) {
if (root == nullptr || root->data_ == key)
return root;
if (key > root->data_)
return BST_Search(root->r_, key);
return BST_Search(root->l_, key);
}
If your BST_Search returns a BinaryTreeNode* by value, what is the reference in BinaryTreeNode* &del_node = BST_Search(head_, data); actually referencing? It allocates a new temporary and references that. You probably wanted it to reference the variable that is holding the pointer in the tree so that you can modify the tree.
Your BST_Search would have to look like this:
BinaryTreeNode*& BST::BST_Search(BinaryTreeNode*& root, int key) {
if (root == nullptr || root->data_ == key)
return root;
if (key > root->data_)
return BST_Search(root->r_, key);
return BST_Search(root->l_, key);
}
I can't check whether this actually works, because you didn't provide a self-contained compilable example. But something along these lines.

How to fix '0 level missing for abstractListDefinition 0' error when using Emulator.getNumber()

I used docx4j to read the docx file. And I need to read the paragraph number format characters. I use Emulator.getNumber() to process, but I got this error. How should I deal with it?
try {
PPr pPr = ((P) p).getPPr();
if (pPr != null && pPr.getNumPr() != null) {
Emulator.ResultTriple triple = Emulator.getNumber(wordprocessingMLPackage, pPr);
if (triple != null) {
order = triple.getNumString();
}
}
} catch (Exception e) {
// throw error '0 level missing for abstractListDefinition 0'
e.printStackTrace();
}
Any help would be appreciated.Thanks.
docx4j version: 6.1.2
docx4j's html output uses it like so:
// Numbering
String numberText=null;
String numId=null;
String levelId=null;
if (pPrDirect.getNumPr()!=null) {
numId = pPrDirect.getNumPr().getNumId()==null ? null : pPrDirect.getNumPr().getNumId().getVal().toString();
levelId = pPrDirect.getNumPr().getIlvl()==null ? null : pPrDirect.getNumPr().getIlvl().getVal().toString();
}
ResultTriple triple = org.docx4j.model.listnumbering.Emulator.getNumber(
conversionContext.getWmlPackage(), pStyleVal, numId, levelId);
if (triple==null) {
getLog().debug("computed number ResultTriple was null");
} else {
if (triple.getBullet() != null) {
//numberText = (triple.getBullet() + " ");
numberText = "\u2022 ";
} else if (triple.getNumString() == null) {
getLog().error("computed NumString was null!");
numberText = ("?");
} else {
numberText = (triple.getNumString() + " ");
}
}
if (numberText!=null) {
currentParent.appendChild(document.createTextNode(
numberText + " "));
}
XSL-FO output:
if (pPrDirect!=null && pPrDirect.getNumPr()!=null) {
triple = org.docx4j.model.listnumbering.Emulator.getNumber(
conversionContext.getWmlPackage(), pStyleVal,
pPrDirect.getNumPr().getNumId().getVal().toString(),
pPrDirect.getNumPr().getIlvl().getVal().toString() );
} else {
// Get the effective values; since we already know this,
// save the effort of doing this again in Emulator
Ilvl ilvl = pPr.getNumPr().getIlvl();
String ilvlString = ilvl == null ? "0" : ilvl.getVal().toString();
triple = null;
if (pPr.getNumPr().getNumId()!=null) {
triple = org.docx4j.model.listnumbering.Emulator.getNumber(
conversionContext.getWmlPackage(), pStyleVal,
pPr.getNumPr().getNumId().getVal().toString(),
ilvlString );
}
}

How to access Topic name from pdfs using poppler?

I am using poppler, and I want to access topic or headings of a particular page number using poppler, so please tell me how to do this using poppler.
Using the glib API. Don't know which API you want.
I'm pretty sure there is no topic/heading stored with a particular page.
You have to walk the index, if there is one.
Walk the index with backtracking. If you are lucky, each index node contains a PopplerActionGotoDest (check type!).
You can grab the title from the PopplerAction object (gchar *title) and get the page number from the included PopplerDest (int page_num).
page_num should be the first page of the section.
Assuming your PDF has an index containing PopplerActionGotoDest objects.
Then you simply walk it, checking for the page_num.
If page_num > searched_num, go back one step.
When you are at the correct parent, walk the childs. This should give you the best match.
I just made some code for it:
gchar* getTitle(PopplerIndexIter *iter, int num, PopplerIndexIter *last,PopplerDocument *doc)
{
int cur_num = 0;
int next;
PopplerAction * action;
PopplerDest * dest;
gchar * title = NULL;
PopplerIndexIter * last_tmp;
do
{
action = poppler_index_iter_get_action(iter);
if (action->type != POPPLER_ACTION_GOTO_DEST) {
printf("No GOTO_DEST!\n");
return NULL;
}
//get page number of current node
if (action->goto_dest.dest->type == POPPLER_DEST_NAMED) {
dest = poppler_document_find_dest (doc, action->goto_dest.dest->named_dest);
cur_num = dest->page_num;
poppler_dest_free(dest);
} else {
cur_num = action->goto_dest.dest->page_num;
}
//printf("cur_num: %d, %d\n",cur_num,num);
//free action, as we don't need it anymore
poppler_action_free(action);
//are there nodes following this one?
last_tmp = poppler_index_iter_copy(iter);
next = poppler_index_iter_next (iter);
//descend
if (!next || cur_num > num) {
if ((!next && cur_num < num) || cur_num == num) {
//descend current node
if (last) {
poppler_index_iter_free(last);
}
last = last_tmp;
}
//descend last node (backtracking)
if (last) {
/* Get the the action and do something with it */
PopplerIndexIter *child = poppler_index_iter_get_child (last);
gchar * tmp = NULL;
if (child) {
tmp = getTitle(child,num,last,doc);
poppler_index_iter_free (child);
} else {
action = poppler_index_iter_get_action(last);
if (action->type != POPPLER_ACTION_GOTO_DEST) {
tmp = NULL;
} else {
tmp = g_strdup (action->any.title);
}
poppler_action_free(action);
poppler_index_iter_free (last);
}
return tmp;
} else {
return NULL;
}
}
if (cur_num > num || (next && cur_num != 0)) {
// free last index_iter
if (last) {
poppler_index_iter_free(last);
}
last = last_tmp;
}
}
while (next);
return NULL;
}
getTitle gets called by:
for (i = 0; i < num_pages; i++) {
iter = poppler_index_iter_new (document);
title = getTitle(iter,i,NULL,document);
poppler_index_iter_free (iter);
if (title) {
printf("title of %d: %s\n",i, title);
g_free(title);
} else {
printf("%d: no title\n",i);
}
}