How do I add a line break to an external text log file from a pentaho transform? - pentaho

I'm using pentaho pdi (spoon) I have a transform to compare 2 database tables (from a query selecting year and quarters within those tables), i'm then hoping to a merge rows (diff) to a filter rows if flagfield is not identical, which if success logs the matches, and if doesn't match logs the output, both with text file output steps...
my issue is my external log file gets appended and looks like this:
412542 - 21 - 4 - deleted - DOMAIN1
461623 - 22 - 1 - deleted - DOMAIN1
^failuresDOMAIN1 - 238388 - 12 - 4 - identical
DOMAIN1- 223016 - 13 - 1 - identical
DOMAIN1- 171764 - 13 - 2 - identical
DOMAIN1- 185569 - 13 - 3 - identical
DOMAIN1- 232247 - 13 - 4 - identical
DOMAIN1- 260057 - 14 - 1 - identical
^successes
I want this output:
412542 - 21 - 4 - deleted - DOMAIN1
461623 - 22 - 1 - deleted - DOMAIN1
^failures
DOMAIN1 - 238388 - 12 - 4 - identical
DOMAIN1- 223016 - 13 - 1 - identical
DOMAIN1- 171764 - 13 - 2 - identical
DOMAIN1- 185569 - 13 - 3 - identical
DOMAIN1- 232247 - 13 - 4 - identical
DOMAIN1- 260057 - 14 - 1 - identical
^successes
notice the line breaks between the successes and failures

using add a data grid w/ a "line_break" string that's simply a new line, then passing that to each "text file output" that logs as this "line_break" data column string value quickly which helps, but I can't seem to sequence the transform steps because they're parallel...

Related

Conditional formatting in webi Rich Client 4.1 of multiple values

I'm in BO 4.1 using a crosstab table. It is summary data based off specific detail information. Example:
Area-Days Late-Order #-Reason
1 - 5 - 12345-Lost
1 - 2 - 843254 - Lost
2 - 4 - 7532384 - Lost
1 - 7 - 12353 - Not home
So the output would be
Area 1 Area 2
Lost 2 1
Not home 1 0
Now for the conditional formatting part, I want it to highlight the Area 1 Lost cell as red because two of the orders are greater than 3 days late.
For whatever reason it seems to not be doing it because it's getting hung up line item 2 because that one is less than 3 days late.
Thank you!
I cheated and created a new object and then summed and did an if statement. Thanks for looking at this.

How to use index match when you have 2 values?

I have a list with 300 names codes and each of these names have more than 1 value, e.g.,
CODE - VALUE
300 - 1
300 - 2
300 - 3
400 - 1
400 - 2
For each code, I want to return the greatest value, and after that I want to transform this greatest value into his name, e.g.,
CODE - VALUE - NAME
300 - 1 - alpha
300 - 2 - beta
300 - 3 - gamma
400 - 1 - theta
400 - 2 - sigma
So for code “300” I want to return “gamma” and for code “400” I want to return “sigma”.
Any thoughts?
Regards
place the following formula in F1 and the code you are looking for in E1. This assumes your second table is located in A1:C5. Adjust ranges to suit your data. Avoid full column references within the aggregrate function.
=INDEX(C:C,AGGREGATE(14,6,ROW(A1:A5)/((A1:A5=E1)*(B1:B5=AGGREGATE(14,6,B1:B5/(A1:A5=E1),1))),1))

How to handle a complex set of nested for loops vb.net

So I'm struggling to wrap my head around something I'm trying to write. Nest for-next loops are clearly the only way to go (as far as I can tell) but I just can't get any sort of pseudo code worked out. My problem is this, given a fixed number (lets say 100 for simplicity) I want to iterate through all combinations of sets of numbers upto 5 that totals 100. Lets say in steps of 5. So to be clear I would want to run the following first few example:
100
95 - 5
90 - 10
---
10 - 90
5 - 95
90 - 5 - 5
85 - 5 - 10
80 - 5 - 15
---
5 - 5 - 85
85 - 10 - 5
80 - 10 - 10
75 - 10 - 15
---
---
80 - 5 - 5 - 5 - 5
75 - 5 - 5 - 5 - 10
---
Hopefully that give you the idea of my my target is. My problem is that I just can't work out an effective way to program. I'm pretty competent in actually writing code (usually) but every time I sit down and do this I end up with 10's of nested for-next loops that just don't work!
To remove that nesting problem there's a simple approach : use a queue or a stack.
Here's some pseudo-code:
// add 1 item to start with in the queue
while(queue.Count > 0)
{
// 1. dequeue item from queue
// 2. do your work on it
// 3. if there's another combination emerging from step 2, enqueue it in the queue
}
// 4. this point will be reached once finished
Using a custom type to store whatever information you need for that job and by having a single loop instead of many should help you tackle the problem relatively quickly :D

change specific data of column

I have a table in which 10 record, now i want to update specific column data, means some part some column data and some not, for example in row 1 i want to change std with standard and other data will remain same, change same thing in all row in a single query. can it will be possible? and remember we cant remove and add cell again because it will change id
id - col1 - col2
1 - A - std abcad
2 - B - std bcddsad
3 - C - std avadsad
4 - A - std abcdsad
5 - B - std bcddsa
6 - C - std avadsad
7 - A - std abcdsd
8 - B - std bcddsds
9 - C - std avadsd
You can use the replace function for this
Update
table
Set
col2 = Replace(col2, 'std', 'standard');
UPDATE tblName
SET Column .WRITE('Standard',(CHARINDEX('std',Column,1)-1),LEN('std'))

Need Help Parsing File for This Pattern "Feb 06 2010 15:49:00.017 MCO"

Need to parse a file for lines of data that start with this pattern "Feb 06 2010 15:49:00.017 MCO", where MCO could be any 3 letter ID, and return the entire record for the line. I think I could get the first part, but the returning the rest of the line is where I get lost.
Here is some sample data.
Feb 06 2010 15:49:00.017 MCO -I -I -I -I 0.34 527 0.26 0.24 184 Tentative 0.00 0 Radar Only -RDR- - - - - No 282356N 0811758W - 3-3
Feb 06 2010 15:49:00.017 MLB -I -I -I -I 44.31 3175 -10.05 -10.05 216 Established 0.00 0 Radar Only -RDR- - - - - No 281336N 0812939W - 2-
Feb 06 2010 15:49:00.018 MLB -I -I -I -I 44.31 3175 -10.05 -10.05 216 Established 15.51 99 Radar Only -RDR- - - - - No 281336N 0812939W - 2-
Feb 06 2010 15:49:00.023 QML N856 7437-V -I 62-V 61-V 67.00 3420 -30.93 15.34 534 Established 328.53 129 Reinforced - - - - - - No 283900N 0815325W - -
Feb 06 2010 15:49:00.023 QML N516SP 0723-V -I 22-V 21-V 42.25 3460 -8.19 5.03 146 Established 243.93 83 Beacon Only - - - - - - No 282844N 0812734W - -
Feb 06 2010 15:49:00.023 QML 2247-V -I 145-V 144-V 78.88 3443 -39.68 23.68 676 Established 177.66 368 Reinforced - - - - - - No 284719N 0820325W - -
Feb 06 2010 15:49:00.023 MLB 1200-V -I 15-V 14-V 45.25 3015 -11.32 -20.97 475 Established 349.68 88 Beacon Only - - - - - - No 280239N 0813104W - -
Feb 06 2010 15:49:00.023 MLB 1011-V -I 91-V 90-V 94.50 3264 -56.77 10.21 698 Established 152.28 187 Beacon Only - - - - - - No 283341N 0822244W - -
- - - - - -
seems like your date + 3 characters are always the first 5 fields (with space as delimiter). Just go through the file, and do a split on space to each line. Then get the first 5 fields
s=Split(strLineOfFile," ")
wscript.echo s(0),s(1),s(2),s(3),s(4)
No need regex
From your sample data it seems that you don't have to check for the presence of a three letter identifier following the date -- it's always there. Add a final three letters to the regex if that's not a valid assumption. Also, add more grouping as needed for regex groups to be useful to you. Anyway:
import re
dtre = re.compile(r'^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) [0-9]{2} [0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}')
[line for line in file if dtre.match(line)]
Wrap it in a with statement or whatever to open your file, then do any processing you need on the list this builds up.
Another possibility would be to use a generator expression instead of a list comprehension (replace the outer [ and ] with ( and ) to do so). This is useful if you're outputting results to somewhere as you go, the file is large and you don't need to have it all in memory for different purposes. Just be sure not to close the file before you consume the entire generator if you go with this approach!
Also, you could use datetime's built-in parsing facility:
import datetime
for line in file:
try:
# the line[:24] bit assumes you're always going to have three-digit
# µs part
dt = datetime.datetime.strptime(line[:24], '%b %d %Y %H:%M:%S.%f')
except ValueError:
# a ValueError means the beginning of the line isn't parseable as datetime
continue
# do something with the line; the datetime is already parsed and stored in dt
That's probably better if you're going to create the datetime.datetime object anyway.