CSV Input Skipping Columns Into the Wrong Field Name - pentaho

I have a transformation that has a CSV input. My csv file is all over the places and separated by both one tab and some with two tabs. I think this may be causing my data to skip to the next column under the wrong field name. It skipped "CtrAcct in LegacySys" and few others so the values are all null in my transformation now. Is there anyway to maybe strip a tab and just keep it consistent with one tab?
See below, some data are delimited by double tabs and some with one. Data looks just like this but i changed out information for privacy.
" Cont.Account CtrAcct in LegacySys Contract account name c/o name House No. PO Box Street Supplement City Postl Code"
" 1234567890 1234567890 JOHN WILLIAM RHODEN 3722 LION BRIDGE BLVD GULF BAY 32563-3445"
" 1234567890 1234567890 JOHN PAUL DIAZ 555 GRAND PARK BLVD MIAMI 32533-6328"
" 1234567890 1234567890 JOHN A RODRIGUEZ 210 RICHARD BAYSHORE DR P C BEACH 32407-2543"
" 1234567890 1234567890 JOHN M YOUNG 12 BLUE LAGOON DR PANAMA CITY BEACH 32408-5125"

Related

How do I work with string matching using %?

There is a table that looks like below. I want to match sender's and receiver's names. In this example, I'm only interested in ABC and DEF as their names match (not a complete match but that is ok). How do I extract cases similar to ABC and DEF?
Table A:
id
sender full name
receiver first name
receiver last name
ABC
mike smith brown
mike
smith
DEF
kate josefin baker
kate
baker williams
GHI
kim jones
nathan
wilson

How to continue a sequence when inserting

I have tried to simplify my question with the following example:
I have a table with the following data:
Marker Name Location
1 Eric Benson Mixed
2 John Smith Rural
3 A David Rural
4 B John Mixed
And i want to insert into the table:
Name Location
Andy Jones Mixed
Ian Davies Rural
How can i continue the sequencein the Marker column to end up with:
Marker Name Location
1 Eric Benson Mixed
2 John Smith Rural
3 A David Rural
4 B John Mixed
5 Andy Jones Mixed
6 Ian Davies Rural
If you make this with a Stored Procedure you can ask the max of the Marker before to insert.
(That only works if the Marker Column is not identity)
Like This:
declare #max_marker int
set #max_marker=isnull((select max(marker) from table),0)
--Insert comes here
Insert into table (Marker,Name,Location) Values(#max_marker+1,'Andy Jones','Mixed')

Excel Concatenate Cells while adding characters and skipping Empty cells

I am not a programmer, but doing Excel work for a small library. We have these fields in an excel sheet:
John | J | Smith | BMI | 123 | 100 |
Sarah | P | Crown | ASCAP | 564 | 100 |
Tommy | T | Stew | BMI | 134 | 100 |
Suzy | S | Smith | BMI | 678 | 50 |
John | J | Smith | BMI | 123 | 50
What I would like to be able to combine any of the cells (in the same row)into one cell that would read like this:
John J Smith, (BMI), 100%, IPI 123
or
Suzy S Smith, (BMI), 50%, IPI 678 | John J Smith (BMI), 50% IPI 123
I figured out how to use the Concatenate function to do this, but it doesn't skip empty cells, and I get extra "|" or "()" in those spots. I also found the =StringConCat topic, and that works great for skipping, but I can't figure out how to add the extra characters.
Any help would be most appreciated.
Thank you!!
EDIT: Thanks for the quick responses so far. I should be more clear -
the pipes in my example were only to designate different cells - they are not actual characters in the cells (thanks for converting it to a table for me, Bruce). The only Pipe character I would like to use is in the results, as in my example between Suzy and John.
There will rarely be more than 2 entries on the same result line, but it is possible. Mostly it will be to composers that are sharing the credit. But there is a chance that they will work on a Public Domain song and I have to list "Traditional" or maybe "Mozart" as another composer.
Sorry that I don't know enough to ask my question as intelligently as I should. Just learning how to do this, and trying to figure it out as I go.
Thanks again!
For the extra spaces, use substitute to get rid of empties.
So, if your code is =concatenate(A1,B1,C1) and your 'empty spaces' are "| " then edit your formula to become =substitute(concatenate(A1,B1,C1),"| ","")
You can even stack the substitutes to add more possible 'empties', like " " (two spaces) or the like. =substitute(substitute(concatenate(A1,B1,C1),"| ","")," ","")

appropriate method for text match in one column to other column in oracle

I have to write a query in Oracle. I have a table called 'Entity' with 2 columns 'Pref_mail_name' and 'spouse_name'.
Now i want list of all spouse_name where the last name of the spouse_name is not populated from pref_mail_name.
For example my table has following data
Pref_mail_name spouse_name
Kunio Tanaka | Lorraine
Mrs. Betty H. Williams | Chester Williams
Mr. John Baranger | Mrs. Cathy Baranger
William kane Gallio | Karen F. Gallio
Sangon Kim | Jungja
i need output as 1st and 5th row only. I did some analysis and came up with oracle built in function
SELECT PREF_MAIL_NAME, SPOUSE_NAME, UTL_MATCH.JARO_WINKLER_SIMILARITY(a, b)
similarity from entity
order by similarity;
But above query is not looking genuine.Even though spouse last name is not populated from pref_mail_name its giving a value above 80 for similarity.

importing values from flat file excel

Hi I have to import values into an excel file as in the following way:
Values have to be imported based on last 8 characters in each line i.e. 00004100.
Assume 00004 is column5 and 100 is column6.
I need to import values into separate excel sheet for every unique pair of column5 and column6. i.e. all rows with last 8 digit as 00002100 will go in one excel sheet and so on.
1234 john smith america 00002100
1234 john smith america 00002100
1234 john smith america 00002200
1234 john smith america 00002200
1234 john smith america 00003100
1234 john smith america 00003200
1234 john smith america 00003200
1234 john smith america 00004100
1234 john smith america 00002100
How to generate excel sheets based on the given criteria in vb!?
Thanx in advance!
I am not sure why this is a VB question? Do you use Interop?
Since I do not know this, I am using pseudocode - but I think you will get the idea.
Its more like vb.net - if you use vba, the general Idea will work.
Generally i would to the following: Iterate over each row and concenate an new string.
Like this:
For row = 0 to rows.count-1
dim str as new stringbuilder()
for col = 0 to rows(row).count-1
str.append(rows(row).item(col)
next
dim lasteight=str.tostring
lasteight=lasteight.substring(lasteight.length-8)
'here do the following: have a list of your sheetnames. If this list contains lasteight goto this page and append the row. if no sheet exist, create one and append as well.
next