How to transform a text file with tab separated fields to a pipe separated fields in pentaho? - pentaho

I have a situation where I want to transform a text file which has tab spaced fields like in the 'space-separated.png' below.
I want to transform this file by replacing tabs with pipes(|) like the 'pipe-separated.png' file below.
How can I do this in pentaho?
space-separated.png
pipe-separated.png

It can be achieved by a transformation with two steps.
Text file input (specify TAB as the separator in the content tab)
Text file output (specify | as the separator in the content tab)
Remember to click on 'Get Fields' option in both the steps. Not clicking on 'Get Fields' is what took me time.

If you don't want for any reason load as TEXT FILE OUTPUT step, you also can read the file text, without delimeter, so entire data will be in a row, use REPLACE IN STRING step, REGEX YES and search \t, replace for |. thats all.
all data in a field:
data view
Replace in string step:
Configuration
Preview result:
result with pipe

Related

Save Results As - Delimiter Issue

I am trying to save results of my query to a CSV file. The first time I did this, I got a text to columns window and I chose comma as delimiter. But after that Excel automatically delimits through comma and I need to change it to TAB. What should I do? Results to Grid in SQL Server does not have the option of delimiter.
Yes,
First, "Tools" --> "Options" --> Search "Results to text" and change output format:
Then change visualization of query, "Query" --> "Results to Text":
Then you will have results separated with tab and you can save it:
Don't forget to remove last metadata from your file or remove it using SQL Managment options:
(1000 rows affected)
Completion time: 2022-07-28T08:48:42.9340430+02:00
Don't forget to change file extension in "Save As" dialog, something like myFyle.csv
I don't know how to do this with grid, I think that it's not possible.

How to call one line at a time in LabVIEW read text file?

I would like to read just one line of text at a time using "Read from text file" function. After the passing of this line I would like to move on to the next line after the rest of the program iterates once. When I change the "Read from text file" function to "Read lines", I can no longer put an indicator on the front panel for text. How can I iterate one line at a time? How can I put an indicator on the front panel to display which line of text was read?
This what you want:
Open/Create/Replace File (with inputs open and read-only)
— inside While Loop:
Read From Text File (with Read Lines)
End While Loop on error from Read From Text File
--After While Loop:
Close File
You can process the data in the While loop, or index it and process outside.
Read the Text file
Use for loop, and add the shift register to it
Through the input shift register terminal, wire the text file output (Initial data)
In Front panel, Right Click >> String >> Additional String Function >> Search/Split String
Wire the Initial data (from shift register) to the input string terminal and Provide the necessary search string/ Character (In my case, I have used line feed constant)
Search/Split String has two output terminals; Substring before the match and match +rest of the String
Trime the whitespace and wire the match +rest of the String o/p to the o/p shift register; Take the desired o/p from Substring before the match terminal
Set the tunnel mode of the desired string as indexing so that each line in the file is indexed to an element in the array.
Note: N value of the for loop's iteration terminal is equal to the no of lines in the file.

How to append one text file to another text file

I need to append 1 text file to another text file.
I've done several keyword searches, and keep coming up with instructions on adding text to an existing file ... which is not the same as appending one text file to another text file.
In a well-designed language, it might look something like the line below, with the contents of source2 added to source1.
Append(path/source1, path/source2, ResultCode)

Finding occuramce of a string in a column in excel based text file

I am using vb.net to find the sum of occuramce of string in a particular column in text file(excel based) . The text file is not tab delimited, and it is separated column by column nicely, I only learnt how to read line by line using stream reader but I have no idea how to read only the last column of the line and summing up the specific string that I want. Any idea how to do it? Not nesseccary nid to provide me the code
If by "an Excel-based text file" you mean that the values are comma-separated, you can read it in line by line, like you already are doing using a stream, and then use Split to separate the line out into an array. Google "vb.net split" to learn how to do this.

How to add line numbers to a file in Pentaho Data Integration (Kettle)?

I have a file names.txt with this data:
NAME;AGE;
alberto;22
andrea;51
ana;16
and I want to add a new column N with the line number of the row:
N;NAME;AGE;
1;alberto;22
2;andrea;51
3;ana;16
I've been looking and what I found was something related with Add sequence. I tried but I don't know how.
Thank you very much.
The Add Sequence step will get the job done, but you don't even need that. Both the CSV file input and Text file input steps can add a row number to the input rows. For the 'CSV file input' step it's called 'The row number field name (optional)'.
For Text file input, check the 'Rownum in output?' box on the Content tab and fill in the 'Rownum fieldname' text box.
I'm really baffled why you couldn't figure out the Add sequence step. It should work with no changes at all. Just drop it in and connect the output of the csv file to it and the sequence should appear as a field name called 'valuename'. I would change that personally, but still, it should work.