How to get output headers on dynamic table input in pentaho kettle

How to get output headers on dynamic table input in pentaho kettle - pentaho

I've got a simple kettle transformation which just does Table Input -> Text File Output
The table input however is SELECT * FROM ${tableName}
(with the table coming from a job parameter)
The Text file output just has the filename options and separator set.
The output data rows are written OK, but the header checkbox does nothing and I cannot work out how to generate a header.
I guess it is because I am not explicitly mapping fields in the output stage.
How can I introduce a header to my output?
Thx

It turns out that enable "append" disables "header"
See the comment here: http://wiki.pentaho.com/display/EAI/Text+File+Output?focusedCommentId=21104316#comment-21104316

Related

how do I filter out errant integer data in pentaho data integration

I have a fixed position input.txt file like this:
4033667 70040118401401
4033671 70040/8401901 < not int because of "/"
4033669 70040118401301
4033673 70060118401101
I'm using a text file input step to pull the data in, and I'd like to load the data into a database as int's and have errant data go to a log file.
I've tried to using the filter step and the data validator step, but I can't seem to get either to work. I've even tried using the text input field to bring it in as a string and then converting it to an int w/ the Select/Rename values Step, and changing the data-type in meta-data section.
a typical error I keep running into is "String : couldn't convert String to Integer"
Any suggestions?
Thanks!

So I ended up using...
Text file input > Filter Rows (regex \d+) > select values (to cast string to int) > table output
...and the error log comes off of the false result of the regex filter.

I understand you problem.
Let do it simple.

Import flat file containing commas/quotes into SAP BODS

Hi I have a row like following in .csv file
12346,abcded,ssadsadc,2013.04.04 08.42.31,8,"I would like to use an
existing project as a template for a new project for another Report
Suite but it just overwrites the existing project rather than creates
new one even when I use the ""Save As"" function.",Analyst,,5,"Hotel
Room,Literature,Open/ Create",,
the text string has " and , as part of the string. Hence I am not able to use " as text delimiter in SAP BODS file format.
Could somebody help me on this?

Use a delimiter that is not expected to be in your data (ex. ~ or | ) or a string of multiple characters (ex. $^$ )

Cannot upload CSV that starts with an integer

I'm stuck with what seems like a weird BigQuery bug : I cannot upload a CSV file that starts (first line, first column) by an integer.
Here's my schema : COL1:INTEGER,COL2:INTEGER,COL3:STRING
Here's my csv file content :
100,4,XXX
100,4,XXX
If I put the STRING column as first column, the upload is OK.
If I add a header and tell BigQuery to skip it during the import, the upload is ok too.
But with the CSV and schema above, BigQuery always complains : Line:1 / Field:1, Value cannot be converted to expected type.
Anyone knows what the problem is ?
Thank you in advance,
David

I could not reproduce this problem--I copied and pasted the content into a file and uploaded it with no problems.
Perhaps the uploaded file format is corrupted somehow? If there are extra bytes at the beginning of the file, those would be ignored in a header row but might result in this error is the first value of the first field is expected to be an integer. I'd recommend examining the actual binary data in the file to make sure there's nothing funny going on.
Also, are you doing this import via web UI, command-line tool, or API? Have you tried one of the other methods?

how to import flat file source to database using sql

im currently want to inport my data from flat file to the database.
the flat file is in a txt file. in that txt file, i save a list of URLs. example:
http://www.mimi.com/Hotels-g303188-Rurrenabaque-Hotels.html
im using the SQL Server Import and Export wizard to do it. but when the time of execution, it has error saying
Error 0xc02020a1:
Data Flow Task 1: Data conversion failed. The data conversion for column
"Column 0" returned status value 4 and status text "Text was truncated or one
or more characters had no match in the target code page.".
can anyone help?..

You get this error because the text is too long for the column youve chosen to put it in.

Text was truncated or
You might want to check the size of the database column vis-a-vis your input data. Does the longest URL less than the column width?
one or more characters had no match in the target code page.".
Check if your input file has any special characters. An easy way to check this would be to save your file in ANSI (Notepad > Save As > Encoding = ANSI). Note - you'd still have to select the right code page so that the import interprets your input text correctly.
Here's a very nice link that has some background on what code pages are - http://www.joelonsoftware.com/articles/Unicode.html

Note you can also change the target column data type (to text stream for example) in the Datasource->Advanced section

How to force scheme.ini to be used for MS Text Driver?

I am creating this huge csv import, that uses the ms text driver, to read the csv file.
And I am using ColdFusion to create the scheme.ini in each folder's location, where the file has been uploaded.
Here is a sample one I am using:
[some_filename.csv]
Format=CSVDelimited
ColNameHeader=True
MaxScanRows=0
Col1=user_id Text width 80
Col2=first_name Text width 20
Col3=last_name Text width 30
Col4=rights Text width 10
Col5=assign_training Text width 1
CharacterSet=ANSI
Then in my ColdFusion code, I am doing 2 cfdump's:
<cfdump var="#GetMetaData( csvfile )#" />
<cfdump var="#csvfile#">
The meta data shows that the query has not grabbed the correct data types for reading the csv file.
And the dump of the query to read file, shows that it is missing values, because of Excel we can not force them to use double quotes. And when fields have mixed data types, then it causes our process to not work..
How can I either change the data type inside the query, aka make it use scheme.ini, or update metadata to the correct data type.
I am using a view on information_schema in sql server 2005 to get the correct data types, column names, and max lengths...
Unless I have some kind of syntax error, I can't see why it's not grabbing the data as the correct data type.
Any suggestions?

Funnily, I had the filename spelled wrong, instead of using schema.ini i was having it as scheme.ini.
I hate when you make lil mistakes like this...
Thank You

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to get output headers on dynamic table input in pentaho kettle - pentaho

It turns out that enable "append" disables "header" See the comment here: http://wiki.pentaho.com/display/EAI/Text+File+Output?focusedCommentId=21104316#comment-21104316

Related

how do I filter out errant integer data in pentaho data integration

Import flat file containing commas/quotes into SAP BODS

Cannot upload CSV that starts with an integer

how to import flat file source to database using sql

How to force scheme.ini to be used for MS Text Driver?

Categories

Resources