I'm new to Pentaho 8.3 CE (Spoon) and am trying add an extra column to a CSV file by concatenating 3 other text fields together. I'm using 2 options - Calculator and the inbuilt 'Concat fields' transformations.
The issue I'm facing is that some rows are enclosed by " " while others aren't... e.g.
Field A = "One thing, another thing"
Field B = Yet another thing
Field C = Final thing
Ideally, I want,
New field = "One thing, another thing Yet another thing Final thing",
I find I can't get the final " to enclose each line, so it looks like "One thing, another... Final thing
How do I get Pentaho to add that final " on? I've set to force the enclosure on.
enter image description here
First strip the double quotes with a String operations step or a Replace in String step (the latter allows regexp search and replace).
The use a Concat strings step to join them all together comma separated.
Finally, either prepend & append double quotes, or when writing out with e.g. a text file output, add the enclosure character.
Related
I have a csv source file with comma (,) delimiter and values are enclosed with double quotes (") and using Text file Input to read the data in PDI 8.3. I am using , in Separator and " in Enclosure options in Content tab.
However, there is a field that contains quotes within the double quotes in the values itself, see the example below:
"abc","cde",
"abc" - 1st col
"cde" - 2nd col
"ef"A"gh" - 3rd col
"ijk" - 4th col and so on..
And issue in the 3rd col, in output it's reading "ef" as 3rd col and remaining values is passing to the next subsequent col. Hope I am able to clarify the issue here, only Expecting to escape the " within the values.
I have tried " in the Escape option but it's not working. Can someone please suggest how to handle this.
Thanks!
You can just leave the Enclosure attribute empty. That way the string will only be divided into columns by the Delimiter.
See CSV File Input Doc and Text File Input Doc
I am investigating a bug in an Excel spreadsheet where the following formula is inserted into every cell in a column.
=AGGREGATE( 3, 5, InputData[#[Foo]:[Bar]]) > 0
The VBA is as follows:
Let AddColumn.DataBodyRange.formula = "=AGGREGATE( 3, 5, [#[Foo]:[Bar]]) > 0"
this will evaluate to FALSE if all of the cells on the current row between columns Foo and Bar are empty, otherwise it evaluates to TRUE
The problem I'm seeing is that the names Foo and Bar are variable and not under my control and the formula fails with Run-time error 1004 if a name contains a single quote:
Let AddColumn.DataBodyRange.formula = "=AGGREGATE( 3, 5, [#[Foo's name]:[Bar]]) > 0"
Is there a way I can escape the name in such a way that single quotes won't create the run-time error? Adding double quotes around the name gives me the same error.
Are there likely to be further problems if the names contain other characters that have special meaning in Excel?
I could also refer to the columns by address instead of name. Would that work with the current row '#' notation?
Excel version:14.0.7188.5002
I hear you when you say the naming convention is "not under my control". This really puts you in a bind when anything can be pumped into your code.
Sadly, the only solution is to scrub the input when they finally hand it over to you. This involves you having to make your own vba function that takes in a string and returns a string that has special characters removed (or replaced with something else).
In your case, you are going to have to scrub the data in possibly two places.
First, you will need to change all the column names so they don't have special characters in them. You'll need to access each name and send it through the 'scrub' function and then replace the name with the scrubbed name.
Second, when someone inputs a column name for your AGGREGATE, you'll need to capture that input into a string variable and then pass that through the 'scrub' function. Then you'll need to validate that the input they gave you matches up with a valid column name. If it's not valid, send them an error message asking them to enter a valid name or to cancel out.
After you have valid values for foo and bar, you can add them to your AGGREGATE function and let it execute.
If you can't scrub/change your column names, then you'll have to make a list of scrubbed column names and associate them with the column address. Then when you get your input, scrub it, and then match to the list to grab the correct address. Then you can use hard addresses instead of variable naming schemes.
It's a lot of work. But it's necessary when you have naming conventions that are not under your control.
The other answers and comments put me on the right track:
Function escapedColumnName(columnName As String) As String
columnName = Replace(columnName, "'", "''")
columnName = Replace(columnName, "#", "'#")
columnName = Replace(columnName, "[", "'[")
columnName = Replace(columnName, "]", "']")
escapedColumnName = columnName
End Function
I have a textbox control that allows for the user of enter button to enter details like ADdresses or other demographic information. Since the default for addresses are as follows:
Address 1
Address 2
City, st
Zip
I am wondering if there is a way to tell if the Enter key was used to make a new line here? I've looked around and currently the only is to have a check in VB for vbCrLf however I'm not seeing it pick this up in the code.
Test data for this would be something similar to below
123 N Street
S Test Street
Test City, XX
91883
The code below is what I'm trying to just replace any return carriage and replace with a space
Text.Replace(vbCrLf, " ")
Will this vbCrLf not pick up a carriage return unless there's an actual space between the above test values?
If you are using a MultiLine textbox (as it seems from your sample) then you don't need to search for the newline characters and replace them with a space.
You could simply use the Lines property where every line is stored separated from the other and then use the string Join method to create a single line string
Dim singleLine = string.Join(" ", myTextBox.Lines)
Of course if you are just interested to know if there is a newline character then just check the Length property of the Lines array
vbCrLf actually refers to two characters, a carriage return (13) and a line feed (10).
I would search and replace each separately. It isn't strictly necessary (as replace will work on the two characters at once), but can catch instances in which the user cut and pasted information, instead of typing directly into the text box.
Text = Text.Replace(vbCr, " ")
Text = Text.Replace(vbLf, " ")
or
Text = Text.Replace(vbCr, " ").Replace(vbLf, " ")
https://msdn.microsoft.com/en-us/library/microsoft.visualbasic.constants.vbcrlf(v=vs.110).aspx
The Replace() function will indeed properly detect and replace all occurrences of the target with the replacement - it does not matter whether there are leading or trailing spaces.
However, please consider that String objects are immutable and cannot be changed after they have been instantiated. Therefore, Replace() does not modify the existing object but rather returns a new string as its result.
To actually see the results of the function call, you need to do something along these lines:
newString = Text.Replace(vbCrLf, " ")
I spent quite some time to resolve exactly this problem, the solution I came across was:
text = text.Replace(" ", ControlChars.CrLf)
Sorry cant remember where I found the solution but if I do remember it I will post the link here.
I have some data which I run through, which generates a textfile.
The data is all pulled correctly, but it doesn't format correctly.
Right now, I am using TAB + Variable to space between each column but it is obviously made uneven as different variables differ in character length. Here is the layout:
RECORD NAME ADDRESS TELEPHONE SOMETHING SOMETHING
... Data is here.
Any ideas?
String.Format is your friend here.
It's very powerfull and gives you the function to align your output.
For example:
(EDIT: removed the txt prefix because could be confusing, now I suppose that data to be formatted is contained in string vars)
Dim result as string
result = String.Format("{0,-10}{1,-30}{2,-30}{3,-10}{4,20}", Record, Name, Address, Telephone, Something)
The result will be aligned to the left in a 10 space column for the first element (txtRecord) and so on for the remainders, the last element will be formatted in a column with 20 space and right aligned
If that's not enough look at composite formatting to get other useful options
I have some contract text that may change.
It is currently (non live system) stored in a database field (we determine which contract text to get by using other fields in table). I, however need to display a specific date (based on individual contract) within this text.
IE (but Jan 12, changes depending on the individual contract):
blah blah blah... on Jan 12, 2009... blah blah blah
but everything else in the contract text is the same.
I'm looking for a way to inject the date into this text. Similar to .NET's
Console.Write("Some text here {0} and some more here", "My text to inject");
Is there something like this? Or am I going to need to split the text into two fields and just concatenate?
I am always displaying this information using Crystal Reports so if I can inject the data in through Crystal then that's fine.
I am using Crystal Reports, SqlServer 2005, and VB.net.
You could use some "reserved string" (such as the "{0}") as part of the contract, and then perform a replace after reading from the database.
If there's no option for this reserved string (for instance, if the contract may contain any type of string characters in any sequence, which I find unlikely), then you'll probably need to split into 2 text fields
Have you tried putting a text marker like the {0} above that can be replaced in the crystal reports code?
You can create a formula field and concatenate your text there.
If the data is stored in the database, the formula text should look like this:
"Some static text " & totext({yourRecord.yourDateField}, "yyyy")
Or you can provide it as a parameter before you show the report:
Dim parameterValue As New CrystalDecisions.Shared.ParameterDiscreteValue
value.Value = yourDate
Dim parameter As New CrystalDecisions.Shared.ParameterField
parameter.ParameterFieldName = "MyParam"
parameter.CurrentValues.Add(value)
parameter.HasCurrentValue = True
Me.CrystalReportViewer1.ReportSource = rapport
Me.CrystalReportViewer1.ParameterFieldInfo.Clear()
Me.CrystalReportViewer1.ParameterFieldInfo.Add(parameter)
Then the formula text should look like this:
"some static text " & {?MyParam}
I'm assuming you have a data source connected to your report. You can drag the field from the Database Explorer drop where it should appear. This way whenever the report runs the correct text will always be shown.