Filtering rows in Pentaho - pentaho

I have a dataset with columns containing numbers. However, some of the rows in that column have missing data. Instead of numbers, a dash (-) is placed in the cell.
What I want to happen is to separate those rows with a dash and output them to a separate excel file. Those without the dash, should output to a csv file.
I tried the "filter rows" but it gives me an error:
Unexpected conversion error while converting value [constant String] to a Number
constant String : couldn't convert String to number
constant String : couldn't convert String to number : non-numeric character found at position 1 for value [-]
My condition is if
Column1 CONTAINS - (String)

You cant try to convert to number in the select step,and handler the error, if can not convert to number that mean that is (-)

You can convert missing value indicators (like a dash or any other string) to null in Text-File-Input - see field option "Null if". That way you still can use the metadata detection feature and will not trip over a dash arriving in a Number field.
With CSV-File-Input you should stick to the String datatype until a Null-If step has cleansed the values, so you can change the datatype to Number in a Select-Values step.
If you must preserve the dash character, don't use metadata detection (as it suggests datatype Number) or use more rows to sample (so a field with a dash is encountered) or just revert the datatype to String again before saving and running the transformation.

My solution lies on the first 'Replace in String'. I replaced the dash into something numeric and can easily be distinguished from the rest of the numbers (I used 9999) and carried on with the rest of my process.
In filter rows, I had no problems anymore with the data type because both my variables and condition contained numbers, therefore, it no longer had to convert anything.
After filter rows, I added the 'Null-if' to remove the random 9999 that I used
just to have something to replace the dash.
After that, the separation was made just as I hope it would.
Thanks to #marabu for the Null-if idea.

Related

SSRS Report builder LENGTH expression & size specific field

I'm trying to build a bank file using SSRS report builder (3.0). I'm running into two issues:
I'm trying to get a length expression to work on a check number field but LEN(Field) doesn't work (returns 4 as the value regardless of the actual length of the field).
And LENGTH(Field) gives me an error:
The Value expression for the textrun 'Textbox15.Paragraphs[0]. TextRuns[0\' contains an error: [BC30451] Name 'LENGTH' is not declared*
The only reason why I'm even trying to get #1 to work is because I need to have one of the fields on the bank file have a constant length. Regardless of the check number, I need to make sure this field is always at 14 characters with leading zeros. I thought the best way to do this is to do a switch statement and add the number of appropriate zeros in depending on the size of the check number field.
Thanks for the help.
Edit: using a SQL server DB
For the length issue:
There are two ways to get string length
Using the LEN function
= LEN(Fields!myfield.Value)
Using the length property
= Fields!myfield.Value.Length
If your field is not a string, try converting first by using the Cstr function
= LEN( Cstr(Fields!myfield.Value) )
= Cstr(Fields!myfield.Value).Length
For the formatting issue:
For numeric fields set the cell format expression to have as many zeros as required eg. for 14 digit numbers
= "00000000000000"
I don't know on which database you are working if you are using sql server then try LEN
function and LENGHT in oracle.
I think you first convert it into integer if it's character and then try len function.

Is format ####0.000000 different to 0.000000?

I am working on some legacy code at the moment and have come across the following:
FooString = String.Format("{0:####0.000000}", FooDouble)
My question is, is the format string here, ####0.000000 any different from simply 0.000000?
I'm trying to generalize the return type of the function that sets FooDouble and so checking to make sure I don't break existing functionality hence trying to work out what the # add to it here.
I've run a couple tests in a toy program and couldn't see how the result was any different but maybe there's something I'm missing?
From MSDN
The "#" custom format specifier serves as a digit-placeholder symbol.
If the value that is being formatted has a digit in the position where
the "#" symbol appears in the format string, that digit is copied to
the result string. Otherwise, nothing is stored in that position in
the result string.
Note that this specifier never displays a zero that
is not a significant digit, even if zero is the only digit in the
string. It will display zero only if it is a significant digit in the
number that is being displayed.
Because you use one 0 before decimal separator 0.0 - both formats should return same result.

Extract only number/numeric

I have URL which is received by the application. The URL can take any format, but it always has a number at the end.
For example -
/Page/Modern-Day-Hotel-59460
/Page/Future-Fun-Days-At-Beach-223345
/Page/Hello-Page-123
/Page/This-Page/Second-Page/Main-Product-44231
As you can see there is a numeric value at the end of URL. I'm trying to extract only the numeric portion of the URL. is there any way to achieve this ? Since the URL length will change and the number size also varies, I'm looking for something generic
I'm using amazon redshift database
If the number is always preceded by a hyphen (as in your examples), relatively simple string manipulation is sufficient:
select right(col, position('-' in reverse(col)) - 1)
Note that this returns the last string after the last hyphen. It does not check that it is a number.

VB6 Change Numeric Field To Alphanumeric

There is a numeric field in a legacy application that I am trying to change to alphanumeric with a field length of about 15. The field is for data entry of account information. In the code, its referenced at numerous places:
.BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
and
!BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
The Format is: ####-##-##-##-## and the Mask is ####-##-##-##-##. What I am wondering is what Format (and code) changes should I make to get the field to become alphanumeric? I tried using ##########, however that has not worked.
As BobRodes commented, you can use # to mask characters not limited to numbers. There are other options (ignore spaces, force left-to-right filling, upper/lower case).
Have a look at Format function documentation at MSDN for details. This link is for VBA but Format strings should be compatible.
Please note that you still need to validate Input, Format function is not strict about input.

Convert an alphanumeric string to integer format

I need to store an alphanumeric string in an integer column on one of my models.
I have tried:
#result.each do |i|
hex_id = []
i["id"].split(//).each{|c| hex_id.push(c.hex)}
hex_id = hex_id.join
...
Model.create(:origin_id => hex_id)
...
end
When I run this in the console using puts hex_id in place of the create line, it returns the correct values, however the above code results in the origin_id being set to "2147483647" for every instance. An example string input is "t6gnk3pp86gg4sboh5oin5vr40" so that doesn't make any sense to me.
Can anyone tell me what is going wrong here or suggest a better way to store a string like the aforementioned example as a unique integer?
Thanks.
Answering by request form OP
It seems that the hex_id.join operation does not concatenate strings in this case but instead sums or performs binary complement of the hex values. The issue could also be that hex_id is an array of hex-es rather than a string, or char array. Nevertheless, what seems to happen is reaching the maximum positive value for the integer type 2147483647. Still, I was unable to find any documented effects on array.join applied on a hex array, it appears it is not concatenation of the elements.
On the other hand, the desired result 060003008600401100500050040 is too large to be recorded as an integer either. A better approach would be to keep it as a string, or use different algorithm for producing a number form the original string. Perhaps aggregating the hex values by an arithmetic operation will do better than join ?