pentaho database join error to match input data - pentaho

I have an input.csv file in which I have field "id" .
I need to do a database lookup with below logic.
I need to search whether the "id" is present in the field "supp_text"and extract the field "loc_id".
Eg,
id = 12345.
and, in my supp_text, I have the value "the value present is 12345".
I am using "Database join" function to do this.
viz.
*select loc_id from SGTABLE where supp_text like '%?%';*
and, i am passing "id" as a parameter.
I get the below error when I run.
"Couldn't get field info from [select LOC_ID from SGTABLE WHERE SUPP_TEXT like '%?%']"
offending row : [ID String(5)]
all inputs are string, and table fields are "VARCHAR".
.
I tried with "database lookup option too. But it does not have an option to match substring within a string.
Please help.

The JDBC driver is not replacing the parameter within the string. You must make the wildcard string first and pass the whole thing as a parameter. Here is a quick transform I threw together that does just that:
Note that in the Database Join step the SQL does not have '' quotes around it. Note also that unless used properly, the Database Join step can be a performance killer. This however, looks to be a reasonable use of it if there are going to be a lot of different wildcard values to use (unlike in my transform).

Related

Injection safe "SELECT FROM table WHERE column IN list;" using python3 and sqlite3

I want to take a user input from an html form, and do a SELECT by matching a column to the user input, and be safe for injection. BUT I want the user input to be a comma separated list. For example, if the column is called "name" and user_input is "Alice,Bob,Carrol" I wan to execute the query
SELECT FROM table WHERE name IN ("Alice","Bob","Carrol");
Which means I have the same problem as in this question select from sqlite table where rowid in list using python sqlite3 — DB-API 2.0.
But of course I do not want to do string concatenation myself to avoid injections. At the same time, because there could be any number of commas in user_input, I cannot do this:
db.execute('SELECT FROM table WHERE name IN (?,?,?)', user_input_splited)
I looked for a way to sanitize or escape the input by hand, to be able to do something like that:
db.execute('SELECT FROM table WHERE name IN ?', user_input_sanitized)
But I didn't find it. What's the best approach here?
Write your own code to take the user's input, split() it by comma, then iterate through that list. As you accept each value, push it onto one array, and push a literal "?" onto the other.
Of course, now verify that you have at least one acceptable value.
Now, construct your SQL statement by to include, say, join(", ", $qmarks_array) to automatically construct a string that looks like ?, ?, ? ... with an appropriate number of question-marks.(It won't insert any comma if there's only one element.) Then, having constructed the SQL statement in this way, supply the other array as input to the function which executes that query.
In this way, you supply each value as a parameter, which you should always do, and you allow the number of parameters to vary.

HIve/Impala Converting String into Lower case before using in hql

I need to convert the name of the table into lower before passing it for the query.
Irrespective of which case in pass the value for parameter $1 i need it to be converted into lower case before executing the below query.
QUERY:
show tables like '$1';
I have tried something like
QUERY
show tables like 'lower($1)';
But this doesn't work.
please help.
Your response would be highly appreciated
Impala identifiers are always case-insensitive. That is, tables named
t1 and T1 always refer to the same table, regardless of quote
characters. Internally, Impala always folds all specified table and
column names to lowercase. This is why the column headers in query
output are always displayed in lowercase.
Impala Documentation
All the below queries will give same result as internally impala converts to lowercase.
show tables like 'test*';
show tables like 'TeSt*';
show tables like 'TEST*';

How to Pass parameter in table input?

I have one job with two transformation in it.
Transformation get list of data which is pass to another transformation. Here it execute for each row pass from first transformation.
In second transformation I have used
"get row from result" -> "table input"
In "get row from result" there are five field but in table input i have to use only 2th position and 3th position field.
even if i try to give single param "?" its giving error
"
2017/06/29 15:11:02 - Get Data from table.0 - Error setting value #3 [String] on prepared statement
2017/06/29 15:11:02 - Get Data from table.0 - Parameter index out of range (3 > number of parameters, which is 2).
"
My query is very simple
select * from table where col1= ? and col2 = ?
How can I achieve this? error? Is my doing anything wrong ?
You can also give names to your parameters, so that your query become
select * from table where col1="${param2}" and col2="${param3}".
Don't forget to check the "Replace variable in script" checkbox, and to adapt the quotes to your sql dialect (ex: '${param1}' for SQL-Server).
Note the param2 and param3 must exists in the transformation's Settings/Parameters, without the ${...} decoration and with values that don't break the SQL.
The values of the parameters can be set or changed in a previous transformation with a Set variables step (variables and parameters are synonymous in first approximation) and a scope at least Valid in the parent job.
Of course, if you insist in unnamed parameters for legacy purposes or any other reason, you are responsible to tell PDI that the first one is to be discarded, (eg where (? is null or 0=0) and col1=? and col2=?.
If you have 5 fields arriving to the table input, you need to pass 5 parameters to your query, and in the right order. Also, each parameter will be used only once.
So, if you have 5 fields and only use 2 of them, the best way is to put a select values step between Get rows from result and Table input and let only the actual query parameters through.

SQL query against "bad word" filter list

My client wants to maintain a list of "bad words" that they can check auto-generated usernames against. This way, someone named "Fred Uckman" can be automatically rolled over to a different username format.
Given a particular input (#username nvarchar(60)), how best can I query against a single column table to see if that input contains any of the "bad words" in that table?
Build a table with a single column and your bad words
select count(*) from BadWords
where #userName like '%'+theBadWord+'%'
You could use a trigger for this. If you store the "bad words" in a table, and the destination usernames in another table, the trigger could run automatically "BEFORE INSERT" into the destination usernames table. To actually match the "bad word" you could use it as an operand to the LIKE SQL keyword. You could concatenate wildcards % (multiple characters) or _ ( a single character) to the beginning or end with CONCAT. Also, to create a trigger, you could use these steps.

I'm trying to do a solr search such that results are shown if a certain field has ANY value

I'm running a search with a type field. I'd like to show results of a certain type ONLY if two other field have values for them. So in my filter query I thought it would be(type:sometype AND field1:* AND field2:*) but wildcard queries can't start with the *.
Use a range query to express "field must have any value", e.g.:
type:sometype AND field1:[* TO *] AND field2:[* TO *]