How to parameterise variable length of Strings? - sql

I am writing a query where 'batch_name' is the parameter, some times I get only one batch name and sometime I get 2 or more batch names. How can I handle this in Oracle BI Publisher query,
Here is my query,
Select * from pay_batch_headers pbh Where UPPER(pbh.batch_name) = UPPER(:p_batch_name)
Now this query will handle for only one batch name, I want it to handle multiple batch names.
something like Where UPPER(pbh.batch_name) IN ('Batch1','Batch2','Batch3')
But problem to use IN clause is I cant predict number of batches I have to query. Can any one help me in this please.

You have two choices. One is to munge the variables together into a string and use some method, such as regexp_like():
where regexp_like(upper(pbh.batch_name), ??)
The parameter string should look like: '^abc|def|ghi|jkl$'. You can make it as long as you like.
Another method is to use execute immediate. Dump the values into a SQL query as a string, using IN. The advantage of this method is that it can more easily use indexes

Related

how to separate the parameters in the sql query and push it in to array to avoid sql injection

SELECT * FROM table1 WHERE year_month BETWEEN '2021-08' AND '2022-01';
update table2 set note_description = 'test #8:57am', patient_id = '5840', note_updated_by = '10000019', note_update_date = '2022-07-13 09:45:49' where note_id = '639'
now my backend queries can be attacked by sql injection so i want to avoid the sql injection
in the above queries I want to separate the parameters from queries and replace it with special characters so that I can avoid sql injection is there any package or anything to do it.
If you have received the SQL statement with the parameters already concatenated in, then this is the wrong place to fix your issue - there’s no way to safely parse the statement and separate out the parameters from the query.
You should find the place in the code where the parameters are concatenated into the Statement and leveraging Prepared Statements/Parameterized Queries to safely pass/bind the parameters.
If that’s not possible (for example because the code is structured to only pass along the statement) a less desirable alternative is to encode/enquote the parameters before concatenating them in, while ensuring they are all quoted in the statement. How you do that part will depend on the database / language being used.
I've seen one product that does this: pt-query-digest. It's a free tool that parses the MySQL query log, and produces reports of aggregate time spent running each query. To do this, it must establish a query "fingerprint" which allows it to group queries that are the same except for constant values. Like SELECT * FROM mytable WHERE id = 123 has the same fingerprint as SELECT * FROM mytable WHERE id = 456.
This means it must parse the queries and replace each constant value, like a numeric or string literal, with a placeholder ?. In cases of IN() predicates, it replaces the list of values with ?+. Also it reduces whitespace and removes comments.
It's a non-trivial amount of code, about 100 lines of Perl: https://github.com/percona/percona-toolkit/blob/3.x/lib/QueryRewriter.pm#L139-L248
In spite of this, the function is preceded by a comment that the developers acknowledge it is not perfect, and may miss some cases. Implementing a recursive-descent parser using regular expressions is not efficient or correct.
But this is probably not what you want to do anyway. You shouldn't be starting from a query with constant values and making them into a parameterized query. You should design parameterized queries yourself, as needed.
Not every constant value in an SQL query necessarily must be parameterized. Only the ones that aren't fixed values. That is, if you need to combine a variable from your client code into the SQL query string, and you can't guarantee that the variable is safe, then use a parameter. If a query has a constant value that is fixed (not interpolated from a variable), then it can remain in the query. If a query has a value that comes from a variable, but that variable is known to be safe, and never can be tainted by untrusted input, then it can remain in the query.
It's more reliable and economical for you to make these judgments. You know the code and the context much better than any automated system can.

SQL purpose of colon

My colleague wrote this SQL statement and I had a hard time understanding it. What exactly is the purpose of using a colon in the where clause?
WHERE MGM_YYMM like :AS_YYMM
Full Query:
SELECT A.MGM_YYMM,
A.MGM_DATE,
A.MGM_GB,
A.INDATE,
A.SUDATE,
A.EMPNUM,
FROM SE_MAGAM A(NOLOCK)
WHERE MGM_YYMM like :AS_YYMM
ORDER BY MGM_YYMM DESC
It is a bind variable.
The program (or whatever else is issuing the query) will assign a value to :AS_YYMM, in this case the pattern to match against the MGM_YYMM column.
These kind of parameterized queries are useful because they can be prepared/parsed/compiled/analyzed once and then be run multiple times for varying inputs with reduced overhead (compared to a new query each time). Also helps against SQL injection (compared to building a dynamic SQL statement from user input).

Hide Empty columns

I got a table with 75 columns,. what is the sql statement to display only the columns with values in in ?
thanks
It's true that a similar statement doesn't exist (in a SELECT you can use condition filters only for the rows, not for the columns). But you could try to write a (bit tricky) procedure. It must check which are the columns that contains at least one not NULL/empty value, using queries. When you get this list of columns just join them in a string with a comma between each one and compose a query that you can run, returning what you wanted.
EDIT: I thought about it and I think you can do it with a procedure but under one of these conditions:
find a way to retrieve column names dynamically in the procedure, that is the metadata (I never heard about it, but I'm new with procedures)
or hardcode all column names (loosing generality)
You could collect column names inside an array, if stored procedures of your DBMS support arrays (or write the procedure in a programming language like C), and loop on them, making a SELECT each time, checking if it's an empty* column or not. If it contains at least one value concatenate it in a string where column names are comma-separated. Finally you can make your query with only not-empty columns!
Alternatively to stored procedure you could write a short program (eg in Java) where you can deal with a better flexibility.
*if you check for NULL values it will be simple, but if you check for empty values you will need to manage with each column data type... another array with data types?
I would suggest that you write a SELECT statement and define which COLUMNS you wish to display and then save that QUERY as a VIEW.
This will save you the trouble of typing in the column names every time you wish to run that query.
As marc_s pointed out in the comments, there is no select statement to hide columns of data.
You could do a pre-parse and dynamically create a statement to do this, but this would be a very inefficient thing to do from a SQL performance perspective. Would strongly advice against what you are trying to do.
A simplified version of this is to just select the relevant columns, which was what I needed personally. A quick search of what we're dealing with in a table
SELECT * FROM table1 LIMIT 10;
-> shows 20 columns where im interested in 3 of them. Limit is just to not overflow the console.
SELECT column1,column3,colum19 FROM table1 WHERE column3='valueX';
It is a bit of a manual filter but it works for what I need.

How to pass an entire row (in SQL, not PL/SQL) to a stored function?

I am having the following (pretty simple) problem. I would like to write an (Oracle) SQL query, roughly like the following:
SELECT count(*), MyFunc(MyTable.*)
FROM MyTable
GROUP BY MyFunc(MyTable.*)
Within PL/SQL, one can use a RECORD type (and/or %ROWTYPE), but to my knowledge, these tools are not available within SQL. The function expects the complete row, however. What can I do to pass the entire row to the stored function?
Thanks!
Don't think you can.
Either create the function with all the arguments you need, or pass the id of the row and do a SELECT within the function.

Sql Optimization: Xml or Delimited String

This is hopefully just a simple question involving performance optimizations when it comes to queries in Sql 2008.
I've worked for companies that use Stored Procs a lot for their ETL processes as well as some of their websites. I've seen the scenario where they need to retrieve specific records based on a finite set of key values. I've seen it handled in 3 different ways, illustrated via pseudo-code below.
Dynamic Sql that concatinates a string and executes it.
EXEC('SELECT * FROM TableX WHERE xId IN (' + #Parameter + ')'
Using a user defined function to split a delimited string into a table
SELECT * FROM TableY INNER JOIN SPLIT(#Parameter) ON yID = splitId
USING XML as the Parameter instead of a delimited varchar value
SELECT * FROM TableZ JOIN #Parameter.Nodes(xpath) AS x (y) ON ...
While I know creating the dynamic sql in the first snippet is a bad idea for a large number of reasons, my curiosity comes from the last 2 examples. Is it more proficient to do the due diligence in my code to pass such lists via XML as in snippet 3 or is it better to just delimit the values and use an udf to take care of it?
There is now a 4th option - table valued parameters, whereby you can actually pass a table of values in to a sproc as a parameter and then use that as you would normally a table variable. I'd be preferring this approach over the XML (or CSV parsing approach)
I can't quote performance figures between all the different approaches, but that's one I'd be trying - I'd recommend doing some real performance tests on them.
Edit:
A little more on TVPs. In order to pass the values in to your sproc, you just define a SqlParameter (SqlDbType.Structured) - the value of this can be set to any IEnumerable, DataTable or DbDataReader source. So presumably, you already have the list of values in a list/array of some sort - you don't need to do anything to transform it into XML or CSV.
I think this also makes the sproc clearer, simpler and more maintainable, providing a more natural way to achieve the end result. One of the main points is that SQL performs best at set based/not looping/non string manipulation activities.
That's not to say it will perform great with a large set of values passed in. But with smaller sets (up to ~1000) it should be fine.
UDF invocation is a little bit more costly than splitting the XML using the built-in function.
However, this only needs to be done once per query, so the performance difference will be negligible.