SELECT FROM (lv_tablename) error: the output table is too small - dynamic

I have an ABAP class method, say, select_something. select_something has an exporting parameter, say, et_result. et_result is of type standard table because the type of et_result cannot be determined until runtime.
The method sometimes gives a short dump saying With ABAP/4 Open SQL array select, the output table is too small at "select * into table et_result from (lv_tablename) where..."
Error analysis:
......in this particular case, the database table is 3806 bytes wide, but the internal table is only 70 bytes wide.
I tried "any table" too and the error is the same.

You could return a data reference. Your query will no longer fail, and you can assign the data to a correctly typed field symbol afterwards.
" Definition
class-methods select_all
importing
!tabname type string
returning
value(results) type ref to data.
...
...
" Implementation
method select_all.
data dref type ref to data.
create data dref type standard table of (tabname).
field-symbols <tab> type any table.
assign dref->* to <tab>.
select * from (tabname) into table <tab>.
get reference of <tab> into results.
endmethod.
Also, I agree with #vwegert that dynamic queries (and programming for that matter) should be avoided when possible.

What you're trying to do looks horribly wrong on many levels. NEVER use SELECT FROM (whatever) unless someone points a gun at your head AND the door is locked tight. You'll loose every kind of static error checking the system might be able to provide you with. For example, the compiler will no longer be able to tell you "Hey, that table you're reading from is 3806 bytes wide." It simply can't tell, even if you use constants. You'll find that out the hard way, producing short dumps, especially when switching between unicode and NUC systems, quite likely some in production systems. No fun.
(Actually there are a few - very very VERY few - good uses for dynamic table names in the SELECT statement. I need them about once every two to three years, and I code quite a lot weird stuff. Just avoid them wherever you can, even at the cost of writing more code. It's just not worth the trouble fixing broken stuff later.)
Then, changing the generic formal parameter type does not do anything to the type of the actual parameter. If you pass a STANRDARD TABLE OF mandt WITH DEFAULT KEY to your method, that table will have lines of 3 characters. It will be a STANDARD TABLE, and as such, it will also be an ANY TABLE, and that's about it. You can twist the generic types anywhere you like, there's no way to enforce correctness using generic types the way you use them. It's up to the caller to make sure that all the right types are used. That's a bad way to fly.

First off, I agree with vwegert's response, try to avoid dynamic sql selections if you can
That said, check the short dump. If the error is an exception class, you can wrap the SELECT statement in a try/catch block and at least stop it from dumping.
You can also try "INTO CORRESPONDING FIELDS OF TABLE et_result". If ET_RESULT is dynamic, you might have to cast it into the proper structure using RTTS. This might give you some ideas...

Couldn't agree more to vwegert, but if there is absolutely no other way (and there usually is) of performing your task than using dynamic select statements and dynamically typed parameters, do some checks on the type of the table and the parameter at runtime.
Use CL_ABAP_TYPEDESCR and its subclasses to do so.
This way, you can handle errors at runtime without your program dumping,
But as vwegert said, this dynamic stuff is pure evil and will most certainly break at some point during runtime. Adding the necessary error handling will most likely be a lot more work and a lot harder than redesigning your code to none dynamic SQL and typed parameters

Related

SQL UDF - Struct Diff

We have a table with 2 top level columns of type 'struct' - one is a 'before', and an 'after' image. The struct schemas are non trivial - nested, with arrays to a variable depth. The are sent to us from replication, so the schemas are always the same (but the schemas of course can be updated at some point, but always together)
Objective is for the two input structs, to return 2 struct 'diffs' of the before and after with only fields that have changed - essentially the 'delta' diff of the changes produce by the replication source. We know something has changed, but not 'what' since we get the full before and after image. this raw data lands in BQ and is then processed from there but need to determine the more granular change for high order BQ processing.
The table schema is very wide (1000's of leaf fields), and the data populated fairly spare (so alot of nulls will be present on both sides of the snapshot) - so would need to be performant as best as possible when executing over 10s of millions of rows.
All things are nullable for maximum flexibility.
So change could look like:
null -> value
value -> null
valueA -> valueB
Arrays:
recursive use of above for arrays of structs, ordering could be relaxed if that makes it easier?
It might not be possible.
Ive not attempted this yet as it seems really difficult so am looking to the community boffins for some support for this. I feel the arrays could be difficult part. There is probably an easy way perhaps in Python I dont or even doing some JSON conversion and comparison using JOSN tools? It feels like it would be a super cool feature built in to BQ as well, so if can get this to work, will add a feature request for it.
Id like to have a SQL UDF for reuse (we have SQL skills not python, although if easier in python then thats ok), and now with the new feature of persistent SQL UDFs, this seems the right time to ask and test the feature out!
sql
def struct_diff(before Struct, after Struct)
(beforeChange, afterChange) - type of signature but open to suggestions?
It appears to be really difficult to get a piece of reusable code. Since currently there is no support for recursive functions for SQL UDF, you cannot use a recursive approach for the nested structs.
Although, you might be able to get some specific SQL UDF functions depending on your array and structs structures. You can use an approach like this one to compare the structs.
CREATE TEMP FUNCTION final_compare(s1 ANY TYPE, s2 ANY TYPE) AS (
STRUCT(s1 as prev, s2 as cur)
);
CREATE TEMP FUNCTION compare(s1 ANY TYPE, s2 ANY TYPE) AS (
STRUCT(final_compare(s1.structA, s2.structA))
);
You can use UNNEST to work with arrays, and the final SQL UDF would really depend on your data.
As #rtenha suggested, Python could be a lot easier to handle this problem.
Finally, I did some tests using JavaScript UDF, and it was basically the same result, if not worst than SQL UDF.
The console allows a recursive definition of the function, however it will fail during execution. Also, javascript doesn't allow the ANY TYPE data type on the signature, so you would have to define the whole STRUCT definition or use a workaround like applying TO_JSON_STRING to your struct in order to pass it as a string.

Using bind variables in large insert statements

I am inheriting an application which has to read data from various types of files and use the OCI interface to move the data into an Oracle database. Most of the tables in question have about 40-50 columns, so the SQL insert statements become pretty lengthy.
When I inherited this code, it basically built up the insert statements via a series of strcats as a C string, then passed it to the appropriate OCI functions to set up and execute the statement. However, since much of the data is read directly from files into the column values, this leaves the application open to easy SQL injection. So I am trying to use bind variables to solve this problem.
In every example OCI application I can find, each variable is statically allocated and bound individually. This would lead to quite a bit of boilerplate, however and I'd like to reduce it to some sort of looping construct. So my solution is to, for each table, create a static array of strings containing the names of the table columns:
const char const *TABLE_NAME[N_COLS] = {
"COL_1",
"COL_2",
"COL_3",
...
"COL_N"
};
along with a short function that makes a placeholder out of a column name:
void makePlaceholder(char *buf, const char *col);
// "COLUMN_NAME" -> ":column_name"
So I then loop through each array and bind my values to each column, generating the placeholders as I go. One potential problem here is that, because the types of each column vary, I bind everything as SQLT_STR (strings) and thus expect Oracle to convert to the proper datatype on insertion.
So, my question(s) are:
What is the proper/idiomatic (if such a thing exists for SQL/OCI) to use bind variables for SQL insert statements with a large number of columns/params? More generally, what is the best way to use OCI to make this type of large insert statement?
Do large numbers of bind calls have a significant cost in efficiency compared to building and using vanilla C strings?
Is there any risk in binding all variables as strings and allowing Oracle to make the proper type conversion?
Thanks in advance!
Not sure about the C aspects of this. My answer will be from a DBA perspective.
Question 2:
Always use bind variables. It prevent SQL-injection and enhances performance.
The performance aspect is often overlooked by programmers. When Oracle receives a SQL it makes a hash of the entire SQL-text and looks in it's internal repository of execution plans to see if it has one. If bind variables was used it the SQL-text will be the same each time you run the query, not matter what the value of a variable is. However if you have concatenated the string your self Oracle will hash the SQL-text including content of (what you aught to have put in) variables, getting a unique hash every time. So if you do a query one million times Oracle will if you used bind variables make one execution plan, while if you did not use bind variables it will make one million execution plans and waste loads of resources doing that.

ABAP field symbols

Can someone simply explain me what happen in field symbols ABAP?
I'm glad if someone can explain the concept and how does it related to inheritance and how does it increasing the performance.
Field Symbols can be said to be pointers. Means, if You assign anything to a fields-symbol, the symbol is strong coupled ( linked ) to the variable, and any change to the fieldsymbol will change the variable immediately. In terms of performance, it comes to use, if You loop over an internal table. Instead of looping into a structure, You can loop into a fieldsymbol. If modifications to the internal table are made, then You can directly modify the fieldsymbol. Then You can get rid of the "modify" instruction,which is used in order to map the changes of the structure back to the corresponding line of the internal table.
"Read Table assigning" also serves the same purpose, like looping into a field-symbol.
Field-Symbol are more recommended then using a "workarea" ( when modifying ) , but references are the thing to go for now. They work almost similar to fieldsymbols.
Could I clarify it for You ?
Field-symbols in ABAP works as pointers in C++.
It has a lot of benefits:
Do not create extra-variables.
You can create a type ANY field-symbol, so you can point to any variable/table type memory space.
...
I hope these lines would be helpful.
Let's have a look at it when it comes to coding. Additionally i would like to throw in data references.
* The 'classic' way. Not recommended though.
LOOP AT lt_data INTO DATA(ls_data).
ls_data-value += 10.
MODIFY TABLE lt_data FROM ls_data.
ENDLOOP.
* Field symbols
LOOP AT lt_data ASSIGNING FIELD-SYMBOL(<fs_data>).
<fs_data>-value += 10.
ENDLOOP.
* Data references
LOOP AT lt_data REFERENCE INTO DATA(lr_data).
lr_data->value += 10.
ENDLOOP.
I personally prefer data references, because they go hand in hand with the OO approach. I have to admit that field symbols are slightly in front when it comes to performance.
The last two should be preferred when talking about modifying. The first example has an additional copy of data which decreases overall performance.

efficiency of using 'class' in Fortran 2003?

Fortran 2003 supports data polymorphism by using class, like:
subroutine excute(A)
class(*) :: A
select type (A)
class is ()
...
type is ()
...
end select
end subroutine
My question is: if I need to call this subroutine a huge amount of times, whether it will slow the code down because of the SELECT statement?
SELECT TYPE is typically implemented by the descriptor for the polymorphic object (CLASS(*) :: A here) having a token or pointer or index of similar that designates the dynamic type of the object. Execution of the select type is then like a SELECT CASE on this token, with the additional complication that non-polymorphic type guards (TYPE IS) are matched in preference to polymorphic guards (CLASS IS).
There is some overhead associated with this construct. That overhead is more than if the code did nothing at all. Then again, code that does nothing at all is rarely useful. So the real question is whether the use of SELECT TYPE is better or worse execution speed wise to some alternative approach that provides the necessary level of functionality (which might be less than the full functionality that SELECT TYPE provides) that your code needs. To answer that, you would need to define and implement that approach, and then measure the difference in speed in a context that's relevant to your use case.
As indicated in the comments, an unlimited polymorphic entity is essentially a type safe way of storing something of any type. In this case, SELECT TYPE is required at some stage to be able to access the value of the stored thing. However, this is only a rather specific subset of F2003's support for polymorphism. In more typical examples SELECT TYPE would not be used at all - the behaviour associated with the dynamic type of an object would be accessed by calling overridden bindings of the declared type.

Is using cfsqltype good practice?

When coding a cfqueryparam or cfprocparam, cfsqltype is optional. However, I've usually seen it coded. Are there any benefits to specifying cfsqltype?
The main benefit is an additional level of sanity checking for your query inputs, prior to passing it into your query. Also, in the case of date time values, I believe CF will properly translate datetime strings into the proper database format, if the cfsqltype="CF_SQL_DATE" or ="CF_SQL_TIMESTAMP" is specified.
In addition, I think it makes it more clear for future developers to see the types excepted when they read your code.
I would add to Jake's comment. In most RDBMS the database will need to run your variable through a type lookup to insure it's the proper type or can be cast to the proper type implicitly. A DB doesn't just throw a variable of "type Any" at a table or view. It has to build the proper typing into the execution plan. So if you don't provide a type you are asking the DB to "figure it out".
Whereas, when you specify the type you are pre-empting or pre-qualifying the data type. The engine knows the driver is presenting a variable of a certain type and can then use it directly or derive it directly.
Remember that, while security is a good reason to use cfqueryparam, it's only one reason. The other reason is to create correctly prepared statements that can executed efficiently - and ideally "pop" the execution plan cache on the DB server.