Delete the current row from an internal table in a loop - abap

Can I safely delete the active row while looping over an internal table?
As an example, consider this code:
LOOP AT lt_itab INTO ls_wa.
IF [...] . " A check that can't be done inside a 'DELETE lt_itab WHERE'
DELETE lt_itab INDEX sy-tabix
" OR
DELETE lt_itab FROM ls_wa.
ENDIF.
ENDLOOP.
Is it safe to delete records like this or will this logic not behave as intended?
Should I instead store the unique identifier for the rows in a temporary itab and run a DELETE lt_itab WHERE after the loop?
I assume that delete operations on records other than the one that is loaded in the current iteration will definitely cause issues but I'm unsure if this is a valid, let alone good practice.

Whether it is safe or not depends largely on your coding skills. It has a defined result, and it's up to you to use the commands correctly. It is usually safe if nothing else happens after the DELETE statement within the loop. You can issue a CONTINUE statement right after the deletion to make sure that this is the case.
Do not use DELETE lt_itab INDEX sy-tabix. If you use some statement within your check that changes sy-tabix as a side effect (for example, looking up some entry in a check table - or calling a function module/method that does so), you will end up deleting the wrong lines.
Be aware that you can simply use the statement DELETE lt_itab. in your example since the line to delete is the current one.
If your table can have multiple identical lines, your second variant DELETE lt_itab FROM ls_wa. will delete all of them, not just the current one - whether that is intended depends on your requirements.
EDIT: To reiterate the "defined result": The current line is deleted. There is no "continuing with the next line" - with the addition INTO var you actually copied the entire line into your variable. That variable won't be touched, it's just out of sync with the table. This might be intentional - the system has no way of knowing this. If you use a field symbol instead, it will be UNASSIGNED, which - again - might be what you intended - and then again maybe not.

Try this:
GET CURSOR LINE SY-CUROW .
DELETE ST_MAT INDEX SY-CUROW.

Related

Read statement fails as internal table is empty

Recently I've come across a Read statement which gives Sy-subrc eq 8. this happens because the internal table has no records and at the same time variables in the WHERE clause are empty too. What I have been wondering is why wouldn't we check if the table is not initial before the read statement?
pleas let me know if we can check the itab is not initial before read statement.
Thanks!
Yes, you can check itab is not initial before read statement. what's bothering you? it's not an issue. And WITH KEY or WITH TABLE KEY clause used with read statement not WHERE.
And if there is no value exist in internal table for given key value then SY-SUBRC will be 4
It's a wrong assumption to make a direct link between READ TABLE on an empty table and SY-SUBRC = 8. If you read the official ABAP documentation, you will see that SY-SUBRC = 8 is related to the variant READ TABLE ... BINARY SEARCH.
So, with READ TABLE ... BINARY SEARCH, SY-SUBRC will be 8 if the searched line doesn't exist but if it was inserted it would be placed after the last line of the table. Of course, that's always the case when the internal table is empty.
Addendum May, 10th: SY-SUBRC = 8 may also occur with READ TABLE on internal tables of type SORTED (because it's using a binary search implicitly).

Suppressing an insuppressible warning while INSERT INTO itab

I am adding a new entry to a sorted internal table inside a loop. As the loop I'm in has a sort order that's different from that of the sorted table, I have to use an INSERT INTO statement instead of an APPEND TO as the latter risks violating the sort order causing a dump.
However, when I add that code, I get a syntax check warning with internal message code "MESSAGE GJK", in EPC it says:
Program: ZCL_CLASS Method METHOD_NAME Row: 301
Syntax check warning.
In the table "LT_TABLE_NAME" a row was to be changed,
deleted or inserted. It is not possible
to determine statically if a LOOP is active over "LT_TABLE_NAME"
Internal message code: MESSAGE GJK
Cannot be hidden using a pragma.
But "Cannot be hidden using a pragma" just doesn't work for me. I understand the reason for the warning but I know at build time with 100% certainty that no loop will be active on the internal table that I'm inserting new records into. Yet I cannot hide this warning. Aside from causing useless warnings while developing, in some environments I wouldn't be able to transport code with syntax check warnings in it!
Is there any way to suppress this insuppresible warning?
Failing that, is there any way to avoid it? I can probably do so by using a temporary unsorted table as an intermediate and then just APPENDing the rows into the sorted table, but I balk at creating a useless (million-row) internal table just to bypass what seems to be a glaring oversight.
This message cannot be supressed, as it has been already stated in your previous question.
However, we can get rid of the initial cause of the problem and it is the only right way of doing here.
This error reports that some operation on internal table was carried out using implicit index specification, as it described in detailed message:
During the program flow, the current LOOP row is used, this means INDEX sy-tabix is used. If no LOOP is active over the table at this time, the runtime error TABLE_ILLEGAL_STATEMENT occurs.
For the current case of such an implicit operation, no encompassing LOOP statement for the table can be statically found (using the syntax check).
For some reason compiler doesn't see your loop and therefore cannot find loop index. What can be done in that case:
Use INSERT wa INTO TABLE instead of short form of INSERT.
Use explicit index for you INSERT statement
INSERT wa INTO itab INDEX loopIdx.
ABAP documentation for INSERT wa INTO itab syntax variant confirms that this syntax requires LOOP:
This variant is only possible within a LOOP across the same table and if the addition USING KEY is not specified in the LOOP. Each row to be inserted can be inserted before the current row in the LOOP.
P.S. Full text of this message could be fetched using DOCU_CALL FM passing message code TRMSG_MESSAGE_GJK there. All message codes are stored in DOKIL table.
The most likely reason for getting this warning is actually a syntax error! It will happen whenever you've got a statement like the following:
INSERT [work area] INTO [internal table].
The actual syntax for insert into an itab requires INTO TABLE:
INSERT [work area] INTO TABLE [internal table].
The warning's description doesn't seem to match what is actually happening here. Presumably it's considering that the table might have a header area (which is not the case). If you run this code, you'll get a TABLE_ILLEGAL_STATEMENT dump with a much more descriptive error message:
An attempt was made to change, delete or add a row in internal table "[internal table]". There is no valid cursor for this table however.
This is actually the second time I've encountered this but it's such a confusing message that I didn't remember the solution. I didn't intend to self-answer when I posted this but I realised my mistake when I got the dump. I'm guessing the main problem is relying on syntax errors to tell me when I'm using incorrect syntax: syntax check apparently doesn't consider this an outright error even if it probably should.

ABAP field symbols

Can someone simply explain me what happen in field symbols ABAP?
I'm glad if someone can explain the concept and how does it related to inheritance and how does it increasing the performance.
Field Symbols can be said to be pointers. Means, if You assign anything to a fields-symbol, the symbol is strong coupled ( linked ) to the variable, and any change to the fieldsymbol will change the variable immediately. In terms of performance, it comes to use, if You loop over an internal table. Instead of looping into a structure, You can loop into a fieldsymbol. If modifications to the internal table are made, then You can directly modify the fieldsymbol. Then You can get rid of the "modify" instruction,which is used in order to map the changes of the structure back to the corresponding line of the internal table.
"Read Table assigning" also serves the same purpose, like looping into a field-symbol.
Field-Symbol are more recommended then using a "workarea" ( when modifying ) , but references are the thing to go for now. They work almost similar to fieldsymbols.
Could I clarify it for You ?
Field-symbols in ABAP works as pointers in C++.
It has a lot of benefits:
Do not create extra-variables.
You can create a type ANY field-symbol, so you can point to any variable/table type memory space.
...
I hope these lines would be helpful.
Let's have a look at it when it comes to coding. Additionally i would like to throw in data references.
* The 'classic' way. Not recommended though.
LOOP AT lt_data INTO DATA(ls_data).
ls_data-value += 10.
MODIFY TABLE lt_data FROM ls_data.
ENDLOOP.
* Field symbols
LOOP AT lt_data ASSIGNING FIELD-SYMBOL(<fs_data>).
<fs_data>-value += 10.
ENDLOOP.
* Data references
LOOP AT lt_data REFERENCE INTO DATA(lr_data).
lr_data->value += 10.
ENDLOOP.
I personally prefer data references, because they go hand in hand with the OO approach. I have to admit that field symbols are slightly in front when it comes to performance.
The last two should be preferred when talking about modifying. The first example has an additional copy of data which decreases overall performance.

Large number of UPDATE queries slowing down page

I am reading and validating large fixed-width text files (range from 10-50K lines) that are submitted via our ASP.net website (coded in VB.Net). I do an initial scan of the file to check for basic issues (line length, etc). Then I import each row into a MS SQL table. Each DB rows basically consists of a record_ID (Primary, auto-incrementing) and about 50 varchar fields.
After the insert is done, I run a validation function on the file that checks each field in each row based on a bunch of criteria (trimmed length, isnumeric, range checks, etc). If it finds an error in any field, it inserts a record into the Errors table, which has an error_ID, the record_ID and an error message. In addition, if the field fails in a particular way, I have to do a "reset" on that field. A reset might consist of blanking the entire field, or simply replacing the value with another value (e.g. replacing the string with a new one that has all illegals chars taken out).
I have a 5,000 line test file. The upload, initial check, and import takes about 5-6 seconds. The detailed error check and insert into the Errors table takes about 5-8 seconds (this file has about 1200 errors in it). However, the "resets" part takes about 40-45 seconds for 750 fields that need to be reset. When I comment out the resets function (returning immediately without actually calling the UPDATE stored proc), the process is very fast. With the resets turned on, the pages take 50 seconds to return.
My UPDATE stored proc is using some recommended code from http://sommarskog.se/dynamic_sql.html, whereby it uses CASE instead of dynamic SQL:
UPDATE dbo.Records
SET dbo.Records.file_ID = CASE #field_name WHEN 'file_ID' THEN #field_value ELSE file_ID END,
.
. (all 50 varchar field CASE statements here)
.
WHERE dbo.Records.record_ID = #record_ID
Is there any way I can help my performance here. Can I somehow group all of these UPDATE calls into a single transaction? Should I be reworking the UPDATE query somehow? Or is it just sheer quantity of 750+ UPDATEs and things are just slow (it's a quad proc server with 8GB ram).
Any suggestions appreciated.
Don't do this in sql; fix the data up in code, then do you updates.
If you have sql 2008, then look into table-value parameters. It enables you to pass an entire table as a parameter to a s'proc. From their you just have the one insert/update or merge statement
If your looping through the lines and doing individual updates/inserts this can be really expensive... Consider using SqlBulkCopy which can speed up all your inserts. Similarly, you can create a DataSet, make your updates on the dataset and then submit them all in one shot through a SqlDataAdapter.
I believe you are doing 50 case statements on every update. Sounds like that would be slow.
It is possible to solve this problem with inject proof code via parameterized querys and a string constant table.
Quick and dirty example code.
string [] queryList = { "UPDATE records SET col1 = {val} WHERE ID={key}",
"UPDATE records SET col2 = {val} WHERE ID={key}",
"UPDATE records SET col3 = {val} WHERE ID={key}",
...
"UPDATE records SET col50 = {val} WHERE ID={key}"}
Then in your call to SQL you just pick the item in the array corresponding to the col you want to update and set the value and key for the parameterized items.
I'm guessing you will see a significant improvement... let me know how it goes.
Um. Why are you inserting numeric data into VARCHAR fields then trying to run numeric checks on it? This is yucky.
Apply correct data typing and constraints to your table, do the INSERT, and see if it failed. SQL Server will happily report errors back to you.
I would try changing the recovery model to simple and look at my indexes. Kimberly Tripp did a session showing a scenario with improved performance using a heap.

Can I maintain state between calls to a SQL Server UDF?

I have a SQL script that inserts data (via INSERT statements currently numbering in the thousands) One of the columns contains a unique identifier (though not an IDENTITY type, just a plain ol' int) that's actually unique across a few different tables.
I'd like to add a scalar function to my script that gets the next available ID (i.e. last used ID + 1) but I'm not sure this is possible because there doesn't seem to be a way to use a global or static variable from within a UDF, I can't use a temp table, and I can't update a permanent table from within a function.
Currently my script looks like this:
declare #v_baseID int
exec dbo.getNextID #v_baseID out --sproc to get the next available id
--Lots of these - where n is a hardcoded value
insert into tableOfStuff (someStuff, uniqueID) values ('stuff', #v_baseID + n )
exec dbo.UpdateNextID #v_baseID + lastUsedn --sproc to update the last used id
But I would like it to look like this:
--Lots of these
insert into tableOfStuff (someStuff, uniqueID) values ('stuff', getNextID() )
Hardcoding the offset is a pain in the arse, and is error prone. Packaging it up into a simple scalar function is very appealing, but I'm starting to think it can't be done that way since there doesn't seem to be a way to maintain the offset counter between calls. Is that right, or is there something I'm missing.
We're using SQL Server 2005 at the moment.
edits for clarification:
Two users hitting it won't happen. This is an upgrade script that will be run only once, and never concurrently.
The actual sproc isn't prefixed with sp_, fixed the example code.
In normal usage, we do use an id table and a sproc to get IDs as needed, I was just looking for a cleaner way to do it in this script, which essentially just dumps a bunch of data into the db.
I'm starting to think it can't be done that way since there doesn't seem to be a way to maintain the offset counter between calls. Is that right, or is there something I'm missing.
You aren't missing anything; SQL Server does not support global variables, and it doesn't support data modification within UDFs. And even if you wanted to do something as kludgy as using CONTEXT_INFO (see http://weblogs.sqlteam.com/mladenp/archive/2007/04/23/60185.aspx), you can't set that from within a UDF anyway.
Is there a way you can get around the "hardcoding" of the offset by making that a variable and looping over the iteration of it, doing the inserts within that loop?
If you have 2 users hitting it at the same time they will get the same id. Why didn't you use an id table with an identity instead, insert into that and use that as the unique (which is guaranteed) id, this will also perform much faster
sp_getNextID
never ever prefix procs with sp_, this has performance implication because the optimizer first checks the master DB to see if that proc exists there and then th local DB, also if MS decide to create a sp_getNextID in a service pack yours will never get executed
It would probably be more work than it's worth, but you can use static C#/VB variables in a SQL CLR UDF, so I think you'd be able to do what you want to do by simply incrementing this variable every time the UDF is called. The static variable would be lost whenever the appdomain unloaded, of course. So if you need continuity of your ID from one day to the next, you'd need a way, on first access of NextId, to poll all of tables that use this ID, to find the highest value.