I have to read 10.000.000 records from a table.
Is it better:
to read these records one by one using SELECT ... ENDSELECT (without internal table)
or to read all of them at once using SELECT ... INTO TABLE itab and then LOOP through this internal table?
If all 10,000,000 entries fit into ABAP's main memory, you should select all of them with a single SELECT ... INTO TABLE ..., followed by a LOOP.
This reduces expensive database interaction to a minimum and will be fastest.
If the records don't fit into main memory, you need to retrieve them in packages. Check out the PACKAGE SIZE addition of the SELECT statement.
They have about the same speed
Contrary to popular belief, SELECT ... ENDSELECT does not fetch the rows one-by-one, so its performance is not much worse than SELECT ... INTO TABLE. See the explanation here.
The big problem with SELECT ... ENDSELECT is that it prevents other performance improvements
Consider this coding:
SELECT matnr FROM mara
INTO lv_matnr WHERE (...).
SELECT SINGLE ebeln
FROM ekpo
INTO lv_ebeln
WHERE matnr = lv_matnr.
SELECT SINGLE zfield
FROM ztable
INTO lv_zfield
WHERE matnr = lv_matnr.
...
ENDSELECT.
Most of the time will be spent with the SELECT SINGLEs on table ekpo and ztable, and often the solution for this is using FOR ALL ENTRIES1, however you need an internal table for that.
So it has to be converted into a SELECT ... INTO TABLE anyway:
SELECT matnr FROM mara
INTO TABLE lt_matnr WHERE (...).
IF lt_mara IS NOT INITIAL.
SELECT matnr, ebeln FROM ekpo
INTO TABLE #lt_ekpo "make it SORTED by matnr"
FOR ALL ENTRIES IN #lt_matnr
WHERE matnr = #table_line.
SELECT matnr, zfield FROM ztable
INTO TABLE #lt_ztable "make it SORTED by matnr"
FOR ALL ENTRIES IN #lt_matnr
WHERE matnr = #table_line.
ENDIF.
LOOP AT lt_matnr ASSIGNING <fs_mara>.
READ TABLE lt_ekpo ASSIGNING <fs_ekpo>
WITH KEY matnr = <fs_matnr>.
READ TABLE lt_ztable ASSIGNING <fs_ztable>
WITH KEY matnr = <fs_matnr>.
...
ENDLOOP.
You should avoid SELECT ... ENDSELECT, not because of its own performance, but to make other improvements easier.
You should use JOINs whenever you can instead of FOR ALL ENTRIES
I'm trying to use following:
update bseg from zbseg
where tables are not from same length (ZBSEG is reduced version of BSEG).
Whole idea is that BSEG is just an example, I have a loop where all cluster tables will be iterated, so everything should be dynamically.
Table data from cluster is reduced to only several fields and copied to transparent table (data dictionary in new transparent table has primary keys + only few of the field of cluster) and afterwards data in DB will be modified and copied back via UPDATE to the cluster.
update bseg from zbseg
this statement updates the field values from ZBSEG but for the rest will not keep old values but rather puts initial values.
I've tried even that:
SELECT *
FROM bseg
INTO TABLE gt_bseg.
SELECT mandt bukrs belnr gjahr buzei buzid augdt
FROM zbseg
INTO CORRESPONDING FIELDS OF TABLE gt_bseg.
but it still overlaps those fields that are not considered in zbseg.
Any statement that will update only certain range of fields extracted from ZBSEG not touching other BSEG fields?
I think you need get records from zbseg with limit because of there will be exists million records then get them from bseg one by one and update it, then remove or update flags of it from zbseg for performance.
tables: BSEG, ZBSEG.
data: GT_ZBSEG like ZBSEG occurs 1 with header line,
GS_BSEG type BSEG.
select *
into table GT_ZBSEG up to 1000 rows
from ZBSEG.
check SY-SUBRC is initial.
check SY-DBCNT is not initial.
loop at GT_ZBSEG.
select single * from BSEG into GS_BSEG
where BSEG~MANDT = GT_ZBSEG-MANDT
and BSEG~BUKRS = GT_ZBSEG-BUKRS
and BSEG~BELNR = GT_ZBSEG-BELNR
and BSEG~GJAHR = GT_ZBSEG-GJAHR
and BSEG~BUZEI = GT_ZBSEG-BUZEI.
if SY-SUBRC ne 0.
message E208(00) with 'Record not found!'.
endif.
if GS_BSEG-BUZID ne GT_ZBSEG-BUZID
or GS_BSEG-AUGDT ne GT_ZBSEG-AUGDT.
move-corresponding GT_ZBSEG to GS_BSEG.
update BSEG from GS_BSEG.
endif.
" delete same records and transfered
delete ZBSEG from GT_ZBSEG.
endloop.
Here is piece of code you can use for your task. It is based on dynamic UPDATE statement which allows updating only certain fields:
DATA: handle TYPE REF TO data,
lref_struct TYPE REF TO cl_abap_structdescr,
source TYPE string,
columns TYPE string,
keys TYPE string,
cond TYPE string,
sets TYPE string.
SELECT tabname FROM dd02l INTO TABLE #DATA(clusters) WHERE tabclass = 'CLUSTER'.
LOOP AT clusters ASSIGNING FIELD-SYMBOL(<cluster>).
lref_struct ?= cl_abap_structdescr=>describe_by_name( <cluster>-tabname ).
source = 'Z' && <cluster>-tabname. " name of your ZBSEG-like table
* get key fields
DATA(key_fields) = VALUE ddfields( FOR line IN lref_struct->get_ddic_field_list( )
WHERE ( keyflag NE space ) ( line ) ).
lref_struct ?= cl_abap_structdescr=>describe_by_name( source ).
* get all fields from source reduced table
DATA(fields) = VALUE ddfields( FOR line IN lref_struct->get_ddic_field_list( ) ( line ) ).
* filling SELECT fields and SET clause
LOOP AT fields ASSIGNING FIELD-SYMBOL(<field>).
AT FIRST.
columns = <field>-fieldname.
CONTINUE.
ENDAT.
CONCATENATE columns <field>-fieldname INTO columns SEPARATED BY `, `.
IF NOT line_exists( key_fields[ fieldname = <field>-fieldname ] ).
IF sets IS INITIAL.
sets = <field>-fieldname && ` = #<fsym_wa>-` && <field>-fieldname.
ELSE.
sets = sets && `, ` && <field>-fieldname && ` = #<fsym_wa>-` && <field>-fieldname.
ENDIF.
ENDIF.
ENDLOOP.
* filling key fields and conditions
LOOP AT key_fields ASSIGNING <field>.
AT FIRST.
keys = <field>-fieldname.
CONTINUE.
ENDAT.
CONCATENATE keys <field>-fieldname INTO keys SEPARATED BY `, `.
IF cond IS INITIAL.
cond = <field>-fieldname && ` = #<fsym_wa>-` && <field>-fieldname.
ELSE.
cond = cond && ` AND ` && <field>-fieldname && ` = #<fsym_wa>-` && <field>-fieldname.
ENDIF.
ENDLOOP.
* constructing reduced table type
lref_struct ?= cl_abap_typedescr=>describe_by_name( source ).
CREATE DATA handle TYPE HANDLE lref_struct.
ASSIGN handle->* TO FIELD-SYMBOL(<fsym_wa>).
* updating result cluster table
SELECT (columns)
FROM (source)
INTO #<fsym_wa>.
UPDATE (<cluster>-tabname)
SET (sets)
WHERE (cond).
ENDSELECT.
ENDLOOP.
This piece selects all cluster tables from DD02L and makes an assumption you have reduced DB table prefixed with Z for each target cluster table. E.g. ZBSEG for BSEG, ZBSET for BSET, ZKONV for KONV and so on.
Tables are updated by primary key which must be included in reduced table. The fields to be updated are taken from reduced table as all fields excluding key fields, because primary key is prohibited for update.
You could try to use the MODIFY statement to update the tables.
An other way to do it would be to use the cl_abap_typedescr to get the fields of each table and compare them for the update.
Here is an example of how to get the fields.
DATA : ref_table_des TYPE REF TO cl_abap_structdescr,
columns TYPE abap_compdescr_tab.
ref_table_des ?= cl_abap_typedescr=>describe_by_data( struc ).
columns = ref_table_des->components[].
So far, I always used this to get specific lines from an internal table:
LOOP AT it_itab INTO ls_itab WHERE place = 'NEW YORK'.
APPEND ls_itab TO it_anotherItab
INSERT ls_itab INTO TABLE it_anotherItab
ENDLOOP.
However, with 7.40 there seems to be REDUCE, FOR, LINES OF and FILTER. FILTER requires a sorted or hashed key, which isn't the case in my example. So I guess only FOR comes into question.
DATA(it_anotherItab) = VALUE t_itab( FOR wa IN it_itab WHERE ( place = 'LONDON' )
( col1 = wa-col2 col2 = wa-col3 col3 = ....... ) ).
The questions are:
Are both indeed doing the same? Is the 2nd one an APPEND or INSERT?
Is it possible in the second variant to use the whole structure and not specifying every column? Like just ( wa )
Is the second example faster?
In accordance to your comment, you can also define a sorted secondary key on a standard table. Just look at this example here:
TYPES:
BEGIN OF t_line_s,
name1 TYPE name1,
name2 TYPE name2,
ort01 TYPE ort01,
END OF t_line_s,
t_tab_tt TYPE STANDARD TABLE OF t_line_s
WITH NON-UNIQUE EMPTY KEY
WITH NON-UNIQUE SORTED KEY place_key COMPONENTS ort01. "<<<
DATA(i_data) = VALUE t_tab_tt( ). " fill table with test data
DATA(i_london_only) = FILTER #(
i_data
USING KEY place_key " we want to use the secondary key
WHERE ort01 = CONV #( 'london' ) " stupid conversion rules...
).
" i_london_only contains the filtered entries now
UPDATE:
In my quick & dirty performance test, FILTER is slow on first call but beats the LOOP-APPEND variant afterwards.
UPDATE 2:
Found the reason today...
... the administration of a non-unique secondary table key is updated at the next explicit use of the secondary table key (lazy update).
I would like to find the columns in a table that has a null value in it.
Is there a system table that have that information?
To find columns where "null" values are allowed try...
select *
from dbc.columns
where databasename = 'your_db_name'
and tablename = 'your_table_name'
and Nullable = 'Y'
then to identify the specific rows w/ null values, take the "ColumnName" from the previous result set and run queries to identify results... perhaps throw them in a volatile table if you want to take further action on them (update,delete).
-- for example you found out that column "foo" is nullable...
create volatile table isnull_foo_col
as
(
sel *
from your_table_name
where foo is null
) with data
on commit preserve rows;
If you have statistics collected on the column you can use the views found here for Teradata 12.0.03+ and Teradata 13.0.02+ to determine the number of records in the table that have NULL values.
In Teradata 14, if you use the SHOW STATISTICS with the VALUES clause you will get similar information generated by the views listed at the link above.
You can use the DBC.Columns data dictionary view to determine what columns in a particular table are nullable.
I got a little problem
i need a sql query that gives all rows back that only contains 0 in it.
the column is defined as varchar2(6)
the values in the column looks like this:
Row Value
1 0
2 00
3 00
4 100
5 bc00
6 000000
7 00000
my first solution would be like this:
Oracle:
substr('000000' || COLUMN_NAME, -6) = '000000'
SQL Server:
right('000000' + COLUMN_NAME, 6) = '000000'
is there an other way?
(it needs to work on both systems)
the output would be the row 1,2,3,6,7
This is the simplest one:
select * from tbl where replace(col,'0','') = ''
If you will not make computed column for that expression, you can opt for function-based index(note: Oracle and Postgres already supports this; Sql Server as of version 2008, not yet) to make that performant:
create index ix_tbl on tbl(replace(col,'0',''))
[EDIT]
I just keep the answer below for posterity, I tried to explain how to make the query use index from computed column.
Use this:
select * from tbl
where ISNUMERIC(col) = 1 and cast(col as int) = 0
For ISNUMERIC needs on Oracle, use this: http://www.oracle.com/technology/oramag/oracle/04-jul/o44asktom.html
[EDIT]
#Charles, re: computed column on Oracle:
For RDBMSes that supports computed column but it doesn't have persisted option, yes it will make function call for every row. If it supports persisted column, it won't make function call, you have real column on the table which is precomputed from that function. Now, if the data could make the function raise an exception, there are two scenarios.
First, if you didn't specify persist, it will allow you to save the computed column (ALTER TABLE tbl ADD numeric_equivalent AS cast(col as int)) even if the result from the data will raise an exception, but you cannot unconditionally select that column, this will raise exception:
select * from tbl
this won't raise exception:
select * from tbl where is_col_numeric = 1
this will:
select * from tbl where numeric_equivalent = 0 and is_col_numeric = 1
this won't (Sql Server supports short-circuiting):
select * from tbl where is_col_numeric = 1 and numeric_equivalent = 0
For reference, the is_col_numeric above was created using this:
ALTER TABLE tbl ADD
is_col_numeric AS isnumeric(col)
And this is is_col_numeric's index:
create index ix_is_col_numeric on tbl(is_col_numeric)
Now for the second scenario, you put computed column with PERSISTED option on table that already has existing data(e.g. 'ABXY','X1','ETC') that raises exception when function/expression(e.g. cast) is applied to it, your RDBMS will not allow you to make a computed column. If your table has no data, it will allow you to put PERSISTED option, but afterwards when you attempt to insert data(e.g. insert into tbl(col) values('ABXY')) that raises an exception, your RDBMS will not allow you to save your data. Thereby only numeric text can be saved in your table, your PERSISTED computed column degenerate into a constraint check, albeit a full detoured one.
For reference, here's the persisted computed column sample:
ALTER TABLE tbl ADD
numeric_equivalent AS cast(col as int) persisted
Now, some of us might be tempted to not put PERSISTED option on computed column. This would be kind of self-defeating endeavor in terms of performance purposes, because you might not be able to create index on them later. When later you want to create index on the unpersisted computed column, and the table already has data 'ABXY', the database won't allow you to create an index. Index creation need to obtain the value from column, and if that column raises an exception, it won't allow you to create index on it.
If we attempt to cheat a bit i.e. we immediately create an index on that unpersisted computed column upon table creation, the database will allow you to do that. But when we insert 'ABXY' to table later, it will not be saved, the database is automatically constructing index(es) after we insert data to the table. The index constructor receives exception instead of data, so it cannot make an index entry for the data we tried inserting, subsequently inserting data will not happen.
So how can we attain index nirvana on computed column? First of all, we make sure that the computed column is PERSISTED, doing this will ensure that errors kicks-in immediately; if we don't put PERSISTED option, anything that could raise exception will be deferred to index construction, just making things fail later. Bugs are easier to find when they happen sooner. After making the column persisted, put an index on it
So if we have existing data '00','01', '2', this will allow us to make persisted computed column. Now after that, if we insert 'ABXY', it will not be inserted, the database cannot persist anything from computed column that raised an exception. So we will just roll our own cast that doesn't raise exception.
To wit(just translate this to Oracle equivalent):
create function cast_as_int(#n varchar(20)) returns int with schemabinding
begin
begin try
return cast(#n as int);
end try
begin catch
return null;
end catch
end;
Please do note that catching exception in UDF will not work yet in Sql Server, but Microsoft have plans to support that
This is now our non-exception-raising persisted computed column:
ALTER TABLE tbl ADD
numeric_equivalent AS cast_as_int(a) persisted
Drop the existing index, then recreate it:
create index ix_num_equiv on tbl(numeric_equivalent)
Now this query will become index-abiding-citizen, performant, and won't raise exception even the order of conditions is reversed:
select * from tbl where numeric_equivalent = 0 and is_col_numeric = 1
To make it more performant, since the numeric_equivalent column doesn't raise any more exceptions, we have no more use for is_col_numeric, so just use this:
select * from tbl where numeric_equivalent = 0
Do you like:
SELECT * FROM MY_TABLE
WHERE REPLACE (MY_COLUMN, '0', NULL) IS NULL
AND MY_COLUMN IS NOT NULL;
This would also work in Oracle (but not in SQL Server):
REPLACE(column_name, '0') IS NULL
This will work in Oracle (and perhaps also in SQL Server, you will have to check):
LTRIM(column_name, '0') IS NULL
Alternatively, since it is a VARCHAR(6) column, you could also just check:
column_name IN ('0', '00', '000', '0000', '00000', '000000')
This is not pretty but it is probably the most efficient if there is an index on the column.
Building off KM's answer, you can do the same thing in Oracle without needing to create an actual table.
SELECT y.*
FROM YourTable y
WHERE YourColumn IN
(SELECT LPAD('0',level,'0') FROM dual CONNECT BY LEVEL <= 6)
or
SELECT y.*
FROM YourTable y
INNER JOIN
(SELECT LPAD('0',level,'0') zeros FROM dual CONNECT BY LEVEL <= 6) z
ON y.YourColumn = z.zeros
I think this is the most flexible answer because if the maximum length of the column changes, you just need to change 6 to the new length.
How about using regular expression (supported by oracle, I think also MSSQL)
Another SQL version would be:
...
where len(COLUMN_NAME) > 0
and len(replace(COLUMN_NAME, '0', '')) = 0
i.e., where there are more than 1 characters in the column, and all of them are 0. Toss in TRIM if there can be leading, trailing, or embedded spaces.
try this, which should be able to use and index on YourTable.COLUMN_NAME if it exists:
--SQL Server syntax, but should be similar in Oracle
--you could make this a temp of permanent table
CREATE TABLE Zeros (Zero varchar(6))
INSERT INTO Zeros VALUES ('0')
INSERT INTO Zeros VALUES ('00')
INSERT INTO Zeros VALUES ('000')
INSERT INTO Zeros VALUES ('0000')
INSERT INTO Zeros VALUES ('00000')
INSERT INTO Zeros VALUES ('000000')
SELECT
y.*
FROM YourTable y
INNER JOIN Zeros z On y.COLUMN_NAME=z.Zero
EDIT
or even just this:
SELECT
*
FROM YourTable
WHERE COLUMN_NAME IN ('0','00','000','0000','00000','000000')
building off of Dave Costa's answer:
Oracle:
SELECT
*
FROM YourTable
WHERE YourColumn IN
(SELECT LPAD('0',level,'0') FROM dual CONNECT BY LEVEL <= 6)
SQL Server 2005 and up:
;WITH Zeros AS
(SELECT
CONVERT(varchar(6),'0') AS Zero
UNION ALL
SELECT '0'+CONVERT(varchar(5),Zero)
FROM Zeros
WHERE LEN(CONVERT(varchar(6),Zero))<6
)
select Zero from Zeros
SELECT
y.*
FROM YourTable y
WHERE y.COLUMN_NAME IN (SELECT Zero FROM Zeros)