SAP HANA | BULK INSERT in SAP HANA - bulkinsert

We have SAP HANA 1.0 SP11. We have one requirement where we need to calculate current stock at store, material level on daily basis. No of rows expected are around 250 million.
Currently we use procedure for same. Flow of procedure is as follows -
begin
t_rst = select * from <LOGIC of deriving current stock on tables MARD,MARC,MBEW>;
select count(*) into v_cnt from :t_rst;
v_loop = v_cnt/2500000;
FOR X in 0 .. v_loop DO
INSERT INTO CRRENT_STOCK_TABLE
SELECT * FROM :t_rst LIMIT 2500000 OFFSET :count;
COMMIT;
count := count + 2500000;
END FOR;
end;
Row count of result set t_rst is around 250 million.
Total execution time of procedure time is around 2.5 hours. Few times procedure goes into long running state resulting into error. We run this procedure in non peak hours of business so load on system is almost nothing.
Is there a way, we can load data in target table in parallel threads and reduce loading time. Also, is there way to bulk insert efficiently in HANA.
Query for t_rst fetches first 1000 rows in 5 minutes.

As Lars, mentioned the total resource usage will not change effectively
But if you have limited time (non-peak hours) and if the system configuration will overcome to the requirements of parallel execution, maybe you can try using
BEGIN PARALLEL EXECUTION
<stmt>
END;
Please refer to reference documentation
After you calculate v_loop value, you know how many times you have to run following INSERT command
INSERT INTO CRRENT_STOCK_TABLE
SELECT * FROM :t_rst LIMIT 2500000 OFFSET :count;
I'm not sure how to convert above code into a dynamic calculation for PARALLEL EXECUTION
But you can assume let's say 10 parallel processes, and run that many INSERT command by modifying the OFFSET clause according to calculated values
The ones that you exceed will run for zero rows which will not harm the overall process
As a response to #LarsBr. , as he mentioned there are limitations that will prevent parallel execution
Restrictions and Limitations
The following restrictions apply:
Modification of tables with a foreign key or triggers are not allowed
Updating the same table in different statements is not allowed
Only concurrent reads on one table are allowed. Implicit SELECT and SELCT INTO scalar variable statements are supported.
Calling procedures containing dynamic SQL (for example, EXEC, EXECUTE IMMEDIATE) is not supported in parallel blocks
Mixing read-only procedure calls and read-write procedure calls in a parallel block is not allowed.
These limitations saying, insert to same table will not be possible from different executions and dynamic SQL cannot be used too

Related

Returning the Number of Records after a Delete/Insert Operation in Oracle

I want to return the number of records before and after an operation performed in a Stored Procedure. I looked up a function that should have worked for returning the number of rows in a table. But, it ain't working. Any help?
Similar: Please check this link on DBA Stack Exchange
The procedure only consists of Dynamic SQL (execute immediate commands). The code is too large to paste here (and confidential).
The real motive is that I want to know how many records did a table consist of before the insert/delete command (in an execute immediate) and how many records it consisted after the insert/delete operation.
I want to store the logs of the procedure in another table (a kind-of log table) which keeps a track of the number of rows inserted/deleted from the table being operated on.
e.g.
PROCEDURE_NAME OP_TYPE RUN_DATE RECORDS_BEFORE RECORDS_AFTER
Name of the procedure Type of Operation Performed 1103929 1112982
The procedure body.
create or replace procedure vector as
begin
-- select count(*) from some_table
execute immediate 'delete from some_table
where trunc(creation_date) >= trunc(sysdate) - 7';
execute immediate 'insert into log_table values
(''Procedure Name'',''Insert'', sysdate,''....'')';
-- select count(*) from some_table
execute immediate 'insert into some_table ....';
execute immediate 'insert into log_table values
(''Procedure Name'',''Insert'', sysdate,''....'')';
-- select count(*) from some_table
end vector;
Basic requirement: I want the count(*) of some_table to be inserted into the log_table.
what data exactly do you want to get?
If it is the number of rows affected by your command - it should be in SQL%ROWCOUNT (after each individual command you execute. It will not "sum" all the modifications in the procedure, if this is what you need -you'll have to sum it manually after each insert/delete/update).
But, if you want to have the total number of rows in the table - you should run a
SELECT count(*) from TABNAME
before and after the command you executed (with the performance hit of it).
You can also combine the two - run a count() in the beginning of your procedure, and use SQL%ROWCOUNT to count the numbers of rows you modified , and assume the table now has count() - rowcount(of deletes).
DO REMEMBER that the Oracle by default will show you the number of records in the table at the time the count(*) query is being executed (after executing the current transaction commands), so the changes you will see without using the rowcount might include concurrent changes. For more information read about Oracle isolation level http://www.oracle.com/technetwork/issue-archive/2005/05-nov/o65asktom-082389.html .
In addition - there'd might be a concurrent change between the time you ran the count(*) query and the "delete" / "update" clause - so think about the scenarios that might occur in your specific case.
If you want a more detailed / code review - update the relevant part of the procedure / queries you execute.

Table Valued Parameters with Estimated Number of Rows 1

I have been searching the internet for hours trying to figure out how to improve the performance of my query using table-valued parameters (TVP).
After hours of searching, I finally determined what I believe is the root of the problem. Upon examining the Estimated Execution plan of my query, I discovered that the estimated number of rows for my query is 1 anytime I use a TVP. If I exchange the TVP for a query that selects the data I am interested in, then the estimated number of rows is much more accurate at around 7400. This significantly increases the performance.
However, in the real scenario, I cannot use a query, I must use a TVP. Is there any way to have SQL Server more accurately predict the number of rows when using a TVP so that a more appropriate plan will be used?
TVPs are Table Variables which don't maintain statistics and hence report only have 1 row. There are two ways to improve statistics on TVPs:
If you have no need to modify any of the values in the TVP or add columns to it to track operational data, then you can do a simple, statement-level OPTION (RECOMPILE) on any query that uses a table variable (TVP or locally created) and is doing more with that table variable than a simple SELECT (i.e. doing INSERT INTO RealTable (columns) SELECT (columns) FROM #TVP; does not need the statement-level recompile). Do the following test in SSMS to see this behavior in action:
DECLARE #TableVariable TABLE (Col1 INT NOT NULL);
INSERT INTO #TableVariable (Col1)
SELECT so.[object_id]
FROM [master].[sys].[objects] so;
-- Control-M to turn on "Include Actual Execution Plan"
SELECT * FROM #TableVariable; -- Estimated Number of Rows = 1 (incorrect)
SELECT * FROM #TableVariable
OPTION (RECOMPILE); -- Estimated Number of Rows = 91 (correct)
SELECT * FROM #TableVariable; -- Estimated Number of Rows = 1 (back to incorrect)
Create a local temporary table (single #) and copy the TVP data to that. While this does duplicate the data in tempdb, the benefits are:
better statistics for a temp table as opposed to table variable (i.e. no need for statement-level recompiles)
ability to add columns
ability to modify values

Generate a block of serial numbers from SQL Server while handling concurrent connections?

We have an application that has to generate unique sequential serial numbers and store the range in a database. We just need to log the range of numbers, we store the actual ones elsewhere
Basically how it's setup now is we have a simple table with three columns SequenceStartNumber, SequenceEndNumber, and AllocatedFor.
So for example, the rows may look as so :
SequenceStartNumber SequenceEndNumber AllocatedFor
1 1000 CustomerXYZ
1001 2000 CustomerZZY
A piece of code will do a query with
SELECT MAX(SequenceEndNumber) + 1 AS FirstNumber
FROM SequenceNumberAllocation
The code takes the result of this query, adds on how ever many serial numbers it knows it needs, and performs an insert with
INSERT INTO SequenceNumberAllocation (SequenceStartNumber, SequenceEndNumber, AllocatedFor)
VALUES (%d, %d, 'CustomerABC')
This way we have a running list of these blocks of numbers, and who's using them.
This works fine as-is, except it's obviously apparent this method cannot account for concurrency. Hypothetically ( it hasn't happened yet ) two simultaneous processes could perform the first query at the same time and grab the same starting number.
What would be the best way to re-factor this, to block the table before the insert clause is done? Should this operation be formed into a stored procedure somehow? SQL is not my strongest suit, so thus I have to ask what might be rudimentary question to the rest of you. Thanks for your time.
BEGIN TRANSACTION;
INSERT INTO dbo.SequenceNumberAllocation
(
SequenceStartNumber,
SequenceEndNumber,
AllocatedFor
)
SELECT
MAX(SequenceEndNumber) + 1,
MAX(SequenceEndNumber) + #HoweverManyNumbersYouNeed,
'Customer ABC'
FROM dbo.SequenceNumberAllocation WITH (HOLDLOCK);
COMMIT TRANSACTION;

working of where condition in sas proc sql, while connecting to some other database

I am working on a table with more that 30 million records.The table is on sybase and i am working on sas. There is a feed_key(numeric) variable which contains the time stamp for the record entry. I wanted to pull records for a particular time frame.
proc sql ;
Connect To Sybase (user="id" pass="password" server=concho);
create table table1 as
select * from connection to sybase
(
select a.feed_key as feed_key,
a.cm15,
a.country_cd,
a.se10,
convert(char(10),a.se10) as se_num,
a.trans_dt,
a.appr_deny_cd,
a.approval_cd,
a.amount
from abc.xyz a
where a.country_cd in ('ABC') and a.appr_deny_cd in ('0','1','6') and a.approval_cd not in ('123456') and feed_key > 12862298
);
disconnect from sybase;
quit;
it is pulling same no of records, irrespective of whether i put the feed_key condition or not, and it is taking almost same time to execute(16 mins without feek_key condition and 15 mins with feed_key condition) the query.
please clarify the working of where clause in this case.
as i believe the feed_key condition should have made the query run much faster as more than 80% of records do not match this condition....
If you're getting the same number of records back, it'll take the same amount of time to process the query.
This is because the I/O (transferring data back to SAS and storing it) is the most time-consuming part of the operation. This is why the lack of index doesn't make a big impact on the total time.
If you adjust your query so that it returns fewer rows, you will get faster processing.
You can tell whenever this is the case by looking at the SAS log, which will show you how much time was used by the CPU (the rest is IO):
NOTE: PROCEDURE SQL used (Total process time):
real time 11.07 seconds
cpu time 1.67 seconds

SQL Server - Simultaneous Inserts to the table from multiple clients - Check Limit and Block

We are recently facing one issue with simultaneous inserts into one of our sal server tables from multiple clients. I hope you guys can help us through.
We are using stored procedure to do the transactions. In that stored procedure, for each transaction, we calculate total sales so far. If the total sales is less than the set limit,
then the transaction will be allowed. Otherwise, the transaction will be denied.
it works fine most of times. But, sometimes when multiple clients trying to do the transaction exactly at the same time, the limit check is failing as both the transactions get done.
Can you guys suggest how we can effectively enforce the limit all the time? Is there any better way to do that?
Thanks!
I don't think it is possible to do this declaratively.
If all inserts are guaranteed to go through the stored procedure and the SaleValue is not updated once inserted then the following should work (I made up table and column names as these were not supplied in the initial question)
DECLARE #SumSaleValue MONEY
BEGIN TRAN
SELECT #SumSaleValue = SUM(SaleValue)
FROM dbo.Orders WITH (UPDLOCK, HOLDLOCK)
WHERE TransactionId = #TransactionId
IF #SumSaleValue > 1000
BEGIN
RAISERROR('Cannot do insert as total would exceed order limit',16,1);
ROLLBACK;
RETURN;
END
/*Code for INSERT goes here*/
COMMIT
The HOLDLOCK gives serializable semantics and locks the entire range matching the TransactionId and the UPDLOCK prevents two concurrent transactions locking the same range thus reducing the risk of deadlocks.
An index on TransactionId,SaleValue would be best to support this query.