DataStage 11.5 CFF stage throwing exception : APT_BabAlloc : Heap allocation failed - error-handling

Can you please help me to solve this --> I m using Datastage 11.5 and in cff stage of one of my job i m getting allocation failed error due to which my job is getting aborted when ever a large size cff file comes.
my job simplly converts cff file into text file.
Errors in job log show:
Message: main_program: Current heap size: 2,072,104,336 bytes in 4,525,666 blocks
Message: main_program: Fatal Error: Throwing exception: APT_BadAlloc: Heap allocation failed. [error_handling/exception.C:132]

From https://www.ibm.com/support/pages/datastage-cff-stage-job-fails-message-aptbadalloc-heap-allocation-failed the Complex Flat File (CFF) stage is a composite operator and will be inserting a Promote Sub-Record operator for every subrecord. Too many of them can exhaust the available heap. To further diagnose the problem, SET the Environment Variable APT_DUMP_SCORE=True and Verify the score dump in the log to see if the job is creating too many Promote Sub-Record operators. This could be exhausting the available heap. To improve performance and reduce memory usage the table definition should be optimized further.
Resolving the problem
Here is what you can do to reduce the number of Promote Sub-Record operators:
Save the table definition from the CFF stage in the job.
Clear all the columns in the CFF stage.
Reload the table definition from the saved table definition in step 1) by checking the check box 'Remove group columns". This step will remove the additional group columns.
Check the layout, it should have the same record length with the original job. After reloading the table the table structure will be flat (no more hierarchy).
After the above steps the OSH script generated from the Complex Flat File stage will no longer contain Promote Sub-Record operator and the performance will be improved and memory usage will be reduced to minimum.

Related

Error message: "PostgreSQL said: could not write block 119518 of temporary file: No space left on device" PostgreSQL

I have a query that, intuitively, should work just fine. But, almost immediately after executing, I am served with this error message:
ERROR: could not write block 119518 of temporary file: No space left on device
Query failed
PostgreSQL said: could not write block 119518 of temporary file: No space left on device
I have approximately 4.5GB in free storage space.
I've tried whittling down the memory usage by replacing each of the CTEs with materialized views in the hope that would reduce the need for processing power.
Additionally. I've taken these steps --
--I boosted our AWS instance to the memory-optimized db.r4.16xlarge.
--I ran analyze verbose and vacuum full analyze
--I've stopped all other processes
The query does some small processing in two CTEs and then joins a table (roughly 20M rows) with a smaller lookup table (roughly 500K rows).
Just in case someone else runs into this, I found out the answer. And, it was really simple: just go into the RDS admin panel and increase allocated storage.

Error in Apache Drill when doing JOIN between two large tables

I'm trying to make a JOIN between two tables, one having 1,250,910,444 records and the other 385,377,113 records using the Apache Drill.
However, after 2 minutes of execution, it gives the following error:
org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Failure allocating buffer.
Fragment 1:2 [Error Id: 51b70ce1-29d5-459b-b974-8682cec41961 on sbsb35.ipea.gov.br:31010]
(org.apache.drill.exec.exception.OutOfMemoryException) Failure allocating buffer. io.netty.buffer.PooledByteBufAllocatorL.allocate():64 org.apache.drill.exec.memory.AllocationManager.():80
org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation():243
org.apache.drill.exec.memory.BaseAllocator.buffer():225 org.apache.drill.exec.memory.BaseAllocator.buffer():195
org.apache.drill.exec.vector.VarCharVector.allocateNew():394 org.apache.drill.exec.vector.NullableVarCharVector.allocateNew():239
org.apache.drill.exec.test.generated.HashTableGen1800$BatchHolder.():137 org.apache.drill.exec.test.generated.HashTableGen1800.newBatchHolder():697
org.apache.drill.exec.test.generated.HashTableGen1800.addBatchHolder():690 org.apache.drill.exec.test.generated.HashTableGen1800.addBatchIfNeeded():679
org.apache.drill.exec.test.generated.HashTableGen1800.put():610 org.apache.drill.exec.test.generated.HashTableGen1800.put():549
org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():366
org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():222
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
...
java.lang.Thread.run():748
Drill configuration information I'm using: planner.memory_limit = 268435456
The server I'm using has 512GB of memory.
Could someone suggest me how to solve this problem? Creating index for each table could be a solution? If so, how do I do this on Drill.
Currently Apache Drill does not support indexing.
Your query fails during execution stage so planner.memory_limit won't take any effect.
Currently all you can do is allocate more memory:
make sure you have enough direct memory allocated in drill-env.sh;
use planner.memory.max_query_memory_per_node option.
There is ongoing work in the community to allow spill to disk for the Hash Join
but it's still in progress (https://issues.apache.org/jira/browse/DRILL-6027).
Use setting the planner.memory.max_query_memory_per_node to the max possible.
ALTER SESSION SET 'planner.memory.max_query_memory_per_node' = (some Value)
Make sure the parameter is set for this session.
check the DRILL_HEAP and DRILL_DIRECT_MAX MEMORY too.

Is there size limit on appending ORC data files to Vora tables

I created a Vora table in Vora 1.3 and tried to append data to that table from ORC files that I got from SAP BW archiving process (NLS on Hadoop). I had 20 files, in total containing approx 50 Mio records.
When I tried to use the "files" setting in the APPEND statement as "/path/*", after approx 1 hour Vora returned this error message:
com.sap.spark.vora.client.VoraClientException: Could not load table F002_5F: [Vora [eba156.extendtec.com.au:42681.1640438]] java.lang.RuntimeException: Wrong magic number in response, expected: 0x56320170, actual: 0x00000000. An unsuccessful attempt to load a table might lead to an inconsistent table state. Please drop the table and re-create it if necessary. with error code 0, status ERROR_STATUS
Next thing I tried was appending data from each file using separate APPEND statements. On the 15th append (of 20) I've got the same error message.
The error indicates that the Vora engine on node eba156.extendtec.com.au is not available. I suspect it either crashed or ran into an out-of-memory situtation.
You can check the log directory for a crash dump. If you find one, please open a customer message for further investigation.
If you do not find a crash dump, it is likely a out-of-memory situation. You should find confirmation in either the engine log file or in /var/log/messages (if the oom killer ended the process). In that case, the available memory is not sufficient to load the data.

Why will my SQL Transaction log file not auto-grow?

The Issue
I've been running a particularly large query, generating millions of records to be inserted into a table. Each time I run the query I get an error reporting that the transaction log file is full.
I've managed to get a test query to run with a reduced set of results and by using SELECT INTO instead of INSERT into as pre built table. This reduced set of results generated a 20 gb table, 838,978,560 rows.
When trying to INSERT into the pre built table I've also tried using it with and without a Cluster index. Both failed.
Server Settings
The server is running SQL Server 2005 (Full not Express).
The dbase being used is set to SIMPLE for recovery and there is space available (around 100 gb) on the drive that the file is sitting on.
The transaction log file setting is for File Growth of 250 mb and to a maximum of 2,097,152 mb.
The log file appears to grow as expected till it gets to 4729 mb.
When the issue first appeared the file grow to a lower value however i've reduced the size of other log files on the same server and this appears to allow this transaction log file grow further by the same amount as the reduction on the other files.
I've now run out of ideas of how to solve this. If anyone has any suggestion or insight into what to do it would be much appreciated.
First, you want to avoid auto-growth whenever possible; auto-growth events are HUGE performance killers. If you have 100GB available why not change the log file size to something like 20GB (just temporarily while you troubleshoot this). My policy has always been to use 90%+ of the disk space allocated for a specific MDF/NDF/LDF file. There's no reason not to.
If you are using SIMPLE recovery SQL Server is supposed manage the task of returning unused space but sometimes SQL Server does not do a great job. Before running your query check the available free log space. You can do this by:
right-click the DB > go to Tasks > Shrink > Files.
change the type to "Log"
This will help you understand how much unused space you have. You can set "Reorganize pages before releasing unused space > Shrink File" to 0. Moving forward you can also release unused space using CHECKPOINT; this may be something to include as a first step before your query runs.

Error while processing the cube: There is not enough space on the disk

I am getting following error while processing SSAS cube using powershell script. Error is "There is not enough space on the disk." However there is 2 TB memory allocated on the server and the estimated cube size is not more than 8 GB. Can some one please advise me on why i am getting this error and how to resolve it.
The following system error occurred from a call to GetOverlappedResult for Physical file: '\?\K:\OLAP\Data\xxx.0.db\xyz.0.cub\Factx.0.det\Factx.0.prt\131.fact.data', Logical file: '' : There is not enough space on the disk. .
Errors in the OLAP storage engine: An error occurred while processing the 'Factx' partition of the 'Factx' measure group for the 'Factx' cube from the xxx database.
Server: The current operation was cancelled because another operation in the transaction failed.
thanks
Check the settings for TempDir under the advanced settings on your SSAS instance. By default this is set to c:\ and if your cube is heavy on aggregations they can be written out to TempDir as the cube processes and can fill up pretty quickly. Change the TempDir to a folder within your k:\ drive or similar.