Does Neo4j calculate JVM heap on Ubuntu? - jvm

In the neo4j-wrapper.conf file I see this:
# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
#wrapper.java.initmemory=512
#wrapper.java.maxmemory=512
Does that mean that I should not worry about -Xms and -Xmx?
I've read elsewhere that -XX:ParallelGCThreads=4 -XX:+UseNUMA -XX:+UseConcMarkSweepGC would be good.
Should I add that on my Intel® Core™ i7-4770 Quad-Core Haswell 32 GB DDR3 RAM 2 x 240 GB 6 Gb/s SSD (Software-RAID 1) machine?

I would still configure it manually.
Set both to 12 GB and use the remaining 16GB for memory mapping in neo4j.properties. Try to match it to you store file sizes

Related

How vCores and Memory get allocated from Spark Pool

I have below spark pool config. Nodes : 3 to 10.
Spark Job config:
After seeing below allocation, looks like it is using all 10 nodes from the pool. 10 x 8 vCores = 80 vCores; 10 x 64 GB = 640 GB
BUT, I have set number of executors - min & max to 4 to 6. So, shouldn’t it go max to 6 x 8 vCores and 6 x 64 GB ? Please correct if I am missing something here.
You are getting confused between Spark Pool Allocated vCores, memory and Spark Job executor size which are two different things.
You have created a ContractsMed Spark Pool, which has max. 10 nodes with each node size equal to 8 vCores and 64 GB memory. That's the reason the last snippet you have shared containing Spark Pool Allocated vCores and Memory and not Spark Job details. So, 80 vCores and 640 GB is the Spark Pool size and not Spark Job
Now coming to Spark Job Configuration, where you are using ContractsMed Spark Pool. As you have configured maximum 6 executors with 8 vCores and 56 GB memory each, the same resources, i.e, 6x8=56 vCores and 6x56=336 GB memory will be fetched from the Spark Pool and used in the Job.
The remaining resources (80-56=24 vCores and 640-336=304 GB memory) from Spark Pool will remain unused and can be used in any other Spark Job.

vertex failed. Out of memory error in Azure HDINSIGHT hive

I am experiencing outofmemory issue while joining 2 datasets; one contains 39M rows other contain 360K rows.
I have 2 worker nodes, each of the worker node has maximum memory of 125 GB.
In Yarn Memory allocated for all YARN containers on a node = 96GB
Minimum Container Size (Memory) = 3072
In Hive settings :
hive.tez.java.opts=-Xmx2728M -Xms2728M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB
hive.tez.container.size=3410
What values I should set to get rid of outofmemory issue.
I solved it by using increasing the Yarn Memory allocated
Minimum Container Size (Memory) = 3072 to 3840
Memory allocated for all YARN containers on a node 96 to 120 GB ( each node had 120GB)
Percentage of physical CPU allocated for all containers on a node 80%
Number of virtual cores 8
https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-hive-out-of-memory-error-oom

Loading large set of images kill the process

Loading 1500 images of size (1000,1000,3) breaks the code and throughs kill 9 without any further error. Memory used before this line of code is 16% of system total memory. Total size of images direcotry is 7.1G.
X = np.asarray(images).astype('float64')
y = np.asarray(labels).astype('float64')
system spec is:
OS: macOS Catalina
processor: 2.2 GHz 6-Core Intel Core i7 16 GB 2
memory: 16 GB 2400 MHz DDR4
Update:
getting the bellow error while running the code on 32 vCPUs, 120 GB memory.
MemoryError: Unable to allocate 14.1 GiB for an array with shape (1200, 1024, 1024, 3) and data type float32
You would have to provide some more info/details for an exact answer but, assuming that this is a memory error(incredibly likely, size of the images on disk does not represent the size they would occupy in memory, so that is irrelevant. In 100% of all cases, the images in memory will occupy a lot more space due to pointers, objects that are needed and so on. Intuitively I would say that 16GB of ram is nowhere nearly enough to load 7GB of images. It's impossible to tell you how much you would need but from experience I would say that you'd need to bump it up to 64GB. If you are using Keras, I would suggest looking into the DirectoryIterator.
Edit:
As Cris Luengo pointed out, I missed the fact that you stated the size of the images.

Physical CPU in AIX

Can someone let me know Why the number of physical CPU's is greater than the number of virtual CPU's in AIX.
Online Virtual CPUs : 8,
Active Physical CPUs in system : 48,
Desired Virtual CPUs : 8
Partition Number : 30
Type : Shared-SMT-4
Mode : Uncapped
Entitled Capacity : 0.80
Partition Group-ID : 32798
Shared Pool ID : 0
**Online Virtual CPUs : 8**
Maximum Virtual CPUs : 160
Minimum Virtual CPUs : 1
Online Memory : 84992 MB
Maximum Memory : 127488 MB
Minimum Memory : 256 MB
Variable Capacity Weight : 128
Minimum Capacity : 0.10
Maximum Capacity : 16.00
Capacity Increment : 0.01
Maximum Physical CPUs in system : 48
**Active Physical CPUs in system : 48**
Active CPUs in Pool : 48
Shared Physical CPUs in system : 48
Maximum Capacity of Pool : 4800
Entitled Capacity of Pool : 1190
Unallocated Capacity : 0.00
Physical CPU Percentage : 10.00%
Unallocated Weight : 0
Memory Mode : Dedicated
Total I/O Memory Entitlement : -
Variable Memory Capacity Weight : -
Memory Pool ID : -
Physical Memory in the Pool : -
Hypervisor Page Size : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement : -
Memory Group ID of LPAR : -
**Desired Virtual CPUs : 8**
Desired Memory : 84992 MB
Desired Variable Capacity Weight : 128
Desired Capacity : 0.80
Target Memory Expansion Factor : -
Target Memory Expansion Size : -
Power Saving Mode : Disabled
Sub Processor Mode : -
Your "Entitled Capacity" is 0.8. And the each fraction of a single processor equals 0.1 of one physical processor. So you get 8 virtual processors. Here you can get more information about this:
What is the capacity entitlement?
Physical processors are presented to a logical partition's operating
systems as virtual processors. Physical processors are virtualized
into portions or fractions. Each fraction of a single processor equals
0.1 of one processor. There is an additional fraction of 0.01 The number of cores assigned to a partition is represented by the Capacity
Entitlement. To display the assigned capacity entitlement for a shared
partition use the command # lparstat|awk -F "ent=" '/ent\=/ {print
$NF}' The output will be the number of processorsthis partition is
entitled to use. This is the upper threshold the partition can have
from the processor pool (Capped mode). The partition can use more than
the assigned capacity entitlement (Uncapped mode). Capped and uncapped
modes details will be illustrated later in this document. The number
of virtual processors and processing units that are assigned to a
partition can be changed through the HMC.
Capacity Entitlement considerations:
Capacity entitlement should be correctly configured for normal
production operation and to cover workload during peak time. Having
enough capacity entitlement is important so as not to impact operating
system performance and processor affinity. Running over entitled
capacity can cause bad affinity and noticeable performance degradation
affecting business operation.
Virtual Processors:
A virtual processor is a representation of a physical processor core
to the operating system of a partition that uses shared processors. It
is the number of physical processors that the logical partition can
spread out across. It represents the upper threshold for the number of
physical processors that can be used. We recommend not to increase the
ratio between the virtual processors to entitled capacity to more than
1.6 Each partition has its own assigned virtual processors. The partition will work only on the virtual processors needed for its
workload. The unneeded virtual processors assigned to a partition will
fold away using processor folding feature. To display the current
assigned virtual processors use the command # lparstat -i | grep -i
"Desired Virtual CPUs" Using an HMC, you can change the number of
virtual processors and processing units that are assigned to the
partition.
The given Physical CPU is the number of Physical CPUs installed on the power machine where this lpar is hosted. The Virtual CPU is the allocated Virtual CPU to this particular LPAR.
Also Desired Virtual CPU and Online Virtual CPU are the same thing.

Yarn memory allocation for spark streaming

When we use spark on yarn for non-streaming apps, we generally get the allocated memory to match the number of executors times memory per executor. When doing streaming apps, the allocated memory is immediately pushed to the limit (total memory) as shown in the yarn console.
With this set of parameters
--driver-memory 2g --num-executors 32 --executor-memory 500m
total memory 90G, memory used 85.88G
total vcores 64, vcores used 33
you would expect a basis of 32 * 1 G (500m + overhead) + driver memory or around 34 G, and 33 vcores (32 workers + 1 driver)
question:
is the 64 vcore due to the requirement of 2 core pairs for streaming connection and processing?
how did the estimated 34 G get pushed to 85.88 G? is this always true that with streaming apps, yarn gives it all it has?