How vCores and Memory get allocated from Spark Pool - azure-synapse

I have below spark pool config. Nodes : 3 to 10.
Spark Job config:
After seeing below allocation, looks like it is using all 10 nodes from the pool. 10 x 8 vCores = 80 vCores; 10 x 64 GB = 640 GB
BUT, I have set number of executors - min & max to 4 to 6. So, shouldn’t it go max to 6 x 8 vCores and 6 x 64 GB ? Please correct if I am missing something here.

You are getting confused between Spark Pool Allocated vCores, memory and Spark Job executor size which are two different things.
You have created a ContractsMed Spark Pool, which has max. 10 nodes with each node size equal to 8 vCores and 64 GB memory. That's the reason the last snippet you have shared containing Spark Pool Allocated vCores and Memory and not Spark Job details. So, 80 vCores and 640 GB is the Spark Pool size and not Spark Job
Now coming to Spark Job Configuration, where you are using ContractsMed Spark Pool. As you have configured maximum 6 executors with 8 vCores and 56 GB memory each, the same resources, i.e, 6x8=56 vCores and 6x56=336 GB memory will be fetched from the Spark Pool and used in the Job.
The remaining resources (80-56=24 vCores and 640-336=304 GB memory) from Spark Pool will remain unused and can be used in any other Spark Job.

Related

vertex failed. Out of memory error in Azure HDINSIGHT hive

I am experiencing outofmemory issue while joining 2 datasets; one contains 39M rows other contain 360K rows.
I have 2 worker nodes, each of the worker node has maximum memory of 125 GB.
In Yarn Memory allocated for all YARN containers on a node = 96GB
Minimum Container Size (Memory) = 3072
In Hive settings :
hive.tez.java.opts=-Xmx2728M -Xms2728M -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB
hive.tez.container.size=3410
What values I should set to get rid of outofmemory issue.
I solved it by using increasing the Yarn Memory allocated
Minimum Container Size (Memory) = 3072 to 3840
Memory allocated for all YARN containers on a node 96 to 120 GB ( each node had 120GB)
Percentage of physical CPU allocated for all containers on a node 80%
Number of virtual cores 8
https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-hive-out-of-memory-error-oom

What will be the best choice for batch size for one device (using Mirrored Strategy in TF)?

Question:
Suppose you have 4 GPUs (having 2GB memory each) to train your deep learning model. You have 1000 data points in your dataset that takes around 10 GB of storage. What will be the best choice for batch size for one device (using Mirrored Strategy in TF)?
Can someone help me to solve this assignment problem? Thanks in advance.
Each GPU has a memory of 2GB and there are 4 GPUs which means you have a total of 8 GB memory to work with.
Now you can't divide 10 GB of data into 8 GB in one go, so you split 10GB into halves, and have an overall batch size of 500 data points(or rather 512 to be closer to a power of 2)
Now you distribute these 500 data points across the 4 GPUs, getting a batch size of ~128 data points per device.
So overall batch size would be 512 data points, and per GPU batch size would 128.

Why no linear scaling of Redis Cluster

I am trying to build one horizontal scalability system based on Redis Cluster. So I've measured the throughput of Redis Cluster with different nodes. But finally, the measured result doesn't show the linear scalability as the cluster spec stated, “High performance and linear scalability up to 1000 nodes.”
redis cluster benchmark:
The image above shows the measure result of redis cluster of (3+3), (4+4), (5+5), (6+6), (8+8), (10+10) and (12+12). (3+3) means 3 master node plus 3 slave nodes. The result of C (reate) and you (update) don't show the linear scalability of redis cluster as following picture.
I'd like to know why these measured result don't show the linear scalability. Is there any possible reason to limit the scaling?
My test environment and related information are described as below
Server
HW: HP BL460c G9, 24 CPU (E5-2620 v3 #2.40GHz), 64G memory, 300G disk
I have two machines. In order to know the capacity of one HW machine, I run all master nodes on one machine and all slaves nodes on another machine. All redis nodes will be include in one Redis Cluster.
OS: SLES 12
I have updates some system settings to achieve higher performance.
echo 65535 > /proc/sys/net/core/somaxconn
echo 65535 > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo never > /sys/kernel/mm/transparent_hugepage/enabled
sysctl vm.overcommit_memory=1
sysctl vm.swappiness=0
Furthermore, I've turned off the swap, which could cause very unstable throughput when AOF re-write happened even swappiness is already set to 0. As observed, 15 million records in my test will occupy around 48G memory.
Redis 3.0.6: To eliminate the burst caused by RDB, I turned off all RDB and only enable AOF. For other configurations in redis.conf, just left with default values.
Client
HW: HP DL380 G7, 16 CPU (E5620 #2.40GHz), 24G memory, 600G disk
OS: SLES 12
YCSB (0.6.0) with jedis (2.8.0)
I will use hash key to store all records (1 key and 21 fields) and N sorted sets to store all keys and its random scores. Here N is the number of master nodes in the cluster. N sorted sets will be distributed evenly in each master node.
The YCSB workload configuration is pasted below:
workload=com.yahoo.ycsb.workloads.CoreWorkload
recordcount=15000000
operationcount=150000000
insertstart=0
fieldcount=21
fieldlength=188
readallfields=true
writeallfields=false
fieldlengthdistribution=zipfian
readproportion=0.0
updateproportion=1.0
insertproportion=0
readmodifywriteproportion=0.0
scanproportion=0
maxscanlength=1000
scanlengthdistribution=uniform
insertorder=hashed
requestdistribution=zipfian
hotspotdatafraction=0.2
hotspotopnfraction=0.8
table=subscriber
measurementtype=histogram
histogram.buckets=1000
timeseries.granularity=1000
At most cases, the computer resource is enough in my view though the throughput already hit the limit.
CPU: there are much CPU left, 60~70% CPU idle
I/O usage: it's not so busy, it's 30~40% utility at peak time.
Memory: only memory could be exhausted almost at peak time, i.e. when AOF re-write happened. At most time it's around 80%.

Yarn memory allocation for spark streaming

When we use spark on yarn for non-streaming apps, we generally get the allocated memory to match the number of executors times memory per executor. When doing streaming apps, the allocated memory is immediately pushed to the limit (total memory) as shown in the yarn console.
With this set of parameters
--driver-memory 2g --num-executors 32 --executor-memory 500m
total memory 90G, memory used 85.88G
total vcores 64, vcores used 33
you would expect a basis of 32 * 1 G (500m + overhead) + driver memory or around 34 G, and 33 vcores (32 workers + 1 driver)
question:
is the 64 vcore due to the requirement of 2 core pairs for streaming connection and processing?
how did the estimated 34 G get pushed to 85.88 G? is this always true that with streaming apps, yarn gives it all it has?

Does Neo4j calculate JVM heap on Ubuntu?

In the neo4j-wrapper.conf file I see this:
# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
#wrapper.java.initmemory=512
#wrapper.java.maxmemory=512
Does that mean that I should not worry about -Xms and -Xmx?
I've read elsewhere that -XX:ParallelGCThreads=4 -XX:+UseNUMA -XX:+UseConcMarkSweepGC would be good.
Should I add that on my Intel® Core™ i7-4770 Quad-Core Haswell 32 GB DDR3 RAM 2 x 240 GB 6 Gb/s SSD (Software-RAID 1) machine?
I would still configure it manually.
Set both to 12 GB and use the remaining 16GB for memory mapping in neo4j.properties. Try to match it to you store file sizes