running into an issue when running query on Impala - sql

I've been running into the following issue when running basic queries on Impala (For example: select * from table limit 100) recently. I did some online research but have not found a fix for this. any insights on how I could fix this? I use HUE for querying.
ExecQueryFInstances rpc query_id=5d4f8d25428xxxx:813cfbd30000xxxx
failed: Failed to get minimum memory reservation of 8.00 MB on daemon
ser1830.xxxx.com:22000 for query xxxxxxxxx:813cfbd30xxxxxxxxx
due to following error: Failed to increase reservation by 8.00 MB
because it would exceed the applicable reservation limit for the
"Process" ReservationTracker: reservation_limit=68.49 GB
reservation=68.49 GB used_reservation=0 child_reservations=68.49 GB
The top 5 queries that allocated memory under this tracker are:
Query(5240724f8exxxxxx:ab377425000xxxxxx): Reservation=41.81 GB
ReservationLimit=64.46 GB OtherMemory=133.44 MB Total=41.94 GB
Peak=42.62 GB Query(394dcbbaf6bxxxxx2f4760000xxxxxx0):
Reservation=26.68 GB ReservationLimit=64.46 GB OtherMemory=92.94 KB
Total=26.68 GB Peak=26.68 GB Query(5d4f8d25428xxxxx:813cfbd30000xxxxx):
Limit=100.00 GB Reservation=0 ReservationLimit=64.46 GB OtherMemory=0
Total=0 Peak=0 Memory is likely oversubscribed. Reducing query
concurrency or configuring admission control may help avoid this
error.

Related

postgres rds slow response time

We have an aws rds postgres db od type t3.micro.
I am running simple queries on a pretty small table and I get pretty high response time - around 2 seconds per query run.
Query example:
select * from package where id='late-night';
The cpu usage is not high (around 5%)
We tried creating a bigger rds db (t3.meduiom) with the snapshot of the original one and the performance did not improve at all.
Table size 2600 rows
We examined connection with bot external ip and internal ip.
Disk size 20gib
Memory type: ssd
Is there a way to improve performance??
Thanks for the help!

How can I change physical memory in a mapreduce/hive job?

I'm trying to run a Hive INSERT OVERWRITE query on an EMR cluster with 40 worker nodes and single master node.
However, while running the INSERT OVERWRITE query, as soon as I get to
Stage-1 map = 100%, reduce = 100%, Cumulative CPU 180529.86 sec
this state, I get the following error:
Ended Job = job_1599289114675_0001 with errors
Diagnostic Messages for this Task:
Container [pid=9944,containerID=container_1599289114675_0001_01_041995] is running beyond physical memory limits. Current usage: 1.5 GB of 1.5 GB physical memory used; 3.2 GB of 7.5 GB virtual memory used. Killing container.
Dump of the process-tree for container_1599289114675_0001_01_041995 :
I'm not sure how can I change the 1.5 GB physical memory number. In my configurations, I don't see such a number, and I don't understand how that 1.5 GB number is being calculated.
I even tried changing the "yarn.nodemanager.vmem-pmem-ratio":"5" to 5 as suggested in some forums. But irrespective of this change, I still get the error.
This is how the job starts:
Number of reduce tasks not specified. Estimated from input data size: 942
Hadoop job information for Stage-1: number of mappers: 910; number of reducers: 942
And this is how my configuration file looks like for the cluster. I'm unable to understand what settings do I have to change to not run into this issue. Could it also be due to Tez settings? Although I'm not using it as the engine.
Any suggestions will be greatly appreciated, thanks.
While opening hive console, append the following to the command
--hiveconf mapreduce.map.memory.mb=8192 --hiveconf mapreduce.reduce.memory.mb=8192 --hiveconf mapreduce.map.java.opts=-Xmx7600M
Incase you still get the Java heap error, try increasing to higher values, but make sure that the mapreduce.map.java.opts doesn't exceed mapreduce.map.memory.mb.

Azure Database cannot reduce the sizing

Azure Database cannot reduce the sizing from 750 to 500 GB.
Overall sizing after I checked in the Azure dashboard.
Used space is 248.29 GB.
Allocated space is 500.02 GB
Maximum storage size is 750 GB.
The validation message when I reduce the sizing:
Message is
The storage size of your database cannot be smaller than the currently
allocated size. To reduce the database size, the database first needs
to reclaim unused space by running DBCC SHRINKDATABASE (XXX_Database
Name). This operation can impact performance while it is running and
may take several hours to complete.
What should I do?
Best Regard
If we want to reduce the database size, we need to ensure the number of databse size is larger than the number of the Allocated space you set. Now, according to your need, we should reclaim unused allocated space. Regarding how to do it, we can run the following command
-- Shrink database data space allocated.
DBCC SHRINKDATABASE (N'db1')
For more details, please refer to the document
I got this error via CLI after also disabling read scale and the solution was to remove --max-size 250GB from command:
az sql db update -g groupname -s servername -n dbname --edition GeneralPurpose --capacity 1 --max-size 250GB --family Gen5 --compute-model Serverless --no-wait

mbr2gpt: Too many MBR partitions found, no room to create EFI system partition

Try to convert MBR to GPT with mbr2gpt introduced with Windows 10 build 1703, it failed with
mbr2gpt: Too many MBR partitions found, no room to create EFI system partition.
Full log:
2017-06-07 22:23:24, Info ESP partition size will be 104857600
2017-06-07 22:23:24, Info MBR2GPT: Validating layout, disk sector size is: 512 bytes
2017-06-07 22:23:24, Error ValidateLayout: Too many MBR partitions found, no room to create EFI system partition.
2017-06-07 22:23:24, Error Disk layout validation failed for disk 1
The mbr2gpt disk conversion tool need three conditions for the validation of disk layout:
Admin rights (what you already know)
One of physical disk (or harddrive) with boot partition (MSR) AND os partition
The validation allows normally one of additional partitions (often it's recovery partition)
If you have more than three partition then check this with diskpart:
Microsoft Windows [Version 10.0.15063]
(c) 2017 Microsoft Corporation. All rights reserved.
C:\WINDOWS\system32>diskpart
Microsoft DiskPart-Version 10.0.15063.0
Copyright (C) Microsoft Corporation.
On computer: SILERRAS-PC
DISKPART> list disk
Dist ### Status Size Free Dyn GPT
-------- ------ ------ ------- ---- ---
Disk 0 Online 117 GB 1024 KB * *
Disk 1 Online 489 GB 455 MB *
Disk 2 Online 186 GB 0 B
Disk 3 Online 931 GB 0 B
Disk 4 Online 931 GB 1024 KB *
DISKPART> select disk 1
Disk 1 is now the selected disk.
DISKPART> list partition
Partition ### Typ Größe Offset
------------- ---------------- ------- -------
Partition 1 System 350 MB 1024 KB
Partition 2 Primary 487 GB 351 MB
Partition 3 Recovery 452 MB 488 GB
DISKPART>
Try to reduce the number of partitions to three partitions.
If you have more than two recovery partitions, then check this out with "ReAgentc /info". This command shows you the current recovery partitions. Often only one of those is active. You can delete the other one with diskpart. Please be careful which partition you delete. The diskpart command is "delete partition override".
I hope my guide is helpful for you.

Slow query using Pyhs2 to fetch data in Hive

I tried to use Pyhs2 to communicate with Hive,fetch data and put them in a list(temporary stored in RAM).
But it took a long time to query a table using very simple HQL like 'select fields1,fields2... from table_name', in which data scale is about 7 million rows and less then 20 fields. The whole process costs nearly 90 mins.
My server: CentOS 6.5, 8 cpu units, 32 processors and 32GB RAM
Hadoop cloud: more than 200 machines
Can someone help to solve this problem? Thanks very much