Question about initializing weedfs volume servers (disks) and fid - seaweedfs

I have two question about seaweedfs:
in each server I have 10 disks, How can I run weedfs volume servee on them?
should I define 10 times "-dir=" in front of "./weed volume -max=100 -mserver="
or I should make systemd unit file for each disk?
fore example:
for sdb
ExecStart=/home/weedfs/weed volume -max=100 -mserver=192.168.200.20:9333 -port=8080 -dataCenter=dc1 -dir="/srv/sdb/data"
for sdc
ExecStart=/home/weedfs/weed volume -max=100 -mserver=192.168.200.20:9333 -port=8080 -dataCenter=dc1 -dir="/srv/sdc/data"
What is the best solution?
Can I create and define fid myself instead of asking master api?
forexample instead of this steps:
a)curl http://localhost:9333/dir/assign
{"fid":"14,8e3cf10b7811f43a542cfa34","url":"192.168.200.20:8080","publicUrl":"192.168.200.20:8080","count":1}
b)curl -F file=#/home/eitaa/weedfs/weed http://192.168.200.20:8080/14,8e3cf10b7811f43a542cfa34
directly I want to generating fid (I mean this part ",8e3cf10b7811f43a542cfa34" ) with desired volume id (eg:"8") and uploading file?
Or I should use Master api (Assign a file key)?

Either way. Pick the one that is easier for you.
Possibly. You may need to run volume with "-index=leveldb" to optimize memory usage, in case the file keys are not monotonically increasing.

Related

Spark - Failed to load collect frame - "RetryingBlockFetcher - Exception while beginning fetch"

We have a Scala Spark application, that reads something like 70K records from the DB to a data frame, each record has 2 fields.
After reading the data from the DB, we make minor mapping and load this as a broadcast for later usage.
Now, in local environment, there is an exception, timeout from the RetryingBlockFetcher while running the following code:
dataframe.select("id", "mapping_id")
.rdd.map(row => row.getString(0) -> row.getLong(1))
.collectAsMap().toMap
The exception is:
2022-06-06 10:08:13.077 task-result-getter-2 ERROR
org.apache.spark.network.shuffle.RetryingBlockFetcher Exception while
beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to /1.1.1.1:62788
at
org.apache.spark.network.client.
TransportClientFactory.createClient(Transpor .tClientFactory.java:253)
at
org.apache.spark.network.client.
TransportClientFactory.createClient(TransportClientFactory.java:195)
at
org.apache.spark.network.netty.
NettyBlockTransferService$$anon$2.
createAndStart(NettyBlockTransferService.scala:122)
In the local environment, I simply create the spark session with local "spark.master"
When I limit the max of records to 20K, it works well.
Can you please help? maybe I need to configure something in my local environment in order that the original code will work properly?
Update:
I tried to change a lot of Spark-related configurations in my local environment, both memory, a number of executors, timeout-related settings, and more, but nothing helped! I just got the timeout after more time...
I realized that the data frame that I'm reading from the DB has 1 partition of 62K records, while trying to repartition with 2 or more partitions the process worked correctly and I managed to map and collect as needed.
Any idea why this solves the issue? Is there a configuration in the spark that can solve this instead of repartition?
Thanks!

Which is the correct Disk Used Size information getting from M_DISK_USAGE or M_DISKS view?

There are 2 System Views provided by SAP Hana Database. M_DISK_USAGE and M_DISK
While comparing the two tables I came to know that USED_SIZE information of DATA,LOG,.....Usage Types are different in both tables.
Can someone please help me to understand, If I want to Monitor the Disk Usage of all usage types at the current time which view can I use to get this information?
The question really is what you want to know.
If you want to know how large the filesystems of the HANA volumes is and how much space is left there, then M_DISKS is the right view:
show free disk space in KiB:
/hana/data/SK1> df -BK .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda5 403469844K 134366892K 269102952K 34% /hana
compared to the M_DISKS view (sizes converted from bytes to KiB):
DISK_ID DEVICE_ID HOST PATH SUBPATH FILESYSTEM_TYPE USAGE_TYPE TOTAL_SIZE_KB USED_SIZE_KB
1 113132 skullbox /hana/data/SK1/ mnt00001 xfs DATA 403469844 134366892
2 113132 skullbox /usr/sap/SK1/HDB01/backup/data/ xfs DATA_BACKUP 403469844 134366892
3 113132 skullbox /hana/log/SK1/ mnt00001 xfs LOG 403469844 134366892
4 113132 skullbox /usr/sap/SK1/HDB01/backup/log/ xfs LOG_BACKUP 403469844 134366892
5 113132 skullbox /usr/sap/SK1/HDB01/skullbox/ xfs TRACE 403469844 134366892
M_DISK_USAGE on the other hand shows what the HANA instance allocated in total grouped by usage types.

julia on PBS cluster: what to give to addprocs()?

I'm trying to setup a cluster across machines on a PBS managed cluster. I'm perfectly able to compute within one node by saying julia -p 12 (after having reserved one node with 12 CPUs).
I understand that to use several machines, I have to add them to the master process with addprocs. I was able to do that on a different cluster (SGE). on this one here something is going wrong.
You can see everything I'm doing, including submit scripts etc, on this branch of a github repo.
to get a list of machines, I parse the PBS_NODEFILE, which for the case of a submit script with option
#PBS -l nodes=2:ppn=12 # give me 2 nodes with 12 processors each
looks like something like this:
red0004
red0004
...
red0004
red0347
...
red0347
I parse this file with bind_pe_procs() in sge.jl in the repo and give a vector of machine names to addprocs. When I submit this I get this error which I put up a gist with the resulting SSH error. I don't know what it means.
has this to do with a system setting, ie do i have to talk to the sys admin about SSH between machines? What are the right questions to ask?
I am unsure about what exactly I have to give to addprocs(). I don't want to add the master process (I don't want worker 1 SSHing into itself?), so I exclude ENV["HOST"] = node001 from my list. but what about all processors with the same name node002? do i list all of those
machines = [ "red0347" for i=1:12]
or just once
machines = ["red0347"]
in addprocs(machines)
thanks!

Need help in Apache Camel multicast/parallel/concurrent processing

I am trying to achieve concurrent/parallel processing in my requirement, but I did not get appropriate help in my multiple attempts in this regard.
I have 5 remote directories ( which may be added or removed) which contains log files, I want to Dow load them for every 15 minutes to my local directory and want to perform Lucene indexing after completion of ftp transfer job, I want to add routers dynamically.
Since all those remote machines are different end points , and different routes. I don't have any particular end point to kickoff all these.
Start
<parallel>
<download remote dir from: sftp1>
<download remote dir from: sftp2>
....
</parallel>
<After above task complete>
<start Lucene indexing>
<end>
Repeat above for every 15 minutes,
I want to download all folders paralally, Kindly suggest the solution if anybody worked on similar requirement.
I would like to know how to start/initiate these multiple routes (like this multiple remote directories) should be kick started when I don't have a starter end point. I would like to start all ftp operations parallel and on completing those then indexing. Thanks for taking time to reading this post , I really appreciate your help.
I tried like this,
from (bean:foo? Method=start).multicast ().to (direct:a).to (direct:b)...
From (direct:a) .from (sftp:xxx).to (localdir)
from (direct:b).from (sftp:xxx).to (localdir)
camel-ftp support periodic polling via the consumer.delay property
add camel-ftp consumer routes dynamically for each server as shown in this unit test
you can then aggregate your results based on a size or timeout value to initiate the Lucene indexing, etc
[todo - put together an example]

How to dump Permgen?

I wanted to take the dump of the Permgen of a application server.
I do not want to use -XX:+TraceClassLoading -XX:+TraceClassUnloading as i do not want to restart the server, Neither i want to use jconsole.
I there any tool like jmap(used to heap dump didnt find any option for permgen) to get the permgen so that i can supply only the pid.
jmap -permstat <pid>
is going to produce an output like that :
30337 intern Strings occupying 2746200 bytes.
class_loader classes bytes parent_loader alive? type
<bootstrap> 2031 7253392 null live <internal>
0x517474f0 1 1760 null dead sun/reflect/DelegatingClassLoader#0x43f95d38
0x4f83f670 1 1744 0x4ebfb8e8 dead sun/reflect/DelegatingClassLoader#0x43f95d38
[...]
total = 287 10020 35889952 N/A alive=3, dead=284 N/A
This is not a full dump, but doing that is going to allow you to do some investigation.
I am still looking on how to find more information.
It is not possible to 'dump permgen' as it's done for the heap.
In addition to jmap -permstat as others have presented, you can analyze standard heap dump to shed some light on your permanent generation as described in this blog entry: 'The Unknown Generation: Perm'.
Because a heap dump does not really contain a lot of information about perm space, perm problems are difficult to tackle. Recently, I found this great article by Sporar, Sundararajan and Kieviet. The authors shed some light on the permanent generation. Of course, I had to check right away if and how I can use the Eclipse Memory Analyzer to analyze this “unknown” generation. This is what this blog is about.
jmap -permstat <pid>