I'm writing a program that monitors how processes use the GPU and I found an API provided by nvml, nvmlDeviceGetProcessUtilization.
Acordding the comment of this API, It reads recent utilization of GPU SM (3D/Compute), framebuffer, video encoder, and video decoder for processes running.
I called the API every 10 seconds or 1 seconds and printed out the samples, as follows:
The "ReadTime" indicates when my program called the API. The "sample" are the samples returned by the API.
ReadTime:10:58:56.194 - sample:[Pid:28128, Timestamp:10:58:55.462519, SmUtil:05, MemUtil:01, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:28104, Timestamp:10:58:55.127657, SmUtil:05, MemUtil:02, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:28084, Timestamp:10:58:48.051124, SmUtil:03, MemUtil:01, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:28050, Timestamp:10:58:53.944518, SmUtil:03, MemUtil:01, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27989, Timestamp:10:58:47.043732, SmUtil:03, MemUtil:01, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27976, Timestamp:10:58:53.604955, SmUtil:09, MemUtil:03, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27814, Timestamp:10:58:48.386200, SmUtil:19, MemUtil:07, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27900, Timestamp:10:58:56.132879, SmUtil:17, MemUtil:06, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27960, Timestamp:10:58:51.423172, SmUtil:06, MemUtil:02, EncUtil:00, DecUtil:00]
ReadTime:10:58:56.194 - sample:[Pid:27832, Timestamp:10:58:47.883811, SmUtil:21, MemUtil:08, EncUtil:00, DecUtil:00]
SUM - GPUId:0, process:10, smSum:91, memSum:32, encSum:0, decSum:0
My question is:
Why is there only one sample per process in the samples returned by this API, regardless of whether I call it once every second or every 10 seconds?
The timestamps in each sample seem to be almost irregular. How does nvml determine the sampling time?
How is SmUtil derived? According to the description of nvmlUtilization_st struct in the nvml.h header file, it is used by nvmlDeviceGetUtilizationRates. The GPU usage refers to “Percent of time over the past sample period during which one or more kernels was executing on the GPU”. As my understanding, that is, even if the GPU has multiple cores, if only one core in the GPU is occupied in a time slice, the entire CPU is considered occupied. This time slice is used as a numerator to calculate the utilization of the entire GPU. So, how do you understand the SmUtil of the process returned by nvmlDeviceGetProcessUtilization?
How to understand the SmUtil returned by nvmlDeviceGetProcessUtilization?
I want to reproduct the code of cross modal focal loss cvpr2021. But I ran into some difficulties and I don't know where to find the solution. The difficulties are the following.
File "/data/run01/scz1974/chenjiawei/bob.paper.cross_modal_focal_loss_cvpr2021/src/bob.io.stream/bob/io/stream/stream_file.py", line 117, in get_stream_shape
descriptor = self.hdf5_file.describe(data_path)
RuntimeError: HDF5File - describe ('/HOME/scz1974/run/yanghao/fasdata/HQ-WMCA/MC-PixBiS-224/preprocessed/face-station/26.02.19/1_01_0064_0000_00_00_000-48a8d5a0.hdf5'): C++ exception caught: 'Cannot find dataset BASLER_BGR' at /HOME/scz1974/run/yanghao/fasdata/HQ-WMCA/MC-PixBiS-224/preprocessed/face-station/26.02.19/1_01_0064_0000_00_00_000-48a8d5a0.hdf5:''
The instructions assumes that you have obtained the raw dataset which has all the data channels. The preprocessed files only contains grayscale and swir differences. If you want to use grayscale and one of the swir differences as two channels you can skip the preprocessing part as given in the documentation.
I'm trying to run a friend's code which perfectly runs at her pc. Just the same code. I don't know why, but i can't run it. I think the problem is probably about my compiler then. Because it's pretty much the same code. But I need to be sure it runs so i can submit it. Why won't it show the solution to the problem? Running code on Coliop4 4.1.0
SHOW'S AT THE END OF OUTPUT: Error (interfaces): Can't open GLPK Solution file:
C:/Users/Procópio/AppData/Local/Temp/cmpl.H14816.gsol
'''''''' PROBLEM
%arg -solver cbc
parameters:
farm:= set(1,2,3);
culture:= set(1,2,3);
area[farm]:= (400,650,350);
waterdisp[farm]:= (1800,2200,650);
maxcultarea[culture]:= (660,880,400);
wateruse[culture]:= (5.5,4,3.5);
revenue[culture]:= (5000,4000,1800);
variables:
x[farm,culture]: real[0..];
alpha : real[0..];
objectives:
tot_revenue: sum{i in farm, j in culture: x[i,j]*revenue[j]} -> max;
constraints:
cultivated_area {j in culture: sum{i in farm: x[i,j]}<= maxcultarea[j];}
water_used {i in farm: sum{j in culture: x[i,j] * wateruse[j]} <= waterdisp[i];}
area_used {i in farm: sum{j in culture: x[i,j]}<= area[i];}
same_area {i in farm: sum{j in culture: x[i,j]} = alpha * area[i];}
'''''''
OUTPUT WINDOW
CMPL model generation - running
CMPL version: 1.12.0
Authors: Thomas Schleiff, Mike Steglich
Distributed under the GPLv3
create model instance ...
write model instance ...
CMPL model generation - finished
Solver - running
Welcome to the CBC MILP Solver
Version: 2.9.9
Build Date: Mar 15 2018
command line - cbc C:/Users/Proc�pio/AppData/Local/Temp/cmpl.H14816.mps max solve gsolu C:/Users/Proc�pio/AppData/Local/Temp/cmpl.H14816.gsol (default strategy 1)
Unable to open file C:/Users/Proc�pio/AppData/Local/Temp/cmpl.H14816.mps
** Current model not valid
** Current model not valid
No match for C:/Users/Proc�pio/AppData/Local/Temp/cmpl.H14816.gsol - ? for list of commands
Total time (CPU seconds): 0.00 (Wallclock seconds): 0.00
Error (interfaces): Can't open GLPK Solution file: C:/Users/Procópio/AppData/Local/Temp/cmpl.H14816.gsol
I have a matrix in my drone.yml, but it only should run on one of my pipeline steps. Is it possible to only apply the matrix to certain steps?
For example, I do not want the matrix to apply on the publish step:
pipeline:
test:
image: ruby
commands:
- bundle exec rspec ${TESTFOLDER}
publish:
image: ruby
commands:
- magic-publish
matrix:
TESTFOLDER:
- integration/user
- integration/shopping_cart
- integration/payments
- units
If you wish to "magic-publish" only once, you might want to restrict it to a single element of your matrix (maybe the last one):
when:
matrix:
TESTFOLDER: units
You could also attach the deployment step to a tag or deploy event.
cf. How to setup conditional build steps
When executing a hive query, here is the output, wondering for "Map 1" and "Reducer 2", what does the 1 and 2 mean?
Map 1: 21/27 Reducer 2: 0/1
Map 1: 22/27 Reducer 2: 0/1
Map 1: 23/27 Reducer 2: 0/1
Map 1: 24/27 Reducer 2: 0/1
Map 1: 26/27 Reducer 2: 0/1
Map 1: 27/27 Reducer 2: 0/1
Map 1: 27/27 Reducer 2: 1/1
thanks in advance,
Lin
The hive query is interpreted by the MapReduce framework as a Map-Reduce task. The task gets assigned mappers and reducers based on the input. When the task is in progress you can see the output displayed in terms of mappers and reducers progress.