How to calculate GPU utilization of specified process on Windows platform? - gpu

divided into integrated graphics card and Nvidia graphics card.GPU Performance Monitoring,Generating Log File Upload Server.

if ((res = nvmlDeviceGetAccountingStats(device, pids[index], &stat)) == NVML_SUCCESS) {
qDebug("Gpu mem: %d", stat.memoryUtilization);
qDebug("GPU: %d%", stat.gpuUtilization);
} else
{
qDebug("get accounting stat error\n");
}

Related

How to pick target hardware for opencl

How do you tell OpenCL to target build for a gpu instead of a cpu? will it automatically pick one over the other?
OpenCL will not automatically pick a device for you. You have to explicitly choose a platform (Intel/AMD/Nvidia) and a device (CPU/GPU) on that platform. Platform #0 and device #0 by default will not always give you the GPU. This is quite cumbersome when running code on different computers, as on each you have to manually select the device.
However there is a smart solution for this, a lightweight OpenCL-Wrapper that automatically picks the fastest available GPU (or CPU if no GPU is available) for you. This works by reading out the number of compute units and clock frequency and adding missing information (number of cores per CU) via vendor and device name with a small database.
Find the source code with an example here.
Here is just the code for automatically selecting the fastest device:
vector<cl::Device> cl_devices; // get all devices of all platforms
{
vector<cl::Platform> cl_platforms; // get all platforms (drivers)
cl::Platform::get(&cl_platforms);
for(uint i=0u; i<(uint)cl_platforms.size(); i++) {
vector<cl::Device> cl_devices_available;
cl_platforms[i].getDevices(CL_DEVICE_TYPE_ALL, &cl_devices_available); // to query only GPUs, use CL_DEVICE_TYPE_GPU here
for(uint j=0u; j<(uint)cl_devices_available.size(); j++) {
cl_devices.push_back(cl_devices_available[j]);
}
}
}
cl::Device cl_device; // select fastest available device
{
float best_value = 0.0f;
uint best_i = 0u; // index of fastest device
for(uint i=0u; i<(uint)cl_devices.size(); i++) { // find device with highest (estimated) floating point performance
const string name = trim(cl_devices[i].getInfo<CL_DEVICE_NAME>()); // device name
const string vendor = trim(cl_devices[i].getInfo<CL_DEVICE_VENDOR>()); // device vendor
const uint compute_units = (uint)cl_devices[i].getInfo<CL_DEVICE_MAX_COMPUTE_UNITS>(); // compute units (CUs) can contain multiple cores depending on the microarchitecture
const uint clock_frequency = (uint)cl_devices[i].getInfo<CL_DEVICE_MAX_CLOCK_FREQUENCY>(); // in MHz
const bool is_gpu = cl_devices[i].getInfo<CL_DEVICE_TYPE>()==CL_DEVICE_TYPE_GPU;
const uint ipc = is_gpu?2u:32u; // IPC (instructions per cycle) is 2 for GPUs and 32 for most modern CPUs
const bool nvidia_192_cores_per_cu = contains_any(to_lower(name), {" 6", " 7", "ro k", "la k"}) || (clock_frequency<1000u&&contains(to_lower(name), "titan")); // identify Kepler GPUs
const bool nvidia_64_cores_per_cu = contains_any(to_lower(name), {"p100", "v100", "a100", "a30", " 16", " 20", "titan v", "titan rtx", "ro t", "la t", "ro rtx"}) && !contains(to_lower(name), "rtx a"); // identify P100, Volta, Turing, A100, A30
const bool amd_128_cores_per_dualcu = contains(to_lower(name), "gfx10"); // identify RDNA/RDNA2 GPUs where dual CUs are reported
const float nvidia = (float)(contains(to_lower(vendor), "nvidia"))*(nvidia_192_cores_per_cu?192.0f:(nvidia_64_cores_per_cu?64.0f:128.0f)); // Nvidia GPUs have 192 cores/CU (Kepler), 128 cores/CU (Maxwell, Pascal, Ampere) or 64 cores/CU (P100, Volta, Turing, A100)
const float amd = (float)(contains_any(to_lower(vendor), {"amd", "advanced"}))*(is_gpu?(amd_128_cores_per_dualcu?128.0f:64.0f):0.5f); // AMD GPUs have 64 cores/CU (GCN, CDNA) or 128 cores/dualCU (RDNA, RDNA2), AMD CPUs (with SMT) have 1/2 core/CU
const float intel = (float)(contains(to_lower(vendor), "intel"))*(is_gpu?8.0f:0.5f); // Intel integrated GPUs usually have 8 cores/CU, Intel CPUs (with HT) have 1/2 core/CU
const float arm = (float)(contains(to_lower(vendor), "arm"))*(is_gpu?8.0f:1.0f); // ARM GPUs usually have 8 cores/CU, ARM CPUs have 1 core/CU
const uint cores = to_uint((float)compute_units*(nvidia+amd+intel+arm)); // for CPUs, compute_units is the number of threads (twice the number of cores with hyperthreading)
const float tflops = 1E-6f*(float)cores*(float)ipc*(float)clock_frequency; // estimated device FP32 floating point performance in TeraFLOPs/s
if(tflops>best_value) {
best_value = tflops;
best_i = i;
}
}
const string name = trim(cl_devices[best_i].getInfo<CL_DEVICE_NAME>()); // device name
cl_device = cl_devices[best_i];
print_info(name); // print device name
}
Alternatively, you can also make it automatically choose the device with most memory rather than FLOPs, or a device with specified ID from the list of all devices from all platforms. There is many more benefits to using this wrapper, for example significantly simpler code for using arrays and automatic tracking of total device memory allocation, all while not impacting performance in any way.

GStreamer demo deosn't work in Virtual Machine (seeking simple example)

I am tryign to code an extremely simple GStreamer app. It doesn't matter what it does, so long as GStreamer does something. Even just displaying some text or a simple JPEG would be fine.
Below is about the best example that I could find by Googling (I have added a few error checks). When I run it in a Linux Virtual Machine running under Windows, I see this console message:
libEGL warning: pci id for fd 4: 80ee:beef, driver (null)
libEGL warning: DRI2: failed to open vboxvideo (search paths
/usr/lib/i386-linux-gnu/dri:${ORIGIN}/dri:/usr/lib/dri)
Googling indicates that this is an error with 3D rendering inside a virtual machine. I can find no solution.
So, can someone fix the code below so that it will run in a VM? I assume that that would mean avoiding 3D rendering, so maybe display an image or some text? It is not necessary to play video, this is just a simple proof of concept of using GStreamer inside something else (which has to be running in a VM).
Here's the code ...
void GstreamerPlayVideo()
{
GstElement *pipeline;
GstBus *bus;
GstMessage *msg;
int argc;
GError *error = NULL;
/* Initialize GStreamer */
if (gst_init_check(&argc, NULL, &error) == TRUE)
{
/* Build the pipeline */
// Change URL to test failure
pipeline = gst_parse_launch ("bin uri=http://docs.gstreamer.com/media/sintel_trailer-480p.webm", &error);
//// pipeline = gst_parse_launch ("bin uri=http://tecfa.unige.ch/guides/x3d/www.web3d.org/x3d/content/examples/HelloWorld.gif", &error);
if (pipeline != NULL)
{
/* Start playing */
gst_element_set_state (pipeline, GST_STATE_PLAYING);
/* wait until it's up and running or failed */
if (gst_element_get_state (pipeline, NULL, NULL, -1) == GST_STATE_CHANGE_FAILURE)
{
g_error ("GST failed to go into PLAYING state");
exit(1);
}
/* Wait until error or EOS */
bus = gst_element_get_bus (pipeline);
if (bus != NULL)
{
msg = gst_bus_timed_pop_filtered (bus, GST_CLOCK_TIME_NONE, GST_MESSAGE_ERROR | GST_MESSAGE_EOS);
/* Parse message */
if (msg != NULL)
{
gchar *debug_info;
switch (GST_MESSAGE_TYPE (msg))
{
case GST_MESSAGE_ERROR:
gst_message_parse_error (msg, &error, &debug_info);
g_printerr ("Error received from element %s: %s\n", GST_OBJECT_NAME (msg->src), error->message);
g_printerr ("Debugging information: %s\n", debug_info ? debug_info : "none");
g_clear_error (&error);
g_free (debug_info);
break;
case GST_MESSAGE_EOS:
g_print ("End-Of-Stream reached.\n");
break;
default:
/* We should not reach here because we only asked for ERRORs and EOS */
g_printerr ("Unexpected message received.\n");
break;
}
gst_message_unref (msg);
}
/* Free resources */
gst_object_unref (bus);
gst_element_set_state (pipeline, GST_STATE_NULL);
gst_object_unref (pipeline);
}
else
{
g_print ("GST get bus error: %s\n", error->message);
exit(1);
}
}
else
{
g_print ("GST parse error: %s\n", error->message);
exit(1);
}
}
else
{
g_print ("GST init error: %s\n", error->message);
exit(1);
}
} // GstreamerPlayVideo()
Try specifying a video sink by hand in your pipeline.
videotestsrc ! ximagesink
Your system may have an EGL video sink plugin installed as the primary video plugin. ximagesink seems a little more likely to work.
Like this:
//this line is where you're creating your pipeline
pipeline = gst_parse_launch ("videotestsrc ! ximagesink", &error);
I recommend experimenting with the gst-launch command first so you can get a hang of pipeline syntax, what sinks and sources are, etc. The simplest test you can run is something like this (if you have gstreamer 1.0 installed, you may have 0.10), from the command line:
gst-launch-1.0 videotestsrc ! autovideosink

Minecraft Forge: My armor textures are not showing up on armor

The game works fine, it just won't load the textures.
package theDwainFilms19.SuperSmashBrosMod.item;
import scala.tools.nsc.MainClass;
import theDwainFilms19.SuperSmashBrosMod.SuperSmashBrosMod;
import net.minecraft.item.ItemArmor;
import net.minecraft.item.ItemStack;
import com.sun.xml.internal.stream.Entity;
public class ItemmarioArmor extends ItemArmor {
public ItemmarioArmor(ArmorMaterial armourMaterial, int id, int placement){
super(armourMaterial, id, placement);
}
public String getArmorTexture(ItemStack stack, Entity entity, int slot, String type) {
if (stack.getItem().itemID == SuperSmashBrosMod.marioHelmet.itemID ||stack.getItem().itemID == SuperSmashBrosMod.marioChestplate.itemID || stack.getItem ().itemID == SuperSmashBrosMod.marioBoots.itemID) {
return SuperSmashBrosMod.MODID + ":textures/models/armor/mario_layer_1.png";
}
if (stack.getItem().itemID == SuperSmashBrosMod.marioLeggings.itemID) {
return SuperSmashBrosMod.MODID + ":textures/models/armor/mario_layer_2.png";
} else {
return null;
I don't have a clue how to fix this, since it doesn't give a error thing, it just ignores the code.
Edit: here is a crash report for when I went to add a proxy
-- System Details --
Details:
Minecraft Version: 1.7.10
Operating System: Windows 8.1 (amd64) version 6.3
Java Version: 1.8.0_31, Oracle Corporation
Java VM Version: Java HotSpot(TM) 64-Bit Server VM (mixed mode), Oracle Corporation
Memory: 687494688 bytes (655 MB) / 1037959168 bytes (989 MB) up to 1037959168 bytes (989 MB)
JVM Flags: 3 total; -Xincgc -Xmx1024M -Xms1024M
AABB Pool Size: 0 (0 bytes; 0 MB) allocated, 0 (0 bytes; 0 MB) used
IntCache: cache: 0, tcache: 0, allocated: 0, tallocated: 0
FML: MCP v9.05 FML v7.10.85.1291 Minecraft Forge 10.13.2.1291 4 mods loaded, 4 mods active
mcp{9.05} [Minecraft Coder Pack] (minecraft.jar) Unloaded->Constructed
FML{7.10.85.1291} [Forge Mod Loader] (forgeSrc-1.7.10-10.13.2.1291.jar) Unloaded->Constructed
Forge{10.13.2.1291} [Minecraft Forge] (forgeSrc-1.7.10-10.13.2.1291.jar) Unloaded->Constructed
ssb{1.0} [Super Smash Bros mod] (bin) Unloaded->Errored
[22:56:55] [Client thread/INFO] [STDOUT]: [net.minecraft.client.Minecraft:displayCrashReport:398]: ##!## Game crashed! Crash report saved to: ##!## C:\Users\David\Desktop\SuperSmashBrosMOd\eclipse.\crash-reports\crash-2015-02-07_22.56.55-client.txt
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release

Embedded: SDHC SPI write issue

I am currently working at a logger that uses a MSP430F2618 MCU and SanDisk 4GB SDHC Card.
Card initialization works as expected, I also can read MBR and FAT table.
The problem is that I can't write any data on it. I have checked if it is write protected by notch, but it's not. Windows 7 OS has no problem reading/writing to it.
Though, I have used a tool called "HxD" and I've tried to alter some sectors (under Windows). When I try to save the content to SD card, the tool pop up a windows telling me "Access denied!".
Then I came back to my code for writing to SD card:
uint8_t SdWriteBlock(uchar_t *blockData, const uint32_t address)
{
uint8_t result = OP_ERROR;
uint16_t count;
uchar_t dataResp;
uint8_t idx;
for (idx = RWTIMEOUT; idx > 0; idx--)
{
CS_LOW();
SdCommand(CMD24, address, 0xFF);
dataResp = SdResponse();
if (dataResp == 0x00)
{
break;
}
else
{
CS_HIGH();
SdWrite(0xFF);
}
}
if (0x00 == dataResp)
{
//send command success, now send data starting with DATA TOKEN = 0xFE
SdWrite(0xFE);
//send 512 bytes of data
for (count = 0; count < 512; count++)
{
SdWrite(*blockData++);
}
//now send tow CRC bytes ,through it is not used in the spi mode
//but it is still needed in transfer format
SdWrite(0xFF);
SdWrite(0xFF);
//now read in the DATA RESPONSE TOKEN
do
{
SdWrite(0xFF);
dataResp = SdRead();
}
while (dataResp == 0x00);
//following the DATA RESPONSE TOKEN are a number of BUSY bytes
//a zero byte indicates the SD/MMC is busy programing,
//a non_zero byte indicates SD/MMC is not busy
dataResp = dataResp & 0x0F;
if (0x05 == dataResp)
{
idx = RWTIMEOUT;
do
{
SdWrite(0xFF);
dataResp = SdRead();
if (0x0 == dataResp)
{
result = OP_OK;
break;
}
idx--;
}
while (idx != 0);
CS_HIGH();
SdWrite(0xFF);
}
else
{
CS_HIGH();
SdWrite(0xFF);
}
}
return result;
}
The problem seems to be when I am waiting for card status:
do
{
SdWrite(0xFF);
dataResp = SdRead();
}
while (dataResp == 0x00);
Here I am waiting for a response of type "X5"(hex value) where X is undefined.
But most of the cases the response is 0x00 (hex value) and I don't get out of the loop. Few cases are when the response is 0xFF (hex value).
I can't figure out what is the problem.
Can anyone help me? Thanks!
4GB SDHC
We need to see much more of your code. Many µC SPI codebases only support SD cards <= 2 GB, so using a smaller card might work.
You might check it yourself: SDHC needs a CMD 8 and an ACMD 41 after the CMD 0 (GO_IDLE_STATE) command, otherwise you cannot read or write data to it.
Thank you for your answers, but I solved my problem. It was a problem of timing. I had to put a delay at specific points.

How to get the mac address in WinRT (Windows 8) programmatically?

I am looking for an API in WinRT to access the mac address.
You can't retrieve the MAC Address per say, but you do can retrieve hardware specific information to identify a machine if that's what you're trying to do.
Here's a complete msdn article discussing the subject: Guidance on using the App Specific Hardware ID (ASHWID) to implement per-device app logic (Windows)
Be careful to use just the information you need and not the complete id, as it might change based on information that are useless to you (such as the Dock Station bytes for instance).
Here's a code sample of a computed device id based on a few bytes (CPU id, size of memory, serial number of the disk device and bios):
string deviceSerial = string.Empty;
// http://msdn.microsoft.com/en-us/library/windows/apps/jj553431
Windows.System.Profile.HardwareToken hardwareToken = Windows.System.Profile.HardwareIdentification.GetPackageSpecificToken(null);
using (DataReader dataReader = DataReader.FromBuffer(hardwareToken.Id))
{
int offset = 0;
while (offset < hardwareToken.Id.Length)
{
byte[] hardwareEntry = new byte[4];
dataReader.ReadBytes(hardwareEntry);
// CPU ID of the processor || Size of the memory || Serial number of the disk device || BIOS
if ((hardwareEntry[0] == 1 || hardwareEntry[0] == 2 || hardwareEntry[0] == 3 || hardwareEntry[0] == 9) && hardwareEntry[1] == 0)
{
if (!string.IsNullOrEmpty(deviceSerial))
{
deviceSerial += "|";
}
deviceSerial += string.Format("{0}.{1}", hardwareEntry[2], hardwareEntry[3]);
}
offset += 4;
}
}
Debug.WriteLine("deviceSerial=" + deviceSerial);
There is no way to do it. The Windows Store App APIs are sandboxed and are pretty restrictive on the information that you can get about the user, mainly because of privacy concerns.