What exactly is the "presentation engine" in Vulkan jargon? - definition

According to this Khronos presentation, a "presentation engine" is:
The platform’s compositor or display engine
According to the specs:
The presentation engine is an abstraction for the platform’s compositor or display engine.
The presentation engine may be synchronous or asynchronous with respect to the application and/or logical device.
Some implementations may use the device’s graphics queue or dedicated presentation hardware to perform presentation.
Both these sources suggest that in most cases the presentation engine is a software entity ("abstraction") of the platform (which is itself, a software layer: the OS+window system).
Googling "window compositor display engine" provides me with this Wikipedia result, which seems relevant: https://en.wikipedia.org/wiki/Compositing_window_manager
Is that basically the article on "presentation engine"? For example, for Windows, the presentation engine would be Desktop Windows Manager, for a GNU/Linux system it could be Compiz, and so on? Or is it that the "presentation engine" is a combination the compositing manager and some other stuff?

Presentation engine in Vulkan is an external component that manages and accepts the rendered image you made in Vulkan (assumably) for the purposes of presentation to the user.
From another POV, it is whatever the interface gives you. Which is vkAcquireNextImageKHR, vkQueuePresentKHR, etc., in the case of the VK_KHR_swapchain extension. Other extensions can be made, as presentation engines that operate fundamentally differently emerge (e.g. VK_KHR_display_swapchain).
VK_KHR_swapchain, requires VK_KHR_surface, which is specialized to VK_KHR_win32_surface, VK_KHR_xlib_surface, etc. So you can bet those are the APIs the driver talks to underneath. I.e. it talks to Win32 API (aka Windows API), probably to the GDI component (but possibly to the DXGI swapchain). On linuxes + VK_KHR_xlib_surface, it would talk to X server. And so on... which inevitably has to end up in the hands of the windowing manager such as the DWM or Compiz.

Related

UiPath unattended automation

I was just curious about how does Uipath process render GUI to interact with various application in unattended mode without screen. I am trying to build my own RPA system for few specific use cases but I am stuck at running those process in unattended. Because to interact with application(click etc) it requires GUI to render.
Thanks
According to this article (and a little bit simplified) they either use the console session (which is a well-known solution / workaround) or they create RDP Sessions programmatically using the FreeRDP framework. (I have tried my luck with FreeRDP but most of it's features are disabled in corporate environments)
If you really want to dig in the whole thing, Microsoft provides a framework for implementing own Remoting Solutions. Theoretically you could implement your own protocol with lower security boundaries and by not destroying the GUI if the remote session is not active (disconnected but not closed)
It's based on the coordinates of the controls and the text they contain. It recognizes graphical objects by their platform-specific attributes. In very particular scenarios, where object recognition is not available such as with RDP, it uses image and OCR text-based automation.

What differs different Vulkan loaders from each other?

First I wonder about some minor details to see if I understand some concepts properly:
Is vulkan-1.dll (or libvulkan.so.1 on Linux) what is referred to as the loader?
When I use HMODULE vulkan_module = LoadLibrary( "vulkan-1.dll" );, is this using the loader from the graphics driver (provided that the previous detail is true)?
Now to the actual question. It seems that the loader is responsible for pulling drivers together to have them seem as one "unit" of sorts, as well as collecting available extensions and validation layers. What then differs the LunarG loader (for example) from those provided by graphics drivers? Why would one want to use one over the other?
Vulkan drivers do not contain anything that would reasonably be called a "loader". They are "providers".
The purpose of a "loader" is to load what the "providers" provide. The most basic thing a loader does is find the implementations' DLLs and interact with them. This differs based on the platform. With Windows, they probably use registry settings to hunt down the implementation DLLs. On Android, their built-in support probably centralizes things. And so forth.
The only commonly used loader is LunarG's SDK loader (which does use the filename vulkan-1). Some have written their own, but LunarG's is the only one with widespread usage.
"the loader" or "official loader" or "Khronos loader" or "LunarG loader" or "VulkanRT" are AFAIK the same. It's from the project KhronosGroup/Vulkan-LoaderAndValidationLayers.
What differs (between those provided by the Khronos, LunarG SDK, and drivers) is usually only a version. (Typically LunarG SDK lags behind Khronos and driver lags behind both.)
More then you ever wanted to know of its inner workings is in the loader documentation.
Run-time dynamic linking as you propose should be possible (you would do the LoadLibrary() then GetProcAddress() the vkGetInstanceProcAddr() command and then rest from it).
(On Windows) I think most people use the convenient dll import library vulkan-1.lib from LnG SDK with whatever vulkan-1.dll is in the System32.

Beginner in VB.net, lost with DLLs, DirectShow, AVIcap32.dll, and etc For Image Processing

Ultimate goal would be using VB.net to interface with webcam and do image processing.
Currently I'm just using Matlab, but it is insanely slow. Since I'm going into the area of image processing,coupled with object recognition, which path should I go down to? This by meaning is it GDI+? DirectX? or some other APIs? What is the API that supports manipulating and analyzing graphical input data? By which I may go delve deeper, and create a standalone software just for my own interest/project.
Before going deep into digital image processing with VB.Net, I strongly suggest that you take your time to learn the basics first, after that moving on to the next step which is dealing with the APIs you mentioned.
However, to answer your question, API (Application Programming Interface) is a set of programming instructions and standards communicate your application with other applications.
Which basically allows two different pieces of software to speak to one another through a common interface.
As for the DLL (Dynamic link library) files, they are a set of executable functions or data that can be used by a Windows application.
Or as I quote from Wikipedia:
Dynamic-link library (also written unhyphenated), or DLL, is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX (for libraries containing ActiveX controls), or DRV (for legacy system drivers). The file formats for DLLs are the same as for Windows EXE files — that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.
Basically, you shouldn't go too deep while you are in the beginning and I strongly encourage you to start learning the language itself then, step by step until you master the language.
I would like to say something to you IvanWong....Welcome to the world of programming, Fun and challenges!!!

why don't more programming languages have builtin interfaces to the window manager?

Programming is at the heart about automating tasks on a computer.
Presumably those tasks would normally be done manually by a human.
Humans use the computer through the keyboard, mouse, and interaction with the console or the window manager. But very few languages have built in functions that provide an interface to these basic computing objects.
A notable exception is autohotkey, an open source language on windows, providing builtin functions that allow the following simple tasks:
* Get Pixel Information
* Get mouse position
* Keyboard macros
* Simulate key strokes
* Simulate mouse click
* Window management
See examples on rosettacode.
There have been various attempts on linux, many of which were stopped without explanation.
One is the inactive tcl library: android. Search google code for android, lang:tcl
I write web server code. No human being interacts with the code. It's simply a lot of complex plug-ins to Apache.
"Humans use the computer through the keyboard, mouse, and interaction with the console or the window manager. "
This is completely false in my case. The "user" sends requests through HTTP. No keyboard, no mouse, no console, no window manager.
The user may be using some kind of fancy GUI, but it doesn't matter to me or my software. All I see are HTTP GET and POST requests. Pure text.
"But very few languages have built in functions that provide an interface to these basic computing objects."
Correct. I have no use for keyboard, mouse, console or window manager.
All personal computing platforms have libraries that will do this.
The problem is that that would require standardizing user interactions over all systems. Java tried this, without a great deal of success. There have been other libraries with more or less success, Qt probably being the most promising one to date.
It's certainly possible to write a language for a single platform that will include all the UI fundamentals. It's also possible to fake it with a GUI and a library. However, there's good reason to want a language that is usable on any major platform, whether or not there's a GUI.
I doubt the premise is true. Java can do all that, except maybe "window management" since I do not know what is meant by this.
I'd be surprised if you can't do it with c#.
If there are many languages that can't do this, I'd guess it is because it is difficult to do it without tying the language to the operating system.
First of all, I think you're asking why the programming languages' standard libraries don't have built-in interfaces to the window manager. The language itself and its libraries are quite distinct.
One big reason is portability. If there's too many specific functions in a programming language's libraries, it will be harder to port it to other systems. For example, I/O, math functions, strings, various data structures and related algorithms, are all generic and can be made to work on virtually any computer.
But things like the window manager, GUI, etc., they are much more specific to certain platforms which is why they are not included in the standard libraries. This is what makes C/C++ so portable.
Tasks performed by computers without any human interface device interaction outnumber those directly actuated by a human by an enormous factor.
Programming languages tries (or at least is currently trying) to be independent with the platform. Example in .net, you have to reference some Win32 api to do some of the stuffs you specified above. Getting it built-in the core programming language model, .net will become too coupled with the OS, thus, creating its Mono counterpart will be too tedious.
Regarding keystrokes, macros and some stuffs, the simplest way I'm doing it right now is true vbscript or in powershell :)

How do I get input from an XBox 360 controller?

I'm writing a program that needs to take input from an XBox 360 controller. The input will then be sent wirelessly to an RC Helicopter that I am building.
So far, I've learned that this can be done using either the XInput library from DirectX, or the Input framework in XNA.
I'm wondering if there are any other options available. The scope of my program is rather small, and having to install a large gaming library like DirectX or XNA seems like excessive. Further, I'd like the program to be cross platform and not Microsoft specific.
Is there a simple lightweight way I can grab the controller input with something like Python?
Edit to answer some comments:
The copter will have 6 total propellers, arranged in 3 co-axial pairs. Basically, it will be very similar to this, only it will cost about $1,000 rather than $15,000. It will use an Arduino for onboard processing, and Zigbee for wireless control.
The 360 controller was selected because it is well designed. It is very ergonomic and has all of the control inputs needed. For those familiar with helicopter controls, the left joystick will control the collective, the right joystick with control the pitch and roll, and the analog triggers will control the yaw. The analog triggers are a big feature for the 360 controller. PS and most others do not have them.
I have a webpage for the project, but it is still pretty sparse. I do plan on documenting the whole design though, so eventually it will be interesting.
http://tricopter.googlecode.com
On a side note, would it kill Google to have a blog feature for googlecode projects?
I would like the 360 controller input program to run in both Linux and Windows if possible. Eventually though, I'd like to hook the controller directly to an embedded microcontroller board (such as Arduino) so that I don't have to go through a computer, but its not a high priority at the moment.
It is not all that difficult. As the earlier guy mentioned, you can use the SDL libraries to read the status of the xbox controller and then you can do whatever you'd like with it.
There is a SDL tutorial: http://sdl.beuc.net/sdl.wiki/Handling_Joysticks which is fairly useful.
Note that an Xbox controller has the following:
two joysticks:
left joystick is axis 0 & 1;
left trigger is axis 2;
right joystick is axis 3 & 4;
right trigger is axis 5
one hat (the D-pad)
11 SDL buttons
two of them are joystick center presses
two triggers (act as axis, see above)
The upcoming SDL v1.3 also will support force feedback (aka. haptic).
I assume, since this thread is several years old, you have already done something, so this post is primarily to inform future visitors.
PyGame can read joysticks, which is what the X360 controller shows up as on a PC.
Well, if you really don't want to add a dependency on DirectX, you can use the old Windows Joystick API -- Windows Multimedia -> Joystick Reference in the platform SDK.
The standard free cross plaform game library is Simple DirectMedia Layer, originally written to port Windows games to Unix (Linux) systems. It's a very basic, lightweight API that tends to support the minimal subset of features on each system, and it has bindings for most major languages. It has very basic joystick and gamepad support (no force feedback, for example) but it might be sufficient for your needs.
Perhaps the Mono.Xna library has added GamePad support, which would provide the cross platform functionality you were looking for:
http://code.google.com/p/monoxna/
As far as the concerns about the library being too heavy weight, sure, for this option it may be true ... however, it could open up opportunities to do some nice visualization in the future.
disclaimer: I'm not familiar with the status of the mono xna project, so it may not have added this feature yet. But still, 'tis an option :-)