MediaFoundation with multi-input device?

MediaFoundation with multi-input device? - video-capture

I have a project where the source device has an SVideo and a Composite connector available for capture. In DirectShow, I can use IAMCrossbar to set which one to capture from, but in MediaFoundation I only get a single video stream and a C00D3704 status when I try to start streaming (using a SourceReader). Is there any way to select the input in MediaFoundation?
NB: LEADTOOLS claims to be able to do this, but I don't know how. Nothing else I've found says how to do it.
Pointers to the correct interface and/or attributes would be enough...

The answer depends on the specific capture card, but nevertheless pretty simple. Some capture cards (like a dual head Datapath card), will appear as two separate devices (for each card in the system). Therefore, you will activate them separately, following the enumeration (error checking omitted for brevity):
UINT32 deviceCount = 0;
IMFActivate** devices = nullptr;
Microsoft::WRL::ComPtr<IMFAttributes> attributes = nullptr;
hr = ::MFCreateAttributes(attributes.GetAddressOf(), 1);
hr = ::attributes->SetGUID(MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE,
MF_DEVSOURCE_ATTRIBUTE_SOURCE_TYPE_VIDCAP_GUID);
hr = ::MFEnumDeviceSources(attributes.Get(), &devices, &deviceCount);
And then activation of the device using GetMediaFoundationActivator and the member function ActivateObject.
This makes sense for a card like the one referenced above since it has separate hardware on the card for each input. And you can concurrently activate each as a result.
However it is possible for the driver to report your SVideo and Composite as one device since it will likely be using the same hardware. In that case, you will find the separate streams types on a single IMFSourceReader.
IMFMediaType* mediaType = nullptr;
HRESULT hr = S_OK;
while (hr == S_OK)
{
hr = reader->GetNativeMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, index, &mediaType);
if (hr == MF_E_NO_MORE_TYPES)
break;
// ... [ process media type ]
++index;
}
In this case, you set the stream selection (IMFSourceReader::SetStreamSelection). I go into some detail on that topic here.
If you are intending to concurrently capture audio, you will have to build an aggregate source, which I wrote a bit about here;
Assuming that your capture card has fairly recent drivers, I am certain that you will locate and read from your available streams without much trouble. Good luck.

Related

Vulkan memory barrier for indirect compute shader dispatch

I have two compute shaders and the first one modifies DispatchIndirectCommand buffer, which is later used by the second one.
// This is the code of the first shader
struct DispatchIndirectCommand{
uint x;
uint y;
uint z;
};
restrict layout(std430, set = 0, binding = 5) buffer Indirect{
DispatchIndirectCommand[1] dispatch_indirect;
};
void main(){
// some stuff
dispatch_indirect[0].x = number_of_groups; // I used debugPrintfEXT to make sure that this number is correct
}
I execute them as follows
vkCmdBindPipeline(cmd_buffer, VK_PIPELINE_BIND_POINT_COMPUTE, first_shader);
vkCmdDispatch(cmd_buffer, x, 1, 1);
vkCmdBindPipeline(cmd_buffer, VK_PIPELINE_BIND_POINT_COMPUTE, second_shader);
vkCmdDispatchIndirect(cmd_buffer, indirect_buffer, 0);
The problem is that the changes made by first shader are not reflected by the second one.
// This is the code of the second shader
void main(){
debugPrintfEXT("%d", gl_GlobalInvocationID.x); //this seems to never be called
}
I initialise the indirect_buffer with VkDispatchIndirectCommand{.x=0,.y=1,.z=1}, and it seems that the second shader always executes with x==0, because the debugPrintfEXT never prints anything. I tried to add a memory barrier like
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
barrier.srcQueueFamilyIndex = queue_idx;
barrier.dstQueueFamilyIndex = queue_idx;
barrier.buffer = indirect_buffer;
barrier.offset = 0;
barrier.size = sizeof_indirect_buffer;
However, this does not seem to make any difference. What does seem to work, is when I use
barrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_INDIRECT_COMMAND_READ;
When I use such access flags, the all compute shaders work properly. However, I get a validation error
Validation Error: [ VUID-vkCmdPipelineBarrier-dstAccessMask-02816 ] Object 0: handle = 0x5561b60356c8, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x69c8467d | vkCmdPipelineBarrier(): .pMemoryBarriers[1].dstAccessMask bit VK_ACCESS_INDIRECT_COMMAND_READ_BIT is not supported by stage mask (VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT). The Vulkan spec states: The dstAccessMask member of each element of pMemoryBarriers must only include access flags that are supported by one or more of the pipeline stages in dstStageMask, as specified in the table of supported access types (https://vulkan.lunarg.com/doc/view/1.2.182.0/linux/1.2-extensions/vkspec.html#VUID-vkCmdPipelineBarrier-dstAccessMask-02816)
Vulkan's documentation states that
VK_ACCESS_INDIRECT_COMMAND_READ_BIT specifies read access to indirect command data read as part of an indirect build, trace, drawing or dispatching command. Such access occurs in the VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT pipeline stage
It looks really confusing. It clearly does mention "dispatching command", but at the same time it says that the stage must be VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT and not VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT. Is the specification contradictory/imprecise or am I missing something?

You seem to be employing trial-and-error strategy. Do not use such strategy in low level APIs, and preferrably in computer engineering in general. Sooner or later you will encounter something that will look like it works when you test it, but be invalid code anyway. The spec did tell you the exact thing to do, so you never ever had a legitimate reason to try any other flags or with no barriers at all.
As you discovered, the appropriate access flag for indirect read is VK_ACCESS_INDIRECT_COMMAND_READ_BIT. And as the spec also says, the appropriate stage is VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT.
So the barrier for your case should probably be:
srcStageMask = VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT
srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
dstStageMask = VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
dstAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT;
The name of the stage flag is slightly confusing, but the spec is otherwisely very clear it aplies to compute:
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT specifies the stage of the pipeline where VkDrawIndirect* / VkDispatchIndirect* / VkTraceRaysIndirect* data structures are consumed.
and also:
For the compute pipeline, the following stages occur in this order:
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT
PS: Also relevant GitHub Issue: KhronosGroup/Vulkan-Docs#176

How to handle GSM buffer on the Microcontroller?

I have a GSM module hooked up to PIC18F87J11 and they communicate just fine . I can send an AT command from the Microcontroller and read the response back. However, I have to know how many characters are in the response so I can have the PIC wait for that many characters. But if an error occurs, the response length might change. What is the best way to handle such scenario?
For Example:
AT+CMGF=1
Will result in the following response.
\r\nOK\r\n
So I have to tell the PIC to wait for 6 characters. However, if there response was an error message. It would be something like this.
\r\nERROR\r\n
And if I already told the PIC to wait for only 6 characters then it will mess out the rest of characters, as a result they might appear on the next time I tell the PIC to read the response of a new AT command.
What is the best way to find the end of the line automatically and handle any error messages?
Thanks!

In a single line
There is no single best way, only trade-offs.
In detail
The problem can be divided in two related subproblems.
1. Receiving messages of arbitrary finite length
The trade-offs:
available memory vs implementation complexity;
bandwidth overhead vs implementation complexity.
In the simplest case, the amount of available RAM is not restricted. We just use a buffer wide enough to hold the longest possible message and keep receiving the messages bytewise. Then, we have to determine somehow that a complete message has been received and can be passed to further processing. That essentially means analyzing the received data.
2. Parsing the received messages
Analyzing the data in search of its syntactic structure is parsing by definition. And that is where the subtasks are related. Parsing in general is a very complex topic, dealing with it is expensive, both in computational and laboriousness senses. It's often possible to reduce the costs if we limit the genericity of the data: the simpler the data structure, the easier to parse it. And that limitation is called "transport layer protocol".
Thus, we have to read the data to parse it, and parse the data to read it. This kind of interlocked problems is generally solved with coroutines.
In your case we have to deal with the AT protocol. It is old and it is human-oriented by design. That's bad news, because parsing it correctly can be challenging despite how simple it can look sometimes. It has some terribly inconvenient features, such as '+++' escape timing!
Things become worse when you're short of memory. In such situation we can't defer parsing until the end of the message, because it very well might not even fit in the available RAM -- we have to parse it chunkwise.
...And we are not even close to opening the TCP connections or making calls! And you'll meet some unexpected troubles there as well, such as these dreaded "unsolicited result codes". The matter is wide enough for a whole book. Please have a look at least here:
http://en.wikibooks.org/wiki/Serial_Programming/Modems_and_AT_Commands. The wikibook discloses many more problems with the Hayes protocol, and describes some approaches to solve them.

Let's break the problem down into some layers of abstraction.
At the top layer is your application. The application layer deals with the response message as a whole and understands the meaning of a message. It shouldn't be mired down with details such as how many characters it should expect to receive.
The next layer is responsible from framing a message from a stream of characters. Framing is extracting the message from a stream by identifying the beginning and end of a message.
The bottom layer is responsible for reading individual characters from the port.
Your application could call a function such as GetResponse(), which implements the framing layer. And GetResponse() could call GetChar(), which implements the bottom layer. It sounds like you've got the bottom layer under control and your question is about the framing layer.
A good pattern for framing a stream of characters into a message is to use a state machine. In your case the state machine includes states such as BEGIN_DELIM, MESSAGE_BODY, and END_DELIM. For more complex serial protocols other states might include MESSAGE_HEADER and MESSAGE_CHECKSUM, for example.
Here is some very basic code to give you an idea of how to implement the state machine in GetResponse(). You should add various types of error checking to prevent a buffer overflow and to handle dropped characters and such.
void GetResponse(char *message_buffer)
{
unsigned int state = BEGIN_DELIM1;
bool is_message_complete = false;
while(!is_message_complete)
{
char c = GetChar();
switch(state)
{
case BEGIN_DELIM1:
if (c = '\r')
state = BEGIN_DELIM2;
break;
case BEGIN_DELIM2:
if (c = '\n')
state = MESSAGE_BODY:
break;
case MESSAGE_BODY:
if (c = '\r')
state = END_DELIM;
else
*message_buffer++ = c;
break;
case END_DELIM:
if (c = '\n')
is_message_complete = true;
break;
}
}
}

top command in Mac OS X got truncated

I'm trying to use the "top" command in MacOS X to determine which app is using the resources.
When I do:
top -stats "pid,command"
the command column is truncated, if the process name is too long.
if you look at the activity monitor, the process name is shown properly (with full name) + icon. My questions are:
how to get the full process name?
sometimes the app icon show next to the process name, is there anyway to do the similar thing using objective-c? should I simply navigate to the app contents folder and grab the icns image?

First, if you're trying to get the data programmatically, driving top is almost definitely not what you want to do.
But, to answer your direct questions:
how to get the full process name?
There is no way to control the truncation of commands. You can use the -ncols parameter to set the width of the output for non-interactive output, but that doesn't stop top from truncating if it wants to.
sometimes the app icon show next to the process name, is there anyway to do the similar thing using objective-c? should I simply navigate to the app contents folder and grab the icns image?
No. How would you deal with apps that have multiple .icns files, e.g., for document icons? (Try it with iTunes, for example. If you pick the first .icns, you get the AIFF document icon; if you pick the last, you get the internal-use recent TV shows icon.)
The right way to do it is to get the NSBundle for the application, then do something like this:
NSString *iconFile = [bundle objectForInfoDictionaryKey:#"CFBundleIconFile"];
if (iconFile) {
NSString *iconPath = [bundle pathForResource:iconFile ofType:#"icns"];
// load and display the icon
}
So, how do you actually want to do this, if not by driving top?
Well, what you're asking for is actually not a well-defined thing. OS X has four different notions of task/process/program/application that don't correspond 1-to-1, and that makes life difficult if you want to write a mashup of two programs that use different notions—e.g., top deals in BSD processes, while Activity Monitor deals in OS X applications.
If what you actually want is the same list top uses, it's open source, so you can read it and do the same thing it does.
But the simplest way to get the list of BSD processes is probably the interfaces in libproc.h, in particular proc_listallpids and proc_pidinfo. For example:
int dump_proc_names() {
int buf[16384];
int count = proc_listallpids(&buf, 16384*sizeof(int));
for (int i = 0; i != count; ++i) {
int pid = buf[i];
char path[MAXPATHLEN+1] = {0};
int ret = proc_pidinfo(pid, PROC_PIDPATHINFO, 0,
&path, sizeof(path));
if (ret < 0) {
printf("%d: error %s (%d)\n", pid, strerror(errno), errno);
} else {
printf("%d: %s\n", pid, path);
}
}
}
Obviously in real code you're going to want to allocate the buffer dynamically, return the values instead of just dumping them, get more than just the paths, etc. But this is enough to give you the basic idea. (When you go to get additional information, be aware that you if you ask for any struct, you will get an EPERM error unless you have rights to see every member of that struct. So, don't go asking for PROC_PIDTASKALLINFO if you only want PROC_PIDT_SHORTBSDINFO.
Anyway, since this API deals with BSD processes (and Mach tasks), not applications, it won't directly help you get at the NSBundle you want to provide Activity Monitor-style features.
There is no way to do this that's entirely correct, but you can probably get away with something like this:
NSString *path = processPath;
while (path && ![path isEqualTo:#"/"]) {
NSBundle *bundle = [NSBundle bundleWithPath:path];
if (bundle) {
if ([bundle executablePath != processPath]) return nil;
return bundle;
}
path = [path stringByDeletingLastPathComponent];
}
There are probably alternative ways to do this, each with different tradeoffs. For example, using -[NSWorkspace runningApplications], storing the results in a dictionary mapping the bundle executable path to the bundle, and using that to look up each process is simple, but it only seems to be useful for applications owned by the current user (and probably in the current session). On the other hand, enumerating all bundles on the system, or asking Spotlight, or similar would probably be too slow to do on the fly, but would go out of date if you cached them on first run.
Another option, in place of libproc, is to use libtop.
Unfortunately, Apple doesn't provide it. They do have a libtop implementation, which they use for their top tool, but it's actually embedded in the source to top and not available from outside. You can find the source (at the link above) and embed it into your program the same way top itself does.
Alternatively, both GNU and BSD process utilities have Mac ports (although knowing which name to use with Homebrew/MacPorts/Google search isn't always easy…), so you could build one of those and use it.
However, unless you're trying to write cross-platform software (or already know how to write this code for linux or FreeBSD or whatever), I think that just adds extra complexity.

Image comparison against a database of images or keys

I've just spent most of today trying to find some sort of function to generate keys for known images, for later comparison to determine what the image is. I have attempted to use SIFT and SURF descriptors, both of which are too slow (and patented for commercial use). My latest attempt was creating a dct hash using:
int mm_dct_imagehash(const char* file, float sigma, uint64_t *hash){
if (!file) return -1;
if (!hash) return -2;
*hash = 0;
IplImage *img = cvLoadImage(file, CV_LOAD_IMAGE_GRAYSCALE);
if (!img) return -3;
cvSmooth(img, img, CV_GAUSSIAN, 7, 7, sigma, sigma);
IplImage *img_resized = cvCreateImage(cvSize(32,32), img->depth, img->nChannels);
if (!img_resized) return -4;
cvResize(img, img_resized, CV_INTER_CUBIC);
IplImage *img_prime = cvCreateImage(cvSize(32,32), IPL_DEPTH_32F, img->nChannels);
if (!img_prime) return -5;
cvConvertScale(img_resized, img_prime,1, 0);
IplImage *dct_img = cvCreateImage(cvSize(32,32), IPL_DEPTH_32F, img->nChannels);
if (!dct_img) return -6;
cvDCT(img_prime, dct_img, CV_DXT_FORWARD);
cvSetImageROI(dct_img, cvRect(1,1,8,8));
double minval, maxval;
cvMinMaxLoc(dct_img, &minval, &maxval, NULL, NULL, NULL);
double medval = (maxval + minval)/2;
int i,j;
for (i=1;i<=8;i++){
const float *row = (const float*)(dct_img->imageData + i*dct_img->widthStep);
for (j=1;j<=8;j++){
if (row[j] > medval){
(*hash) |= 1;
}
(*hash) <<= 1;
}
}
cvReleaseImage(&img);
cvReleaseImage(&img_resized);
cvReleaseImage(&img_prime);
cvReleaseImage(&dct_img);
return 0;
}
This did generate something of the type I was looking for, but when I tried comparing it to a database of known hashes, I had as many false positives as I had positives. And so, I'm back at it and thought I might ask the experts.
Would any of you know/have a function that could give me some sort of identifier/checksum for provided images, which would remain similar across similar images so it could be used to quickly identify images via comparison to a database? In short, which category of checksums the image best matches to?
I'm not looking for theories, concepts, papers or ideas, but actually working solutions. I'm not spending another day digging at a dead end, and appreciate anyone who takes the time to put together some code.
With a bit more research, I know that the autoit devs designed pixelchecksum to use the "Adler-32" algorithm. I guess the next step is to find a c implementation and to get it to process pixel data. Any suggestions are welcome!

A google search for "microsoft image hashing" has near the top the two best papers on the subject I am aware of. Both offer practical solutions.

The short answer is that there's no out of the box working solution for your problem. Additionally, the Adler-32 algorithm will not solve your problem.
Unfortunately, comparing image by visual similarity using image signatures (or a related concept) is a very active and open research topic. For example, you said that you had many false positives in your tests. However, what is a correct or incorrect result is subjective and will depend on your application.
In my opinion, the only way to solve your problem is find a adequate image descriptor for your problem and use then to compare the images. Note that comparing descriptors extracted from image is not a trivial task.

Controlling volume of running applications in Mac OS X via Objective-C

Please edvice by objective-c code snippets and useful links of how can I control all audio signals of output in OS X?
I think it should be something like proxy layer somewhere in OS X logic layers.
Thank you!

It's somewhat sad that there is no simple API to do this. Luckily it isn't too hard, just verbose.
First, get the system output device:
UInt32 size;
AudioDeviceID outputDevice;
OSStatus result = AudioHardwareGetProperty(kAudioHardwarePropertyDefaultOuputDevice, &size, &outputDevice);
Then set the volume:
Float32 theVolume;
result = AudioDeviceSetProperty(theDevice, NULL, 0, /* master channel */ false, kAudioDevicePropertyVolumeScalar, sizeof(Float32), &theVolume);
Obviously I've omitted any error checking, which is a must.
Things can get a bit tricky because not all devices support channel 0 (the master channel). If this is the case with your device (it probably is) you have two options: query the device for its preferred stereo pair and set the volume on those channels individually, or just set the volume on channels 1 and 2.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas