SUM of SUMPRODUCTs - Formula is too long? - sum

I just tried running an admittedly long formula and Excel came back to tell me it's too long. Currently I am using SUM on a number (probably 30-40) of SUMPRODUCT formulas.
A bit of background about it:
I am trying to match the origin and destination of a product, so I created a new sheet (Sheet 2) with every country in the world listed down Column 1 and across Row 1. The SUMPRODUCT formula I wrote goes into Sheet 1, where there are number of columns containing origin and destination of various products. The rows are based on each organization, so one organization may have 13 products with different countries of origin and multiple destinations per product.
The SUMPRODUCT formula I'm using looks like this:
=SUMPRODUCT((BA2:BA501="NO")*(P2:P501="Guatemala")*(R2:R501="Guatemala")*(S2:S501)
P = the country of origin
R = the country of destination
S = the volume to be summed
BA = in some cases, I have the actual distribution of the product, so the "destination" volume should not be included. So, it should only use the destination if BA=0
I had to identify the various columns by hand, resulting in a truly horrendous length of SUMPRODUCTS, which I then used SUM(Sumproduct 1, 2, 3...) around to get the total amount of products matching a COUNTRY origin/destination in the matrix on Sheet 2.
So after going through all this work, I'm a bit upset that Excel is saying it's too long! Does anyone know if there is a way to circumvent this, or if there's some solution I'm overlooking?
Thanks!
p.s. I doubt anyone is interested in reading this, but here's the full formula that Excel rejected:


If your criterias are ever the same and only the range to be summed changes, then you could use the SUMPRODUCT in this way:
=SUMPRODUCT((BA2:BA501="NO")*(P2:P501="Guatemala")*(R2:R501="Guatemala"),
S2:S501+T2:T501+V2:V501+AA2:AA501)
Greetings
Axel

Related

Why is fence status checking and resetting in Vulkan so slow?

If I check the status of a fence with vkGetFenceStatus() it takes about 0.002 milliseconds. This may not seem like a long time, but that amount of time in a rendering or game engine is a very long time, especially when waiting on fences while doing other scheduled jobs will soon add up to time quickly approaching a millisecond. If the fence statuses are kept host-side why does it take so long to check these and reset them? Do other people get similar timings when calling this function?
Ideally, the time it takes to check for a fence being set shouldn't matter. While taking up 0.02% of a frame at 120FPS isn't ideal, at the end of the day, it should not be all that important. The ideal scenario works like this:
Basically, you should build your fence logic around the idea that you're only going to check the fence if it's almost certainly already set.
If you submit frame 1, you should not check the fence when you're starting to build frame 2. You should only check it when you're starting to build frame 3 (or 4, depending on how much delay you're willing to tolerate).
And most importantly, if it isn't set, that should represent a case where either the CPU isn't doing enough work or the GPU has been given too much work. If the CPU is outrunning the GPU, it's fine for the CPU to wait. That is, the CPU performance no longer matters, since you're GPU-bound.
So the time it takes to check the fence is more or less irrelevant.
If you're in a scenario where you're task dispatching and you want to run the graphics task ASAP, but you have other tasks available if the graphics task isn't ready yet, that's where this may become a problem. But even so, it would only be a problem for that small space of time between the first check to see if the graphics task is ready and the point where you've run out of other tasks to start and the CPU needs to start waiting on the GPU to be ready.
In that scenario, I would suggest testing the fence only twice per frame. Test it at the first opportunity; if its not set, do all of the other tasks you can. After those tasks are dispatched/done... just wait on the GPU with vkWaitForFences. Either the fence is set and the function will return immediately, or you're waiting for the GPU to be ready for more data.
There are other scenarios where this could be a problem. If the GPU lacks dedicated transfer queues, you may be testing the same fence for different purposes. But even in those cases, I would suggest only testing the fence once per frame. If the data upload isn't done, you either have to do a hard sync if that data is essential right now, or you delay using it until the next frame.
If this remains a concern, and your Vulkan implementation allows timeline semaphores, consider using them to keep track of queue progress. vkGetSemaphoreCounterValue may be faster than vkGetFenceStatus, since it's just reading a number.

What is the difference between Synchronous and asynchronous I2C in embedded programming?

What is the difference between Synchronous and asynchronous I2C in embedded programming? Could anyone explain this using an example? When to use either of them?
I2C is a synchronous protocol, meaning that the communicating parties do not need to agree to a certain speed beforehand - think at the asynchronous serial lines like RS-232, where no communication can succeed if the parties don't use the same baud rate.
The sync/async someone refers to, speaking of i2c, it's in another level, we may call it API. A synchronous API (or routine) will start the communication and will not return control to the program until the whole data will be sent or received. The time taken to do the transfer will be unavailable for the program.
If the communication is asynchronous, the calling program can invoke the i2c driver and then continue to do its work. Later, the program should be notified (or the program should check) about the result of the transaction: "is the writing/reading still in progress?"; and if it is terminated, did it go well or not?
Sync/async in the context of i2c can be thought the same as disk (file) I/O: often synchronous disk access is used, which is simple and effective: read some data in memory, check if the reading was ok, do something with the data, and go ahead. In the asynchronous way, the program says something like "I need those data: I/O driver, please fetch them while I do something else; when the data will be available I will do something with that".
The asynchronous mode for i2c can be pleasant especially because i2c is slow when compared to other ways to exchange data. On the other hand, i2c is used for little data, certainly not for a hard disk!
Speaking strictly about the embedded world, often the MCU has to do many things concurrently, and an i2c device can be simply slow enough to make the MCU lose too much time if the i2c is bit-banged. But often there is hardware support, interrupt-driven. Anyway, a non-blocking (i.e. asynchronous) API is more difficult to manage.
-- UPDATE AFTER COMMENT --
"often there is hardware support, interrupt-driven. Anyway, a non-blocking (i.e. asynchronous) API is more difficult to manage" Do you mean the implementation of synchronus I2C in a multimodal sensor system can be easier than the other and still give similar performance.
Let's assume there is an asynchronous hardware+driver support: we call
i2c_write(periph_addr, data_to_send[], 6);
// send 6 bytes to the peripheral
After few microseconds the routine returns, but the communication is still ongoing. At this point we can not issue another i2c_write(...), because we would interrupt the ongoing one. The program could do something else, yes, but not use the same bus. And if instead i2c_write(...) we used a
i2c_read(...);
we would have not the data ready when the routine returns: the program must use i2c_read(), but use the data only later, when arrived, and without touching the i2c bus in the meanwhile. Not difficult to do, but surely a synchronous call/API like:
if ( i2c_read(some_data) == I2COK)
display(some_data);
else display(error);
is far simpler.

Corner Cases to Verify Synchronous FIFO

I'm trying to figure out the corner cases for verifying a synchronous FIFO during hardware verification.
My setup is a very simple two ports synchronous FIFO (write/read) and the write clk frequency is same as read clk frequency.
In order to test whether the FIFO overflow occurs or not, can somebody help me identify those corner cases so that we can completely verify this simple synchronous FIFO?

When I should use I2C and when I should use SPI? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I want to know out of SPI and I2C communication protocol which one
is Faster? I have read in an article that SPI is faster but they
have not given an explanation why? Is it because of less
overhead in SPI when compared to I2C (like start, ack, stop)?
Which one is better out of two? I have seen that for ADC mostly SPI
is preferred, but why? For Flash also I have mostly seen SPI
protocol being used, but for Sensor both SPI and I2C. Now, what makes
as to decide that for one peripheral I should go with SPI and for
another I2C is preferred?
Better question is when I should use I2C and when I should use SPI. Like always in engineering there are different pros and cons in both protocols. I compared them below so you will be able to asses what is better match for your requirements.
Quick comparison of my own:
Additional remarks:
SPI is usually used for slave devices where speed matters e.g. ADC peripherals, FLASH memories etc.
I2C is usually used for slave devices which are fine with I2C speed constrains or which are kind of slow like sensors which can take longer time to get the measure e.g. popular temperature and humidity sensor HTU21-D with I2C performs measure between 3-16 [ms] (this time depends on the selected measurement resolution).
Post explaining I2C Bus Length constrains.
Post explaining why SPI is faster than I2C
PS:
The fastest ADC peripherals are not using either I2C or SPI. They use parallel I/O.
Keep in mind that for simple (hobby) projects it usually doesn’t matter and you will be fine with either of them.

In what situations is VkFence better than vkQueueWaitIdle for vkQueueSubmit?

As described here vkQueueWaitIdle is equivalent of vkFence.
So in which situation to use either of them.
As you say, vkQueueWaitIdle() is just a special case of Fence use.
So you would use it when you would have to write 10 lines of equivalent Fence code instead — especially if you do not care to remember all the previous queue submissions. It is somewhat a debug feature (most frequently you would use it temporarily to test your synchronization). And it may be useful during cleanup (e.g. application termination, or rebuilding the swapchain).
In all other cases you should prefer VkFences, which are more general:
You can take advantage of advanced vkWaitForFences() usage. I.e. wait-one vs wait-all and timeout.
You supply it to some command that is supposed to signal it (can't do that with vkQueueWaitIdle()). You can do something like:
vkQueueSubmit( q, 1, si1, fence1 );
vkQueueSubmit( q, 1, si2, fence2 );
vkWaitFences( fence1 ); // won't block on the 2nd submit unlike vkQueueWaitIdle(q)
which can even be potentially faster than:
vkQueueSubmit( q, 1, si1, 0 );
vkQueueWaitIdle(q);
vkQueueSubmit( q, 1, si2, 0 );
You can just query the state of the Fence without waiting with vkGetFenceStatus(). E.g. having some background job and just periodically asking if it's done already while you do other jobs.
VkFence may be faster even in identical situations. vkQueueWaitIdle() might be implemented as
vkQueueSubmit( q, 0, nullptr, fence );
vkWaitFences( fence, infiniteWait );
where you would potentially pay extra for the vkQueueSubmit.
In what situations is VkFence better than vkQueueWaitIdle for vkQueueSubmit?
When you aren't shutting down the Vulkan context, i.e. in virtually all situations. vkQueueWaitIdle is a sledgehammer approach to synchronization, roughly analogous to glFinish(). A Vulkan queue is something you want to keep populated, because when it's empty that's a kind of inefficiency. Using vkQueueWaitIdle creates a kind of synchronization point between the client code and parts of the Vulkan driver, which can potentially lead to stalls and bubbles in the GPU pipeline.
A fence is much more fine-grained. Instead of asking the queue to be empty of all work, you're just asking when it finished the specific set of work queued prior to or with the fence. Even though it still creates a synchronization point by having to sync the client CPU thread with the driver CPU thread, this still leaves the driver free to continue working on the remaining items in the queue.
Semaphores are even better than fences, because they're telling the driver that one piece of work is dependent on another piece of work and letting the driver work out the synchronization entirely internally, but they're not viable for all situations, since sometimes the client needs to know when some piece of work is done.
Quite frankly you should always prefer waiting on a fence because it is much more flexible.
With a fence you can wait on completion of work without having to wait on work submitted after the work you are waiting on. A fence also allows other threads to push command buffers to the queue without interfering with the wait.
Besides that the WaitQueueIdle may be implemented differently (and less efficiently) compared to waiting on the fence.