VLOOKUP two different items, use whichever one has a number - vba

I have a list of financial metrics in column A, rest of the columns are the time periods the financial data is for.
Let's say I'm trying to calculate a ratio, but the financial metrics in A are not entirely unique, in the sense that a metric type may have more than one associated metric depending on how the company reports the metric.
For example, let's say I need Depreciation Expense on the income statement... that item may be reported as Depreciation, or DepreciationAndAmortization, or something else.
Any ideas how the formula in the ratio I'm trying to calculate can lookup the metric in A1, use the number immediately to the right as part of the formula... and if the metric Depreciation for example is 0, it would look for the next one I specify, like DepreciationAndAmortization, and use that one instead as the first one isn't reported.

If I understand correctly, this should do it:
=MAX(INDEX(B:B,MATCH("*depreciation*",A:A,)),INDEX(B:B,MATCH("*depreciation*",A:A,)+MATCH("*depreciation*",INDEX(A:A,1+MATCH("*depreciation*",A:A,)):INDEX(A:A,100+MATCH("*depreciation*",A:A,)),)))

If the alternatives are say in E2 and E3 then:
=MAX(VLOOKUP(E2,A:B,2,0),VLOOKUP(E3,A:B,2,0))
ie try both and take whichever is larger.

About your concern on the answer of Excel Hero that returns Value error, you can use the function "iferror" and returns "0" if the value/date you're looking for isn't available.
=IFERROR(MAX(INDEX(B:B,MATCH("*depreciation*",$A:$A,)),INDEX(B:B,MATCH("*depreciation*",$A:$A,)+MATCH("*depreciation*",INDEX($A:$A,1+MATCH("*depreciation*",$A:$A,)):INDEX($A:$A,100+MATCH("*depreciation*",$A:$A,)),))),0)

Related

How to populate all possible combination of values in columns, using Spark/normal SQL

I have a scenario, where my original dataset looks like below
Data:
Country,Commodity,Year,Type,Amount
US,Vegetable,2010,Harvested,2.44
US,Vegetable,2010,Yield,15.8
US,Vegetable,2010,Production,6.48
US,Vegetable,2011,Harvested,6
US,Vegetable,2011,Yield,18
US,Vegetable,2011,Production,3
Argentina,Vegetable,2010,Harvested,15.2
Argentina,Vegetable,2010,Yield,40.5
Argentina,Vegetable,2010,Production,2.66
Argentina,Vegetable,2011,Harvested,15.2
Argentina,Vegetable,2011,Yield,40.5
Argentina,Vegetable,2011,Production,2.66
Bhutan,Vegetable,2010,Harvested,7
Bhutan,Vegetable,2010,Yield,35
Bhutan,Vegetable,2010,Production,5
Bhutan,Vegetable,2011,Harvested,2
Bhutan,Vegetable,2011,Yield,6
Bhutan,Vegetable,2011,Production,3
Image of the above csv:
Now there is a very small country lookup table which has all possible countries the source data can come with, listed. PFB:
I want to have the output data's number of columns always fixed (this is to ensure the reporting/visualization tool doesn't get dynamic number columns with every day's new source data ingestions depending on the varying distinct number of countries present).
So, I've to somehow join the source data with the country_lookup csv and populate all those columns with default value as F. Every country column would be binary with T or F being the possible values.
The original dataset from the above has to be converted into below:
Data (I've kept the Amount field unsolved for column Type having Derived Yield as is, rather than calculating them below for a better understanding and for you to match with the formulae):
Country,Commodity,Year,Type,Amount,US,Argentina,Bhutan,India,Nepal,Bangladesh
US,Vegetable,2010,Harvested,2.44,T,F,F,F,F,F
US,Vegetable,2010,Yield,15.8,T,F,F,F,F,F
US,Vegetable,2010,Production,6.48,T,F,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
US,Vegetable,2011,Harvested,6,T,F,F,F,F,F
US,Vegetable,2011,Yield,18,T,F,F,F,F,F
US,Vegetable,2011,Production,3,T,F,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+2)/(3+3),T,F,T,F,F,F
US,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Argentina,Vegetable,2010,Harvested,15.2,F,T,F,F,F,F
Argentina,Vegetable,2010,Yield,40.5,F,T,F,F,F,F
Argentina,Vegetable,2010,Production,2.66,F,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Argentina,Vegetable,2011,Harvested,10,F,T,F,F,F,F
Argentina,Vegetable,2011,Yield,90,F,T,F,F,F,F
Argentina,Vegetable,2011,Production,9,F,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Bhutan,Vegetable,2010,Harvested,7,F,F,T,F,F,F
Bhutan,Vegetable,2010,Yield,35,F,F,T,F,F,F
Bhutan,Vegetable,2010,Production,5,F,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Bhutan,Vegetable,2011,Harvested,2,F,F,T,F,F,F
Bhutan,Vegetable,2011,Yield,6,F,F,T,F,F,F
Bhutan,Vegetable,2011,Production,3,F,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
The image of the above expected output data for a structured look at it:
Part 1 -
Part 2 -
Formulae for populating Amount Field for Derived Type:
Derived Amount = Sum of Harvested of all countries with T (True) grouped by Year and Commodity columns divided by Sum of Production of all countries with T (True)grouped by Year and Commodity columns.
So, the target is to have a combination of all the countries from source and calculate the sum of respective Harvested and Production values which then has to be divided. The commodity can be more than one in the actual scenario for any given country, but that should not bother as the summation of amount happens on grouped commodity and year.
Note: The users in the frontend can select any combination of countries. The sole purpose of doing it in the backend rather than dynamically doing it in the frontend is because AWS QuickSight (our visualisation tool), even though can populate sum on selected column filters but doesn't yet support calculation on those derived summed fields. Hence, the entire calculation of all combination of countries has to be pre-populated (very naive approach) in order to make it available in report on dynamic users selection of countries.
Also if you've any better approach (than the above naive approach mentioned in note) to solve this problem, you are most welcome to guide me. I've also posted a question on the same problem without writing my expected approach for experts to show me the path on how we can solve this kind of a problem better than this naive approach. If you want to help solve it with some other technique, you're most welcome, here is the link to that question.
Any help shall be greatly acknowledged.

Tableau: Display Different Number Formats for the Same Measure

I have a field in my dashboard called outcome that displays the performance results for a doctor's office for multiple measures. A majority of the outcome values are rates and should be displayed as percentages. Unfortunately, there are two outcomes, "Utilization Management: Measure 3" and "Utilization Management: Measure 5" that are NOT rates, but actually number values.
Is there a way to display the outcome field in my table so that all 'Measures' that are NOT "Utilization Management: Measure 3" or "Utilization Management: Measure 5" get displayed as percentages, while the two aforementioned measures are displayed as number values?
Please do not get hung up on the appropriateness of combining rate and number values in the same field, as I've tried to have that conversation with my customer...they are insistent on this display ability and will not let it through UAT without it. Thanks.
Same question posed and packaged workbook attached for reference here.
One approach is to define a string valued calculated field that makes the number format part of the calculation logic. This solution does not play well with Measure Names and Measure Values, since you can't put string valued measures on the Measure Values shelf. But you can still build the view your customer wants, with a little effort.
First create a static set based on your field [Measure] (confusing choice of field name, by the way. You have a dimension named Measure.). Call it [Percentage Measures] and check off the ones you want displayed as percentages.
Then for each of your numeric measure fields that you want treated this way, make a corresponding calculated field that looks something like:
if attr([Percentage Measures]) then
str(round(sum([outcome]) * 100, 2)) + "%"
else
str(round(sum([outcome]),3))
end
This approach assumes you will use your calculation on views where your dimension named [Measure] is on some (non-filter) shelf. Adjust the round() function arguments as desired
I took a look at your workbook. It seems that it is not possible to dynamically format values. There is a feature idea here: https://community.tableau.com/ideas/1411 which I believe would allow you to do what you want.

DAX sum different DateTime

I have a problem here, i would like to sum the work time from my employee based on the data (time2 - time 1) daily and here is my query:
Effective Minute Work Time = 24. * 60 * (LASTNONBLANK(time2,0) -FIRSTNONBLANK(time1,0))
It works daily, but if i drill up to weekly / monthly data it show the wrong sum as it shown below :
What i want is summary of minute between daily different times (time2-time1)
Thanks for your help :)
You have several approaches you can take: the hard way or the easier way :). The harder (at least for me :)) is to use DAX to do this. You would:
1) create a date table,
2) Use the DAX calculate function to evaluate your last non-blank and first non-blank values (you might need to use calculate table, but I'm not sure; DAX experts jump in). Then subtract one vs. the other.
This will give you correct values for a given day for a given person. You can enforce the latter condition by putting a 'has one value' guard on the person name so that your measure informs the report author if they're not using it right.
Doing the same for dates is a little trickier. In the example you show you are including the date in the row grouping. But if you change your mind and want instead to have 'total hours worked by person' or 'total hours worked by everyone' you're not done with modelling yet.
Your next step is to use calculate table in combination with calculate to create a measure that returns the total. You'll use calculate table so you evaluate each date and the hours worked on that date by person. Then you'll use calculate to summarize that all down to a single number. If you're not careful with your DAX (or report authoring) you might mix which person you're summarizing for so that your first/last non blank are not at the person level. It gets intense quickly.
Your easier solution, though it might be more limited in its application - depends really on your scenario - is to use the query to transform the data into a summary by day and person using the group by command. This will give you a row per person per day with their start and end times. Then you can quickly calculate the hours worked on that day. Then you can quite easily build visuals on top of the summary data. Of course you give up some of the flexibility of the having a proper data model. However if you have a date table, a person table, and your summary table and then setup your relationships correctly you can achieve answers to the most common questions.

Aggregation of an MDX calculated measure when multiple time periods are selected

In my SSAS cube, I've several measures defined in MDX which work fine except in one type of aggregation across time periods. Some don't aggregate (and aren't meant to) but one does aggregate but gives the wrong answers. I can see why, but not what to do to prevent it.
The total highlighted in the Excel screenshot below (damn, not allowed to include an image, reverting to old-fashion table) is the simplest case of what goes wrong. In that example, 23,621 is not the grand total of 5,713 and 6,837.
Active Commitments Acquisitions Net Lost Commitments Growth in Commitments
2009 88,526 13,185 5,713 7,472
2010 92,125 10,436 6,837 3,599
Total 23,621 23,621
Active Commitments works fine. It is calculated for a point in time and should not be aggregated across time periods.
Acquisitions works fine.
[Measures].[Growth in Commitments] = ([Measures].[Active Commitments],[Date Dimension].[Fiscal Year Hierarchy].currentMember) - ([Measures].[Active Commitments],[Date Dimension].[Fiscal Year Hierarchy].prevMember)
[Measures].[Net Lost Commitments] = ([Measures].[Acquisitions] - [Measures].[Growth in Commitments])
What's happening in the screenshot is that the total of Net Lost Commitments is calculated from the total of Acquisitions (23,621) minus the total of Growth in Commitments (which is null).
Aggregation of Net Lost Commitments makes sense and works for non-time dimensions. But I want it to show null when multiple time periods are selected rather than an erroneous value. Note that this is not the same as simply disabling all aggregation on the time dimension. The aggregation of Net Lost Commitment works fine up the time hierarchy -- the screenshot shows correct values for 2009 and 2010, and if you expand to quarters or months you still get correct values. It is only when multiple time periods are selected that the aggregation fails.
So my question is how to change the definition of Net Lost Commitments so that it does not aggregate when multiple time periods are selected, but continues to aggregate across all other dimensions? For instance, is there a way of writing in MDX:
CREATE MEMBER CURRENTCUBE.[Measures].[Net Lost Commitments]
AS (iif([Date Dimension].[Fiscal Year Hierarchy].**MultipleMembersSelected**
, null
, [Measures].[Acquisitions] - [Measures].[Growth in Commitments]))
ADVthanksANCE,
Matt.
A suggestion from another source has solved this for me. I can use --
iif(iserror([Date Dimension].[Fiscal Year Hierarchy].CurrentMember),
, null
, [Measures].[Acquisitions] - [Measures].[Growth in Commitments]))
CurrentMember will return an error when multiple members have been selected.
I didn't understand much of the first part of the question, sorry...but at the end I think you ask how to detect if multiple members from a particular dimension are in use in the MDX.
You can examine either of the two axes as a string, and use that to form a true/false test. Remember you can use VBA functions in Microsoft implementations of MDX.
I suggest InStr(1, SetToStr(StrToSet("Axis(1)")), "whatever") = 0 as a way to craft the first argument of your IIF.
This gets the set of members on axis number one, converts it to a string, and looks to see if a certain string is present (it returns the position of that string within the other). Zero means not found (so it returns true). You may need to use axis zero instead, or maybe check both.
To see if multiple members from the same dimension were used, the test string above would have to be more complicated. You want to know if whatever occurs once or twice. You could test if the first occurance of the string was at the same position as the last occurance (by searching backwards); though that could also mean the string wasn't found at all:
IIF(
InStr(1, bigstring, littlestring) = InStrRev(bigstring, littlestring),
'used once',
'used twice or not at all'
)
I came across this post while researching a solution for my own issue with grand totals of calculated measures over time when filters are involved. I think you could have fixed the calculations instead of suppressing them by using dynamic sets. This worked for me.

track sales for week/month and find the best sellers

Lets say I have a website that sells widgets. I would like to do something similar to a tag cloud tracking best sellers. However, due to constantly aquiring and selling new widgets, I would like the sales to decay on a weekly time scale.
I'm having problems puzzling out how store and manipulate this data and have it decay properly over time so that something that was an ultra hot item 2 months ago but has since tapered off doesn't show on top of the list over the current best sellers. What would be the logic and database design for this?
Part 1: You have to have tables storing the data that you want to report on. Date/time sold is obviously key. If you need to work in decay factors, that raises the question: for how long is the data good and/or relevant? At what point in time as the "value" of the data decayed so much that you no longer care about it? When this point is reached for any given entry in the database, what do you do--keep it there but ensure it gets factored out of all subsequent computations? Or do you archive it--copy it to a "history" table and delete it from your main "sales" table? This is relevant, as it has to be factored into your decay formula (as well as your capacity planning, annual reporting requirements, and who knows what all else.)
Part 2: How much thought has been given to the decay formula that you want to use? There's no end of detail you can work into this. Options and factors to wade through include but are not limited to:
Simple age-based. Everything before the cutoff date counts as 1; everything after counts as 0. Sum and you're done.
What's the cutoff date? Precisly 14 days ago, to the minute? Midnight as of two Saturdays ago from (now)?
Does the cutoff date depend on the item that was sold? If some items are hot but some are not, does that affect things? What if you want to emphasize some things (the expensive/hard to sell ones) over others (the fluff you'd sell anyway)?
Simple age-based decays are trivial, but can be insufficient. Time to go nuclear.
Perhaps you want some kind of half-life, Dr. Freeman?
Everything sold is "worth" X, where the value of X is either always the same or varies on the item sold. And the value of X can decay over time.
Perhaps the value of X decreased by one-half every week. Or ever day. Or every month. Or (again) it may vary depending on the item.
If you do half-lifes, the value of X may never reach zero, and you're stuck tracking it forever (which is why I wrote "part 1" first). At some point, you probably need some kind of cut-off, some point after which you just don't care. X has decreased to one-tenth the intial value? Three months have passed? Either/or but the "range" depends on the inherent valud of the item?
My real point here is that how you calculate your decay rate is far more important than how you store it in the database. So long as the data's there that the formalu needs to do it's calculations, you should be good. And if you only need the last month's data to do this, you should perhaps move everything older to some kind of archive table.
you could just count the sales for the last month/week/whatever, and sort your items according to that.
if you want you can always add the total amonut of sold items into your formula.
You might have a table which contains the definitions of the pointing criterion (most sales, most this, most that, etc.), then for a given period, store in another table the attribution of points for each of the criterion defined in the criterion table. Obviously, a historical table will be used to store the score for each sellers for a given period or promotion, call it whatever you want.
Does it help a little?