Two Dimensional Diagram with aggr function - qlikview

I'm having a very curious Problem in QlikView.
I have a number of readouts from a Database which show certain amounts of time in a different state.
In that table there are 49 variables that describe the state, there are 7 levels of i.e the SOC and seven states of the Temperature.
i.e the one of the fields could be named: SOC1_T1 or SOC2_T1 and so on...
So what i get is a table full of readouts in which every i have an specific id for the object, the state of the variables and an age. There are multiple entries per Object.
What i want to do is to plot a two dimensional diagram over all the states so i get SOC over Temperatur Histogram(Average of the maximum (or newest) value of every object).
I tried creating to Dynamic (or syntethic) Dimensions (ValueLoop(1,7) and ValueLoop(1,8).
In the formulas i reffered to them with
=If(ValueLoop(1,7) = 1 and ValueLoop(1,8) = 1,
(avg(aggr(FirstSortedValue (SOC1_T1
, -age), id)) * 100))
and created 49 Formulas with each state variable output.
Problem now is:
It only shows the first entry. I can replace the whole expression in the if condition with a specific number (100) and get a result. I also plotted the inner expression into a Listbox and checked wheter the result is not null.
As soon as I delete the aggr function and just take the AVG over everything (which is not what i want). Everything works fine. When i turn back to aggr, only the first one is shown.
Doesnt help by the way when i delete one of the dimensions, this doesnt work one dimensional either.
Any ideas or workarounds?
Greetings
Julian

Related

Stata Create panel dataset with two dataframes, no common variable

I am creating a city-by-day panel from scratch, but I'm having trouble balancing and filling in the data. Every city needs to have an observation every day between 01jan2000 and 31dec2019, my variable of interest is a dummy variable recording whether or not an event took place on that day in that city.
My original dataset only recorded observations if event == 1, and I managed to fill in time gaps using tsfill, but I can't figure out how to balance the data or extend it to start on 01jan2000 and 31dec2019. I need every date and city because eventually it will be merged with data that uses that sample period.
My current approach is to create a balanced & filled in panel and then merge the event data using the date it took place. I have a stata df containing the 7,305 dates, and another containing the 273 cityid's I'm observing. Is it possible to generate a new df that combines these two so all 273 cities are observed every day? essentially there will be 273 x 7,304 observations, no variables of interest.
Any help figuring out how to solve the unbalanced issue using either of these approaches is hugely appreciated.

Count number of rows with similar times

good afternoon. What is the point, the train has a geotag that determines its position in space. Location data is entered into a table. It is required to calculate how many times the train was at a certain point. But the problem is that being at a certain point, the geotag leaves several entries in the table by time. I wrote a query that allows you to count the number of arrivals, but the problem is that it works if you take only one train in the selection of values, and if you take several entries in a row, the query already counts incorrectly. Below I will attach a table and a written query
table
query1
if you select several train numbers, the values ​​are confused and are considered as one
Now I have this request, it counts the number of arrivals, but it counts incorrectly, if instead of several trains you specify only one in the selection, everything will be correct, what is my mistake ???
query2
I think you should use group by for address and zone.
By "train"| you mean adress?

How to populate all possible combination of values in columns, using Spark/normal SQL

I have a scenario, where my original dataset looks like below
Data:
Country,Commodity,Year,Type,Amount
US,Vegetable,2010,Harvested,2.44
US,Vegetable,2010,Yield,15.8
US,Vegetable,2010,Production,6.48
US,Vegetable,2011,Harvested,6
US,Vegetable,2011,Yield,18
US,Vegetable,2011,Production,3
Argentina,Vegetable,2010,Harvested,15.2
Argentina,Vegetable,2010,Yield,40.5
Argentina,Vegetable,2010,Production,2.66
Argentina,Vegetable,2011,Harvested,15.2
Argentina,Vegetable,2011,Yield,40.5
Argentina,Vegetable,2011,Production,2.66
Bhutan,Vegetable,2010,Harvested,7
Bhutan,Vegetable,2010,Yield,35
Bhutan,Vegetable,2010,Production,5
Bhutan,Vegetable,2011,Harvested,2
Bhutan,Vegetable,2011,Yield,6
Bhutan,Vegetable,2011,Production,3
Image of the above csv:
Now there is a very small country lookup table which has all possible countries the source data can come with, listed. PFB:
I want to have the output data's number of columns always fixed (this is to ensure the reporting/visualization tool doesn't get dynamic number columns with every day's new source data ingestions depending on the varying distinct number of countries present).
So, I've to somehow join the source data with the country_lookup csv and populate all those columns with default value as F. Every country column would be binary with T or F being the possible values.
The original dataset from the above has to be converted into below:
Data (I've kept the Amount field unsolved for column Type having Derived Yield as is, rather than calculating them below for a better understanding and for you to match with the formulae):
Country,Commodity,Year,Type,Amount,US,Argentina,Bhutan,India,Nepal,Bangladesh
US,Vegetable,2010,Harvested,2.44,T,F,F,F,F,F
US,Vegetable,2010,Yield,15.8,T,F,F,F,F,F
US,Vegetable,2010,Production,6.48,T,F,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
US,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
US,Vegetable,2011,Harvested,6,T,F,F,F,F,F
US,Vegetable,2011,Yield,18,T,F,F,F,F,F
US,Vegetable,2011,Production,3,T,F,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
US,Vegetable,2011,Derived Yield,(6+2)/(3+3),T,F,T,F,F,F
US,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Argentina,Vegetable,2010,Harvested,15.2,F,T,F,F,F,F
Argentina,Vegetable,2010,Yield,40.5,F,T,F,F,F,F
Argentina,Vegetable,2010,Production,2.66,F,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2)/(6.48+2.66),T,T,F,F,F,F
Argentina,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Argentina,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Argentina,Vegetable,2011,Harvested,10,F,T,F,F,F,F
Argentina,Vegetable,2011,Yield,90,F,T,F,F,F,F
Argentina,Vegetable,2011,Production,9,F,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10)/(3+9),T,T,F,F,F,F
Argentina,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Argentina,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
Bhutan,Vegetable,2010,Harvested,7,F,F,T,F,F,F
Bhutan,Vegetable,2010,Yield,35,F,F,T,F,F,F
Bhutan,Vegetable,2010,Production,5,F,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(15.2+7)/(2.66+5),F,T,T,F,F,F
Bhutan,Vegetable,2010,Derived Yield,(2.44+15.2+7)/(6.48+2.66+5),T,T,T,F,F,F
Bhutan,Vegetable,2011,Harvested,2,F,F,T,F,F,F
Bhutan,Vegetable,2011,Yield,6,F,F,T,F,F,F
Bhutan,Vegetable,2011,Production,3,F,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(2.44+7)/(6.48+5),T,F,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(10+2)/(9+3),F,T,T,F,F,F
Bhutan,Vegetable,2011,Derived Yield,(6+10+2)/(3+9+3),T,T,T,F,F,F
The image of the above expected output data for a structured look at it:
Part 1 -
Part 2 -
Formulae for populating Amount Field for Derived Type:
Derived Amount = Sum of Harvested of all countries with T (True) grouped by Year and Commodity columns divided by Sum of Production of all countries with T (True)grouped by Year and Commodity columns.
So, the target is to have a combination of all the countries from source and calculate the sum of respective Harvested and Production values which then has to be divided. The commodity can be more than one in the actual scenario for any given country, but that should not bother as the summation of amount happens on grouped commodity and year.
Note: The users in the frontend can select any combination of countries. The sole purpose of doing it in the backend rather than dynamically doing it in the frontend is because AWS QuickSight (our visualisation tool), even though can populate sum on selected column filters but doesn't yet support calculation on those derived summed fields. Hence, the entire calculation of all combination of countries has to be pre-populated (very naive approach) in order to make it available in report on dynamic users selection of countries.
Also if you've any better approach (than the above naive approach mentioned in note) to solve this problem, you are most welcome to guide me. I've also posted a question on the same problem without writing my expected approach for experts to show me the path on how we can solve this kind of a problem better than this naive approach. If you want to help solve it with some other technique, you're most welcome, here is the link to that question.
Any help shall be greatly acknowledged.

Can someone explain the following Essbase code: FIX, #relative

Can someone please explain the below Essbase code to me please? This is my first time looking at any Essbase code and I'm getting a bit confused as to what it is actually doing.
FIX(&Mth, &Yr, &Version,
"Sector1","Sector2", #relative("Source Code",0), #relative("Channel", 0) )
FIX("AccountNo","DepNo")
DATACOPY "1A11"->"A-500" TO "1BCD"->"C-800";
ENDFIX
ENDFIX
From what I have googled the following is my understanding:
Creates a new command block which restricts database calculations to this subset.
Passes the following members into the command to be used:
Mth
Yr
Version
Returns the following fields:
Sector1
Sector2
returns the 0-level members of the Source Code member - meaning it returns the members of the Total Source Code without children (no other dimensions)
returns the 0-level members of the Channel member - meaning it returns the members of the Channel without children (no other dimensions)
Begins a new command block and passes the following members into the command to be used:
AccountNo
DepNo
Copies the range of cells 1A11, A-500 over to the range 1BCD, C-800
The above is what I understand from the oracle documents on each of the functions, but I can't actually figure out what is happening.
Welcome to the world of Essbase; it can be a little daunting at first especially if you're new to the concept of multidimensionality. You are on the right track regarding analyzing your calc script.
Try not to think of the FIX statement as a command block, per se. A FIX is used to select a portion of cells in the cube. Every single piece of data in your cube has a particular address that consists of one member from every dimension, plus the actual data value itself. For instance, a cube with the dimensions Time, Year, Scenario, and Location might have a particular piece of data at Jan->2018->Actual->Washington. The number of possible permutations of data in a cube can quickly get very large. For instance, if you're organization has 4 years of data, 12 months in a year, 100 locations, 10000 accounts, 3 versions, and 10 departments, you are talking about 4 * 12 * 100 * 10000 * 3 * 10 = 1.4 billion different potential addresses (cells) of data – and that's actually fairly small for a cube, as they tend to grow much larger.
That said, FIX statements are used to narrow down the scope of your calculation operation, rather than operating on the ENTIRE cube (all 1.4 billion cells in my hypothetical example), the FIX essentially restricts the calculation to cells that match certain criteria you specify. In this case, the first FIX statement restricts the calculation to a particular month, yr, version, sectors, sources, and channels. Note that the ampersand on Mth, Yr, and Version means that a substitution variable is to be used. This means your server or cube has a substitution variable value set, such as the variable Mth = "Jan" and Yr = "FY2018" and Version might be "Working" or "Final" or something similar. I would guess that Sector1 and Sector2 are possibly two different members from the same dimension. #RELATIVE("Source Code", 0) is a function that finds the level-0 members (leaf/bottom-level members in a dimension, that is, members that do not have children below them) of the specified member.
In other words, the first FIX statement is narrowing the scope of the calculation to a particular month in a particular year in a particular version (as opposed to all months, all years, all versions), and for that particular month/year/version (for either Sector1 or Sector2) it is fixing on all of the level-0/bottom/leaf members in Source Code and Channel dimensions.
The next FIX statement just further narrows the current scope of cells to calculate on in addition to the outer FIX. It's not uncommon to see FIX statements nested like this.
Lastly we get to the part where something actually happens: the DATACOPY. In the given FIX context, this DATACOPY command is saying that for EACH cell in the current FIX, copy values from the source to the destination. DATACOPY is a little more straightforward when it's just DATACOPY "Source" TO "Target" as opposed to using the inter dimensional operator (->)... but this is perhaps more easily understood in terms of the time/year dimensions. For example, imagine the data copy was written like this:
DATACOPY "FY2018"->"Dec" TO "FY2019"->"Jan";
In this DATACOPY I'd be telling Essbase that for the given FIX context I would like to copy values from the end of the year (data values where the year is FY2018 AND the month is Dec) to the beginning of the next year (data values where the year is FY2019 AND the month is Jan). Your DATACOPY is working in a similar fashion, but using cost centers or something else. It all just depends on how the cube is setup.

Excel VBA - Random extra grouping levels

This is not really an issue that affects the code but rather a question of the table's appearance.
So, the table is the summary of records for income and expenses of different business departments. Let's call each department a type of the record. Each of those types has subtype1. Each subtype1 has subtypes2 and each subtype2 has subtypes3.
So the sample data would be something like this.
1, Type1, sum of subtypes1
1.1, Subtype1, sum of subtypes2
1.1.1 Subtype2, sum of subtypes3
1.1.1.1 Subtype3, amount
1.1.1.2 Subtype3, amount
1.2, Subtype1, sum of subtypes2
1.2.1, Subtype2, sum of subtypes3
1.2.1.1, Subtype3, amount
Each subtype can have different number of "children subtypes". Children subtypes can't go further than subtype3.
Then I am using VBA script to group the records of the same subtype under their direct parent up to the main type. Everything works fine, I can expand or hide every single level of this structure.
However, logically the group outline on the left side of the table for rows should show 4 levels. Instead it shows 8 levels of groups. First 4 do exactly what you would expect, show or hide respective subtypes while the other 4 levels do absolutely nothing which is also expected because I don't see a reason for them to be there.
Any ideas why extra levels have been created and how to get rid of them?
I might have explained this in a not very clear way so feel free to ask for further information.
Try stepping through your code in trace mode to watch the groups being set up. (open the VBA window and use the F8 key to loop one line at a time)
This may reveal why the extra groups are being defined and suggest what to change.