DAX Query to get average of a column within the same table - powerpivot

We have a table named MetricsTable which has columns A1 and Group simply.
We want to add a calculated column AvgA1 to this table which calculates the average of column A1 filtered by the value of Group . What should be our DAX query? The point is that we want to claculate the average from the values within the same table.
| id | A1| Group | AvgA1
| -- | --- | --- ------| ----
| 1 | 20 | Group1| 20
| 2 | 10 | Group2| 30
| 3 | 50 | Group2| 30
| 4 | 30 | Group2| 30
| 5 | 35 | Group3| 35
Regards

Likely you should use a measure and put that measure into a pivot table's 'Values' section:
AverageA1:=
AVERAGE( Metrics[A1] )
Then it will be updated based on filter and slicer selections in the pivot table, and subtotaled appropriately across various dimension categories.
If it strictly needs to be a column in the table for reasons not enumerated in your question, then the following will work:
AverageA1 =
CALCULATE(
AVERAGE( Metrics[A1] )
,ALLEXCEPT( Metrics, Metrics[Group] )
)
CALCULATE() takes an expression and a list of 0-N arguments to modify the filter context in which that expression is evaluated.
ALLEXCEPT() takes a table, and a list of 1-N fields from which to preserve context. The current (row) context in the evaluation of this column definition is the value of every field on that row. We remove the context from ALL fields EXCEPT those named in arguments 2-N of ALLEXCEPT(). Thus we preserve the row context of [Group], and calculate an average across the table where that [Group] is the same as in the current context.

Related

How to sum the minutes of each activity in Postgresql?

The column "activitie_time_enter" has the times.
The column "activitie_still" indicates the type of activity.
The column "activitie_walking" indicates the other type of activity.
Table example:
activitie_time_enter | activitie_still | activitie_walking
17:30:20 | Still |
17:31:32 | Still |
17:32:24 | | Walking
17:33:37 | | Walking
17:34:20 | Still |
17:35:37 | Still |
17:45:13 | Still |
17:50:23 | Still |
17:51:32 | | Walking
What I need is to sum up the total minutes for each activity separately.
Any suggestions or solution?
First calculate the duration for each activity (the with CTE) and then do conditional sum.
with t as
(
select
*, lead(activitie_time_enter) over (order by activitie_time_enter) - activitie_time_enter as duration
from _table
)
select
sum (duration) filter (where activitie_still = 'Still') as total_still,
sum (duration) filter (where activitie_walking = 'Walking') as total_walking
from t;
/** Result:
total_still|total_walking|
-----------+-------------+
00:19:16| 00:01:56|
*/
BTW do you really need two columns (activitie_still and activitie_walking)? Only one activity column with those values will do. This will allow more activities (Running, Sleeping, Working etc.) w/o having to change the table structure.

Dynamically determine and categorize duplicates in Tableau

I have a set that has the following structure:
ID | Date | DollarAmount
1 | Jan | 50
1 | Jan | 20
2 | Jan | 10
1 | Feb | 20
2 | Feb | 10
I am trying to dynamically be able to determine if for a particular period in time there is a duplicate based on the ID column.
For example, based on the data above, I would ideally have
I have tried to filter based on Number of Records but it shows filters out based on the TOTAL observations across the dataset, not date ranges.
Any help is much appreciated
Thanks!
Apparently you define a duplicate records as those that have the same value for the ID and Date fields, where Date is really a string containing the abbreviation for the month name.
In that case, define a (Boolean valued) LOD calculated field called [Duplicates] as {FIXED [ID], [Date] : Count(1) > 1}
Place [Duplicates] on the color shelf, Sum([Dollar Amount]) on rows and [Date] on Columns.
You will see the values True and False on the Color Legend. You can edit the aliases for those values if you want to display a more clear label such Duplicates, Non-Duplicates
If you have a true date valued field instead of a string, you may want to use DateTrunc() to define your duplicate test at the level of granularity that matches your problem.

Searching a "vertical" table in SQLite

Tables are usually laid out in a "horizontal" fashion:
+-----+----+----+--------+
|recID|FirstName|LastName|
+-----+----+----+--------+
| 1 | Jim | Jones |
+-----+----+----+--------+
| 2 | Adam | Smith |
+-----+----+----+--------+
Here, however, is a table with the same data in a "vertical" layout:
+-----+-----+----+-----+-------+
|rowID|recID| Property | Value |
+-----+-----+----+-----+-------+
| 1 | 1 |FirstName | Jim | \
+-----+-----+----+-----+-------+ These two rows constitute a single logical record
| 2 | 1 |LastName | Jones | /
+-----+-----+----+-----+-------+
| 3 | 2 |FirstName | Adam | \
+-----+-----+----+-----+-------+ These two rows are another single logical record
| 4 | 2 |LastName | Smith | /
+-----+-----+----+-----+-------+
Question: In SQLite, how can I search the vertical table efficiently and in such a way that recIDs are not duplicated in the result set? That is, if multiple matches are found with the same recID, only one (any one) is returned?
Example (incorrect):
SELECT rowID from items WHERE "Value" LIKE "J%"
returns of course two rows with the same recID:
1 (Jim)
2 (Jones)
What is the optimal solution here? I can imagine storing intermediate results in a temp table, but hoping for a more efficient way.
(I need to search through all properties, so the SELECT cannot be restricted with e.g. "Property" = "FirstName". The database is maintained by a third-party product; I suppose the design makes sense because the number of property fields is variable.)
To avoid duplicate rows in the result returned by a SELECT, use DISTINCT:
SELECT DISTINCT recID
FROM items
WHERE "Value" LIKE 'J%'
However, this works only for the values that are actually returned, and only for entire result rows.
In the general case, to return one result record for each group of table records, use GROUP BY to create such groups.
For any column that does not appear in the GROUP BY clause, you then have to choose which rowID in the group to return; here we use MIN:
SELECT MIN(rowID)
FROM items
WHERE "Value" LIKE 'J%'
GROUP BY recID
To make this query more efficient, create an index on the recID column.

Access 2007 select first value of query results

I am running into a rather annoying thingy in Access (2007) and I am not sure if this is a feature or if I am asking for the impossible.
Although the actual database structure is more complex, my problem boils down to this:
I have a table with data about Units for specific years. This data comes from different sources and might overlap.
Unit | IYR | X1 | Source |
-----------------------------
A | 2009 | 55 | 1 |
A | 2010 | 80 | 1 |
A | 2010 | 101 | 2 |
A | 2010 | 150 | 3 |
A | 2011 | 90 | 1 |
...
Now I would like the user to select certain sources, order them by priority and then extract one data value for each year.
For example, if the user selects source 1, 2 and 3 and orders them by (3, 1, 2), then I would like the following result:
Unit | IYR | X1 | Source |
-----------------------------
A | 2009 | 55 | 1 |
A | 2010 | 150 | 3 |
A | 2011 | 90 | 1 |
I am able to order the initial table, based on a specific order. I do this with the following query
SELECT Unit, IYR, X1, Source
FROM TestTable
WHERE Source In (1,2,3)
ORDER BY Unit, IYR,
IIf(Source=3,1,IIf(Source=1,2,IIf(Source=2,3,4)))
This gives me the following intermediate result:
Unit | IYR | X1 | Source |
-----------------------------
A | 2009 | 55 | 1 |
A | 2010 | 150 | 3 |
A | 2010 | 80 | 1 |
A | 2010 | 101 | 2 |
A | 2011 | 90 | 1 |
Next step is to only get the first value of each year. I was thinking to use the following query:
SELECT X.Unit, X.IYR, first(X.X1) as FirstX1
FROM (...) AS X
GROUP BY X.Unit, X.IYR
Where (…) is the above query.
Now Access goes bananas. Whatever order I give to the intermediate results, the result of this query is.
Unit | IYR | X1 |
--------------------
A | 2009 | 55 |
A | 2010 | 80 |
A | 2011 | 90 |
In other words, for year 2010 it shows the value of source 1 instead of 3. It seems that Access does not care about the ordering of the nested query when it applies the FIRST() function and sticks to the original ordering of the data.
Is this a feature of Access or is there a different way of achieving the desired results?
Ps: Next step would be to use a self join to add the source column to the results again, but I first need to resolve above problem.
Rather than use first it may be better to determine the MIN Priority and then join back e.g.
SELECT
t.UNIT,
t.IYR,
t.X1,
t.Source ,
t.PrioritySource
FROM
(SELECT
Unit,
IYR,
X1,
Source,
SWITCH ( [Source]=3, 1,
[Source]=1, 2,
[Source]=2, 3) as PrioritySource
FROM
TestTable
WHERE
Source In (1,2,3)
) as t
INNER JOIN
(SELECT
Unit,
IYR,
MIN(SWITCH ( [Source]=3, 1,
[Source]=1, 2,
[Source]=2, 3)) as PrioritySource
FROM
TestTable
WHERE
Source In (1,2,3)
GROUP BY
Unit,
IYR ) as MinPriortiy
ON t.Unit = MinPriortiy.Unit and
t.IYR = MinPriortiy.IYR and
t.PrioritySource = MinPriortiy.PrioritySource
which will produce this result (Note I include Source and priority source for demonstration purposes only)
UNIT | IYR | X1 | Source | PrioritySource
----------------------------------------------
A | 2009 | 55 | 1 | 2
A | 2010 | 150 | 3 | 1
A | 2011 | 90 | 1 | 2
Note the first subquery is to handle the fact that Access won't let you join on a Switch
Yes, FIRST() does use an arbitrary ordering. From the Access Help:
These functions return the value of a specified field in the first or
last record, respectively, of the result set returned by a query. If
the query does not include an ORDER BY clause, the values returned by
these functions will be arbitrary because records are usually returned
in no particular order.
I don't know whether FROM (...) AS X means you are using an ORDER BY inline (assuming that is actually possible) or if you are using a VIEW ('stored Query object') here but either way I assume the ORDER BY is being disregarded (because an ORDER BY should only apply to the final result).
The alternative is to use MIN() (or possibly MAX()).
This is the most concise way I have found to write such queries in Access that require pulling back all columns that correspond to the first row in a group of records that are ordered in a particular way.
First, I added a UniqueID to your table. In this case, it's just an AutoNumber field. You may already have a unique value in your table, in which case you can use that.
This will choose the row with a Source 3 first, then Source 1, then Source 2. If there is a tie, it picks the one with the higher X1 value. If there is a further tie, it is broken by the UniqueID value:
SELECT t.* INTO [Chosen Rows]
FROM TestTable AS t
WHERE t.UniqueID=
(SELECT TOP 1 [UniqueID] FROM [TestTable]
WHERE t.IYR=IYR ORDER BY Choose([Source],2,3,1), X1 DESC, UniqueID)
This yields:
Unit IYR X1 Source UniqueID
A 2009 55 1 1
A 2010 150 3 4
A 2011 90 1 5
I recommend (1) you create an index on the IYR field -- this will dramatically increase your performance for this type of query, and (2) if you have a lot (>~100K) records, this isn't the best choice. I find it works quite well for tables in the 1-70K range. For larger datasets, I like to use my GroupIncrement function to partition each group (similar to SQL Server's ROW_NUMBER() OVER statement).
The Choose() function is a VBA function and may not be clear here. In your case, it sounds like there is some interactivity required. For that, you could create a second table called "Choices", like so:
Rank Choice
1 3
2 1
3 2
Then, you could substitute the following:
SELECT t.* INTO [Chosen Rows]
FROM TestTable AS t
WHERE t.UniqueID=(SELECT TOP 1 [UniqueID] FROM
[TestTable] t2 INNER JOIN [Choices] c
ON t2.Source=c.Choice
WHERE t.IYR=t2.IYR ORDER BY c.[Rank], t2.X1 DESC, t2.UniqueID);
Indexing Source on TestTable and Choice on the Choices table may be helpful here, too, depending on the number of choices required.
Q:
Can you get this to work without the need for surrogate key? For
example what if the unique key is the composite of
{Unit,IYR,X1,Source}
A:
If you have a compound key, you can do it like this-- however I think that if you have a large dataset, it will totally kill the performance of the query. It may help to index all four columns, but I can't say for sure because I don't regularly use this method.
SELECT t.* INTO [Chosen Rows]
FROM TestTable AS t
WHERE t.Unit & t.IYR & t.X1 & t.Source =
(SELECT TOP 1 Unit & IYR & X1 & Source FROM [TestTable]
WHERE t.IYR=IYR ORDER BY Choose([Source],2,3,1), X1 DESC, Unit, IYR)
In certain cases, you may have to coalesce some of the individual parts of the key as follows (though Access generally will coalesce values automatically):
t.Unit & CStr(t.IYR) & CStr(t.X1) & CStr(t.Source)
You could also use a query in your FROM statements instead of the actual table. The query itself would build a composite of the four fields used in the key, and then you'd use the new key name in the WHERE clause of the top SELECT statement, and in the SELECT TOP 1 [key] of the subquery.
In general, though, I will either: (a) create a new table with an AutoNumber field, (b) add an AutoNumber field, (c) add an integer and populate it with a unique number using VBA - this is useful when you get a MaxLocks error when trying to add an AutoNumber, or (d) use an already indexed unique key.

How to properly group SQL results set?

SQL noob, please bear with me!!
I am storing a 3-tuple in a database (x,y, {signal1, signal2,..}).
I have a database with tables coordinates (x,y) and another table called signals (signal, coordinate_id, group) which stores the individual signal values. There can be several signals at the same coordinate.
The group is just an abitrary integer which marks the entries in the signal table as belonging to the same set (provided they belong to the same coordinate). So that any signals with the same 'coordinate_id' and 'group' together form a tuple as shown above.
For example,
Coordinates table Signals table
-------------------- -----------------------------
| id | x | y | | id | signal | coordinate_id | group |
| 1 | 1 | 2 | | 1 | 45 | 1 | 1 |
| 2 | 2 | 5 | | 2 | 95 | 1 | 1 |
| 3 | 33 | 1 | 1 |
| 4 | 65 | 1 | 2 |
| 5 | 57 | 1 | 2 |
| 6 | 63 | 2 | 1 |
This would produce the tuples (1,2 {45,95,33}), (1,2,{65,57}), (2,5, {63}) and so on.
I would like to retrieve the sets of {signal1, signal2,...} for each coordinate. The signals belonging to a set have the same coordinate_id and group, but I do not necessarily know the group value. I only know that if the group value is the same for a particular coordinate_id, then all those with that group form one set.
I tried looking into SQL GROUP BY, but I realized that it is for use with aggregate functions.
Can someone point out how to do this properly in SQL or give tips for improving my database structure.
SQLite supports the GROUP_CONCAT() aggregate function similar to MySQL. It rolls up a set of values in the group and concatenates them together comma-separated.
SELECT c.x, c.y, GROUP_CONCAT(s.signal) AS signal_list
FROM Signals s
JOIN Coordinates ON s.coordinate_id = c.id
GROUP BY s.coordinate_id, s.group
SQLite also permits the mismatch between columns in the select-list and columns in the group-by clause, even though this isn't strictly permitted by ANSI SQL and most implementations.
personally I would write the database as 3 tables:
x_y(x, y, id) coords_groups(pos, group, id) signals(group, signal)
with signals.group->coords_groups.id and coords_groups.pos->x_y.id
as you are trying to represent a sort-of 4 dimensional array.
then, to get from a couple of coordinates (X, Y) an ArrayList of List of Signal you can use this
SELECT temp."group", signals.signal
FROM (
SELECT cg."group", cg.id
FROM x_y JOIN coords_groups AS cg ON x_y.id = cg.pos
WHERE x_y.x=X AND x_y.y=Y )
AS temp JOIN signals ON temp.id=signals."group"
ORDER BY temp."group" ASC
(X Y are in the innermost where)
inside this sort of pseudo-code:
getSignalsGroups(X, Y)
ArrayList<List<Signals>> a
List<Signals> temp
query=sqlLiteExecute(THE_SQL_SNIPPET, x, y)
row=query.fetch() //fetch the first row to set the groupCounter
actualGroup=row.group
temp.add(row.signal)
for(row : query) //foreach row add the signal to the list
if(row.group!=actualGroup) //or reset the list if is a new group
a.add(actualGroup, temp)
actualGroup=row.group; temp= new List
temp.add(row.signal)
return a