How to add total sum of each rows within group with DAX Power BI - sum

I am trying to give rank column of every group which repeating in every rows within the group of the original table but not the shape of after sum-up.
The formula i found in another site but it show an error :
https://intellipaat.com/community/9734/rank-categories-by-sum-power-bi
Table1
+-----------+------------+-------+
| product | date | sales |
+-----------+------------+-------+
| coffee | 11/03/2019 | 15 |
| coffee | 12/03/2019 | 10 |
| coffee | 13/03/2019 | 28 |
| coffee | 14/03/2019 | 1 |
| tea | 11/03/2019 | 5 |
| tea | 12/03/2019 | 2 |
| tea | 13/03/2019 | 6 |
| tea | 14/03/2019 | 7 |
| Chocolate | 11/03/2019 | 30 |
| Chocolate | 11/03/2019 | 4 |
| Chocolate | 11/03/2019 | 15 |
| Chocolate | 11/03/2019 | 10 |
+-----------+------------+-------+
The Goal
+-----------+------------+-------+-----+------+
| product | date | sales | sum | rank |
+-----------+------------+-------+-----+------+
| coffee | 11/03/2019 | 15 | 54 | 5 |
| coffee | 12/03/2019 | 10 | 54 | 5 |
| coffee | 13/03/2019 | 28 | 54 | 5 |
| coffee | 14/03/2019 | 1 | 54 | 5 |
| tea | 11/03/2019 | 5 | 20 | 9 |
| tea | 12/03/2019 | 2 | 20 | 9 |
| tea | 13/03/2019 | 6 | 20 | 9 |
| tea | 14/03/2019 | 7 | 20 | 9 |
| Chocolate | 11/03/2019 | 30 | 59 | 1 |
| Chocolate | 11/03/2019 | 4 | 59 | 1 |
| Chocolate | 11/03/2019 | 15 | 59 | 1 |
| Chocolate | 11/03/2019 | 10 | 59 | 1 |
+-----------+------------+-------+-----+------+
The script
sum =
SUMX(
FILTER(
Table1;
Table1[product] = EARLIER(Table1[product])
);
Table1[sales]
)
The Error :
EARLIER(Table1[product]) # Parameter is not correct type cannot find name 'product'
What's wrong with the script above ?
* not able to test this script:
rank = RANKX( ALL(Table1); Table1[sum]; ;; "Dense" )
before fixed the sum approach

The script is designed for a calculated column, not a measure. If you enter it as a measure, EARLIER has no "previous" row context to refer to, and gives you the error.
Create a measure:
Total Sales = SUM(Table1[sales])
This measure will be used to show sales.
Create another measure:
Sales by Product =
SUMX(
VALUES(Table1[product]);
CALCULATE([Total Sales]; ALL(Table1[date]))
)
This measure will show sales by product ignoring dates.
Third measure:
Sale Rank =
RANKX(
ALL(Table1[product]; Table1[date]);
[Sales by Product];;DESC;Dense)
Create a report with product and dates on a pivot, and drop all 3 measures into it. Result:
Tweak RANKX parameters to change the ranking mode, if necessary.

Related

What's the shortest method to generate a column of numbers for queries instead of having to count each rows in SQL?

I'm going through some practice questions and had a question asking for number of rows shown as the result of my query and found myself counting each rows for it and thought it was inefficient.
How do I create a new column that numbers the rows from 1 to number of rows?
If my query is as follows,
SELECT *
FROM invoices
WHERE BillingCountry = 'Germany' AND Total > 5
then the result is:
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
| InvoiceId | CustomerId | InvoiceDate | BillingAddress | BillingCity | BillingState | BillingCountry | BillingPostalCode | Total |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
| 12 | 2 | 2009-02-11 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 13.86 |
| 40 | 36 | 2009-06-15 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 13.86 |
| 52 | 38 | 2009-08-08 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 5.94 |
| 67 | 2 | 2009-10-12 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 8.91 |
| 95 | 36 | 2010-02-13 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 8.91 |
| 138 | 37 | 2010-08-23 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 13.86 |
| 193 | 37 | 2011-04-23 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 14.91 |
| 236 | 38 | 2011-10-31 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 13.86 |
| 241 | 2 | 2011-11-23 00:00:00 | Theodor-Heuss-Straße 34 | Stuttgart | None | Germany | 70174 | 5.94 |
| 269 | 36 | 2012-03-26 00:00:00 | Tauentzienstraße 8 | Berlin | None | Germany | 10789 | 5.94 |
| 291 | 38 | 2012-06-30 00:00:00 | Barbarossastraße 19 | Berlin | None | Germany | 10779 | 8.91 |
| 367 | 37 | 2013-06-03 00:00:00 | Berger Straße 10 | Frankfurt | None | Germany | 60316 | 5.94 |
+-----------+------------+---------------------+-------------------------+-------------+--------------+----------------+-------------------+-------+
There are 12 rows of information pulled from a dataset, but I only realized it after manually counting the rows.
What can I add in my query that can add a column in the left-most side of the result that shows numbers 1 through 12 for each rows like how Excel would show it as and is there a way to do the same for the columns but in an alphabetical order?
I would use ROW_NUMBER function:
SELECT ROW_NUMBER() OVER (ORDER BY BillingAddress, BillingCity) AS RN, *
FROM invoices
WHERE BillingCountry = 'Germany' AND Total > 5

SQL window excluding current group?

I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL).
I want the following data to be aggregated
+------------+---------+--------+
| date | product | rating |
+------------+---------+--------+
| 2018-01-01 | A | 1 |
| 2018-01-02 | A | 3 |
| 2018-01-20 | A | 4 |
| 2018-01-27 | A | 5 |
| 2018-01-29 | A | 4 |
| 2018-02-01 | A | 5 |
| 2017-01-09 | B | NULL |
| 2017-01-12 | B | 3 |
| 2017-01-15 | B | 4 |
| 2017-01-28 | B | 4 |
| 2017-07-21 | B | 2 |
| 2017-09-21 | B | 5 |
| 2017-09-13 | C | 3 |
| 2017-09-14 | C | 4 |
| 2017-09-15 | C | 5 |
| 2017-09-16 | C | 5 |
| 2018-04-01 | C | 2 |
| 2018-01-13 | D | 1 |
| 2018-01-14 | D | 2 |
| 2018-01-24 | D | 3 |
| 2018-01-31 | D | 4 |
+------------+---------+--------+
Aggregated results:
+------+-------+---------+----+------------+------------------+----------+
| year | month | product | ct | avg_rating | avg_rating_other | other_ct |
+------+-------+---------+----+------------+------------------+----------+
| 2018 | 1 | A | 5 | 3.4 | 2.5 | 4 |
| 2018 | 2 | A | 1 | 5 | NULL | 0 |
| 2017 | 1 | B | 4 | 3.6666667 | NULL | 0 |
| 2017 | 7 | B | 1 | 2 | NULL | 0 |
| 2017 | 9 | B | 1 | 5 | 4.25 | 4 |
| 2017 | 9 | C | 4 | 4.25 | 5 | 1 |
| 2018 | 4 | C | 1 | 2 | NULL | 0 |
| 2018 | 1 | D | 4 | 2.5 | 3.4 | 5 |
+------+-------+---------+----+------------+------------------+----------+
I've also considered producing two aggregates, one with the product in question and one without, but having trouble with creating the appropriate joining key.
You can do:
select year(date), month(date), product,
count(*) as ct, avg(rating) as avg_rating,
sum(count(*)) over (partition by year(date), month(date)) - count(*) as ct_other,
((sum(sum(rating)) over (partition by year(date), month(date)) - sum(rating)) /
(sum(count(*)) over (partition by year(date), month(date)) - count(*))
) as avg_other
from t
group by year(date), month(date), product;
The rating for the "other" is a bit tricky. You need to add everything up and subtract out the current row -- and calculate the average by doing the sum divided by the count.

Outer Join multible tables keeping all rows in common colums

I'm quite new to SQL - hope you can help:
I have several tables that all have 3 columns in common: ObjNo, Date(year-month), Product.
Each table has 1 other column, that represents an economic value (sales, count, netsales, plan ..)
I need to join all tables on the 3 common columns giving. The outcome must have one row for each existing combination of the 3 common columns. Not every combination exists in every table.
If I do full outer joins, I get ObjNo, Date, etc. for each table, but only need them once.
How can I achieve this?
+--------------+-------+--------+---------+-----------+
| tblCount | | | | |
+--------------+-------+--------+---------+-----------+
| | ObjNo | Date | Product | count |
| | 1 | 201601 | Snacks | 22 |
| | 2 | 201602 | Coffee | 23 |
| | 4 | 201605 | Tea | 30 |
| | | | | |
| tblSalesPlan | | | | |
| | ObjNo | Date | Product | salesplan |
| | 1 | 201601 | Beer | 2000 |
| | 2 | 201602 | Sancks | 2000 |
| | 5 | 201605 | Tea | 2000 |
| | | | | |
| | | | | |
| tblSales | | | | |
| | ObjNo | Date | Product | Sales |
| | 1 | 201601 | Beer | 1000 |
| | 2 | 201602 | Coffee | 2000 |
| | 3 | 201603 | Tea | 3000 |
+--------------+-------+--------+---------+-----------+
Thx
Devon
It sounds like you're using SELECT * FROM... which is giving you every field from every table. You probably only want to get the values from one table, so you should be explicit about which fields you want to include in the results.
If you're not sure which table is going to have a record for each case (i.e. there is not guaranteed to be a record in any particular table) you can use the COALESCE function to get the first non-null value in each case.
SELECT COALESCE(tbl1.ObjNo, tbl2.ObjNo, tbl3.ObjNo) AS ObjNo, ....
tbl1.Sales, tbl2.Count, tbl3.Netsales

Creating a SSRS report from 2 Tables

I Have two tables in a sql server database.
Here's my First table,Table1
+------------+------------------+----------------+
| Project ID | Project Manager | Approved Hours |
+------------+------------------+----------------+
| 1 | Mr.A | 120 |
| 2 | Mr.B | 100 |
+------------+------------------+----------------+
Here's my Second Table,Table 2
+-----------+-----------------+-----------+----------+---------------+
| ProjectID | Project Manager | Personnel | Week No. | Working Hours |
+-----------+-----------------+-----------+----------+---------------+
| 1 | Mr.A | Tom | 1 | 20 |
| 1 | Mr.A | Tom | 2 | 20 |
| 1 | Mr.A | Tom | 3 | 10 |
| 1 | Mr.A | Harry | 1 | 20 |
| 1 | Mr.A | Harry | 2 | 20 |
| 1 | Mr.A | Harry | 3 | 20 |
| 2 | Mr.B | Tom | 1 | 20 |
| 2 | Mr.B | Tom | 2 | 10 |
| 2 | Mr.B | Tom | 3 | 20 |
| 2 | Mr.B | Harry | 1 | 20 |
| 2 | Mr.B | Harry | 2 | 15 |
+-----------+-----------------+-----------+----------+---------------+
I would like to create a ssrs report that looks like this.I'm using the 2012 version.
Actual Hours being the sum of working Hours for each Project.
+------------+-----------------+----------------+--------------+
| Project ID | Project Manager | Approved Hours | Actual Hours |
+------------+-----------------+----------------+--------------+
| 1 | Mr.A | 120 | 110 |
| 2 | Mr.B | 100 | 85 |
+------------+-----------------+----------------+--------------+
I'm kind of new to SQL, Can I get this done with a single query.
As #jarlh Suggest simply do INNER JOIN & group by as below :
SELECT T.[Project ID],
T.[Project Manager],
T.[Approved Hours],
SUM(T1.[Working Hours]) [Actual Hours]
FROM Table1 T
INNER JOIN Table2 T1 ON T.[Project ID] = T1.[Project ID]
GROUP BY T.[Project ID],
T.[Project Manager],
T.[Approved Hours];
Result :
+------------+-----------------+----------------+--------------+
| Project ID | Project Manager | Approved Hours | Actual Hours |
+------------+-----------------+----------------+--------------+
| 1 | Mr.A | 120 | 110 |
| 2 | Mr.B | 100 | 85 |
+------------+-----------------+----------------+--------------+

Rolling total with no sub-select and no vendor specific extensions

What I'm trying to achieve: rolling total for quantity and amount for a given day, grouped by hour.
It's easy in most cases, but if you have some additional columns (dir and product in my case) and you don't want to group/filter on them, that's a problem.
I know there are extensions in Oracle and MSSQL specifically for that, and there's SELECT OVER PARTITION in Postgres.
At the moment I'm working on an app prototype, and it's backed by MySQL, and I have no idea what it will be using in production, so I'm trying to avoid vendor lock-in.
The entrire table:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
ORDER BY date, hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2230 | 65 | ABCDEDF | 2014-09-11 | 1 | 1 | 10 |
| 2231 | 64 | ABCDEDF | 2014-09-11 | 3 | 4 | 40 |
| 2232 | 64 | ABCDEDF | 2014-09-11 | 5 | 5 | 50 |
| 2235 | 64 | ZZ | 2014-09-11 | 7 | 6 | 60 |
| 2233 | 64 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2237 | 66 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
For a given date:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
WHERE date = '2014-09-18'
ORDER BY hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
The results that I need, using sub-select:
> SELECT date, hour, SUM(quantity),
( SELECT SUM(quantity) FROM sales s2
WHERE s2.hour <= s1.hour AND s2.date = s1.date
) AS total
FROM sales s1
WHERE s1.date = '2014-09-18'
GROUP by date, hour;
+------------+------+---------------+-------+
| date | hour | sum(quantity) | total |
+------------+------+---------------+-------+
| 2014-09-18 | 3 | 3 | 3 |
| 2014-09-18 | 5 | 2 | 5 |
| 2014-09-18 | 7 | 3 | 8 |
+------------+------+---------------+-------+
My concerns for using sub-select:
once there are round million records in the table, the query may become too slow, not sure if it's subject to optimizations even though it has no HAVING statements.
if I had to filter on a product or dir, I will have to put those conditions to both main SELECT and sub-SELECT too (WHERE product = / WHERE dir =).
sub-select only counts a single sum, while I need two of them (sum(quantity) и sum(amount)) (ERROR 1241 (21000): Operand should contain 1 column(s)).
The closest result I were able to get using JOIN:
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount, s2.id
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+------+
| ih | date | hour | quantity | amount | id |
+----+------------+------+----------+--------+------+
| 3 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 3 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 3 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 5 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 5 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 7 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 7 | 2014-09-18 | 7 | 3 | 300 | 2229 |
| 7 | 2014-09-18 | 3 | 1 | 11 | 2234 |
+----+------------+------+----------+--------+------+
I could stop here and just use those results to group by ih (hour), calculate the sum for quantity and amount and be happy. But something eats me up telling that this is wrong.
If I remove DISTINCT most rows become to be duplicated. Replacing JOIN with its invariants doesn't help.
Once I remove s2.id from statement you get a complete mess with disappearing/collapsion meaningful rows (e.g. ids 2236/2227 got collapsed):
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+
| ih | date | hour | quantity | amount |
+----+------------+------+----------+--------+
| 3 | 2014-09-18 | 3 | 1 | 100 |
| 3 | 2014-09-18 | 3 | 1 | 11 |
| 5 | 2014-09-18 | 3 | 1 | 100 |
| 5 | 2014-09-18 | 5 | 2 | 200 |
| 5 | 2014-09-18 | 3 | 1 | 11 |
| 7 | 2014-09-18 | 3 | 1 | 100 |
| 7 | 2014-09-18 | 5 | 2 | 200 |
| 7 | 2014-09-18 | 7 | 3 | 300 |
| 7 | 2014-09-18 | 3 | 1 | 11 |
+----+------------+------+----------+--------+
Summing doesn't help, and it adds up to the mess.
First row (hour = 3) should have SUM(s2.quantity) equal 3, but it has 9. What does SUM(s1.quantity) shows is a complete mystery to me.
> SELECT DISTINCT(s1.hour) AS hour, sum(s1.quantity), s2.date, SUM(s2.quantity)
FROM sales s1 JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
GROUP BY hour;
+------+------------------+------------+------------------+
| hour | sum(s1.quantity) | date | sum(s2.quantity) |
+------+------------------+------------+------------------+
| 3 | 9 | 2014-09-18 | 9 |
| 5 | 8 | 2014-09-18 | 5 |
| 7 | 15 | 2014-09-18 | 8 |
+------+------------------+------------+------------------+
Bonus points/boss level:
I also need a column that will show total_reference, the same rolling total for the same periods for a different date (e.g. 2014-09-11).
If you want a cumulative sum in MySQL, the most efficient way is to use variables:
SELECT date, hour,
(#q := q + #q) as cumeq, (#a := a + #a) as cumea
FROM (SELECT date, hour, SUM(quantity) as q, SUM(amount) as a
FROM sales s
WHERE s.date = '2014-09-18'
GROUP by date, hour
) dh cross join
(select #q := 0, #a := 0) vars
ORDER BY date, hour;
If you are planning on working with databases such as Oracle, SQL Server, and Postgres, then you should use a database more similar in functionality and that supports that ANSI standard window functions. The right way to do this is with window functions, but MySQL doesn't support those. Postgres, SQL Server, and Oracle all have free versions that yo can use for development purposes.
Also, with proper indexing, you shouldn't have a problem with the subquery approach, even on large tables.