Get the item with the highest count - datatables

Can you please help me to get the item with the highest count using DAX?
Measure = FIRSTNONBLANK('Table1'[ItemName],CALCULATE(COUNT('Table2'[Instance])))
This shows the First ItemName in the table but doesnt get the ItemName of the Highest Value.
Thanks

Well, it's more complicated than I would have wanted, but here's what I came up with.
There things that you are hoping to do that are not so straightforward in DAX. First, you want an aggregated aggregation ;) -- in this case, the Max of a Count. The second thing is that you want to use a value from one column that you identify by what's in another column. That's row-based thinking and DAX prefers column-based thinking.
So, to do the aggregate of aggregates, we just have to slog through it. SUMMARIZE gives us counts of items. Max and Rank functions could help us find the biggest count, but wouldn't be so useful for getting Item Name. TOP N gives us the whole row where our count is the biggest.
But now we need to get our ItemName out of the row, so SELECTCOLUMNS lets us pick the field to work with. Finally, we really want a value not a 1-column, 1-row table. So FirstNonBlank finishes the job.
Hope it helps.
Here's my DAX
MostFrequentItem =
VAR SummaryTable = SUMMARIZE ( 'Table', 'Table'[ItemName], "CountsByItem", COUNT ( 'Table'[ItemName] ) )
VAR TopSummaryItemRow = TOPN(1, SummaryTable, [CountsByItem], DESC)
VAR TopItem = SELECTCOLUMNS (TopSummaryItemRow, "TopItemName", [ItemName])
RETURN FIRSTNONBLANK (TopItem, [TopItemName])
Here's the DAX without using variables (not tested, sorry. Should be close):
MostFrequentItem_2 =
FIRSTNONBLANK (
SELECTCOLUMNS (
TOPN (
1,
SUMMARIZE ( 'Table', 'Table'[ItemName], "Count", COUNT ( 'Table'[ItemName] ) ),
[Count], DESC
),
"ItemName", [ItemName]
),
[ItemName]
)
Here's the mock data:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcipNSspJTS/NVYrVIZ/nnFmUnJOKznRJzSlJxMlyzi9PSs3JAbODElMyizNQmLEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Stuff = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stuff", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Stuff", "ItemName"}})
in
#"Renamed Columns"

Related

How to return distinct rows while keeping the ordering in a query (SQL Alchemy)

I've been stuck on this for a few days now. An event can have multiple dates, and I want the query to only return the date closest to today (the next date). I have considered querying for Events and then adding a hybrid property to Event that returns the next Event Date but I believe this won't work out (such as if I want to query EventDates in a certain range).
I'm having a problem with distinct() not working as I would expect. Keep in mind I'm not a SQL expert. Also, I'm using postgres.
My query starts like this:
distance_expression = func.ST_Distance(
cast(EventLocation.geo, Geography(srid=4326)),
cast("SRID=4326;POINT(%f %f)" % (lng, lat), Geography(srid=4326)),
)
query = (
db.session.query(EventDate)
.populate_existing()
.options(
with_expression(
EventDate.distance,
distance_expression,
)
)
.join(Event, EventDate.event_id == Event.id)
.join(EventLocation, EventDate.location_id == EventLocation.id)
)
And then I have multiple filters (just showing a few for as an example)
query = query.filter(EventDate.start >= datetime.utcnow)
if kwargs.get("locality_id", None) is not None:
query = query.filter(EventLocation.locality_id == kwargs.pop("locality_id"))
if kwargs.get("region_id", None) is not None:
query = query.filter(EventLocation.region_id == kwargs.pop("region_id"))
if kwargs.get("country_id", None) is not None:
query = query.filter(EventLocation.country_id == kwargs.pop("country_id"))
Then I want to order by date and distance (using my query expression)
query = query.order_by(
EventDate.start.asc(),
distance_expression.asc(),
)
And finally I want to get distinct rows, and only return the next EventDate of an event, according to the ordering in the code block above.
query = query.distinct(Event.id)
The problem is that this doesn't work and I get a database error. This is what the generated SQL looks like:
SELECT DISTINCT ON (events.id) ST_Distance(CAST(event_locations.geo AS geography(GEOMETRY,4326)), CAST(ST_GeogFromText(%(param_1)s) AS geography(GEOMETRY,4326))) AS "ST_Distance_1", event_dates.id AS event_dates_id, event_dates.created_at AS event_dates_created_at, event_dates.event_id AS event_dates_event_id, event_dates.tz AS event_dates_tz, event_dates.start AS event_dates_start, event_dates."end" AS event_dates_end, event_dates.start_naive AS event_dates_start_naive, event_dates.end_naive AS event_dates_end_naive, event_dates.location_id AS event_dates_location_id, event_dates.description AS event_dates_description, event_dates.description_attribute AS event_dates_description_attribute, event_dates.url AS event_dates_url, event_dates.ticket_url AS event_dates_ticket_url, event_dates.cancelled AS event_dates_cancelled, event_dates.size AS event_dates_size
FROM event_dates JOIN events ON event_dates.event_id = events.id JOIN event_locations ON event_dates.location_id = event_locations.id
WHERE events.hidden = false AND event_dates.start >= %(start_1)s AND (event_locations.lat BETWEEN %(lat_1)s AND %(lat_2)s OR false) AND (event_locations.lng BETWEEN %(lng_1)s AND %(lng_2)s OR false) AND ST_DWithin(CAST(event_locations.geo AS geography(GEOMETRY,4326)), CAST(ST_GeogFromText(%(param_2)s) AS geography(GEOMETRY,4326)), %(ST_DWithin_1)s) ORDER BY event_dates.start ASC, ST_Distance(CAST(event_locations.geo AS geography(GEOMETRY,4326)), CAST(ST_GeogFromText(%(param_3)s) AS geography(GEOMETRY,4326))) ASC
I've tried a lot of different things and orderings but I can't work this out. I've also tried to create a subquery at the end using from_self() but it doesn't keep the ordering.
Any help would be much appreciated!
EDIT:
On further experimentation it seems that I can't use order_by will only work if it's ordering the same field that I'm using for distinct(). So
query = query.order_by(EventDate.event_id).distinct(EventDate.event_id)
will work, but
query.order_by(EventDate.start).distinct(EventDate.event_id)
will not :/
I solved this by using adding a row_number column and then filtering by the first row numbers like in this answer:
filter by row_number in sqlalchemy

Select only the row with the max value, but the column with this info is a SUM()

I have the following query:
SELECT DISTINCT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT
FROM TGFCAB CAB, TGFPAR PAR, TSIBAI BAI
WHERE CAB.CODPARC = PAR.CODPARC
AND PAR.CODBAI = BAI.CODBAI
AND CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI
Which the result is this. Company names and neighborhood hid for obvious reasons
The query at the moment, for those who don't understand Latin languages, is giving me clients, company name, company neighborhood, and the total value of movements.
in the WHERE clause it is only filtering sales movements of companies from an established city.
But if you notice in the Select statement, the column that is retuning the value that aggregates the total amount of value of sales is a SUM().
My goal is to return only the company that have the maximum value of this column, if its a tie, display both of em.
This is where i'm struggling, cause i can't seem to find a simple solution. I tried to use
WHERE AMOUNT = MAX(AMOUNT)
But as expected it didn't work
You tagged the question with the whole bunch of different databases; do you really use all of them?
Because, "PL/SQL" reads as "Oracle". If that's so, here's one option.
with temp as
-- this is your current query
(select columns,
sum(vrlnota) as amount
from ...
where ...
)
-- query that returns what you asked for
select *
from temp t
where t.amount = (select max(a.amount)
from temp a
);
You should be able to achieve the same without the need for a subquery using window over() function,
WITH T AS (
SELECT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT,
MAX(VLRNOTA) over() AS MAMOUNT
FROM TGFCAB CAB
JOIN TGFPAR PAR ON PAR.CODPARC = CAB.CODPARC
JOIN TSIBAI BAI ON BAI.CODBAI = PAR.CODBAI
WHERE CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY CAB.CODPARC, PAR.RAZAOSOCIAL, BAI.NOMEBAI
)
SELECT CODPARC, RAZAOSOCIAL, NOMEBAI, AMOUNT
FROM T
WHERE AMOUNT=MAMOUNT
Note it's usually (always) beneficial to join tables using clear explicit join syntax. This should be fine cross-platform between Oracle & SQL Server.

SAP Query IMRG Measure documents

I'm learning SAP queries.
I want to get all the Measure documents from an equipement.
To do that, I use 3 tables :
EQUI, IMPTT, IMRG
The query works but I have all documents instead I only want to get the last one by Date. But I can't do that. I'm sure that I have to add a custom field, but I have tried but none of them works.
For example, my last code :
select min( IMRG~INVTS ) IMRG~RECDV
from IMRG inner join IMPTT on
IMRG~POINT = IMPTT~POINT into (INVTS, IMRGVAL)
where IMRG~POINT = IMPTT-POINT AND
IMPTT~MPOBJ = EQUI-OBJNR
and IMRG~CANCL = '' group by IMRG~MDOCM IMRG~RECDV.
ENDSELECT.
Thanks for your help.
You will need to get the date from IMRG, and the inverted timestamp field, so the MIN() of this will be the most recent - that looks correct.
However your GROUP BY looks wrong. You should be grouping on the IMPTT~POINT field so that you get one record per measurement point. Note that one Point IMPTT can have many measurements (IMRG), so something like this:
SELECT EQUI-OBJNR, IMPTT~POINT, MIN(IMRG~IMRC_INVTS)
...
GROUP BY EQUI-OBJNR, IMPTT~POINT
If I got you correctly, you are trying to get the freshest measurement of the equipment disregard of measurement point. So you can try this query, which is not so beautiful, but it just works.
SELECT objnr COUNT(*) MIN( invts )
FROM equi AS eq
JOIN imptt AS tt
ON tt~mpobj = eq~objnr
JOIN imrg AS ig
ON ig~point = tt~point
INTO (wa_objnr, count, wa_invts)
WHERE ig~cancl = ''
GROUP BY objnr.
SELECT SINGLE recdv FROM imrg JOIN imptt ON imptt~point = imrg~point INTO wa_imrgval WHERE invts = wa_invts AND imptt~mpobj = wa_objnr.
WRITE: / wa_objnr, count, wa_invts, wa_imrgval.
ENDSELECT.

Handling negative values with sql

I have a data set that lists the date and quantity of future stock of products. Occasionally our demand outstrips our future supply and we wind up with a negative future quantity. I need to factor that future negative quantity into previous supply so we don't compound the problem by overselling our supply.
In the following data set, I need to prepare for demand on 10-19 by applying the negative quantity up the chain until i'm left with a positive quantity:
"ID","SKU","DATE","SEASON","QUANTITY"
"1","001","2012-06-22","S12","1656"
"2","001","2012-07-13","F12","1986"
"3","001","2012-07-27","F12","-283"
"4","001","2012-08-17","F12","2718"
"5","001","2012-08-31","F12","-4019"
"6","001","2012-09-14","F12","7212"
"7","001","2012-09-21","F12","782"
"8","001","2012-09-28","F12","2073"
"9","001","2012-10-12","F12","1842"
"10","001","2012-10-19","F12","-12159"
I need to get it to this:
"ID","SKU","DATE","SEASON","QUANTITY"
"1","001","2012-06-22","S12","1656"
"2","001","2012-07-13","F12","152"
I have looked at using a while loop as well as an outer apply but cannot seem to find a way to do this yet. Any help would be much appreciated. This would need to work for sql server 2008 R2.
Here's another example:
"1","002","2012-07-13","S12","1980"
"2","002","2012-08-10","F12","-306"
"3","002","2012-09-07","F12","826"
Would become:
"1","002","2012-07-13","S12","1674"
"3","002","2012-09-07","F12","826"
You don't seem to get a lot of answers - so here's something if you won't get the right 'how-to do it in pure SQL'. Ignore this solution if there's anything SQLish - it's just a defensive coding, not elegant.
If you want to get a sum of all data with same season why deleting duplicate records - just get it outside, run a foreach loop, sum all data with same season value, update table with the right values and delete unnecessary entries. Here's one of the ways to do it (pseudocode):
productsArray = SELECT * FROM products
processed = array (associative)
foreach product in productsArray:
if product[season] not in processed:
processed[season] = product[quantity]
UPDATE products SET quantity = processed[season] WHERE id = product[id]
else:
processed[season] = processed[season] + product[quantity]
DELETE FROM products WHERE id = product[id]
Here is a CROSS APPLY - tested
SELECT b.ID,SKU,b.DATE,SEASON,QUANTITY
FROM (
SELECT SKU,SEASON, SUM(QUANTITY) AS QUANTITY
FROM T1
GROUP BY SKU,SEASON
) a
CROSS APPLY (
SELECT TOP 1 b.ID,b.Date FROM T1 b
WHERE a.SKU = b.SKU AND a.SEASON = b.SEASON
ORDER BY b.ID ASC
) b
ORDER BY ID ASC

Linq to sql, aggregate columns ,group by date into listview

Okay so I have a listview control with 3 columns:
Year | NetTotal | GrossTotal
also i have a table called Orders with several columns, they contain information about the order and store the ID of the Customer they belong to.
Question is: how can i query that table with (preferably) linq (the datacontext is from LinqToSql) to return the following data?
I want to search for any entry with the matching CustomerID which took place, group them by Year, Sum the Totals respectively and add them to the listview?
I now i could use lamda expressions and aggregate, its just not clear how (option infer on,db is a datacontext object,CustomerID is an int32 variable):
Dim Orders = (From order In db.Orders Where order.CustomerID = CustomerID).GroupBy(Function(p) p.Date.Year).GetEnumerator
I reckon i'd have to create an anonymous type like the following:
Dim tmpYears = From prevs In db.Orders Select New With {.CustID = prevs.CustomerID, .Year = prevs.PaymentDate.Year, .NetPurchase, .GrossPurchase}
But how do I aggregate the Purchased column in a group?
Dim CustomerOrders = From ord In db.Orders Where Ord.CustomerID = custID Select ord
Dim tot = From O in CustomerOrders Select Aggregate netTot In O Into Sum(netTot.Price * netTot.Quantity * 1+ (netTot.Discount/100))
I want to merge the two.
Any suggestions? (I've read this but i want it in Linq because its a team project and we agreed on using Linq instead of sending .ExecuteQuerys and etc to the db.Also its a LinqToSQL solution so would be better if i could make some use of it)
I can't guarantee I understand your requirements exactly, but long story short it seems you want to display orders, grouped by year, with the aggregated sums for net/gross value, where the orders match a provided CustomerID?
Sorry if the syntax is slightly out but I'm doing this freehand...
Dim results = db.Orders.Where(Function(n) n.CustomerId = customerId).GroupBy(Function(n) n.Date.Year, Function(key, values) New With {.Year = key, .NetTotal = values.Sum(Function(n) n.NetPurchase * n.Quantity * (1 + (n.Discount/100))), .GrossTotal = values.Sum(Function(n) n.GrossPurchase)})
This should provide you an anonymous type with Year, NetTotal, and GrossTotal populated as per the requirements I listed.
EDIT: Also apologies for the one liner but I'm sure you can reformat it to taste.