Tableau - Conditions on LOD Expression - conditional-statements

I have a purchase_log table which has the following fields:
uid, date, category, amount
And I would like to know the first and second purchases dates for each user of each category.
For example:
+-----+------+----------+--------+
| uid | date | category | amount |
+-----+------+----------+--------+
| A | d1 | c1 | 100 |
| A | d2 | c2 | 200 |
| A | d3 | c1 | 120 |
| A | d4 | c2 | 300 |
+-----+------+----------+--------+
For above user records, I would like to say the first purchase from category c1 is made on date d1, and second purchase from category c1 is made on date d3.
I currently created 3 calculated fields:
1st purchase:
{ FIXED [uid] : MIN([date])}
Repeat purchase:
IIF([date]>[1st Purchase],[date],null)
2nd purchase:
{ FIXED [uid] : MIN([Repeat Purchase])}
But since there is no distinction between categories, I'm not able to see dates with respect to categories.
How should I solve this problem?
Thanks.

You can do so by LODing on both uid and the category.
1st purchase:
{ FIXED [uid],[category] : MIN([date])}
Repeat purchase:
{ FIXED [uid],[category]: IIF([date]>[1st Purchase],[date],null)}
2nd purchase:
{ FIXED [uid],[category] : MIN([Repeat Purchase])}

Related

SQL/Power BI Joins without common column

So I have the following problem:
I have 2 tables, one containing different bids for a product_type, and one containing the price, date etc. to which the product was sold.
The tables look like this:
Table bids:
+----------+---------------------+---------------------+--------------+-------+
| Bid_id | Start_time | End_time | Product_type | price |
+----------+---------------------+---------------------+--------------+-------+
| 1 | 18.01.2020 06:00:00 | 18.01.2020 06:02:33 | blue | 5 € |
| 2 | 18.01.2020 06:00:07 | 18.01.2020 06:00:43 | blue | 7 € |
| 3 | 18.01.2020 06:01:10 | 19.01.2020 15:03:15 | red | 3 € |
| 4 | 18.01.2020 06:02:20 | 18.01.2020 06:05:44 | blue | 6 € |
| | | | | |
+----------+---------------------+---------------------+--------------+-------+
Table sells:
+---------+---------------------+--------------+--------+
| Sell_id | Sell_time | Product_type | Price |
+---------+---------------------+--------------+--------+
| 1 | 18.01.2020 06:00:31 | Blue | 6,50 € |
| 2 | 18:01.2020 06:51:03 | Red | 2,50 € |
| | | | |
+---------+---------------------+--------------+--------+
The sell_id and the bid_id have no relation with each other.
What I want to find out is, what is the maximum bid to the time we sold the product_type. So if we take sell_id 1, it should check, which bids for this specific product_type were active during the sell_time (in this case bid_id 1 and 2) and give back the higher price (in this case bid_id 2).
I tried to solve this problem in Power Bi, however, I was not able to get a solution. I assume, that I have to work with SQL-Joins to solve it.
Is it possible, to join based on criteria instead of matching columns? Something like:
SELECT bids.start_time, bids.end_time, bids.product_type, MAX(bids.price), sells.sell_time, sells.product_type, sells.price
FROM sells
INNER JOIN bids ON bids.start_time<sells.sell_time AND bids.end_time > sells.sell_time;
I am sorry if this question is confusing, I am still new to this sorry. Thanks in advance for ANY help!
Your sample data Sell_time should be 18.01.2020, right? You Can try this code (can be resource-intensive in relation to the amount of data due to Cartesian joins). If you are sure that Sell day is always in Bid Start day, then you can add date column to yours tables and use additional TREATAS(VALUE(bids[day], sells[day])
Test =
VAR __tretasfilter =
TREATAS ( VALUES ( bids[Product_type] ), sells[Product_type] )
RETURN
SUMMARIZE (
FILTER (
SUMMARIZECOLUMNS (
sells[Sell_id],
bids[Price],
bids[Start_time],
sells[Sell_time],
bids[End_time],
sells[Product_type],
__tretasfilter
),
[Start_time] <= [Sell_time]
&& [End_time] >= [Sell_time]
),
sells[Sell_id],
"MaxPrice", MAX ( bids[Price] )
)

Returning singular row/value from joined table date based on closest date

I have a Production Table and a Standing Data table. The relationship of Production to Standing Data is actually Many-To-Many which is different to how this relationship is usually represented (Many-to-One).
The standing data table holds a list of tasks and the score each task is worth. Tasks can appear multiple times with different "ValidFrom" dates for changing the score at different points in time. What I am trying to do is query the Production Table so that the TaskID is looked up in the table and uses the date it was logged to check what score it should return.
Here's an example of how I want the data to look:
Production Table:
+----------+------------+-------+-----------+--------+-------+
| RecordID | Date | EmpID | Reference | TaskID | Score |
+----------+------------+-------+-----------+--------+-------+
| 1 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 2 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 3 | 30/02/2020 | 1 | 123 | 1 | 2 |
| 4 | 31/02/2020 | 1 | 123 | 1 | 2 |
+----------+------------+-------+-----------+--------+-------+
Standing Data
+----------+--------+----------------+-------+
| RecordID | TaskID | DateActiveFrom | Score |
+----------+--------+----------------+-------+
| 1 | 1 | 01/02/2020 | 1.5 |
| 2 | 1 | 28/02/2020 | 2 |
+----------+--------+----------------+-------+
I have tried the below code but unfortunately due to multiple records meeting the criteria, the production data duplicates with two different scores per record:
SELECT p.[RecordID],
p.[Date],
p.[EmpID],
p.[Reference],
p.[TaskID],
s.[Score]
FROM ProductionTable as p
LEFT JOIN StandingDataTable as s
ON s.[TaskID] = p.[TaskID]
AND s.[DateActiveFrom] <= p.[Date];
What is the correct way to return the correct and singular/scalar Score value for this record based on the date?
You can use apply :
SELECT p.[RecordID], p.[Date], p.[EmpID], p.[Reference], p.[TaskID], s.[Score]
FROM ProductionTable as p OUTER APPLY
( SELECT TOP (1) s.[Score]
FROM StandingDataTable AS s
WHERE s.[TaskID] = p.[TaskID] AND
s.[DateActiveFrom] <= p.[Date]
ORDER BY S.DateActiveFrom DESC
) s;
You might want score basis on Record Level if so, change the where clause in apply.

How do I rank customer visits by date in PowerPivot?

I have a table with customers' transactions named "purchases" with fields like this:
--------------------------------------------------
| title | price |qty| client_id | created_at |
--------------------------------------------------
| product A | 100 | 1 | 1 | 01.01.2010 |
| product B | 120 | 2 | 1 | 05.01.2010 |
| product B | 120 | 1 | 2 | 08.01.2010 |
When I create a calc column for total purchase count, it works great:
=calculate(DISTINCTCOUNT([created_at]);ALLEXCEPT(purchases;purchases[client_id]))
but when I try to calculate the number of each exact customer visit (or rank) with the formula
=calculate(DISTINCTCOUNT([created_at]);filter(purchases;purchases[created_at]<=earlier([created_at]));ALLEXCEPT(purchases;purchases[client_id]))
it calculates the number of visit regadless the current client_id, it ignores the ALLEXCEPT part of the filter.
How can I fix it?
I also tried to solve it with RANKX but the issue was similar: i don't know how to filter according to the current client_id.
not sure how, but it works :))
=CALCULATE (DISTINCTCOUNT ( [created_at] );FILTER (ALL ( purchases ); [client_id] = EARLIER ( [client_id] ) && [created_at]<=EARLIER([created_at])))
got this hint on Facebook. hope it helps someone.

SQL deleting rows with duplicate dates conditional upon values in two columns

I have data on approx 1000 individuals, where each individual can have multiple rows, with multiple dates and where the columns indicate the program admitted to and a code number.
I need each row to contain a distinct date, so I need to delete the rows of duplicate dates from my table. Where there are multiple rows with the same date, I need to keep the row that has the lowest code number. In the case of more than one row having both the same date and the same lowest code, then I need to keep the row that also has been in program (prog) B. For example;
| ID | DATE | CODE | PROG|
--------------------------------
| 1 | 1996-08-16 | 24 | A |
| 1 | 1997-06-02 | 123 | A |
| 1 | 1997-06-02 | 123 | B |
| 1 | 1997-06-02 | 211 | B |
| 1 | 1997-08-19 | 67 | A |
| 1 | 1997-08-19 | 23 | A |
So my desired output would look like this;
| ID | DATE | CODE | PROG|
--------------------------------
| 1 | 1996-08-16 | 24 | A |
| 1 | 1997-06-02 | 123 | B |
| 1 | 1997-08-19 | 23 | A |
I'm struggling to come up with a solution to this, so any help greatly appreciated!
Microsoft SQL Server 2012 (X64)
The following works with your test data
SELECT ID, date, MIN(code), MAX(prog) FROM table
GROUP BY date
You can then use the results of this query to create a new table or populate a new table. Or to delete all records not returned by this query.
SQLFiddle http://sqlfiddle.com/#!9/0ebb5/5
You can use min() function: (See the details here)
select ID, DATE, min(CODE), max(PROG)
from table
group by DATE
I assume that your table has a valid primary key. However i would recommend you to take IDas Primary key. Hope this would help you.

How to Count the same field with different criteria on the same Query

I have a database like this
| Contact | Incident | OpenTime | Country | Product |
| C1 | | 1/1/2014 | MX | Office |
| C2 | I1 | 2/2/2014 | BR | SAP |
| C3 | | 3/2/2014 | US | SAP |
| C4 | I2 | 3/3/2014 | US | SAP |
| C5 | I3 | 3/4/2014 | US | Office |
| C6 | | 3/5/2014 | TW | SAP |
I want to run a query with criteria on country and and open time, and I want to receive back something like this:
| Product | Contacts with | Incidents |
| | no Incidents | |
| Office | 1 | 1 |
| SAP | 2 | 2 |
I can easily get one part to work with a query like
SELECT Service, count(
FROM database
WHERE criterias AND Incident is Null //(or Not Null) depending on the row
GROUP BY Product
What I am struggling to do is counting Incident is Null, and Incident is not Null on the same table as a result of the same query as in the example above.
I have tried the following
SELECT Service AS Service,
(SELECT count Contacts FROM Database Where Incident Is Null) as Contact,
(SELECT count Contacts FROM Database Where Incident Is not Null) as Incident
FROM database
WHERE criterias AND Incident is Null //(or Not Null) depending on the row
GROUP BY Product
The issue I have with the above sentence is that whatever criteria I use on the "main" select are ignored by the nested Selects.
I have tried using UNION ALL as well, but did not managed to make it work.
Ultimately I resolved it with this approach: I counted the total contacts per product, counted the numbers of incidents and added a calculated field with the result
SELECT Service, COUNT (Contact) AS Total, COUNT (Incident) as Incidents,
(Total - Incident) as Only Contact
From Database
Where <criterias>
GROUP BY Service
Although I make it work, I am still sure that there is a more elegant approach for it.
How can I retrieve the different counting on the same column with different count criteria in one query?
Just use conditional aggregation:
SELECT Product,
SUM(IIF(incident is not null, 1, 1)) as incidents,
SUM(IIF(incident is null, 1, 1)) as noincidents
FROM database
WHERE criterias
GROUP BY Product;
Possibly a very MS Access solution would suit:
TRANSFORM Count(tmp.Contact) AS CountOfContact
SELECT tmp.Product
FROM tmp
GROUP BY tmp.Product
PIVOT IIf(Trim([Incident] & "")="","No Incident","Incident");
This IIf(Trim([Incident] & "")="" covers all possibilities of Null string, Null and space filled.
tmp is the name of the table.