Calculate First Time Buyer and Repeating Buyers using MAQL Queries in GOOD-DATA platform - gooddata

I have been recently working with GOOD-DATA platform. I don't have that much experience in MAQL, but I am working on it. I did some metric and reports in GOOD-DATA platform. Recently I tried to create a metric for calculating Total Buyers,First Time Buyers and Repeating Buyers. I created these three reports and working perfect.But when i try to add a order date parent filter the first time buyers and repeating buyers value getting wrong. please have Look at Following queries.
I can find out the correct values using sql queries.
MAQL Queries:
TOTAL ORDERS-
SELECT COUNT(NexternalOrderNo) BY CustomerNo WITHOUT PF
TOTAL FIRSTTIMEBUYERS-
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF=1) WITHOUT PF
TOTAL REPEATINGBUYERS-
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1) WITHOUT PF
Can any one suggest a logic for finding these values using MAQL

It's not clear what you want to do. If you could provide more details about the report you need to get, it would be great.
It's not necessary to put "without pf" into the metrics. This clause bans filter application, so when you remove it, the parent filter will be used there. And you will probably get what you want. Specifically, modify this:
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1) WITHOUT PF
to:
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1)

The only thing you miss here is "ALL IN ALL OTHER DIMENSIONS" aka "ALL OTHER".
This keyword locks and overrides all attributes in all other dimensions—keeping them from having any affect on the metric. You can read about it more in MAQL Reference Guide.
FIRSTTIMEBUYERS:
SELECT COUNT(CustomerNo)
WHERE (SELECT IFNULL(COUNT(NexternalOrderNo), 0) BY Customer ID, ALL OTHER) = 1
REPEATINGBUYERS:
SELECT COUNT(CustomerNo)
WHERE (SELECT IFNULL(COUNT(NexternalOrderNo), 0) BY Customer ID, ALL OTHER) > 1

Related

Running Sum Query

I'm new to Access and I'm trying to develop my own Inventory Management Database.
I'm trying to make a query that could display a running total of the Inventory on Hand as of a specific date. This is how my table looks:
It's sorted according to ITEM_ID then TRANDATE in ascending order. I'd like to add a calculated field beside the NET field that would show a running total of the specific ITEM_ID after a specific date. Negative numbers in the NET field represent a sale while the positive ones represent a purchase. I tried using the DSUM function as it is widely recommended in creating a running sum field. My expression is this
DSum([NET],"InvtyTransT", "[ITEM_ID]=" & [ITEM_ID] And "[TRANDATE]<=#" & [TRANDATE] & "#"). But it only shows the total of the NET field (6827) in each record like this:
What I needed is like this:
(I used an IF function in excel to compute this)
Please help. I think I might have missed something in my expression. I've tried revising it several times and it would always give me the same wrong answer in every record.
Thanks in advance.
Try correlated sub-query.
SELECT t.*, (SELECT SUM(t2.NET)
from InvtyTransT as t2
WHERE (t2.TRANDATE <= t.TRANDATE AND t2.ITEM_ID = t.ITEM_ID)
) AS rSUM
FROM InvtyTransT AS t;

Quick one on Big Query SQL-Ecommerce Data

I am trying to replicate the Google Analyitcs data in Big Query but couldnt do that.
Basically I am using Custom Dimension 40 (user subscription status)
but I am getting wrong numbers in BQ.
Can someone help me on this?
I am using this query but couldn't find it out the exact one.
SELECT
(SELECT value FROM hits.customDimensions where index=40) AS UserStatus,
COUNT(hits.transaction.transactionId) AS Unique_Purchases
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA, --new rollup
UNNEST(GA.hits) AS hits
WHERE
(SELECT value FROM hits.customDimensions where index=40) IN ("xx001","xxx002")
GROUP BY 1
I am getting this from big query which is wrong.
I have check out the dates also but dont know why its wrong.
Your question is rather unclear. But because you want something to be unique and numbers are mysteriously not what you want, I would suggest using COUNT(DISTINCT):
COUNT(DISTINCT hits.transaction.transactionId) AS Unique_Purchases
As far as I understand, you imported Google Analytics data into Bigquery and you are trying to group the custom dimension with index 40 and values ("xx001","xxx002") in order to know how many hit transactions were performed in function of these dimension values.
Replicating your scenario and trying to execute the query you posted, I got the following error.
However, I created a query that could help with your use-case. At first, it selects the transactionId and dimension values with the transactionId different from null and with index value equal to 40, then the grouping is done by the dimension value, filtered with values equals to "xx001"&"xxx002".
WITH tx AS (
SELECT
HIT.transaction.transactionId,
CD.value
FROM
`xxxxxxxxxxxxx.ga_sessions_2020*` AS GA,
UNNEST(GA.hits) AS HIT,
UNNEST(HIT.customDimensions) AS CD
WHERE
HIT.transaction.transactionId IS NOT NULL
AND
CD.index = 40
)
SELECT tx.value AS UserStatus, count(tx.transactionId) AS Unique_Purchases
FROM tx
WHERE tx.value IN ("xx001","xx002")
GROUP BY tx.value
For further details about the format and schema of the data that is imported into BigQuery, I found this document.

GROUP BY/SELECT DISTINCT issues

I have gone through dozens of questions on this site and others, but am still having issues adding a GROUP BY to my code. One came close, but not sure I fully understand it:
SQL/mysql - Select distinct/UNIQUE but return all columns?
I'm working on a program that tracks tracking numbers for orders shipped, but need each sales order only listed once as I need to assign a status to each order. PackageReferenceNo1 contains the sales order numbers, but I only need one tracking number for each order. The following code pulls all tracking numbers and their sales orders. My thought was if I could SELECT DISTINCT, or GROUP BY, I could resolve this issue but haven't had any luck with either option.
My working code is:
SELECT tblImportUPS.ManifestDate,
tblImportUPS.TrackingNumber,
tblImportUPS.PackageReferenceNo1,
tblImportUPS.PackageReferenceNo2,
tblImportUPS.STATUS,
tblTrackingLinks.Notes,
tblImportUPS.ShipToName,
tblTrackingLinks.PGId,
tblTrackingLinks.BBReleased,
tblTrackingLinks.BBReleaseDate,
tblTrackingLinks.SRFNumber,
tblImportUPS.ShipToCity,
tblImportUPS.[ShipToState/Province],
tblImportUPS.ShipToCountry,
tblImportUPS.ShipperName,
tblImportUPS.NumberofPieces,
tblImportUPS.Weight,
tblImportUPS.ScheduledDelivery,
tblImportUPS.DateDelivered
FROM tblTrackingLinks
INNER JOIN tblImportUPS ON (tblTrackingLinks.TrackingNumber = tblImportUPS.TrackingNumber)
AND (tblTrackingLinks.TrackingNumber = tblImportUPS.TrackingNumber)
WHERE (((tblImportUPS.PackageReferenceNo1) LIKE "*56*")
AND ((tblImportUPS.STATUS) <> "Void"))
ORDER BY tblImportUPS.ManifestDate DESC;
This works fine, but repeats multiple tracking numbers for each order. I try to add GROUP BY tblImportUPS.PackageReferenceNo1 before the ORDER BY line and receive an aggregate error.
Can anyone advise the correct way to proceed and why? I'd prefer to understand and not just receive the solution.

Access 2013 SQL to perform linear interpolation where necessary

I have a database in which there are 13 different products, sold in 6 different countries.
Prices increase once a year.
Prices need to be calculated using a linear interpolation method.  I have 21 different price and quantity increments for each product for each country for each year.
The user needs to be able to see how much an order would cost for any given value (as you would expect).
What the database needs to do (in English!) is to:
If there is a matching quantity from TblOrderDetail in the TblPrices,
use the price for the current product, country and year
if there isn't a matching quantity but the quantity required is greater than 1000 for one product (GT) and greater than 100 for every other product:
Find the highest quantity for the product, country and year (so, 1000 or 100, depending on the product), and calculate a pro-rated price.  eg.  If someone wanted 1500 of product GT for the UK for 2015, we'd look at the price for 1000 GT in the UK for 2015 and multiply it by 1.5.  If 1800 were required, we'd multiply it by 1.8.  I haven't been able to get this working yet as I'm looking at it alongside the formula for the next possibility...
If there isn't a matching quantity and the quantity required is less than 1000 for the product GT but 100 for the other products (this is the norm)...
Find the quantity and price for the increment directly below the quantity required by the user for the required product, country and year (let's call these quantitybelow and pricebelow)
Find the quantity and price for the increment directly above the quantity required by the user for the required product, country and year (let's call these quantityabove and priceabove)
Calculate the price for the required number of products for an account holder in a particular country for a given year using this formula.
ActualPrice: PriceBelow + ((PriceAbove - PriceBelow) * (The quantity required in the order detail - QuantityBelow) / (QuantityAbove - QuantityBelow))
I have spent days on this and have sought advice about this before but I am still getting very stuck.
The tables I've been working with to try and make this work are as follows:
TblAccount (primary key is AccountID, it also has a Country field which joins to the TblCountry.Code (primary key)
TblOrders (primary key is Order ID) which joins to TblAccount via the AccountID field; TblOrderDetail via the OrderID.  This table also holds the OrderDate and Recipient ID which links to a person in TblContact - I don't need that here but will need it later to generate an invoice 
TblOrderDetail (primary key is DetailID) which joins to TblOrders via OrderID field; TblProducts via ProductID field, and holds the Quantity required as well as the product
TblProducts (primary key is ProductCode) which as well as joining to TblOrderDetail, also joins to TblPrice via the Product field
TblPrices links to the TblProducts (as you have just read).  I've also created an Alias for the TblCountry (CountryAliasForProductCode) so I can link it to the TblPrices to show the country link. I'm not sure if I needed to do this - it doesn't work if I do or I don't do it, so I seek guidance again here.
This is the code I've been trying to use (and failing) to get my price and quantity steps above and I hope to replicate it, making a couple of tweaks to get the steps below:
SELECT MIN(TblPrices.stepquantity) AS QuantityAbove, MIN(TblPrices.StepPrice) AS PriceAbove, TblOrders.OrderID, TblOrders.OldOrderID, TblOrders.AccountID, TblOrders.OrderDate, TblOrders.RecipientID, TblOrders.OrderStatus, TblOrderDetail.DetailID, TblOrderDetail.Product, TblOrderDetail.Quantity
FROM (TblCountry INNER JOIN ((TblAccount INNER JOIN TblOrders ON TblAccount.AccountID = TblOrders.AccountID) INNER JOIN (TblOrderDetail INNER JOIN TblProducts ON TblOrderDetail.Product = TblProducts.ProductCode) ON TblOrders.OrderID = TblOrderDetail.OrderID) ON TblCountry.Code = TblAccount.Country) INNER JOIN (TblCountry AS CountryAliasForProduct INNER JOIN TblPrices ON CountryAliasForProduct.Code = TblPrices.CountryCode) ON TblProducts.ProductCode = TblPrices.Product
WHERE (StepQuantity >= TblOrderDetails.Quantity)
AND (TblPrices.CountryCode = TblAccount.Country)
AND (TblOrderDetail.Product = TblPrices.Product)
AND (DATEPART('yyyy', TblPrices.DateEffective) = DATEPART('yyyy', TblOrders.OrderDate));
I've also tried...
I've even tried going back to basics and trying again to generate the steps below in 1 query, then try the steps above in another and finally, create the final calculation in another query.
This is what I have been trying to get my prices and quantities below:
SELECT Max(StepQuantity) AS quantity_below, Max(StepPrice) AS price_below, TblOrderDetails.Quantity, TblAccounts.Country
FROM 
(TblProducts INNER JOIN TblPrices ON TblProducts.ProductCode = TblPrices.Product)
(TblOrderDetail INNER JOIN TblProducts ON TblOrderDetail.Product = TblProducts.ProductCode)
(TblOrders INNER JOIN TblOrderDetail ON TblOrders.OrderID = TblOrderDetail.OrderID)
(TblAccount INNER JOIN TblOrders ON TblAccount.AccountID = TblOrders.AccountID),
WHERE (((TblPrices.StepQuantity)<=(TblOrderDetail.Quantity)) AND ((TblPrices.CountryCode)=([TblAccounts].[country])) AND ((TblPrices.Product)=([TblOrderDetail].[product])) AND ((DatePart('yyyy',[TblPrices].[DateApplicable]))=(DatePart('yyyy',[TblOrders].[OrderDate]))));
You may be able to see glaring errors in this but I'm afraid I can't.  I've tried re-jigging it and I'm getting nowhere.
I need to be able to tie the information in to the OrderDetail records as the price generated will need to be added to a financial transactions table as a debit amount and will show as an amount owing on statements.
I'm really not very good at SQL.  I've read and worked though several self-study books and I have asked part of this question before; but I really am struggling with it.  If anyone has any ideas on how to proceed, or even where I've gone wrong with my code, I'd be delighted, even if you tell me I shouldn't be using SQL. For the record, I originally posted this question on a different forum under Visual Basic. Responses from that forum brought me to SQL - however, anything that works would be good!
I've even tried, using Excel, concatenating the Year&Product&Country&Quantity to get a unique product code, interpolating the prices for every quantity between 1 and 1000 for each product, country and year and bringing them into a TblProductsAndPrices table. In Access, I created a query to concatenate the Year(of order date from tblOrders)&Product(of tblorderdetails)&Country(of tblAccount) in order to get the required product code for the order. Another query would find a price for me. However, any product code that doesn't appear on the list (such as where a quantity isn't listed in the tblProductsAndPrices as it is larger than the highest price increment) doesn't have a price.
If there was a workable solution to what I've just described that would generate a price for everything, then I'd be so pleased.
I'd really like to be able to generate an order for any quantity of any product for any account based in any country on any date and retrieve a price which will be used to "debit" a financial account in the database, who in a transaction history for an account and appear on statements. I'd also like to be able to do an ad-hoc price check on the spot.
Thank you very much for taking the time to read this.  I really appreciate it. If you could offer any help or words of encouragement, I'd be very grateful.
Many thanks
Karen
Maybe no one thinks on an easy solution to the problem, since not all minds work in database thinking.
Easy solution: Create one view that gives all calculated values, not only the final one you need, each one as a column. Then you can use such view in a relation view and use on some rows one of the values and on other rows other values, etc.
How to think is simple, think in reverse order, instead of thinking "if that then I need to calculate such else I need this other", think as "I need "such" and I need "this other", both are columns of an intermediate view, then think on top level "if" that would be another view, such view will select the correct value ignoring the rest.
Never ever try to solve all in one step, that can be a really big headache.
Pros: You can isolate calculated values (needed or not), sql is much more easy to write and maintain.
Cons: Resources use is bigger than minimal, but most of times that extra calculated values does not represent a really big impact.
In terms of tutorial out there: Instead of a Top-Down method, use a Down-Top method.
Sometimes it is better (with your example) to calculate all three values (you write sentences on bold) ignoring the if part, and have all three possible values for your order and after that discard the ones not wanted, than trying to only calculate one.
Trying to calculate only one is thinking as a procedural programming, when working with databases most times one must get rid of such thinking and think as reverse, first do the most internal part of such procedural programming to have all data collected, then do the external selection of the procedural programing.
Note: If one of the values can not be calculated, just generate a Null.
I know it is hard to think on First in, last out (Down-Top) model, but it is great for things as the one you want.
Step1 (on specific view, or a join from one view per calculation):
Calculate column 1 as price for the current product, country and
year
Calculate column 2 as calculate a pro-rated price as if 1000
Calculate column 3 as calculate a pro-rated price as if 100
Calculate column 4 as etc
Calculate column N as etc
Step 2 (Another view, the one you want):
Calculate the if part, so you can choose adequate column from previous view (you can use immediately if or a calculated auxiliary field).
Hope you can follow theese way of thinking, I have solved a lot of things like that one (and more complex) thinking in that way, but it is not easy to think as that, needs an extra effort.

Using distinct and then an aggregate function in Postgresql?

This is a pretty basic problem and for whatever reason I can't find a reasonable solution. I'll do my best to explain.
Say you have an event ticket (section, row, seat #). Each ticket belongs to an attendee. Multiple tickets can belong to the same attendee. Each attendee has a worth (ex: Attendee #1 is worth $10,000). That said, here's what I want to do:
1. Group the tickets by their section
2. Get number of tickets (count)
3. Get total worth of the attendees in that section
Here's where I'm having problems: If Attendee #1 is worth $10,000 and is using 4 tickets, sum(attendees.worth) is returning $40,000. Which is not accurate. The worth should be $10,000. Yet when I make the result distinct on the attendee, the count is not accurate. In an ideal world it'd be nice to do something like
select
tickets.section,
count(tickets.*) as count,
sum(DISTINCT ON (attendees.id) attendees.worth) as total_worth
from
tickets
INNER JOIN
attendees ON attendees.id = tickets.attendee_id
GROUP BY tickets.section
Obviously this query doesn't work. How can I accomplish this same thing in a single query? OR is it even possible? I'd prefer to stay away from sub queries too because this is part of a much larger solution where I would need to do this across multiple tables.
Also, the worth should follow the ticket divided evenly. Ex: $10,000 / 4. Each ticket has an attendee worth of $5,000. So if the tickets are in different sections, they take their prorated worth with them.
Thanks for your help.
You need to aggregate the tickets before the attendees:
select ta.section, sum(ta.numtickets) as count, sum(a.worth) as total_worth
from (select attendee_id, section, count(*) as numtickets
from tickets
group by attendee_id, section
) ta INNER JOIN
attendees a
ON a.id = ta.attendee_id
GROUP BY ta.section
You still have a problem of a single attendee having seats in multiple sections. However, you do not specify how to solve that (apportion the worth? randomly choose one section? attribute it to all sections? canonically choose a section?)
Using jsonb_object_agg:
select
tickets.section,
count(tickets.*) as count,
(
SELECT SUM(value::int4)
FROM jsonb_each_text(jsonb_object_agg(attendees.id, attendees.worth))
) as total_worth
from
tickets
INNER JOIN
attendees ON attendees.id = tickets.attendee_id
GROUP BY tickets.section