COUNT Clicks/Opens for Engagement Scoring

COUNT Clicks/Opens for Engagement Scoring - sql

I am a bit rusty on SQL so any assistance is appreciated. I am also referencing my SQL textbook but I thought I would try this out.
I am developing a lead scoring model starting with engagement scoring. I created a data extension to house the results and used the following query to populate:
SELECT a.[opportunityid],
a.[first name],
a.[last name],
a.[anticipatedentryterm],
a.[funnelstage],
a.[programofinterest],
a.[opportunitystage],
a.[opportunitystatus],
a.[createdon],
a.[ownerfirstname],
a.[ownerlastname],
a.[f or j visa student],
a.[donotbulkemail],
a.[statecode],
Count(DISTINCT c.[subscriberkey]) AS 'Clicks',
Count(DISTINCT b.[subscriberkey]) AS 'Opens',
Count(DISTINCT b.[subscriberkey]) * 1.5 +
Count(DISTINCT c.[subscriberkey]) * 3 AS 'Probability'
FROM [ug_all_time_joined] a
INNER JOIN [open] b
ON a.[opportunityid] = b.[subscriberkey]
INNER JOIN [click] c
ON a.[opportunityid] = c.[subscriberkey]
GROUP BY a.[opportunityid],
a.[first name],
a.[last name],
a.[anticipatedentryterm],
a.[funnelstage],
a.[programofinterest],
a.[opportunitystage],
a.[opportunitystatus],
a.[createdon],
a.[ownerfirstname],
a.[ownerlastname],
a.[f or j visa student],
a.[donotbulkemail],
a.[statecode]
Something is wrong with my COUNT functions, the query populates the same value in both Clicks and Opens and I don't think it's accurate. The result I am aiming for is how many times a subscriber id appears (which would correspond with the individual clicks/opens, each row is a 1 action).
Thank you!

Why is that surprising?
You have two joins that if you take to their logical conclusion imply that
b.[SubscriberKey] = c.[SubscriberKey]
Hence, counting distinct values will be the same.
You have not provide sample data or desired results. I can speculate, though, that you intend LEFT JOINs so you get some values in one table that are not matched in the other.

When you do an inner join, between a and b, your data is filtered when you join a and c, which will give you incorrect results. having no view of your data and no background of your tables, this is the best guess i have

Related

SQL report filtering, results stop after first item

and thanks in advance. I am a newbie, working on one of my first reports. I have orders, which have a terminal assigned them (a "DC"). The report is set up to return all open orders, the "DC", and a few other columns (driver #, city, etc). I made a drop down filter to use so I can look at one, several, or all of the DCs. My problem is, it stops looking after the first item that is checked in the drop down list. So if the first item in the list has 100 orders, but the rest of them have thousands more, it only shows me the 100 orders. Am I making any sense here? I am not sure what information from my report's setup would be pertinent here.
This is the query that the report is based on. Using SQL Report Builder.
SELECT
o.OrderTrackingID,
cm.accountno,
o.ClientRefNo,
o.DCoName,
o.DStreet,
o.DCity,
o.DState,
o.DZip,
o.DZone,
t.TerminalName as 'OrderDC',
e.LastName as 'DrvLast',
e.FirstName as 'DrvFirst',
e.DriverNo,
et.TerminalName as 'DriverDC'
FROM Orders o
FULL JOIN OrderDrivers od ON o.OrderTrackingID = od.OrderTrackingID
FULL JOIN Employees e ON od.DriverID = e.ID
FULL JOIN ClientMaster cm ON o.ClientID = cm.ClientID
FULL JOIN Terminals t ON o.TerminalID = t.TerminalID
FULL JOIN Terminals et ON e.TerminalID = et.TerminalID
WHERE o.Status = 'N'
Order By o.aTimeStamp ASC

(I am writing this as an answer even if it isn't an complete answer mostly because the comment field is kind of limited.)
In the SQL you posted the below stands out as wrong
FULL JOIN Terminals t ON o.TerminalID = t.TerminalID
FULL JOIN Terminals et ON e.TerminalID = et.TerminalID
You are joining the same table twice but the is nothing that separated the two joins and this is my guess why you are not getting any more orders in your report.
I don't now what the drop down list corresponds to but I assume it is some kind of identifier in the Terminals table.
From a pure SQL point of view I would expect something like this
FULL JOIN Terminals t ON o.TerminalID = t.TerminalID
WHERE t.someColumn IN (value1, value2)
where value1 and value2 comes from the drop down list.
I see in your select part that you include the same column from both of the Terminals JOIN you have and I expect those two columns to always have the same values. You should need that column only once in your select list.
Not a solution but maybe this can get you in the right direction.

Include missing years in Group By query

I am fairly new in Access and SQL programming. I am trying to do the following:
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
and group by year even when there is no amount in some of the years. I would like to have these years listed as well for a report with charts. I'm not certain if this is possible, but every bit of help is appreciated.
My code so far is as follows:
SELECT
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
FROM
Base_CustomerT
INNER JOIN (
SO_SalesOrderPaymentHistoryLineT
INNER JOIN SO_SalesOrderT
ON SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId
) ON Base_CustomerT.CustomerId = SO_SalesOrderT.CustomerId
GROUP BY
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
SO_SalesOrderPaymentHistoryLineT.PaymentType,
Base_CustomerT.IsActive
HAVING
(((SO_SalesOrderPaymentHistoryLineT.PaymentType)=1)
AND ((Base_CustomerT.IsActive)=Yes))
ORDER BY
Base_CustomerT.SalesRep,
Base_CustomerT.Customer;

You need another table with all years listed -- you can create this on the fly or have one in the db... join from that. So if you had a table called alltheyears with a column called y that just listed the years then you could use code like this:
WITH minmax as
(
select min(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as minyear,
max(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as maxyear)
from SalesOrderPaymentHistoryLineT
), yearsused as
(
select y
from alltheyears, minmax
where alltheyears.y >= minyear and alltheyears.y <= maxyear
)
select *
from yearsused
join ( -- your query above goes here! -- ) T
ON year(T.SO_SalesOrderPaymentHistoryLineT.DatePaid) = yearsused.y

You need a data source that will provide the year numbers. You cannot manufacture them out of thin air. Supposing you had a table Interesting_year with a single column year, populated, say, with every distinct integer between 2000 and 2050, you could do something like this:
SELECT
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
Sum(NZ(data.Amount)) AS [Sum Of PaymentPerYear]
FROM
(SELECT * FROM Base_CustomerT INNER JOIN Year) AS base
LEFT JOIN
(SELECT * FROM
SO_SalesOrderT
INNER JOIN SO_SalesOrderPaymentHistoryLineT
ON (SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId)
) AS data
ON ((base.CustomerId = data.CustomerId)
AND (base.year = Year(data.DatePaid))),
WHERE
(data.PaymentType = 1)
AND (base.IsActive = Yes)
AND (base.year BETWEEN
(SELECT Min(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT)
AND (SELECT Max(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT))
GROUP BY
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
ORDER BY
base.SalesRep,
base.Customer;
Note the following:
The revised query first forms the Cartesian product of BaseCustomerT with Interesting_year in order to have base customer data associated with each year (this is sometimes called a CROSS JOIN, but it's the same thing as an INNER JOIN with no join predicate, which is what Access requires)
In order to have result rows for years with no payments, you must perform an outer join (in this case a LEFT JOIN). Where a (base customer, year) combination has no associated orders, the rest of the columns of the join result will be NULL.
I'm selecting the CustomerId from Base_CustomerT because you would sometimes get a NULL if you selected from SO_SalesOrderT as in the starting query
I'm using the Access Nz() function to convert NULL payment amounts to 0 (from rows corresponding to years with no payments)
I converted your HAVING clause to a WHERE clause. That's semantically equivalent in this particular case, and it will be more efficient because the WHERE filter is applied before groups are formed, and because it allows some columns to be omitted from the GROUP BY clause.
Following Hogan's example, I filter out data for years outside the overall range covered by your data. Alternatively, you could achieve the same effect without that filter condition and its subqueries by ensuring that table Intersting_year contains only the year numbers for which you want results.
Update: modified the query to a different, but logically equivalent "something like this" that I hope Access will like better. Aside from adding a bunch of parentheses, the main difference is making both the left and the right operand of the LEFT JOIN into a subquery. That's consistent with the consensus recommendation for resolving Access "ambiguous outer join" errors.

Thank you John for your help. I found a solution which works for me. It looks quiet different but I learned a lot out of it. If you are interested here is how it looks now.
SELECT DISTINCTROW
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
FROM
Base_Customer_RevenueYearQ
LEFT JOIN CustomerPaymentPerYearQ
ON (Base_Customer_RevenueYearQ.RevenueYear = CustomerPaymentPerYearQ.[RevenueYear])
AND (Base_Customer_RevenueYearQ.CustomerId = CustomerPaymentPerYearQ.CustomerId)
GROUP BY
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
;

SQL LEFT JOIN WHERE not displaying right result

So I got this query:
Data structure:
Users
id---inlog----name----more stuff
llntoets
id---code----inlog----more stuff
oefeningen
id---speler---status----morestuff
(inlog and speler are always the same values for a user)
SELECT
// Some other stuff working
SUM(o.status) AS oefn
FROM users AS u
LEFT JOIN llntoets AS l
ON (u.inlog = l.inlog)
LEFT JOIN oefeningen AS o
ON (u.inlog = o.speler) AND o.status = 'afgewerkt'
WHERE
code = '$code'
GROUP BY l.inlog
ORDER BY klas ASC, klasnr ASC
Everything runs fine except 1 thing the oefn variable. It shows a number sometimes it shows the correct value and sometimes it shows a value that is much higher than it should be. Someone told me it could be because of the GROUP BY. Can someone help me pls?
It is supposed to count the total records from table oefeningen where status = 'afgewerkt' and where the speler is the inlog from users. Thanks, if you got other questions ask will try to explain more.

the SUM(o.status) in your query it is not supposed to count the total records of table oefeningen.
that sum is the sum of the values of all the joined rows that satisfy your criteria that can be a much higher number.
also note that applying the filter o.status = 'afgewerkt' you are performing a JOIN even if you wrote LEFT JOIN throghout the query.

Two identical queries that give different results

I have 2 queries: one is written in ANSI SQL, another is written using oracle dialect.
I think that they both must give the same resultset, but it is no true. First query gives 385 rows and the second - only 25
First:
SELECT idclient, cl.surname, sum(sub1.s)
FROM client cl JOIN incomestatement incst USING(idclient)
JOIN (SELECT c.idincome ID, sum(inst.total) AS s
FROM instalment inst JOIN credit c USING(idcredit)
WHERE inst.paydate > c.paydate AND c.isloaned = 1
GROUP BY c.idincome) sub1 ON incst.idincome = sub1.ID
GROUP BY idclient, cl.surname;
Second:
SELECT c.idclient, c.surname, sum(sub.s)
FROM client c, incomestatement inc,
(SELECT sum(inst.total) as s, cr.idincome as id
FROM instalment inst, credit cr
WHERE inst.paydate > cr.paydate AND cr.isloaned = 1 AND cr.idcredit = inst.idcredit
GROUP BY cr.idincome
) sub
WHERE c.idclient = inc.idclient AND inc.income = sub.ID
group by c.idclient, c.surname;
So why they don't give the same result?

I'd approach the problem in steps.
Do the two sub-queries produce the same data sets?
If they do, proceed to step 2.
If not, then you have two simpler queries to analyze and dissect.
Given that the pair of sub-queries produce the same answer, you can then establish whether the Client and IncomeStatement joins give the same results (treat it as another sub-query)
If they do, proceed to step 3.
If not, then you have a pair of queries (one with JOIN, one with classic SQL notation) to analyze and dissect.
Given that the pair of joins and the pair of subqueries each produce the same result, analyze why the join of these does not work correctly.

Have you made a commit?
It's possible that you don´t commited some transactions, so the results can be differents.

SUM(a*b) not working

I have a PHP page running in postgres. I have 3 tables - workorders, wo_parts and part2vendor. I am trying to multiply 2 table column row datas together, ie wo_parts has a field called qty and part2vendor has a field called cost. These 2 are joined by wo_parts.pn and part2vendor.pn. I have created a query like this:
$scoreCostQuery = "SELECT SUM(part2vendor.cost*wo_parts.qty) as total_score
FROM part2vendor
INNER JOIN wo_parts
ON (wo_parts.pn=part2vendor.pn)
WHERE workorder=$workorder";
But if I add the costs of the parts multiplied by the qauntities supplied, it adds to a different number than what the script is doing. Help....I am new to this but if someone can show me in SQL I can modify it for postgres. Thanks

Without seeing example data, there's no way for us to know why you're query totals are coming out differently that when you do the math by hand. It could be a bad join, so you are getting more/less records than you expected. It's also possible that your calculations are off. Pick an example with the smallest number of associated records & compare.
My suggestion is to add a GROUP BY to the query:
SELECT SUM(p.cost * wp.qty) as total_score
FROM part2vendor p
JOIN wo_parts wp ON wp.pn = p.pn
WHERE workorder = $workorder
GROUP BY workorder
FYI: MySQL was designed to allow flexibility in the GROUP BY, while no other db I've used does - it's a source of numerous questions on SO "why does this work in MySQL when it doesn't work on db x...".
To Check that your Quantities are correct:
SELECT wp.qty,
p.cost
FROM WO_PARTS wp
JOIN PART2VENDOR p ON p.pn = wp.pn
WHERE p.workorder = $workorder
Check that the numbers are correct for a given order.

You could try a sub-query instead.
(Note, I don't have a Postgres installation to test this on so consider this more like pseudo code than a working example... It does work in MySQL tho)
SELECT
SUM(p.`score`) AS 'total_score'
FROM part2vendor AS p2v
INNER JOIN (
SELECT pn, cost * qty AS `score`
FROM wo_parts
) AS p
ON p.pn = p2v.pn
WHERE p2n.workorder=$workorder"

In the question, you say the cost column is in part2vendor, but in the query you reference wo_parts.cost. If the wo_parts table has its own cost column, that's the source of the problem.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

COUNT Clicks/Opens for Engagement Scoring - sql

When you do an inner join, between a and b, your data is filtered when you join a and c, which will give you incorrect results. having no view of your data and no background of your tables, this is the best guess i have

Related

SQL report filtering, results stop after first item

Include missing years in Group By query

SQL LEFT JOIN WHERE not displaying right result

Two identical queries that give different results

SUM(a*b) not working

Categories

Resources