Counting presence of discount ranges in SQL - sql

I have a simple orders table for a product being sold that has a column with the discount per order. For example:
Order Number DiscountPrcnt
1234 0
1235 10
1236 41
What I would like to do is create an output where I can join the order table to a customer table and group the discounts into ranges by email, as follows:
Email_Address 0-20 20-50 50-100
joe#abc.com Yes Yes
tom#abc.com Yes Yes
So the idea is to determine if each customer (here designated by email) has ever received a discount in the specified range, and if not, it should return NULL for the range.
A simplified version of the table structure is:
Customer Table:
CustID Email
123 joe#abc.com
234 tom#abc.com
456 joe#abc.com
So emails can repeat across customers.
Orders Table:
CustID OrderID Amount DiscPrcnt
123 1234 50.00 0
234 1235 75.00 10
456 1236 20.00 41

select c.email, COUNT(o1.order_number), COUNT(o2.order_number), COUNT(o3.order_number)
from customer c, order o1, order o2, order o3
where c.CustID = o1.CustId
and c.CustID = o2.CustID
and c.CustID = o3.CustID
and o1.DiscountPrcnt > 0
and o1.DiscountPrcnt <= 20
and o2.DiscountPrcnt > 2
and o2.DiscountPrcnt <= 50
and o3.DiscountPrcnt > 50
and o3.DiscountPrcnt <= 100
group by c.email
You'll not get yes as expected but the number of time the user benefits the discount. 0 if none.

Related

SQL - Case when product exists, fill up its corresponding value to all rows within the partition

I have the something like the following monthly data set.
I have a Product, company ID, Date, and Quantity. A company (denoted by Company ID) can buy multiple products. I want to create a new column that will have the quantity of Product 'C' if the company bought in the month at each line item. If Product 'C' is not bought, then return 0.
Product Company_ID Date Quantity Desired_Calculated_Column
A 1 5/1/2019 100 300
B 1 5/1/2019 200 300
C 1 5/1/2019 300 300
A 2 6/1/2019 150 125
B 2 6/1/2019 250 125
C 2 6/1/2019 125 125
A 3 7/1/2019 175 0
B 3 7/1/2019 275 0
I have been trying to partition the data based on Product and Company ID. I have been trying to leverage the LAST_VALUE but haven't been successful.
LAST_VALUE(quantity) OVER (PARTITION BY Date, Company_ID
ORDER BY product_group
) AS Desired_Calculated_Column
You don't want last_value(). You can use conditional aggregation, assuming that 'C' occurs once per group:
MAX(CASE WHEN product_group = 'C' THEN quantity ELSE 0 END) OVER
(PARTITION BY Date, Company_ID) AS C_quantity

Selecting Account Numbers with latest date

I have been trying to solve this problem now for days.
I have table called Stat with the following simplified structure and sample data:
Customer BankID AccNumb Type Date Amount AccType
Customer 1 Boa 5 Account Statement 2015-01-01 10000,00 Eur
Customer 1 CS 10 Account Statement 2015-04-04 22000,00 Eur
Customer 2 Sa 15 Account Statement 2015-03-13 3000,00 Eur
Customer 2 Sa 40 Account Statement 2015-04-24 1000,00 Eur
Customer 2 Sa 15 Sale Advice 2015-04-16 400,00 Eur
Customer 2 Sa 15 Account Statement 2015-12-24 50,00 Usd
Customer 2 Boa 20 Sale Advice 2015-05-15 6000,00 Eur
Customer 3 Cu 25 Account Statement 2015-11-27 81000,00 Eur
Customer 3 Cu 30 Sale Advice 2015-11-27 3000,00 Usd
Customer 3 Pop 30 Account Statement 2015-11-27 12000,00 Eur
What I'm trying to do is to Select the AccountNumber with the latest date specified. A Customer can also have different Account Numbers on various Banks, so it should also be grouped by BankID and Customer.
I have come this far:
SELECT AccNumb, Customer, BankID,
(SELECT TOP 1 Amount FROM Stat
WHERE AccNumb = y.AccNumb AND Customer = y.Customer AND
BankID = y.BankID AND Type = 'Account Statement' AND
Date = MAX(y.Date) GROUP BY Amount) Amount
FROM Stat y
GROUP BY AccNumb, Customer, BankID
ORDER BY Customer, AccNumb
And it works fine, the problem is i should also add the column AccType and Date
I managed to do this with 2 more subselects (the query takes long but it works).
But now i have the problem that there are also NULL values in Customer (or Date) Column. Now, the account number of these 'NULL' Customers still should be displayed if it's the latest date. I also tried to do the same by joining the table by itself, and it didnt work out.
SELECT x.AccNumber, x.Customer, x.BankID, x.Date, y.Amount, y.AccType
FROM Stat y RIGHT JOIN
(SELECT AccNumber, Customer, BankID, MAX(Date) Date FROM Stat
GROUP BY AccNumber, Customer, BankID) x
ON x.AccNumber = y.AccNumber AND
x.Customer = y.Customer AND
x.BankID = y.BankID AND
x.Date = y.Date
ORDER BY y.Customer, y.AccNumber
But now the 'NULL' Customers only have NULL values in the Amount, Date and AccType Columns, which is not correct.
The output should be something like this
AccNumb Customer BankID Amount Date AccType
111111111 a Boa 1234.40 31.06.2014 Eur
222222222 NULL Boa 5678.40 31.04.2014 Eur
333333333 b Boa 0.00 25.02.2014 Eur
444444444 NULL Boa 9101.40 23.04.2015 Eur
555555555 NULL Boa 1213.40 31.02.2014 Usd
A66666666 c Sa NULL 31.02.2014 Eur
777777777 c Sa 1415.00 31.12.2014 Eur
888888888 c Boa 1617.40 31.12.2014 Usd
999999999 f Pop 5678.64 31.10.2014 Eur
Thanks in advance.
Just use row_number(), if I understand correctly:
select s.*
from (select s.*,
row_number() over (partition by customer, bankId order by date desc) as seqnum
from stat s
) s
where seqnum = 1;
Find the rows with latest dates, i.e. return a row if no other row with same AccountNumber, BankID and Customer but a later Date exists:
select *
from stat s1
where not exists (select 1 from stat s2
where s1.AccountNumber = s2.AccountNumber
and s1.BankID = s2.BankID
and s1.Customer = s2.Customer
and s1.Date < s2.Date)
You're first query works closely to what, I believe, you are looking for. Using it as a base, we can alter to work for you:
SELECT
AccNumb,
Customer,
BankID,
Amount,
Date,
AccType
FROM Stat y
WHERE Date = (SELECT MAX(z.DATE)
FROM Stat z
WHERE z.AccNumb = y.AccNumb
AND z.Customer = y.Customer
AND z.BankID = y.BankID AND Type = 'Account Statement')
ORDER BY Customer, AccNumb

Concatenating data from one row into the results from another

I have a SQL Server database of orders I'm struggling with. For a normal order a single table provides the following results:
Orders:
ID Customer Shipdate Order ID
-----------------------------------------------------------------
1 Tom 2015-01-01 100
2 Bob 2014-03-20 200
At some point they needed orders that were placed by more than one customer. So they created a row for each customer and split the record over multiple rows.
Orders:
ID Customer Shipdate Order ID
-----------------------------------------------------------------
1 Tom 2015-01-01 100
2 Bob 2014-03-20 200
3 John
4 Dan
5 2014-05-10 300
So there is another table I can join on to make sense of this which relates the three rows which are actually one order.
Joint.Orders:
ID Related ID
-----------------------------------------------------------------
5 3
5 4
I'm a little new to SQL and while I can join on the other table and filter to only get the data relating to Order ID 300, but what I'd really like is to concatenate the customers, but after searching for a while I can't see how to do this. What'd I'd really like to achieve is this as an output:
ID Customer Shipdate Order ID
----------------------------------------------------------------
1 Tom 2015-01-01 100
2 Bob 2014-03-20 200
5 John, Dan 2014-05-10 300
You should consider changing the schema first. The below query might help you get a feel of how it can be done with your current design.
Select * From Orders Where IsNull(Customer, '') <> ''
Union All
Select ID,
Customer = (Select Customer + ',' From Orders OI Where OI.ID (Select RelatedID from JointOrders JO Where JO.ID = O.ID)
,ShipDate, OrderID
From Orders O Where IsNull(O.Customer, '') = ''

Joining to another table only on the first occurrence of a field

Note: I have tried to simplify the below to make it simpler both for me and for anyone else to understand, the tables I reference below are in fact sub-queries joining a lot of different data together from different sources)
I have a table of purchased items:
Items
ItemSaleID CustomerID ItemCode
1 100 A
2 100 B
3 100 C
4 200 A
5 200 C
I also have transaction header and detail tables coming from a till system:
TranDetail
TranDetailID TranHeaderID ItemSaleID Cost
11 51 1 $10
12 51 2 $10
13 51 3 $10
14 52 4 $20
15 52 5 $10
TranHeader
TranHeaderID CustomerID Payment Time
51 100 $100 11:00
52 200 $50 12:00
53 100 $20 13:00
I want to get to a point where I have a table like:
ItemSaleID CustomerID ItemCode Cost Payment Time
1 100 A $10 $120 11:00
2 100 B $10 11:00
3 100 C $10 11:00
4 200 D $20 $50 12:00
5 200 E $10 12:00
I have a query which produces the results but when I add in the ROW_NUMBER() case statement goes from 2 minutes to 30+ minutes.
The query is further confused because I need to supply the earliest date relating to the list of transactions and the total price paid (could be many transactions throughout the day for upgrades etc)
Query below:
SELECT ItemSaleID
, CustomerID
, ItemCode
, Cost
, CASE WHEN ROW_NUMBER() OVER (PARTITION BY TranHeaderID ORDER BY ItemSaleID) = 1
THEN TRN.Payment ELSE NULL END AS Payment
FROM Items I
OUTER APPLY (
SELECT TOP 1 SUB.Payment, Time
FROM TranHeader H
INNER JOIN TranDetail D ON H.TranHeaderID = D.TranHeaderID
OUTER APPLY (SELECT SUM(Payment) AS Payment
FROM TranHeader H2
WHERE H2.CustomerID = Items.CustomerID
) SUB
WHERE D.CustomerID = I.CustomerID
) TRN
WHERE ...
Is there a way that I can only show payments for each occurrence of the customer ID whilst maintaining performance

SQL: How do I count the number of clients that have already bought the same product?

I have a table like the one below. It is a record of daily featured products and the customers that purchased them (similar to a daily deal site). A given client can only purchase a product one time per feature, but they may purchase the same product if it is featured multiple times.
FeatureID | ClientID | FeatureDate | ProductID
1 1002 2011-05-01 500
1 2333 2011-05-01 500
1 4458 2011-05-01 500
2 8888 2011-05-10 700
2 2333 2011-05-10 700
2 1111 2011-05-10 700
3 1002 2011-05-20 500
3 4444 2011-05-20 500
4 4444 2011-05-30 500
4 2333 2011-05-30 500
4 1002 2011-05-30 500
I want to count by FeatureID the number of clients that purchased FeatureID X AND who purchased the same productID during a previous feature.
For the table above the expected result would be:
FeatureID | CountofReturningClients
1 0
2 0
3 1
4 3
Ideally I would like to do this with SQL, but am also open to doing some manipulation in Excel/PowerPivot. Thanks!!
If you join your table to itself, you can find the data you're looking for. Be careful, because this query can take a long time if the table has a lot of data and is not indexed well.
SELECT t_current.FEATUREID, COUNT(DISTINCT t_prior.CLIENTID)
FROM table_name t_current
LEFT JOIN table_name t_prior
ON t_current.FEATUREDATE > t_prior.FEATUREDATE
AND t_current.CLIENTID = t_prior.CLIENTID
AND t_current.PRODUCTID = t_prior.PRODUCTID
GROUP BY t_current.FEATUREID
"Per feature, count the clients who match for any earlier Features with the same product"
SELECT
Curr.FeatureID
COUNT(DISTINCT Prev.ClientID) AS CountofReturningClients --edit thanks to feedback
FROM
MyTable Curr
LEFT JOIN
MyTable Prev WHERE Curr.FeatureID > Prev.FeatureID
AND Curr.ClientID = Prev.ClientID
AND Curr.ProductID = Prev.ProductID
GROUP BY
Curr.FeatureID
Assumptions: You have a table called Features that is:
FeatureID, FeatureDate, ProductID
If not then you could always create one on the fly with a temporary table, cte or view.
Then:
SELECT
FeatureID
, (
SELECT COUNT(DISTINCT ClientID) FROM Purchases WHERE Purchases.FeatureDate < Feature.FeatureDate AND Feature.ProductID = Purchases.ProductID
) as CountOfReturningClients
FROM Features
ORDER BY FeatureID
New to this, but wouldn't the following work?
SELECT FeatureID, (CASE WHEN COUNT(clientid) > 1 THEN COUNT(clientid) ELSE 0 END)
FROM table
GROUP BY featureID