Conditional left join on max date and where clause in second table - sql

I am attempting to join a customer table with sales table where I show the list of all customers in database and any paid sale the customer might have in the sales tables. Now a customer can have multiple sales rows in the sales table.
This is an example sales record of one customer with multiple sales in the sale tables
while extracting this record I would like to get only the MAX (q_saledatetime) WHERE the q_paidamount is > 0.
as in show me the last time this customer made a payment to us. So in this case row 2 where they paid 8.90 is what I would like to get for that customer. If a customer has no record in the sales table, show their name/details on the list either way.
My failure at the moment is how to include the where clause of the paid amount + max date column.
ATTEMPT A
select DISTINCT ON (q_customer.q_code)
q_customer.q_code, q_customer.q_name, -- customer info
MAX(q_saleheader.q_saledatetime) AS latestDate, q_saleheader.q_paidamount -- saleheader info
FROM q_customer
LEFT JOIN q_saleheader ON (q_customer.q_code = q_saleheader.q_customercode)
group by q_customer.q_code, q_customer.q_name , q_saleheader.q_saledatetime, q_saleheader.q_paidamount
order by q_customer.q_code ASC
which results in
so for Fred Blogg is picking up details from row 4 instead of 2 (first image). As there's no rule for q_paidamount at this point
ATTEMPT B
SELECT
customer.q_code, customer.q_name, -- customer info
sale.q_saledatetime, sale.q_paidamount -- sale info
FROM q_customer customer
LEFT JOIN (SELECT * FROM q_saleheader WHERE q_saledatetime =
(SELECT MAX(q_saledatetime) FROM q_saleheader b1 where q_paidamount > 0 ))
sale ON sale.q_customercode = customer.q_code
which results in
This doesnt seem to be getting any information from the sale table at all.
Update:
After having a closer look at my first attempt I amended the statement and came up with this solution which achieves the same results as Michal's answer. I just curious to know is there any pitfalls or perfomance disadvantages with the following way.
select DISTINCT ON (q_customer.q_code)
q_customer.q_code, q_customer.q_name, -- customer info
q_saleheader.q_saledatetime, q_saleheader.q_paidamount -- saleheader info
FROM q_customer
LEFT JOIN q_saleheader ON (q_customer.q_code = q_saleheader.q_customercode AND
q_saleheader.q_paidamount > 0 )
group by q_customer.q_code, q_customer.q_name , q_saleheader.q_saledatetime,
q_saleheader.q_paidamount
order by q_customer.q_code ASC, q_saleheader.q_saledatetime DESC
main change was adding AND q_saleheader.q_paidamount > 0 on the join and q_saleheader.q_saledatetime DESC to make sure are getting the top row of that related data. As mentioned both Michal's answer and this solution achieve the same results. Just curious about pitfalls in either of the two ways.

Try this query:
SELECT c.q_code,
c.q_name,
CASE WHEN q_saledatetime <> '1900-01-01 00:00:00.000' THEN q_saledatetime END q_saledatetime,
q_paidamount
FROM (
SELECT c.q_code,
c.q_name,
coalesce(s.q_saledatetime, '1900-01-01 00:00:00.000') q_saledatetime, --it will indicate customer with no data
s.q_paidamount,
ROW_NUMBER() OVER (PARTITION BY c.q_code ORDER BY COALESCE(s.q_saledatetime, '1900-01-01') DESC) rn
FROM q_customer c
LEFT JOIN (SELECT q_saledatetime,
q_paidamount
FROM q_saleheader
WHERE q_paidamount > 0) s
ON c.q_code = s.q_customercode
) c WHERE rn = 1

Related

SQL query to add column that counts number of encounters for the past year from each encounter

I am trying to identify High-Usage status for customers, so at time of order how many orders did the customer place in the last year. Each customer has unique ID and each order has unique ID, with a date/time stamp at time of order. This is not just adding a count column, but a conditional count. I can recreate this in Excel using sumproduct, but wanted to see if I can automate the process in SMSS before my pull.
I tried a subquery column and then doing a join on a subquery result:
SELECT (*)
,HU_CUSTOMER_YOY
FROM data
LEFT JOIN (SELECT MAX(ORDER_ID) AS ORDER_ID
,COUNT(CUSTOMER_ID) AS HU_CUSTOMER_YOY
FROM data AS CUS_HU
WHERE ORDER_DTTM > DATEADD(YEAR, -1, ORDER_DTTM)
GROUP BY CUS_HU.CUSTOMER_ID)
CUSHU on CUSHU.ORDER_ID = data.ORDER_ID
This pulls in a value ONLY on the most recent order and counts ALL previous orders. To reiterate, I need a value on EACH unique order to count every order for that customer for the previous year from that unique order. My issue is using the DTTM column. If I use a static date like getdate(), it will count but I need the count for the DTTM-1year on EACH order to view historical data, i.e., when a customer began and fell-off High-Usage status, what contributed to the change, etc.
This is for a rather large dataset that is refreshed daily. I would prefer to not have the main query be aggregated, if possible, which is why I thought creating and joining a reference table would be preferred.
Is this possible?
Adding expected query results:
customer_id
order_id
HU_count
order_dttm
c1
c1-1
0
1/1/2020
c1
c1-2
1
7/1/2020
c1
c1-3
0
1/1/2022
c1
c1-4
1
1/10/2022
c2
c2-1
0
1/11/2022
c1
c1-5
2
1/14/2022
c2
c2-2
1
1/15/2022
I assume you are using SQL Server from your usage of the DATEADD function.
Based on my understanding of the requirements, this will show the count for each order for each customer in the previous year.
SELECT DISTINCT
customer_id as hu_customer_yoy
,COUNT(case when order_dttm > DATEADD(YEAR, -1, order_dttm) THEN 1 ELSE null END)
over (partition by customer_id, order_id) AS ORDER_COUNT
,ORDER_ID
FROM data

SQL - Count new entries based on last date

I have a table with the follow structure
ID ReportDate Object_id
What I need to know, is the count of new and count of old (Object id's)
For example: If I have the data below:
I want the following output grouped by ReportDate:
I thought a way doing it using a Where clause based on date, however i need the data for all the dates I have in the table. To see the count of what already existed in the previous report and what is new at that report. Any Ideas?
Edit: New/Old definition- New would be the records that never appeared before that report run date and appeared on this one, whereas old is the number of records that had at least one match in previous dates. I'll edit the post to include this info.
managed to do it using a left join. Below is my solution in case it helps anyone in the future :)
SELECT table.ReportRunDate,
-1*sum(table.ReportRunDate = new_table.init_date) as count_new,
-1*sum(table.ReportRunDate <> new_table.init_date) as count_old,
count(*) as count_total
FROM table LEFT JOIN
((SELECT Object_ID, min(ReportRunDate) as init_date
FROM table
GROUP By OBJECT_ID) as new_table)
ON table.Object_ID = new_table.Object_ID
GROUP BY ReportRunDate
This would work in Oracle, not sure about ms-access:
SELECT ReportDate
,COUNT(CASE WHEN rnk = 1 THEN 1 ELSE NULL END) count_of_new
,COUNT(CASE WHEN rnk <> 1 THEN 1 ELSE NULL END)count_of_old
FROM (SELECT ID
,ReportDate
,Object_id
,RANK() OVER (PARTITION BY Object_id ORDER BY ReportDate) rnk
FROM table_name)
GROUP BY ReportDate
Inner query should rank each occurence of object_id based on the ReportDate so the 1st occurrence of certain object_id will have rank = 1, the next one rank = 2 etc.
Then the outer query counts how many records with rank equal/not equal 1 are the within each group.
I assumed that 1 object_id can appear only once within each reportDate.

Use query result

I´m having issues with the following query. I have two tables; Table Orderheader and table Bought. The first query I execute gives me, for example, two dates. Based on these two dates, I need to find Production data AND, based on the production data, I need to find the Bought data, and combine those data together. Lets say I do the following:
Select Lotdate From Orderheader where orhsysid = 1
This results in two rows: '2019-02-05' and '2019-02-04'. Now I need to do two things: I need two run two queries using this result set. The first one is easy; use the dates returned and get a sum of column A like this:
Select date, SUM(Amount) from Orderheader where date = Sales.date() [use the two dates here]
The second one is slighty more complicated, I need to find the last day where something has been bought based on the two dates. Production is everyday so Productiondate=Sales.date()-1. But Bought is derived from Productionday and is not everyday so for every Productionday it needs to find the last Boughtday. So I can't say where date = Orderheader.date. I need to do something like:
Select date, SUM(Amount)
FROM Bought
WHERE date = (
SELECT top 1 date
FROM Bought
WHERE date < Orderheader.date)
But twice, for both the dates I got.
This needs to result in 1 table giving me:
Bought.date, Bought.SUM(AMOUNT), Orderheader.date, Orderheader.SUM(AMOUNT)
All based on the, possible multiple, Lotdate(s) I got from the first query from Sales table.
I've been struggling with this for a moment now, using joins and nested queries but I can't seem to figure it out!
Example sample:
SELECT CONVERT(date,ORF.orfDate) as Productiedatum, SUM(orlQuantityRegistered) as 'Aantal'
FROM OrderHeader ORH
LEFT JOIN OrderFrame ORF ON ORH.orhFrameSysID = ORF.orfSysID
LEFT JOIN OrderLine ORL ON ORL.orhSysID = ORH.orhSysID
LEFT JOIN Item ON Item.itmSysID = ORL.orlitmSysID
where CONVERT(date,ORF.orfDate) IN
(
SELECT DISTINCT(CONVERT(date, Lot.lotproductiondate)) as Productiedatum
FROM OrderHeader ORH
LEFT JOIN Registration reg ON reg.regorhSysID = ORH.orhSysID
LEFT JOIN StockRegistration stcreg ON stcreg.stcregRegistrationSysID = reg.regSysID
LEFT JOIN Lot ON Lot.lotSysID = stcregSrclotSysID
WHERE ORH.orhSysID = 514955
AND regRevokeRegSysID IS NULL
AND stcregSrcitmSysID = 5103
)
AND ORL.orlitmSysID = 5103
AND orldirSysID = 2
AND NOT orlQuantityRegistered IS NULL
GROUP BY Orf.orfDate
Sample output:
Productiedatum Aantal
2019-02-05 20
2019-02-06 20
Here I used a nested subquery to get the results from 'Production' (orderheader) because I just can use date = date. I'm struggling with the Sales part where I need to find the last date(s) and use those dates in the Sales table to get the sum of that date.
Expected output:
Productiedatum Aantal Boughtdate Aantal
2019-02-04 20 2019-02-01 55
2019-02-05 20 2019-02-04 60
Try this.
IF OBJECT_ID('tempdb..#Production') IS NOT NULL DROP TABLE #Production
IF OBJECT_ID('tempdb..#Bought') IS NOT NULL DROP TABLE #Bought
CREATE table #Production(R_NO int,ProductionDate datetime,ProductionAmount float)
CREATE table #Bought(R_NO int,Boughtdate datetime,Boughtamount float)
insert into #Production(ProductionDate,ProductionAmount,R_NO)
select p.date ProductionDate,sum(Amount) ProductionAmount,row_number()over (order by p.date) R_NO
from Production P
join Sales s on p.date=S.date-1
where orhsysid=1
group by p.date
declare #loop int,#ProdDate datetime
select #loop =max(R_NO) from #Production
while (1<=#loop)
begin
select #ProdDate=ProductionDate from #Production where r_no=#loop
insert into #Bought(Boughtdate,Boughtamount,R_NO)
select Date,Sum(Amount),#loop R_NO from Bought where date=(
select max(date) from bought B
where B.Date<#ProdDate)
group by Date
set #loop=#loop-1
end
select ProductionDate,ProductionAmount,Boughtdate,Boughtamount from #Bought B
join #Production p on B.R_NO=P.R_NO

Using a stored procedure in Teradata to build a summarial history table

I am using Terdata SQL Assistant connected to an enterprise DW. I have written the query below to show an inventory of outstanding items as of a specific point in time. The table referenced loads and stores new records as changes are made to their state by load date (and does not delete historical records). The output of my query is 1 row for the specified date. Can I create a stored procedure or recursive query of some sort to build a history of these summary rows (with 1 new row per day)? I have not used such functions in the past; links to pertinent previously answered questions or suggestions on how I could get on the right track in researching other possible solutions are totally fine if applicable; just trying to bridge this gap in my knowledge.
SELECT
'2017-10-02' as Dt
,COUNT(DISTINCT A.RECORD_NBR) as Pending_Records
,SUM(A.PAY_AMT) AS Total_Pending_Payments
FROM DB.RECORD_HISTORY A
INNER JOIN
(SELECT MAX(LOAD_DT) AS LOAD_DT
,RECORD_NBR
FROM DB.RECORD_HISTORY
WHERE LOAD_DT <= '2017-10-02'
GROUP BY RECORD_NBR
) B
ON A.RECORD_NBR = B.RECORD_NBR
AND A.LOAD_DT = B.LOAD_DT
WHERE
A.RECORD_ORDER =1 AND Final_DT Is Null
GROUP BY Dt
ORDER BY 1 desc
Here is my interpretation of your query:
For the most recent load_dt (up until 2017-10-02) for record_order #1,
return
1) the number of different pending records
2) the total amount of pending payments
Is this correct? If you're looking for this info, but one row for each "Load_Dt", you just need to remove that INNER JOIN:
SELECT
load_Dt,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE record_order = 1
AND final_Dt IS NULL
GROUP BY load_Dt
ORDER BY 1 DESC
If you want to get the summary info per record_order, just add record_order as a grouping column:
SELECT
load_Dt,
record_order,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE final_Dt IS NULL
GROUP BY load_Dt, record_order
ORDER BY 1,2 DESC
If you want to get one row per day (if there are calendar days with no corresponding "load_dt" days), then you can SELECT from the sys_calendar.calendar view and LEFT JOIN the query above on the "load_dt" field:
SELECT cal.calendar_date, src.Pending_Records, src.Total_Pending_Payments
FROM sys_calendar.calendar cal
LEFT JOIN (
SELECT
load_Dt,
COUNT(DISTINCT record_nbr) AS Pending_Records,
SUM(pay_amt) AS Total_Pending_Payments
FROM DB.record_history
WHERE record_order = 1
AND final_Dt IS NULL
GROUP BY load_Dt
) src ON cal.calendar_date = src.load_Dt
WHERE cal.calendar_date BETWEEN <start_date> AND <end_date>
ORDER BY 1 DESC
I don't have access to a TD system, so you may get syntax errors. Let me know if that works or you're looking for something else.

SELECT TOP 10 rows

I have built an SQL Query that returns me the top 10 customers which have the highest outstanding. The oustanding is on product level (each product has its own outstanding).
Untill now everything works fine, my only problem is that if a certain customer has more then 1 product then the second product or more should be categorized under the same customer_id like in the second picture (because the first product that has the highest outstanding contagions the second product that may have a lower outstanding that the other 9 clients of top 10).
How can I modify my query in order to do that? Is it possible in SQL Server 2012?
My query is:
select top 10 CUSTOMER_ID
,S90T01_GROSS_EXPOSURE_THSD_EUR
,S90T01_COGNOS_PROD_NAME
,S90T01_DPD_C
,PREVIOUS_BUCKET_DPD_REP
,S90T01_BUCKET_DPD_REP
from [dbo].[DM_07MONTHLY_DATA]
where S90T01_CLIENT_SEGMENT = 'PI'
and YYYY_MM = '2017_01'
group by CUSTOMER_ID
,S90T01_GROSS_EXPOSURE_THSD_EUR
,S90T01_COGNOS_PROD_NAME
,S90T01_DPD_C
,PREVIOUS_BUCKET_DPD_REP
,S90T01_BUCKET_DPD_REP
order by S90T01_GROSS_EXPOSURE_THSD_EUR desc;
You need to calculate the top Customers first, then pull out all their products. You can do this with a Common Table Expression.
As you haven't provided any test data this is untested, but I think it will work for you:
with top10 as
(
select top 10 CUSTOMER_ID
,sum(S90T01_GROSS_EXPOSURE_THSD_EUR) as TOTAL_S90T01_GROSS_EXPOSURE_THSD_EUR
from [dbo].[DM_07MONTHLY_DATA]
where S90T01_CLIENT_SEGMENT = 'PI'
and YYYY_MM = '2017_01'
group by CUSTOMER_ID
order by TOTAL_S90T01_GROSS_EXPOSURE_THSD_EUR desc
)
select m.CUSTOMER_ID
,m.S90T01_GROSS_EXPOSURE_THSD_EUR
,m.S90T01_COGNOS_PROD_NAME
,m.S90T01_DPD_C
,m.PREVIOUS_BUCKET_DPD_REP
,m.S90T01_BUCKET_DPD_REP
from [dbo].[DM_07MONTHLY_DATA] m
join top10 t
on m.CUSTOMER_ID = t.CUSTOMER_ID
order by t.TOTAL_S90T01_GROSS_EXPOSURE_THSD_EUR desc
,m.S90T01_GROSS_EXPOSURE_THSD_EUR;