select and delete query based on older entries - sql

I have an Excel sheet that is pushing data to an Access database using ADO. It is essentially putting invoices into a database. Sometimes I will revise my invoice and therefore the database will end up with the same invoice twice. I need to make a select and delete query that will find duplicates based on the invoice number, and delete the older version of the invoice (older record), for a simple example:
id invoice# total item datestamp
1 1234 456.29$ shoes 06/06/2016 03:51
2 1234 78.58$ boots 06/06/2016 03:51
3 1234 22.74$ scarf 06/06/2016 03:51
4 1234 539.34$ shoes 06/07/2016 12:44
4 1234 66.24$ pants 06/07/2016 12:44
As you can see row 4 and 5 are my new invoice for this customer. I want every previous order of the same invoice # to be deleted. Please note: they are not actually duplicates, only the invoice number is duplicated. The query needs to see dupliactes based on invoice number and criteria sees dates older than the most recent date.
At that point it is way beyond me. I would appreciate the help.

Consider using a correlated aggregate subquery in WHERE clause:
DELETE *
FROM InvoiceTable
WHERE NOT datestamp IN
(SELECT Max(datestamp)
FROM InvoiceTable sub
WHERE sub.InvoiceNumber = InvoiceTable.InvoiceNumber)

As I said, try being conservative and not deleting. Instead, select rows that are based on the maximum date stamp for a given invoice number:
SELECT
invoices.id, invoices.invoice, invoices.total, invoices.item, invoices.datestamp
FROM
invoices
INNER JOIN
(SELECT
id, MAX(datestamp) AS maxdate
FROM
invoices
GROUP BY
id) lastinv
ON invoices.id = lastinv.id AND
invoices.datestamp = lastinv.maxdate
This is untested code, but should, pretty much do what you want. All you have to do is mangle it into Microsoft Access, as this is T-SQL.

Related

Find the latest record

In MS Access I have the DateList table, which holds the due date of different orders. Thus, the table has two columns: OrderNo and DueDate. For some order numbers, there could be multiple DueDates. The table could look like below:
OrderNo DueDate
100 12/9/2021
101 20/9/2021
102 30/9/2021
100 7/10/2021
102 11/10/2021
103 15/10/2021
…
My goal is write a query to fetch the latest DueDate of each OrderNr.
I created two queries;
the first one, qry1, to generate a list of OrdNo without duplications:
SELECT
DateList.OrderNo AS UniqOrderNo
FROM DateList
GROUPBY DateList.OrderNo;
in the second query, qry2, I used the DMax function in order to search through DueDates of each order for the maximum value.
SELECT
qry1.UniqOrderNo
,DMax("[DueDate]","[DateList]","[OrderNo]='[qry1]![UniqOrderNo]'") AS LatDuDate
FROM qry1
INNER JOIN DateList
ON qry1.UniqOrderNo = DateList.OrderNo;
LatDuDate represents the latest DueDate of the Order.
The query is unfortunately does not work and returns nothing.
Now my questions:
Is there something wrong with my approach / queries?
Is there better way to accomplish this task in MS Access?
You almost figured it out yourself. Max returns you the biggest value of the group.
SELECT Max(DueDate) DueDate, OrderNo
FROM DateList
GROUP BY OrderNo
Similar to Christian's answer, but since OrderNo is a unique id, you can simply select the First() instead of grouping - it performs better. **
Of course it depends on the number of records the table holds.
SELECT First(OrderNo) AS OrderNo, Max(DueDate) AS DueDate
FROM DateList;
** Source: Allen Browne - Optimizing queries

I want NAV price as per (Today date minus 1) date

I have two tables. One is NAV where product daily new price is updated. Second is TDK table where item wise stock is available.
Now I want to get a summery report as per buyer name where all product wise total will come and from table one latest price will come.
I have tried below query...
SELECT dbo.TDK.buyer, dbo.NAV.Product_Name, sum(dbo.TDK.TD_UNITS) as Units, sum(dbo.TDK.TD_AMT) as 'Amount',dbo.NAV.NAValue
FROM dbo.TDK INNER JOIN
dbo.NAV
ON dbo.TDK.Products = dbo.NAV.Product_Name
group by dbo.TDK.buyer, dbo.NAV.Product_Name, dbo.NAV.NAValue
Imnportant: Common columns in both tables...
Table one NAV has column as Products
Table two TDK has column as Product_Name
If I have NAValue 4 records for one product then this query shows 4 lines with same total.
What I need??
I want this query to show only one line with latest NAValue price.
I want display one more line with Units*NAValue (latest) as "Latest Market Value".
Please guide.
What field contains the quote date? I am assuming you have a DATIME field, quoteDate, in dbo.NAV table and my other assumption is that you only store the Date part (i.e. mid-night, time = 00:00:00).
SELECT
t.buyer,
n.Product_Name,
sum(t.TD_UNITS) as Units,
sum(t.TD_AMT) as 'Amount',
n.NAValue
FROM dbo.TDK t
INNER JOIN dbo.NAV n
ON t.Products = n.Product_Name
AND n.quoteDate > getdate()-2
group by t.buyer, n.Product_Name, n.NAValue, n.QuoteDate
GetDate() will give you the current date and time. Subtracting 2 would get it before yesterday but after the day before yesterday.
Also, add n.quoteDate in your select and group by. Even though you don't need it, in case that one day you have a day of bad data with double record in NAV table, one with midnight time and another with 6 PM time.
Your code looks like SQL Server. I think you just want APPLY:
SELECT t.buyer, n.Product_Name, t.TD_UNITS as Units, t.TD_AMT as Amount, n.NAValue
FROM dbo.TDK t CROSS APPLY
(SELECT TOP (1) n.*
FROM dbo.NAV n
WHERE t.Products = n.Product_Name
ORDER BY ?? DESC -- however you define "latest"
) n;

How many customers upgraded from Product A to Product B?

I have a "daily changes" table that records when a customer "upgrades" or "downgrades" their membership level. In the table, let's say field 1 is customer ID, field 2 is membership type and field 3 is the date of change. Customers 123 and ABC each have two rows in the table. Values in field 1 (ID) are the same, but values in field 2 (TYPE) and 3 (DATE) are different. I'd like to write a SQL query to tell me how many customers "upgraded" from membership type 1 to membership type 2 how many customers "downgraded" from membership type 2 to membership type 1 in any given time frame.
The table also shows other types of changes. To identify the records with changes in the membership type field, I've created the following code:
SELECT *
FROM member_detail_daily_changes_new
WHERE customer IN (
SELECT customer
FROM member_detail_daily_changes_new
GROUP BY customer
HAVING COUNT(distinct member_type_cd) > 1)
I'd like to see an end report which tells me:
For Fiscal 2018,
X,XXX customers moved from Member Type 1 to Member Type 2 and
X,XXX customers moved from Member Type 2 to Member type 1
Sounds like a good time to use a LEAD() analytical function to look ahead for a given customer's member_Type; compare it to current record and then evaluate if thats an upgrade/downgrade then sum results.
DEMO
CTE AS (SELECT case when lead(Member_Type_Code) over (partition by Customer order by date asc) > member_Type_Code then 1 else 0 end as Upgrade
, case when lead(Member_Type_Code) over (partition by Customer order by date asc) < member_Type_Code then 1 else 0 end as DownGrade
FROM member_detail_daily_changes_new
WHERE Date between '20190101' and '20190201')
SELECT sum(Upgrade) upgrades, sum(downgrade) downgrades
FROM CTE
Giving us: using my sample data
+----+----------+------------+
| | upgrades | downgrades |
+----+----------+------------+
| 1 | 3 | 2 |
+----+----------+------------+
I'm not sure if SQL express on rex tester just doesn't support the sum() on the analytic itself which is why I had to add the CTE or if that's a rule in non-SQL express versions too.
Some other notes:
I let the system implicitly cast the dates in the where clause
I assume the member_Type_Code itself tells me if it's an upgrade or downgrade which long term probably isn't right. Say we add membership type 3 and it goes between 1 and 2... now what... So maybe we need a decimal number outside of the Member_Type_Code so we can handle future memberships and if it's an upgrade/downgrade or a lateral...
I assumed all upgrades/downgrades are counted and a user can be counted multiple times if membership changed that often in time period desired.
I assume an upgrade/downgrade can't occur on the same date/time. Otherwise the sorting for lead may not work right. (but if it's a timestamp field we shouldn't have an issue)
So how does this work?
We use a Common table expression (CTE) to generate the desired evaluations of downgrade/upgrade per customer. This could be done in a derived table as well in-line but I find CTE's easier to read; and then we sum it up.
Lead(Member_Type_Code) over (partition by customer order by date asc) does the following
It organizes the data by customer and then sorts it by date in ascending order.
So we end up getting all the same customers records in subsequent rows ordered by date. Lead(field) then starts on record 1 and Looks ahead to record 2 for the same customer and returns the Member_Type_Code of record 2 on record 1. We then can compare those type codes and determine if an upgrade or downgrade occurred. We then are able to sum the results of the comparison and provide the desired totals.
And now we have a long winded explanation for a very small query :P
You want to use lag() for this, but you need to be careful about the date filtering. So, I think you want:
SELECT prev_membership_type, membership_type,
COUNT(*) as num_changes,
COUNT(DISTINCT member) as num_members
FROM (SELECT mddc.*,
LAG(mddc.membership_type) OVER (PARTITION BY mddc.customer_id ORDER BY mddc.date) as prev_membership_type
FROM member_detail_daily_changes_new mddc
) mddc
WHERE prev_membership_type <> membership_type AND
date >= '2018-01-01' AND
date < '2019-01-01'
GROUP BY membership_type, prev_membership_type;
Notes:
The filtering on date needs to occur after the calculation of lag().
This takes into account that members may have a certain type in 2017 and then change to a new type in 2018.
The date filtering is compatible with indexes.
Two values are calculated. One is the overall number of changes. The other counts each member only once for each type of change.
With conditional aggregation after self joining the table:
select
2018 fiscal,
sum(case when m.member_type_cd > t.member_type_cd then 1 else 0 end) upgrades,
sum(case when m.member_type_cd < t.member_type_cd then 1 else 0 end) downgrades
from member_detail_daily_changes_new m inner join member_detail_daily_changes_new t
on
t.customer = m.customer
and
t.changedate = (
select max(changedate) from member_detail_daily_changes_new
where customer = m.customer and changedate < m.changedate
)
where year(m.changedate) = 2018
This will work even if there are more than 2 types of membership level.

Select most current data in grouped set in Oracle

I am writing a procedure to query some data in Oracle and grouping it:
Account Amt Due Last payment Last Payment Date (mm/dd/yyyy format)
1234 10.00 5.00 12/12/2013
1234 35.00 8.00 12/12/2013
3293 15.00 10.00 11/18/2013
4455 8.00 3.00 5/23/2013
4455 14.00 5.00 10/18/2013
I want to group the data, so there is one record per account, the Amt due is summed, as well as the last payment. Unless the last payment date is different -- if the date is different, then I just want the last payment. So I would want to have a result of something like this:
Account Amt Due Last payment Last Payment Date
1234 45.00 13.00 12/12/2013
3293 15.00 10.00 11/18/2013
4455 22.00 5.00 10/18/2013
I was doing something like
select Account, sum (AmtDue), sum (LastPmt), Max (LastPmtDt)
from all my tables
group by Account
But, that doesn't work for the last record above, because the last payment was only the $5.00 on 10/18, not the sum of them on 10/18.
If I group by Account and LastPmtDt, then I get two records for the last, but I only want one per account.
I have other data I'm querying, and I'm using a CASE, INSTR, and LISTAGG on another field (if combining them gives me this substring and that, then output 'Both'; else if it only gives me this substring, then output the substring; else if it only gives me the other substring, then output that one). It seems like I may need something similar, but not by looking for a specific date. If the dates are the same, then sum (LastPmt) and max (LastPmtDt) works fine, if they are not the same, then I want to ignore all but the most recent LastPmt and LastPmtDt record(s).
Oh, and my LastPmt and LastPmtDt fields are already case statements within the select. They aren't fields that I already can just access. I'm reading other posts about RANK and KEEP, but to involve both fields, I'd need all that calculation of each field as well. Would it be more efficient to query everything, and then wrap another query around that to do the grouping, summing, and selecting fields I want?
Related: HAVING - GROUP BY to get the latest record
Can someone provide some direction on how to solve this?
Try this:
select Account,
sum ( Amt_Due),
sum (CASE WHEN Last_Payment_Date = last_dat THEN Last_payment ELSE 0 END),
Max (Last_Payment_Date)
from (
SELECT t.*,
max( Last_Payment_Date ) OVER( partition by Account ) last_dat
FROM table1 t
)
group by Account
Demo --> http://www.sqlfiddle.com/#!4/fc650/8
Rank is the right idea.
Try this
select a.Account, a.AmtDue, a.LastPmt, a.LastPmtDt from (
select Account, sum (AmtDue) AmtDue, sum (LastPmt) LastPmt, LastPmtDt,
RANK() OVER (PARTITION BY Account ORDER BY LastPmtDt desc) as rnk
from all my tables
group by Account, LastPmtDt
) a
where a.rnk = 1
I haven't tested this, but it should give you the right idea.
Try this:
select Account, sum(AmtDue), sum(LastPmt), LastPmtDt
from (select Account,
AmtDue,
LastPmt,
LastPmtDt,
max(LastPmtDt) over(partition by Account) MaxLastPmtDt
from your_table) t
where t.LastPmtDt = t.MaxLastPmtDt
group by Account, LastPmtDt

Create SQL view column from result set

I'm new to SQL and attempting to revise a Create View script to add a new column from a select statement result set I've googled this quite a bit but haven't seen a good example.
Here's the select statement:
select lease_id, year(posting_date) as years1, SUM(amount) as Annual
from la_tbl_lease_projection
group by year(posting_date), lease_id
order by lease_id
The complicating factor is this. The Annual column in the result set is the Annual sum of expenses for a lease_id. However, in the view I'm adding the column to, expenses are listed monthly. So lease_id 100001 has 12 lines in 2010, 2011, etc. I want the view to have the new column show the Annual amount on each of the 12 monthly line items. The new Annual column should be to the right of the amount column and each line should contain the sum of the amount column for that year. e.g.:
Lease_id Posting_Date Amount Annual
100001 2010-01-01 $25 $300
100001 2010-02-01 $25 $300
etc...............
The view I'm adding to is a reasonably complex join and union from multiple tables. Instead of creating a new table for my result set, I'd like to access it using a stored procedure, unless there's a better option. MSDN says temp tables and table variables don't work in views so that's not an option.
I think this can be done by something like "when years1 = years1 AND lease_id = lease_id then [Annual] = resultset total, but can't seem to visualize it. Thanks in advance for your input.
Since you were looking at MSDN, I'm assuming SQL Server for this answer;
To get a yearly column that's a by year sum of amounts, you can use SUM() OVER ();
SELECT *, SUM(Amount) OVER (PARTITION BY YEAR(Posting_Date)) Yearly
FROM la_tbl_lease_projection;
An SQLfiddle to test with.
I think a derived table would do the trick for you something like:
select blah, blah2, blah3, ..., a.annual
from
<Long complicated set of joins>
join
(select lease_id, year(posting_date) as years1, SUM(amount) as Annual
from la_tbl_lease_projection
group by year(posting_date), lease_id
order by lease_id) a
on sometable.lease_id = a.lease_id and year(sometable .posting_date) = a.years1
Where <complex where conditions>