MS Access Query to retrieve records with lasted date in orderdate column - sql

I have table in Access Database with the following columns
ProductID|ProductName|StoreID|StoreName|AuditRating|AuditVisit|NextAuditDue
100100 |Calculator |SC12345|CrawlyRoad| B |11/12/2013|21/02/2014
100100 |Calculator |SC12345|CrawlyRoad| A |11/12/2014|30/04/2015
100100 |Calculator |SC12345|CrawlyRoad| C |16/12/2015|24/01/2017
I need to make a query which will only give me the distinct record where the AuditVisit date is maximum like in this case I only want the third row
100100 |Calculator |SC12345|CrawlyRoad| C |16/12/2015|24/01/2017
I have used group by but as I need to bring all the columns I am getting all the records as the AuditRating column is different in all three rows.

You can use TOP 1:
Select Top 1 * From YourTable Order By AuditVisit Desc

You don't want to have groups, so why use group by?
SELECT
*
FROM your_table
WHERE AuditVisit = (SELECT MAX(AuditVisit) FROM your_table)
Pretty self-explaining, I think.

If you want one record in MS Access, then you need to be very careful with SELECT TOP. It is really SELECT TOP WITH TIES.
Hence, the obvious answer of:
Select Top 1 *
From t
Order By AuditVisit Desc;
would return multiple rows if multiple rows have the same date. If you really want one, then you want to add a unique column as the last key in the order by:
Select Top 1 *
From t
Order By AuditVisit Desc, id;
I don't see such a key in your data, although you might have a combination of columns that are unique in each row (multiple columns can be added to the ORDER BY).
In MS Access -- even more so than in other databases -- primary keys are important on tables for this reason.

Related

Select first three rows for each ID

I have executed the following query:
SELECT ProductID, Quantity, Location
FROM DBLocations
ORDER BY ProductID, LocationDistanceIndex DESC;
Afterwards, I've been trying to select up to 3 closest warehouses which have each of the products - LocationDistanceIndex column (Also there could be none, 1 or 2).
How would I write the query to remain with up to 3 records for each ProductID - the 3 records with the highest LocationDistanceIndex hence the descending order by.
Also if there is a way to perform such filtering without manually written queries in MS Access, it would be great if somebody points that out.
Note: I tried using Row_Number() Over Partition but MS Access does not seem to support that.
Here is one method for MS Access:
SELECT l.*
FROM DBLocations l
WHERE l.LocationDistanceIndex IN (SELECT TOP 3 l2.LocationDistanceIndex
FROM DBLocations l2
WHERE l.ProductID = l2.ProductID
ORDER BY l2.LocationDistanceIndex DESC
);

Join query in Access 2013

Currently have a single table with large amount of data in access, due to the size I couldn't easily work with it in Excel any more.
I'm partially there on a query to pull data from this table.
7 Column table
One column GL_GL_NUM contains a transaction number. ~ 75% of these numbers are pairs. I'm trying to pull the records (all columns information) for each unique transaction number in this column.
I have put together some code from googling that hypothetically should work but I think I'm missing something on the syntax or simply asking access to do what it cannot.
See below:
SELECT SOURCE_FUND, GLType, Contract, Status, Debit, Credit, GL_GL_NUM
FROM Suspense
JOIN (
SELECT TC_TXN_NUM TXN_NUM, COUNT(GL_GL_NUM) GL_NUM
FROM Suspense
GROUP BY TC_TXN_NUM HAVING COUNT(GL_GL_NUM) > 1 ) SUB ON GL_GL_NUM = GL_NUM
Hey Beth is this the suggested code? It says there is a syntax error in the FROM clause. Thanks.
SELECT * from SuspenseGL
JOIN (
SELECT TC_TXN_NUM, COUNT(GL_GL_NUM) GL_NUM
FROM Suspense
GROUP BY TC_TXN_NUM
HAVING COUNT(GL_GL_NUM) > 1
Do you want detailed results (all rows and columns) or aggregate results, with one row per tx number?
If you want an aggregate result, like the count of distinct transaction numbers, then you need to apply one or more aggregate functions to any other columns you include.
If you run
SELECT TC_TXN_NUM, COUNT(GL_GL_NUM) GL_NUM
FROM Suspense
GROUP BY TC_TXN_NUM
HAVING COUNT(GL_GL_NUM) > 1
you'll get one row for each distinct txn, but if you then join those results back with your original table, you'll have the same number of rows as if you didn't join them with distinct txns at all.
Is there a column you don't want included in your results? If not, then the only query you need to work with is
select * from suspense
Considering your column names, what you may want is:
SELECT SOURCE_FUND, GLType, Contract, Status, sum(Debit) as sum_debit,
sum(Credit) as sum_credit, count(*) as txCount
FROM Suspense
group by
SOURCE_FUND, GLType, Contract, Status
based on your comments, if you can't work with aggregate results, you need to work with them all:
Select * from suspense
What's not working? It doesn't matter if 75% of the txns are duplicates, you need to send out every column in every row.
OK, let's say
Select * from suspense
returns 8 rows, and
select GL_GL_NUM from suspense group by GL_GL_NUM
returns 5 rows, because 3 of them have duplicate GL_GL_NUMs and 2 of them don't.
How many rows do you want in your result set? if you want less than 8 rows back, you need to perform some sort of aggregate function on each column you want returned.
You could do something like the following:
SELECT S.* FROM
SUSPENSE AS S
INNER JOIN (SELECT DISTINCT GL_GL_NUM, MIN(ID) AS ID FROM SUSPENSE
GROUP BY GL_GL_NUM) AS S2
ON S.ID = S2.ID
AND S.GL_GL_NUM = S2.GL_GL_NUM
Which would return a single row for a unique gl_gl_num. However if the other rows have different data it will not be shown. You would have to either aggregate that data up using SUM(Credit), SUM(Debit) and then GROUP BY the gl_gl_num.
I have attached a SQL Fiddle to demonstrate my results and make this clearer.
http://sqlfiddle.com/#!3/8284f/2

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?
Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.

Filtering Database Results to Top n Records for Each Value in a Lookup Column

Let's say I have two tables in my database.
TABLE:Categories
ID|CategoryName
01|CategoryA
02|CategoryB
03|CategoryC
and a table that references the Categories and also has a column storing some random number.
TABLE:CategoriesAndNumbers
CategoryType|Number
CategoryA|24
CategoryA|22
CategoryC|105
.....(20,000 records)
CategoryB|3
Now, how do I filter out this data? So, I want to know what the 3 smallest numbers are out of each category and delete the rest. The end result would be like this:
TABLE:CategoriesAndNumbers
CategoryType|Number
CategoryA|2
CategoryA|5
CategoryA|18
CategoryB|3
CategoryB|500
CategoryB|1601
CategoryC|1
CategoryC|4
CategoryC|62
Right now, I can get the smallest numbers between all the categories, but I would like each category to be compared individually.
EDIT: I'm using Access and here's my code so far
SELECT TOP 10 cdt1.sourceCounty, cdt1.destCounty, cdt1.distMiles
FROM countyDistanceTable as cdt1, countyTable
WHERE cdt1.sourceCounty = countyTable.countyID
ORDER BY cdt1.sourceCounty, cdt1.distMiles, cdt1.destCounty
EDIT2: Thanks to Remou, here would be the working query that solved my problem. Thank you!
DELETE
FROM CategoriesAndNumbers a
WHERE a.Number NOT IN (
SELECT Top 3 [Number]
FROM CategoriesAndNumbers b
WHERE b.CategoryType=a.CategoryType
ORDER BY [Number])
You could use something like:
SELECT a.CategoryType, a.Number
FROM CategoriesAndNumbers a
WHERE a.Number IN (
SELECT Top 3 [Number]
FROM CategoriesAndNumbers b
WHERE b.CategoryType=a.CategoryType
ORDER BY [Number])
ORDER BY a.CategoryType
The difficulty with this is that Jet/ACE Top selects duplicate values where they exist, so you will not necessarily get three values, but more, if there are ties. The problem can often be solved with a key field, if one exists :
WHERE a.Number IN (
SELECT Top 3 [Number]
FROM CategoriesAndNumbers b
WHERE b.CategoryType=a.CategoryType
ORDER BY [Number], [KeyField])
However, I do not think it will help in this instance, because the outer table will include ties.
Order it by number and take 3, find out what the biggest number is and then remove rows where Number is greater than the Number.
I imagine it would need to be two seperate queries as your business tier would hold the value for the biggest number out of the 3 results and dynamically build the query to delete the rest.

Using a DISTINCT clause to filter data but still pull other fields that are not DISTINCT

I am trying to write a query in Postgresql that pulls a set of ordered data and filters it by a distinct field. I also need to pull several other fields from the same table row, but they need to be left out of the distinct evaluation. example:
SELECT DISTINCT(user_id) user_id,
created_at
FROM creations
ORDER BY created_at
LIMIT 20
I need the user_id to be DISTINCT, but don't care whether the created_at date is unique or not. Because the created_at date is being included in the evaluation, I am getting duplicate user_id in my result set.
Also, the data must be ordered by the date, so using DISTINCT ON is not an option here. It required that the DISTINCT ON field be the first field in the ORDER BY clause and that does not deliver the results that I seek.
How do I properly use the DISTINCT clause but limit its scope to only one field while still selecting other fields?
As you've discovered, standard SQL treats DISTINCT as applying to the whole select-list, not just one column or a few columns. The reason for this is that it's ambiguous what value to put in the columns you exclude from the DISTINCT. For the same reason, standard SQL doesn't allow you to have ambiguous columns in a query with GROUP BY.
But PostgreSQL has a nonstandard extension to SQL to allow for what you're asking: DISTINCT ON (expr).
SELECT DISTINCT ON (user_id) user_id, created_at
FROM creations
ORDER BY user_id, created_at
LIMIT 20
You have to include the distinct expression(s) as the leftmost part of your ORDER BY clause.
See the manual on DISTINCT Clause for more information.
If you want the most recent created_at for each user then I suggest you aggregate like this:
SELECT user_id, MAX(created_at)
FROM creations
WHERE ....
GROUP BY user_id
ORDER BY created_at DESC
This will return the most recent created_at for each user_id
If you only want the top 20, then append
LIMIT 20
EDIT: This is basically the same thing Unreason said above... define from which row you want the data by aggregation.
The GROUP BY should ensure distinct values of the grouped columns, this might give you what you are after.
(Note I'm putting in my 2 cents even though I am not familiar with PostgreSQL, but rather MySQL and Oracle)
In MySql
SELECT user_id, created_at
FROM creations
GROUP BY user_id
ORDER BY user_id
In Oracle sqlplus
SELECT user_id, FIRST(created_at)
FROM creations
GROUP BY user_id
ORDER BY user_id
These will give you the user_id followed by the first created_at associated with that user_id. If you want a different created_at you have the option to substitute FIRST with other functions like AVG, MIN, MAX, or LAST in Oracle, you can also try adding ORDER BY on other columns (including ones that are not returned, to give you a different created_at.
Your question is not well defined - when you say you need also other data from the same row you are not defining which row.
You do say you need to order the results by created_at, so I will assume that you want values from the row with min created_at (earliest).
This now becomes one of the most common so SQL questions - retrieving rows containing some aggregate value (MIN, MAX).
For example
SELECT user_id, MIN(created_at) AS created_at
FROM creations
GROUP BY user_id
ORDER BY MIN(create_at)
LIMIT 20
This approach will not let you (easily) pick other values from the same row.
One approach that will let you pick other values is
SELECT c.user_id, c.created_at, c.other_columns
FROM creations c LEFT JOIN creation c_help
ON c.user_id = c_help.user_id AND c.created_at > c_help.create_at
WHERE c_help IS NULL
ORDER BY c.created_at
LIMIT 20
Using a sub-query was suggested by someone on the irc #postgresql channel. It worked:
SELECT user_id
FROM (SELECT DISTINCT ON (user_id) * FROM creations) ss
ORDER BY created_at DESC
LIMIT 20;