SQL statement selecting most recent date no longer working - sql

I've had a sudden failure in one of my reporting routines and have traced it back to the having portion of my statement. The function this has been serving, up until 2 days ago, was selecting the most recent date from the dbo.data_feed_file table (column name: File_Date).
Statement follows
HAVING (dbo.data_feed_file.file_date = (Select MAX(File_Date) as Expr1
FROM dbo.data_feed_file AS data_feed_file_1))
First: is there an alternative way to write this? I've gotten my report working by removing the statement, it's just 2.5 million more lines than I want. I know I can hard code the date to pull just the specific date I want, but automation is obviously preferred.
Second: Does anyone know what could cause this to spontaneously fail? I'm the only person with access to edit this query so I know nothing was changed (no really, nothing changed).
Thanks in advance.
Edit: To add clarification: There is no error message, the column headers are showing up as anticipated but no data is populated, it's just blank fields (as though nothing met the having criteria). The statement completes as though there is nothing wrong. I've confirmed there are no NULL values in the File_Date column.

I can think of two reasons why no rows would return. The first is that the subquery is returning NULL. This is easily fixed as:
HAVING (dbo.data_feed_file.file_date = (Select MAX(File_Date) as Expr1
FROM dbo.data_feed_file AS data_feed_file_1
where file_date is not null))
The second is that File_Date is stored as a datetime, rather than a date. If so, you might have a where clause that filters out the most recent value, and be missing it in the having clause. If you intend dates, but the value is stored as a datetime, then you can try:
HAVING (cast(dbo.data_feed_file.file_date as date) =
(Select cast(MAX(File_Date) as date) as Expr1
FROM dbo.data_feed_file AS data_feed_file_1
where file_date is not null))

Related

Unexpected result with ORDER BY

I have the following query:
SELECT
D.[Year] AS [Year]
, D.[Month] AS [Month]
, CASE
WHEN f.Dept IN ('XSD') THEN 'Marketing'
ELSE f.Dept
END AS DeptS
, COUNT(DISTINCT f.OrderNo) AS CountOrders
FROM Sales.LocalOrders AS l WITH
INNER JOIN Sales.FiscalOrders AS f
ON l.ORDER_NUMBER = f.OrderNo
INNER JOIN Dimensions.Date_Dim AS D
ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE)
WHERE YEAR(f.OrderDate) = 2019
AND f.Dept IN ('XSD', 'PPM', 'XPP')
GROUP BY
D.[Year]
, D.[Month]
, f.Dept
ORDER BY
D.[Year] ASC
, D.[Month] ASC
I get the following result the ORDER BY isn't giving the right result with Month column as we can see it is not ordered:
Year Month Depts CountOrders
2019 1 XSD 200
2019 10 PPM 290
2019 10 XPP 150
2019 2 XSD 200
2019 3 XPP 300
The expected output:
Year Month Depts CountOrders
2019 1 XSD 200
2019 2 XSD 200
2019 3 XPP 300
2019 10 PPM 290
2019 10 XPP 150
Your query
It is ordered by month, as your D.[Month] is treated like a text string in the ORDER BY clause.
You could do one of two things to fix this:
Use a two-digit month number (e.g. 01... 12)
Use a data type for the ORDER BY clause that will be recognized as representing a month
A quick fix
You can correct this in your code by quickly changing the ORDER BY clause to analyze those columns as though they are numbers, which is done by converting ("casting") them to an integer data type like this:
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
This will correct your unexpected query results, but does not address the root cause, which is your underlying data (more on that below).
Your underlying data
The root cause of your issue is how your underlying data is stored and/or surfaced.
Your Month seems to be appearing as a default data type (VarChar), rather than something more specifically suited to a month or date.
If you administer or have access to or control over the database, it is a good idea to consider correcting this.
In considering this, be mindful of potential context and change management issues, including:
Is this underlying data, or just a representation of upstream data that is elsewhere? (e.g. something that is refreshed periodically using a process that you do not control, or a view that is redefined periodically)
What other queries or processes rely on how this data is currently stored or surfaced (including data types), that may break if you mess with it?
Might there be validation issues if correcting it? (such as from the way zero, null, non-numeric or non-date data is stored, even if invalid)
What change management practices should be followed in your environment?
Is the data source under high transactional load?
Is it a production dataset?
Are other reporting processes dependent on it?
None of these issues are a good excuse to leave something set up incorrectly forever, which will likely compound the issue and introduce others. However, that is only part of the story.
The appropriate approach (correct it, or leave it) will depend on your situation. In a perfect textbook world, you'd correct it. In your world, you will have to decide.
A better way?
The above solution is a bit of a quick and nasty way to force your query to work.
The fact that the solution CASTs late in the query syntax, after the results have been selected and filtered, hints that is not the most elegant way to achieve this.
Ideally you can convert data types as early as possible in the process:
If done in underlying data, not the query, this is the ultimate but may not suit the situation (see below)
If done in the query, try to do it earlier.
In your case, your GROUP BY and ORDER BY are both using columns that look to be redundant data from the original query results, that is, you are getting a DATE and a MONTH and a YEAR. Ideally you would just get a DATE and then use the MONTH or YEAR from that date. Your issue is your dates are not actually dates (see "underlying data" above), which:
In the case of DATE, is converted in your INNER JOIN line ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE) (likely to minimise issues with the join)
In the case of D.[year] and D.[month], are not converted (which is why we still need to convert them further down, in ORDER BY)
You could consider ignoring D.[month] and use the MONTH DATEPART computed from DATE, which would avoid the need to use CAST in the ORDER BY clause.
In your instance, this approach is a middle ground. The quick fix is included at the top of this answer, and the best fix is to correct the underlying data. This last section considers optimizing the quick fix, but does not correct the underlying issue. It is only mentioned for awareness and to avoid promoting the use of CAST in an ORDER BY clause as the most legitimate way of addressing your issue with good clean query syntax.
There are also potential performance tradeoffs between how many columns you select that you don't need (e.g. all of the ones in D?), whether to compute the month from the date or a seperate month column, whether to cast to date before filtering, etc. These are beyond the scope of this solution.
So:
The immediate solution: use the quick fix
The optimal solution: after it's working, consider the underlying data (in your situation)
The real problem is your object Dimensions.Date_Dim here. As you are simply ordering on the value of D.[Year] and D.[Month] without manipulating the values at all, this means the object is severely flawed; you are storing numerical data as a varchar. varchar, and numerical data types are completely different. For example 2 is less than 10 but '2' is greater than '10'; because '2' is greater than '1', so therefore it must also be greater than '10'.
The real solution, therefore, is fixing your object. Assuming that both Month and Year are incorrectly stored as a varchar, don't have any non-integer values (another and different flaw if so), and not a computed column then you could just do:
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Year] int NOT NULL;
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Month] int NOT NULL;
You could, however, also make the columns a PERSISTED computed column, which might well be easier, in my opinion, as DATEPART already returns a strongly typed int value.
ALTER TABLE dbo.Date_Dim DROP COLUMN [Month];
ALTER TABLE dbo.Date_Dim ADD [Month] AS DATEPART(MONTH,[Date]) PERSISTED;
Of course, for both solutions, you'll need to (first) DROP and (afterwards) reCREATE any indexes and constraints on the columns.
As long as your "Month" is always 1-12, you can use
SELECT ..., TRY_CAST(D.[Month] AS INT) AS [Month],...
ORDER BY TRY_CAST(D.[Month] AS INT)
The simplest solution is:
ORDER BY MIN(D.DATE)
or:
ORDER BY MIN(f.ORDER_DATE)
Fiddling with the year and month columns is totally unnecessary when you have a date column that is available.
A very common issue when you store numerical data as a varchar/nvarchar.
Try to cast Year and Month to INT.
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
If you try using the <, > and BETWEEN operators, you will get some really "weird" results.

Access query expression returns incorrect

I'm working on a query that pulls a date from another query, I have my reasons for the nesting. The problem I'm facing is that there is a field that is called DueDate.
My SQL is
SELECT DueDate
FROM qryDueDates
WHERE DueDates <= DateAdd("d",60,Date())
The data causing the issue is when it equals something like "1/25/2019", "11/19/2019" or any date in 2019.
Goal
I need to limit the results to show dates that are expired or expiring within 60 days or less.
I'm trying to prepare the dataset for the conditional formatting.
if you can put your nested sub-query in your post that may give better picture, and if you can mention what is the error you are getting that may also help. Since you mentioned that you are getting error only when sub-query returns certain dates, I would suggest that cast your sub-query result to DATE if you have not already done.
Below is my attempt to help you with limited information I could extract from your post. I have used some of MS-SQL function below, please replace with your DB specific function.
SELECT myDates.* FROM (select COLUMN_NAME DueDates from TABLE_NAME) as myDates WHERE myDates.DueDates <= DateAdd("d",60, GETDATE())
Turns out that the original query was screwing it up. I moved the query into the main one and it worked.

Why is SQL Server returning a different order when using 'month' in 'where'?

I run a procedure call that calculates sums into table rows. First I taught the procedure is not working as expected, so I wasted half a day trying to fix what actually works fine.
Later I actually taken a look at the SELECT that gets the data on screen and was surprised by this:
YEAR(M.date) = 2016
--and MONTH(M.date) = 2
and
YEAR(M.date) = 2016
and MONTH(M.date) = 2
So the second example returns a different sorting than the first.
The thing is I do calculations on the whole year. Display data on year + month parameters.
Can someone explain why this is happening and how to avoid this?
In my procedure that calls the SELECT for on screen data I have it implemented like so:
and (#month = 0 or (month(M.date) = #month))
and year(M.date) = #year
So the month parameter is optional if the user wants to see the data for the whole year and year parameter is mandatory.
You are ordering by the date column. However, the date column is not unique -- multiple rows have the same date. The ORDER BY returns these in arbitrary order. In fact, you might get a different ordering for the same query running at different times.
To "fix" this, you need to include another column (or columns) that is unique for each row. In your case, that would appear to be the id column:
order by date, id
Another way to think about this is that in SQL the sorts are not stable. That is, they do not preserve the original ordering of the data. This is easy to remember, because there is no "original ordering" for a table or result set. Remember, tables represent unordered sets.

Update statement returning 0 rows affected

I am having issues with an update statement where I need to change the employeeid field from 4 to 6 for orders that have an order date of 7-19-1996
the statement i made is saying 0 rows affected when i know for a fact there are several rows that fit this description. Can someone steer me in the right direction as far as why am i getting this result and what I did wrong? Thanks
here is the statement I have so far
UPDATE [dbo].[LMOrders]
SET [EmployeeID] = 6
WHERE OrderDate= 7-19-1996
If OrderDate is DateTime then you need to format your WHERE cause correctly: '2007-05-08 12:35:29.123' or if Date then '2007-05-08'
Here is a great resource, at the bottom of the doc shows the different formats
https://msdn.microsoft.com/en-us/library/ms187819.aspx
The OrderDate here is wrong in any database. If I assume a OrderDate as a String, you can always do WHERE OrderDate= '7/19/1996'
If you need a conversion to date, since with [dbo]. I assume you're with mssql, check https://msdn.microsoft.com/fr-CA/library/ms187928.aspx for conversion.

Access SQL Query: Comparing Date In Select Statement

I have a problem that I simply cannot seem to figure out. I have a list of employees with different travel dates and I want to display all of them in a cascading list format. The problem is that I only want to see employees once, and only the date closest to today.
For example I could have 'Smith' in there multiple times with dates before and after today, as we also keep historical records. This means I can't just do min, as it will try and display a date before today, and max is too far forward.
The code example below ALMOST works. The problem is in the select statement. I want to show the minimum date after today, but instead it gives me 0's and -1's where the dates should be. There might just be another way to do this all together, but this is the only configuration that seems to allow the other information such as Site, Position, and Comments to be displayed correctly alongside it.
SELECT A.`Last Name` AS [Last Name], Min(A.`Date In`) > Now() AS [Date In], Max(B.Site) AS Site, Max(B.Position), Max(B.Comments) AS Comments
FROM Deployments AS A
INNER JOIN Deployments AS B ON A.ID = B.ID
GROUP BY A.`FSR Name`
HAVING (((Max(A.`Actual TEP IN`))>Now()));
I did a group by Name because I only want to see each individual once. If I don't add the table to itself with a join it gives a self reference error. This is my first time posting so I hope this makes sense! All help will be greatly appreciated!
Not sure what DB you're on, but in general, you need to return MIN(date) instead of the result of the comparison "Min(Date) > Now()" - I'm guessing this is where you're seeing 0's and -1's, since that would be the result of the comparison, when you want the minimum date value itself.
Also, if you are just wanting people who have a trip date in the future, just restrict your query with a WHERE clause, do a GROUP BY, and you get rid of the self-join. Also note that the example below aligns some discrepancies in your OP like where you're selecting based on "Last Name" but grouping on "FSR Name" - these things must be consistent, whichever field you're concerned about.
Example:
SELECT A.[FSR Name] AS [FSR Name],
Min(A.[Date In]) AS [Date In],
Max(A.Site) AS Site,
Max(A.Position) AS Position,
Max(A.Comments) AS Comments
FROM Deployments AS A
WHERE A.[Date In] > Now()
GROUP BY A.[FSR Name];
EDIT: If you need to make sure that Site,Position,Comments all came from the same row, you have to do something like one of these options:
If you have a Primary Key:
select * from Deployments A3 where A3.pk_value =
(select max(A2.pk_value) from Deployments A2
where A2.[Date In] =
(select Max([Date In]) from Deployments A where A.[FSR Name] = A2.[FSR Name])
and A2.[FSR Name] = A3.[FSR Name]
)
This guarantees you to get 1 row per FSR Name, even if there are multiple rows for that FSR with the same "latest" date.
Otherwise, you can leave out the secondary query dealing with the pk_value, but you run a risk of getting multiple rows for an FSR that has multiple records with the same "latest" date.
Note: when you get to queries this complex, running on a full-featured database (SQL Server, Oracle, anything but Access) allows you to use much more sophistication. For this example, "Windowing Functions" would give you the answer without as much wrangling. Not sure if you're stuck with Access for now, but consider this for the future, anyway.
Try something like this
Select A.LastName, A.DateIn, A.Site, A.Position, A.Comments
From deployments a
Where not exists (Select *
From deployments b
Where b.id <> a.id
and (abs(datediff(d, getdate(), a.datein))) > abs(datediff(d, getdate(), b.datein))
or abs(datediff(d, getdate(), a.datein)) = abs(datediff(d, getdate(), b.datein) and a.id > b.id))
Instead of the funny mins and maxes that you are using to try to get the row with the datein that is closest to today, try using datediff. With this function, you can specify what type of date or time value you are looking to compare (day, month, year, minute) and then find the difference between two different datetimes. In this case, I used getdate() to find the current date and time. Then, we want the datein with the least value for datediff, the datein that is closest to today. Datediff will return positive or negative values, so I used abs to get the absolute value of the result. I did this because it doesn't matter if the date is before today or after today.
Then we are looking in the deployment table. The subquery says that we should look at all the values which are not the current value. Then, find all the rows that have a smaller datediff than the current record. Also, find all the records that have the same datediff as the current record and a smaller id. We will only include the current record if there isn't anything that fits this criteria. It is a little weird to think about, but this type of query should help you find what you are looking for a lot easier. The only thing is that you will need to add criteria in the where clause of the subquery to determine which entries to compare. As it stands, this query will look at all of the entries in your deployments table and pull back the one row that has a datein closest to today. Since you want one row for each person, this will need few more specifications.