Why is SQL Server returning a different order when using 'month' in 'where'? - sql

I run a procedure call that calculates sums into table rows. First I taught the procedure is not working as expected, so I wasted half a day trying to fix what actually works fine.
Later I actually taken a look at the SELECT that gets the data on screen and was surprised by this:
YEAR(M.date) = 2016
--and MONTH(M.date) = 2
and
YEAR(M.date) = 2016
and MONTH(M.date) = 2
So the second example returns a different sorting than the first.
The thing is I do calculations on the whole year. Display data on year + month parameters.
Can someone explain why this is happening and how to avoid this?
In my procedure that calls the SELECT for on screen data I have it implemented like so:
and (#month = 0 or (month(M.date) = #month))
and year(M.date) = #year
So the month parameter is optional if the user wants to see the data for the whole year and year parameter is mandatory.

You are ordering by the date column. However, the date column is not unique -- multiple rows have the same date. The ORDER BY returns these in arbitrary order. In fact, you might get a different ordering for the same query running at different times.
To "fix" this, you need to include another column (or columns) that is unique for each row. In your case, that would appear to be the id column:
order by date, id
Another way to think about this is that in SQL the sorts are not stable. That is, they do not preserve the original ordering of the data. This is easy to remember, because there is no "original ordering" for a table or result set. Remember, tables represent unordered sets.

Related

Query another table with results of an another query that include a csv column

Brief Summary:
I am currently trying to get a count of completed parts that fall within a specific time range, machine number, operation number, and matches the tool number.
For example:
SELECT Sequence, Serial, Operation,Machine,DateTime,value as Tool
FROM tbPartProfile
CROSS APPLY STRING_SPLIT(Tool_Used, ',')
ORDER BY DateTime desc
is running a query which pulls all the instances that a tool has been changed, I am splitting the CSV from Tool_Used column. I am doing this because there can be multiple changes during one operation.
Objective:
This is where the production count come into place. For example, record 1 has a to0l change of 36 on 12/12/2022. I will need to go back in to the table and get the amount of part completed that equals the OPERATION/MACHINE/TOOL and fall between the date range.
For example:
SELECT *
FROM tbPartProfile
WHERE Operation = 20 AND Machine = 1 AND Tool_Used LIKE '%36%'
ORDER BY DateTime desc
For example this query will give me the datetimes the tools LIKE 36 was changed. I will need to take this datetime and compare it previous query and get the sum of all parts that were ran in this TimeRange/Operation/Machine/Tool Used

Unexpected result with ORDER BY

I have the following query:
SELECT
D.[Year] AS [Year]
, D.[Month] AS [Month]
, CASE
WHEN f.Dept IN ('XSD') THEN 'Marketing'
ELSE f.Dept
END AS DeptS
, COUNT(DISTINCT f.OrderNo) AS CountOrders
FROM Sales.LocalOrders AS l WITH
INNER JOIN Sales.FiscalOrders AS f
ON l.ORDER_NUMBER = f.OrderNo
INNER JOIN Dimensions.Date_Dim AS D
ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE)
WHERE YEAR(f.OrderDate) = 2019
AND f.Dept IN ('XSD', 'PPM', 'XPP')
GROUP BY
D.[Year]
, D.[Month]
, f.Dept
ORDER BY
D.[Year] ASC
, D.[Month] ASC
I get the following result the ORDER BY isn't giving the right result with Month column as we can see it is not ordered:
Year Month Depts CountOrders
2019 1 XSD 200
2019 10 PPM 290
2019 10 XPP 150
2019 2 XSD 200
2019 3 XPP 300
The expected output:
Year Month Depts CountOrders
2019 1 XSD 200
2019 2 XSD 200
2019 3 XPP 300
2019 10 PPM 290
2019 10 XPP 150
Your query
It is ordered by month, as your D.[Month] is treated like a text string in the ORDER BY clause.
You could do one of two things to fix this:
Use a two-digit month number (e.g. 01... 12)
Use a data type for the ORDER BY clause that will be recognized as representing a month
A quick fix
You can correct this in your code by quickly changing the ORDER BY clause to analyze those columns as though they are numbers, which is done by converting ("casting") them to an integer data type like this:
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
This will correct your unexpected query results, but does not address the root cause, which is your underlying data (more on that below).
Your underlying data
The root cause of your issue is how your underlying data is stored and/or surfaced.
Your Month seems to be appearing as a default data type (VarChar), rather than something more specifically suited to a month or date.
If you administer or have access to or control over the database, it is a good idea to consider correcting this.
In considering this, be mindful of potential context and change management issues, including:
Is this underlying data, or just a representation of upstream data that is elsewhere? (e.g. something that is refreshed periodically using a process that you do not control, or a view that is redefined periodically)
What other queries or processes rely on how this data is currently stored or surfaced (including data types), that may break if you mess with it?
Might there be validation issues if correcting it? (such as from the way zero, null, non-numeric or non-date data is stored, even if invalid)
What change management practices should be followed in your environment?
Is the data source under high transactional load?
Is it a production dataset?
Are other reporting processes dependent on it?
None of these issues are a good excuse to leave something set up incorrectly forever, which will likely compound the issue and introduce others. However, that is only part of the story.
The appropriate approach (correct it, or leave it) will depend on your situation. In a perfect textbook world, you'd correct it. In your world, you will have to decide.
A better way?
The above solution is a bit of a quick and nasty way to force your query to work.
The fact that the solution CASTs late in the query syntax, after the results have been selected and filtered, hints that is not the most elegant way to achieve this.
Ideally you can convert data types as early as possible in the process:
If done in underlying data, not the query, this is the ultimate but may not suit the situation (see below)
If done in the query, try to do it earlier.
In your case, your GROUP BY and ORDER BY are both using columns that look to be redundant data from the original query results, that is, you are getting a DATE and a MONTH and a YEAR. Ideally you would just get a DATE and then use the MONTH or YEAR from that date. Your issue is your dates are not actually dates (see "underlying data" above), which:
In the case of DATE, is converted in your INNER JOIN line ON CAST(D.[Date] AS DATE) = CAST(f.OrderDate AS DATE) (likely to minimise issues with the join)
In the case of D.[year] and D.[month], are not converted (which is why we still need to convert them further down, in ORDER BY)
You could consider ignoring D.[month] and use the MONTH DATEPART computed from DATE, which would avoid the need to use CAST in the ORDER BY clause.
In your instance, this approach is a middle ground. The quick fix is included at the top of this answer, and the best fix is to correct the underlying data. This last section considers optimizing the quick fix, but does not correct the underlying issue. It is only mentioned for awareness and to avoid promoting the use of CAST in an ORDER BY clause as the most legitimate way of addressing your issue with good clean query syntax.
There are also potential performance tradeoffs between how many columns you select that you don't need (e.g. all of the ones in D?), whether to compute the month from the date or a seperate month column, whether to cast to date before filtering, etc. These are beyond the scope of this solution.
So:
The immediate solution: use the quick fix
The optimal solution: after it's working, consider the underlying data (in your situation)
The real problem is your object Dimensions.Date_Dim here. As you are simply ordering on the value of D.[Year] and D.[Month] without manipulating the values at all, this means the object is severely flawed; you are storing numerical data as a varchar. varchar, and numerical data types are completely different. For example 2 is less than 10 but '2' is greater than '10'; because '2' is greater than '1', so therefore it must also be greater than '10'.
The real solution, therefore, is fixing your object. Assuming that both Month and Year are incorrectly stored as a varchar, don't have any non-integer values (another and different flaw if so), and not a computed column then you could just do:
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Year] int NOT NULL;
ALTER TABLE Dimensions.Date_Dim ALTER COLUMN [Month] int NOT NULL;
You could, however, also make the columns a PERSISTED computed column, which might well be easier, in my opinion, as DATEPART already returns a strongly typed int value.
ALTER TABLE dbo.Date_Dim DROP COLUMN [Month];
ALTER TABLE dbo.Date_Dim ADD [Month] AS DATEPART(MONTH,[Date]) PERSISTED;
Of course, for both solutions, you'll need to (first) DROP and (afterwards) reCREATE any indexes and constraints on the columns.
As long as your "Month" is always 1-12, you can use
SELECT ..., TRY_CAST(D.[Month] AS INT) AS [Month],...
ORDER BY TRY_CAST(D.[Month] AS INT)
The simplest solution is:
ORDER BY MIN(D.DATE)
or:
ORDER BY MIN(f.ORDER_DATE)
Fiddling with the year and month columns is totally unnecessary when you have a date column that is available.
A very common issue when you store numerical data as a varchar/nvarchar.
Try to cast Year and Month to INT.
ORDER BY
CAST(D.[Year] AS INT) ASC
,CAST(D.[Month] AS INT) ASC
If you try using the <, > and BETWEEN operators, you will get some really "weird" results.

Ms Access query previous day's date

How to bulid a query in Ms Access to include the day before amounts as an opening balance. So on running the query i enter 3/10/18 in the WorkDay parameter box and records for 3/10/18 and 2/10/18 is shown. The Table is setup as follows:
WorkDay....TranactionID....Amount
2/10/18......Opening........1000
2/10/18......Credit.........500
2/10/18.......Debit.........300
3/10/18.......Credit........700
3/10/18.......Debit.........200
So if I run the query for 3/10/18 it should return
WorkDay....TranactionID....Amount
2/10/18......[Expr].........800
3/10/18.......Credit........700
3/10/18.......Debit.........200
If you are using the GUI add DateAdd("d",-1,[MyDateParameter]) to the OR line under [MyDateParameter] in the Workday field.
For SQL WHERE statement you would use
WorkDay=[MyDateParameter] OR Workday=DateAdd("d",-1,[MyDateParameter])
Obviously substitute [MyDateParameter] with whatever your date parameter actually is.
First some notes about the request:
The desired results imposes different requirements for the current day vs the previous day, so there must be two different queries. If you want them in one result set, you would need to use a UNION.
(You could write a single SQL UNION query, but since UNION queries do not work at all with the visual designer, you are left to write and test the query without any advantages of the query Design View. My preference is therefore to create two saved queries instead of embedded subqueries, then create a UNION which combines the results of the saved queries.)
Neither the question, nor answers to comments indicate what to do with any exceptions, like missing dates, weekends, etc. The following queries take the "day before" literally without exception.
The other difficulty is that the Credit entries also have a positive amount, so you must handle them specially. If Credits were saved with negative values, the summation would be simple and direct.
QueryCurrent:
PARAMETERS [Which WorkDay] DateTime;
SELECT S.WorkDay, S.TransactionID, Sum(S.[Amount]) As Amount
FROM [SomeUnspecifiedTable] As S
WHERE S.WorkDay = [Which WorkDay]
GROUP BY S.WorkDay, S.TransactionID
QueryPrevious:
PARAMETERS [Which WorkDay] DateTime;
SELECT S.WorkDay, "[Expr]" As TransactionID,
Sum(IIF(S.TransactionID = "Credit", -1, 1) * S.[Amount]) As Amount
FROM [SomeUnspecifiedTable] As S
WHERE S.WorkDay = ([Which WorkDay] - 1)
GROUP BY S.WorkDay
Union query:
SELECT * FROM QueryCurrent
UNION
SELECT * FROM QueryPrevious
ORDER BY [WorkDay]
Notes about the solution:
You could also use DateAdd() function, but add/subtracting integers from dates defaults to a change of days.

Is it possible to return part of a field from the last row entered into a table

I am proposing to have a table (the design isn't settled on yet and can be altered dependent upon the views expressed in reply to this question) that will have a primary key of type int (using auto increment) and a field (ReturnPeriod of type Nchar) that will contain data in the form of '06 2013' (representing in this instance June 2013).
I would simply like to return 06 or whatever happens to be in the last record entered in the table. This table will never grow by more than 4 records per annum (so it will never be that big). It also has a column indicating the date that the last entry was created.
That column seems to my mind at least to be the most suitable candidate for getting the last record, so essentially I'd like to know if sql has a inbuilt function for comparing the date the query is run to the nearest match in a column, and to return the first two characters of a field.
So far I have:
Select Mid(ReturnPeriod,1,2) from Returns
Where DateReturnEntered = <and this is where I'm stuck>
What I'm looking for is a where clause that would get me the last entered record using the date the query is run as its reference point(DateRetunEntered of type Date contains the date a record was entered).
Of course there may be an even easier way to guarantee that one has the last record in which case I'm open to suggestions.
Thanks
I think you should store ReturnPeriod as a datetime for example not 06 2013 as a VARCHAR but 01.06.2013 as a DATETIME (first day of 06.2013).
In this case, if I've got your question right, you can use GETDATE() to get current time:
SELECT TOP 1 MONTH(ReturnPeriod)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC
If you store ReturnPeriod as a varchar then
SELECT TOP 1 LEFT(ReturnPeriod,2)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC
I would store your ReturnPeriod as a date datatype, using a nominal 1st of the month, e.g. 1 Jun 2013, if you don't have the actual date.
This will allow direct comparison against your entered date, with trivial formatting of the return value if required.
Your query would then find the latest date prior to your date entered.
SELECT MONTH(MAX(ReturnPeriod)) AS ReturnMonth
FROM Returns
WHERE ReturnPeriod <= #DateReturnEntered

How can I query just the month and day of a DATE column?

I have a date of birth DATE column in a customer table with ~13 million rows. I would like to query this table to find all customers who were born on a certain month and day of that month, but any year.
Can I do this by casting the date into a char and doing a subscript query on the cast, or should I create an aditional char column, update it to hold just the month and day, or create three new integer columns to hold month, day and year, respectively?
This will be a very frequently used query criteria...
EDIT:... and the table has ~13 million rows.
Can you please provide an example of your best solution?
If it will be frequently used, consider a 'functional index'. Searching on that term at the Informix 11.70 InfoCentre produces a number of relevant hits.
You can use:
WHERE MONTH(date_col) = 12 AND DAY(date_col) = 25;
You can also play games such as:
WHERE MONTH(date_col) * 100 + DAY(date_col) = 1225;
This might be more suitable for a functional index, but isn't as clear for everyday use. You could easily write a stored procedure too:
Note that in the absence of a functional index, invoking functions on a column in the criterion means that an index is unlikely to be used.
CREATE FUNCTION mmdd(date_val DATE DEFAULT TODAY) RETURNING SMALLINT AS mmdd;
RETURN MONTH(date_val) * 100 + DAY(date_val);
END FUNCTION;
And use it as:
WHERE mmdd(date_col) = 1225;
Depending on how frequently you do this and how fast it needs to run you might think about splitting the date column into day, month and year columns. This would make search faster but cause all sorts of other problems when you want to retrieve a whole date (and also problems in validating that it is a date) - not a great idea.
Assuming speed isn't a probem I would do something like:
select *
FROM Table
WHERE Month(*DateOfBirthColumn*) = *SomeMonth* AND DAY(*DateOfBirthColumn*) = *SomeDay*
I don't have informix in front of me at the moment but I think the syntax is right.