IIf Statement only manipulating part of the data in a field - sql

I created an IIf statement in a MS Access query, to fill null spaces in a particular field. The statement in only affecting some of the data, but not all. Why would it only work on a certain percentage of the records?
Field:Line_off_Date
Table:Claims
Criteria:IIf(IsNull("Line_off_Date "),"1/1/1900"
Here is the official Sql after delimiting:
SELECT Claims.Line_off_Date
FROM Claims
WHERE (((Claims.Line_off_Date)=IIf("Line_off_Date IsNull",#1/1/1900#,"Line_off_Date ")));
I can't screen shot onto this site, but I could give a mock up representation:
Line_off_Date
12/23/2013
12/23/2013
5/16/2010
1/1/1900
1/1/1900
12/10/2000
11/4/2008
This is listed as a column, with a space between 1/1/1900 and 12/10/2000. When I post it here, it turns into a paragraph. I hope this helps in some way...

I'm unsure about your goal, but this looks like something where you could use the Nz Function.
Criteria: Nz(Line_off_Date, "1/1/1900")
If you want IIf instead, try it this way ...
Criteria: IIf(Line_off_Date Is Null, "1/1/1900", Line_off_Date)
Note those suggestions assume Line_off_Date is text datatype. If it is actually Date/Time, delimit the value with # instead of quotes.
Criteria: Nz(Line_off_Date, #1/1/1900#)
Criteria: IIf(Line_off_Date Is Null, #1/1/1900#, Line_off_Date)
I'm more confused after seeing your WHERE clause ...
WHERE (((Claims.Line_off_Date)=IIf("Line_off_Date IsNull",#1/1/1900#,"Line_off_Date ")));
Aside from the issues of quoting and so forth, I think the logic is faulty. If you say "show me rows where Line_off_Date = something other than what is stored in Line_off_Date", you may not get any rows back.
I think you need a different approach. If you need more help, show us a brief set of sample data and your desired output based on those data.
Based on the comment discussion, my understanding is you don't really want to filter the result set based on Line_off_Date --- which means this is not a WHERE clause issue. Instead, you want to retrieve Line_off_Date values, but substitute #1900-1-1# for any which are Null. If that is correct, do the transformation with a field expression in the SELECT clause.
SELECT Nz(Claims.Line_off_Date, #1900-1-1-#) AS Line_off_Date_adjusted
FROM Claims;

Related

"Cannot construct data type datetime" when filtering data, but all values filtered DO have valid dates

I am convinced that this question is NOT a duplicate of:
Cannot construct data type datetime, some of the arguments have values which are not valid
In that case the values passed in are explicitly not valid. Whereas in this case the values that the function could be expected to be called upon are all valid.
I know what the actual problem is, and it's not something that would help most people that find the other question. But it IS something that would be good to be findable on SO.
Please read the answer, and understand why it's different from the linked question before voting to close as dupe of that question.
I've run some SQL that's errored with the error message: Cannot construct data type datetime, some of the arguments have values which are not valid.
My SQL uses DATETIMEFROMPARTS, but it's fine evaluating that function in the select - it's only a problem when I filter on the selected value.
It's also demonstrating weird, can't-possibly-be-happening behaviour w.r.t. other changes to the query.
My query looks roughly like this:
WITH FilteredDataWithDate (
SELECT *, DATETIMEFROMPARTS(...some integer columns representing date data...) AS Date
FROM Table
WHERE <unrelated pre-condition filter>
)
SELECT * FROM FilteredDataWithDate
WHERE Date > '2020-01-01'
If I run that query, then it errors with the invalid data error.
But if I omit the final Date > filter, then it happily renders every result record, so clearly none of the values it's filtering on are invalid.
I've also manually examined the contents of Table WHERE <unrelated pre-condition filter> and verified that everything is a valid date.
It also has a wild collection of other behaviours:
If I replace all of ...some integer columns representing date data... with hard-coded numbers then it's fine.
If I replace some parts of that data with hardcoded values, that fixes it, but others don't. I don't find any particular patterns in what does or doesn't help.
If I remove most of the * columns from the Table select. Then it starts to be fine again.
Specifically, it appears to break any time I include an nvarchar(max) column in the CTE.
If I add an additional filter to the CTE that limits the results to Id values in the following ranges, then the results are:
130,000 and 140,000. Error.
130,000 and 135,000. Fine.
135,000 and 140,000. Fine.!!!!
Filtering by the Date column breaks everything ... but ORDER BY Date is fine. (and confirms that all dates lie within perfectly sensible bounds.)
Adding TOP 1000000 makes it work ... even though there are only about 1000 rows.
... WTAF?!
This took me a while to decode, but it turns out that the SS compiler doesn't necessarily restrict its execution of the function just to rows that are, or could be, relevant to the result set.
Depending on the execution plan it arrives at, the function could get called on any record in Table, even one that doesn't satisfy WHERE <unrelated pre-condition filter>.
This was found by another user, for another function, over here.
So the fact that it could return all the results without the filter wasn't actually proving that every input into the function was valid. And indeed there were some records in the table that weren't in the result set, but still had invalid data.
That actually means that even if you were to add an explicit WHERE filter to exclude rows containing invalid date-component data ... that isn't actually guaranteed to fix it, because the function may still get called against the 'excluded' rows.
Each of the random other things I did will have been influencing the query plan in one way or another that happened to fix/break things.
The solution is, naturally, to fix the underlying table data.

MS Access Having Clause

Alright so I understand the point of the HAVING clause. I am having an issue and I am wondering if I can solve this the way I want to.
I want to execute one query using ADODB.Recordset and then use the Filter function to sift through the data set.
The problem is the query at the moment which looks like this:
SELECT tblMT.Folder, tblMT.MTDATE, tblMT.Cust, Sum(tblMT.Hours)
FROM tblMT
GROUP BY tblMT.Folder, tblMT.MTDATE, tblMT.Cust
HAVING tblMT.Cust LIKE "TEST*" AND Min(tblMT.MTDATE)>=Date()-30 AND MAX(tblMT.MTDATE)<=Date()
ORDER BY tblMT.TheDATE DESC;
So the above works as expected.... however I want to be able to use the tblMT.Cust as the filter without having to keep re querying the database. If I remove it I get a:
Data type mismatch in criteria expression.
Is what I am trying to do possible? If someone can point me in the right direction here would be great.
Ok... the type mismatch is caused because either tblmt.mtdate isn't a date field or tblmt.hours isn't a number field AND you have data that either isn't a date or isn't a number when the customer isn't like 'TEST*'. Or, for some customers, you have a NULL in mt.date and null can't be compared with >=. you'd still get the error if you said where tblMt.cust not like "TEST*" too.
Problem is likely with the data or your expectation and you need to handle it.
What data types are tblMT.hours and tblMt.MtDate?

Access Query resulting in #Error for one outcome

I am pulling information via a query in access. Below is my code. The results populate when the "Pinacle Type" is "DOM" or "BOOK", but if the type is anything else, as stated by the IIF statement, I get #Error.
5_BeneBankID: IIf([Pinacle_Type]="DOM" Or "BOOK",Mid(Replace(Replace([BeneABA]," ",""),"-",""),1,11),Mid(Replace(Replace([Intl_BeneBankID]," ",""),"-",""),1,11))
Embedded in this statement is are also formatting parts but those work for the first instance. The BeneABA field is a bank ABA number so this is always numeric. The Intl_BeneBankID is what is known as a SWIFT code which is either all alpha or alphanumeric. Both have a maximum length of 11 characters.
Also, If I type the following, the Intl_BeneBankID POPULATES! which is why I am stumped:
5_BeneBankID: Intl_BeneBankID
relevant table
I'm not sure, as your description of the problem seems vague to me, but if you are getting #error on some rows, but not all, you probably have null fields in your data. This is a common problem in test data, not so often in production code, but needs to be handled. Try wrapping the fields with Nz() i.e. ((Replace(Nz([Intl_BeneBankID],"")... and see what you get.

For an Oracle NUMBER datatype, LIKE operator vs BETWEEN..AND operator

Assume mytable is an Oracle table and it has a field called id. The datatype of id is NUMBER(8). Compare the following queries:
select * from mytable where id like '715%'
and
select * from mytable where id between 71500000 and 71599999
I would think the second is more efficient since I think "number comparison" would require fewer number of assembly language instructions than "string comparison". I need a confirmation or correction. Please confirm/correct and throw any further comment related to either operator.
UPDATE: I forgot to mention 1 important piece of info. id in this case must be an 8-digit number.
If you only want values between 71500000 and 71599999 then yes the second one is much more efficient. The first one would also return values between 7150-7159, 71500-71599 etc. and so forth. You would either need to sift through unecessary results or write another couple lines of code to filter the rest of them out. The second option is definitely more efficient for what you seem to want to do.
It seems like the execution plan on the second query is more efficient.
The first query is doing a full table scan of the id's, whereas the second query is not.
My Test Data:
Execution Plan of first query:
Execution Plan of second query:
I don't like the idea of using LIKE with a numeric column.
Also, it may not give the results you are looking for.
If you have a value of 715000000, it will show up in the query result, even though it is larger than 71599999.
Also, I do not like between on principle.
If a thing is between two other things, it should not include those two other things. But this is just a personal annoyance.
I prefer to use >= and <= This avoids confusion when I read the query. In addition, sometimes I have to change the query to something like >= a and < c. If I started by using the between operator, I would have to rewrite it when I don't want to be inclusive.
Harv
In addition to the other points raised, using LIKE in the manner you suggest would cause Oracle to not use any indexes on the ID column due to the implicit conversion of the data from number to character, resulting in a full table scan when using LIKE versus and index range scan when using BETWEEN. Assuming, of course, you have an index on ID. Even if you don't, however, Oracle will have to do the type conversion on each value it scans in the LIKE case, which it won't have to do in the other.
You can use math function, otherwise you have to use to_char function to use like, but it will cause performance problems.
select * from mytable where floor(id /100000) = 715
or
select * from mytable where floor(id /100000) = TO_NUMBER('715') // this is parametric

Can scalar functions be applied before filtering when executing a SQL Statement?

I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.