I am looking for specific pieces of software across our network by querying the SCCM Database. My problem is that, for various reasons, sometimes I can search by the program's name and other times I need to search for a specific EXE.
When I run the query below, does it take 13 seconds to run if the where clause contains an AND, but it will run for days with no results if the AND is replaced with OR. I'm assuming it is doing this because I am not properly joining the tables. How can I fix this?
select vrs.Name0
FROM v_r_system as vrs
join v_GS_INSTALLED_SOFTWARE as VIS on VIS.resourceid = vrs.resourceid
join v_GS_SoftwareFile as sf on SF.resourceid = vrs.resourceid
where
VIS.productname0 LIKE '%office%' AND SF.Filename LIKE 'Office2007%'
GROUP BY vrs.Name0
Thanks!
Your LIKE clause contains a wildcard match at the start of a string:
LIKE '%office%'
This prevents SQL Server from using an index on this column, hence the slow running query. Ideally you should change your query so your LIKE clause doesn't use a wildcard at the start.
In the case where the WHERE clause contains an AND its querying based on the Filename clause first (it is able to use an index here and so this is relatively quick) and then filtering this reduced rowset based on your productname0 clause. When you use an OR however it isn't restricted to just returning rows that match your Filename clause, and so it must search through the entire table checking to see if each productname0 field matches.
Here's a good Microsoft article http://msdn.microsoft.com/en-us/library/ms172984.aspx on improving indexes. See the section on Indexes with filter clauses (reiterates the previous answer.)
Have you tried something along these lines instead of a like query?
... where product name in ('Microsoft Office 2000','Microsoft Office xyz','Whateverelse')
Related
I am trying to create a select query in access with two tables I want to link/create a relationship.
Normally, if both tables contains same value you can just "drag" and create a link between those two columns.
In this case however, the second table have an " /CUSTOMER" added at the end in the fields.
Example;
Table1.OrderNumber contains order numbers which always contains 10 characters
Table2.Refference contains same order numbers, but will have a " /CUSTOMER" added to the end.
Can I link/create a relationship between these two in a Query? And how?
Thanks for the help!
Sebastian
Table1.OrderNumber contains order numbers which always contains 10 characters
If so, try this join:
ON Table1.OrderNumber = Left(Table2.Reference, 10)
For these nuanced joins you will have to use SQL and not design view with diagram. Consider the following steps in MS Access:
In Design view, create the join as if two customer fields match exactly. Then run the query which as you point out should return no results.
In SQL view, find the ON clause and adjust to replace that string. Specifically, change this clause
ON Table1.OrderNumber = Table2.Refference
To this clause:
ON Table1.OrderNumber = REPLACE(Table2.Refference, '/CUSTOMER', '')
Then run query to see results.
Do note: with this above change, you may get an Access warning when trying to open query in Design View since it may not be able to be visualized. Should you ignore the warning, above SQL change may be reverted. Therefore, make any changes to query only in SQL view.
Alternatively (arguably better solution), consider cleaning out that string using UPDATE query on the source table so the original join can work. Any change to avoid complexity is an ideal approach. Run below SQL query only one time:
UPDATE Table2
SET Refference = REPLACE(Refference, '/CUSTOMER', '')
I have a list of proper names (in a table), and another table with a free-text field. I want to check whether that field contains any of the proper names. If it were just one, I could do
WHERE free_text LIKE "%proper_name%"
but how do you do that for an entire list? Is there a better string function I can use with a list?
Thanks
No, like does not have that capability.
Many databases support regular expressions, which enable to you do what you want. For instance, in Postgres this is phrased as:
where free_text ~ 'name1|name2|name3'
Many databases also have full-text search capabilities that speed such searches.
Both capabilities are highly specific to the database you are using.
Well, you can use LIKE in a standard JOIN, but the query most likely will be slow, because it will search each proper name in each free_text.
For example, if you have 10 proper names in a list and a certain free_text value contains the first name, the server will continue processing the rest of 9 names.
Here is the query:
SELECT -- DISTINCT
free_text_table.*
FROM
free_text_table
INNER JOIN proper_names_table ON free_text_table.free_text LIKE proper_names_table.proper_name
;
If a certain free_text value contains several proper names, that row will be returned several times, so you may need to add DISTINCT to the query. It depends on what you need.
It is possible to use LATERAL JOIN to avoid Cartesian product (where each row in free_text_table is compared to each rows in proper_names_table). The end result may be faster, than the simple variant. It depends on your data distribution.
Here is SQL Server syntax.
SELECT
free_text_table.*
FROM
free_text_table
CROSS APPLY
(
SELECT TOP(1)
proper_names_table.proper_name
FROM proper_names_table
WHERE free_text_table.free_text LIKE proper_names_table.proper_name
-- ORDER BY proper_names_table.frequency
) AS A
;
Here we don't need DISTINCT, there will be at most one row in the result for each row from free_text_table (one or zero). Optimiser should be smart enough to stop reading and processing proper_names_table as soon as the first match is found due to TOP(1) clause.
If you also can somehow order your proper names and put those that are most likely to be found first, then the query is more likely to be faster than a simple JOIN. (Add a suitable ORDER BY clause in subquery).
I suppose I have always naively assumed that scalar functions in the select part of a SQL query will only get applied to the rows that meet all the criteria of the where clause.
Today I was debugging some code from a vendor and had that assumption challenged. The only reason I can think of for this code failing is that the Substring() function is getting called on data that should have been filtered out by the WHERE clause. But it appears that the substring call is being applied before the filtering happens, the query is failing.
Here is an example of what I mean. Let's say we have two tables, each with 2 columns and having 2 rows and 1 row respectively. The first column in each is just an id. NAME is just a string, and NAME_LENGTH tells us how many characters in the name with the same ID. Note that only names with more than one character have a corresponding row in the LONG_NAMES table.
NAMES: ID, NAME
1, "Peter"
2, "X"
LONG_NAMES: ID, NAME_LENGTH
1, 5
If I want a query to print each name with the last 3 letters cut off, I might first try something like this (assuming SQL Server syntax for now):
SELECT substring(NAME,1,len(NAME)-3)
FROM NAMES;
I would soon find out that this would give me an error, because when it reaches "X" it will try using a negative number for in the substring call, and it will fail.
The way my vendor decided to solve this was by filtering out rows where the strings were too short for the len - 3 query to work. He did it by joining to another table:
SELECT substring(NAMES.NAME,1,len(NAMES.NAME)-3)
FROM NAMES
INNER JOIN LONG_NAMES
ON NAMES.ID = LONG_NAMES.ID;
At first glance, this query looks like it might work. The join condition will eliminate any rows that have NAME fields short enough for the substring call to fail.
However, from what I can observe, SQL Server will sometimes try to calculate the the substring expression for everything in the table, and then apply the join to filter out rows. Is this supposed to happen this way? Is there a documented order of operations where I can find out when certain things will happen? Is it specific to a particular Database engine or part of the SQL standard? If I decided to include some predicate on my NAMES table to filter out short names, (like len(NAME) > 3), could SQL Server also choose to apply that after trying to apply the substring? If so then it seems the only safe way to do a substring would be to wrap it in a "case when" construct in the select?
Martin gave this link that pretty much explains what is going on - the query optimizer has free rein to reorder things however it likes. I am including this as an answer so I can accept something. Martin, if you create an answer with your link in it i will gladly accept that instead of this one.
I do want to leave my question here because I think it is a tricky one to search for, and my particular phrasing of the issue may be easier for someone else to find in the future.
TSQL divide by zero encountered despite no columns containing 0
EDIT: As more responses have come in, I am again confused. It does not seem clear yet when exactly the optimizer is allowed to evaluate things in the select clause. I guess I'll have to go find the SQL standard myself and see if i can make sense of it.
Joe Celko, who helped write early SQL standards, has posted something similar to this several times in various USENET newsfroups. (I'm skipping over the clauses that don't apply to your SELECT statement.) He usually said something like "This is how statements are supposed to act like they work". In other words, SQL implementations should behave exactly as if they did these steps, without actually being required to do each of these steps.
Build a working table from all of
the table constructors in the FROM
clause.
Remove from the working table those
rows that do not satisfy the WHERE
clause.
Construct the expressions in the
SELECT clause against the working table.
So, following this, no SQL dbms should act like it evaluates functions in the SELECT clause before it acts like it applies the WHERE clause.
In a recent posting, Joe expands the steps to include CTEs.
CJ Date and Hugh Darwen say essentially the same thing in chapter 11 ("Table Expressions") of their book A Guide to the SQL Standard. They also note that this chapter corresponds to the "Query Specification" section (sections?) in the SQL standards.
You are thinking about something called query execution plan. It's based on query optimization rules, indexes, temporaty buffers and execution time statistics. If you are using SQL Managment Studio you have toolbox over your query editor where you can look at estimated execution plan, it shows how your query will change to gain some speed. So if just used your Name table and it is in buffer, engine might first try to subquery your data, and then join it with other table.
I was curious since i read it in a doc. Does writing
select * from CONTACTS where id = ‘098’ and name like ‘Tom%’;
speed up the query as oppose to
select * from CONTACTS where name like ‘Tom%’ and id = ‘098’;
The first has an indexed column on the left side. Does it actually speed things up or is it superstition?
Using php and mysql
Check the query plans with explain. They should be exactly the same.
This is purely superstition. I see no reason that either query would differ in speed. If it was an OR query rather than an AND query however, then I could see that having it on the left may spped things up.
interesting question, i tried this once. query plans are the same (using EXPLAIN).
but considering short-circuit-evaluation i was wondering too why there is no difference (or does mysql fully evaluate boolean statements?)
You may be mis-remembering or mis-reading something else, regarding which side the wildcards are on a string literal in a Like predicate. Putting the wildcard on the right (as in yr example), allows the query engine to use any indices that might exist on the table column you are searching (in this case - name). But if you put the wildcard on the left,
select * from CONTACTS where name like ‘%Tom’ and id = ‘098’;
then the engine cannot use any existing index and must do a complete table scan.
I am running a query against a providex database that we use in MAS 90. The query has three tables joined together, and has been slow but not unbearably, taking about 8 minutes per run. The query has a fair number of conditions in the where clause:
I'm going to omit the select part of the query as its long and simple, just a list of fields from the three tables that are to be used in the results.
But the tables and the where clauses in the 8 minute run time version are:
(The first parameter is the lower bound of the user-selected date range, the second is the upper bound.)
FROM "AR_InvoiceHistoryDetail" "AR_InvoiceHistoryDetail",
"AR_InvoiceHistoryHeader" "AR_InvoiceHistoryHeader", "IM1_InventoryMasterfile"
"IM1_InventoryMasterfile"
WHERE "AR_InvoiceHistoryDetail"."InvoiceNo" = "AR_InvoiceHistoryHeader"."InvoiceNo"
AND "AR_InvoiceHistoryDetail"."ItemCode" = "IM1_InventoryMasterfile"."ItemNumber"
AND "AR_InvoiceHistoryHeader"."SalespersonNo" = 'SMC'
AND "AR_InvoiceHistoryHeader"."OrderDate" >= #p_dr
AND "AR_InvoiceHistoryHeader"."OrderDate" <= #p_d2
However, it turns out that another date field in the same table needs to be the one that the Date Range is compared with. So I changed the Order Dates at the end of the WHERE clause to InvoiceDate. I haven't had the query run successfully at all yet. And I've waited over 40 minutes. I have no control over indexing because this is a MAS 90 database which I don't believe I can directly change the database characteristics of.
What could cause such a large (at least 5 fold) difference in performance. Is it that OrderDate might have been indexed while InvoiceDate was not? I have tried BETWEEN clauses but it doesn't seem to work in the providex dialect. I am using the ODBC interface through .NET in my custom report engine. I have been debugging the report and it is running at the database execution point when I asked VS to Break All, at the same spot where the 8 minute report was waiting, so it is almost certainly either something in my query or something in the database that is screwed up.
If its just the case that InvoiceDates aren't indexed, what else can I do in the providex dialect of SQL to optimize the performance of these queries? Should I change the order of my criteria? This report gets results for a specific salesperson which is why the SMC clause exists. The prior clauses are for the inner joins, and the last clause is for the date range.
I used an identical date range in both the OrderDate and InvoiceDate versions and have ran them all mulitiple times and got the same results.
I still don't know exactly why it was so slow, but we had another problem with the results coming from the query (we switched back to using OrderDate). We weren't getting some of the results because of the nature of the IM1 table.
So I added a Left Outer Join once I figured out Providex's syntax for that. And for some reason, even though we still have 3 tables, it runs a lot faster now.
The new query criteria are:
FROM "AR_InvoiceHistoryHeader" "AR_InvoiceHistoryHeader",
{OJ "AR_InvoiceHistoryDetail" "AR_InvoiceHistoryDetail"
LEFT OUTER JOIN "IM1_InventoryMasterfile" "IM1_InventoryMasterfile"
ON "AR_InvoiceHistoryDetail"."ItemCode" =
"IM1_InventoryMasterfile"."ItemNumber" }
WHERE "AR_InvoiceHistoryDetail"."InvoiceNo" =
"AR_InvoiceHistoryHeader"."InvoiceNo" AND
"AR_InvoiceHistoryHeader"."SalespersonNo" = 'SMC'
AND "AR_InvoiceHistoryHeader"."InvoiceDate" >= ?
AND "AR_InvoiceHistoryHeader"."InvoiceDate" <= ?
Strange, but at least I learned more of the world of Providex Sql in the process.
I've never used providex before.
A search turned up this reference article on the syntax for creating an index.
Looking over your query, there's three tables and five criteria. Two of the criteria are "join criteria", and three criteria are filtering criteria:
AND "AR_InvoiceHistoryHeader"."SalespersonNo" = 'SMC'
AND "AR_InvoiceHistoryHeader"."OrderDate" >= #p_dr
AND "AR_InvoiceHistoryHeader"."OrderDate" <= #p_d2
I don't know how good SalespersonNo is for limiting return results, but it might be good to add an index on that.
I haven't used .NET so my question may show ignorance, but in Access you must use a SQL Pass-Through query to wring any results from ProvideX, if more than one table is involved.