Access Query link two tables with similar values - sql

I am trying to create a select query in access with two tables I want to link/create a relationship.
Normally, if both tables contains same value you can just "drag" and create a link between those two columns.
In this case however, the second table have an " /CUSTOMER" added at the end in the fields.
Example;
Table1.OrderNumber contains order numbers which always contains 10 characters
Table2.Refference contains same order numbers, but will have a " /CUSTOMER" added to the end.
Can I link/create a relationship between these two in a Query? And how?
Thanks for the help!
Sebastian

Table1.OrderNumber contains order numbers which always contains 10 characters
If so, try this join:
ON Table1.OrderNumber = Left(Table2.Reference, 10)

For these nuanced joins you will have to use SQL and not design view with diagram. Consider the following steps in MS Access:
In Design view, create the join as if two customer fields match exactly. Then run the query which as you point out should return no results.
In SQL view, find the ON clause and adjust to replace that string. Specifically, change this clause
ON Table1.OrderNumber = Table2.Refference
To this clause:
ON Table1.OrderNumber = REPLACE(Table2.Refference, '/CUSTOMER', '')
Then run query to see results.
Do note: with this above change, you may get an Access warning when trying to open query in Design View since it may not be able to be visualized. Should you ignore the warning, above SQL change may be reverted. Therefore, make any changes to query only in SQL view.
Alternatively (arguably better solution), consider cleaning out that string using UPDATE query on the source table so the original join can work. Any change to avoid complexity is an ideal approach. Run below SQL query only one time:
UPDATE Table2
SET Refference = REPLACE(Refference, '/CUSTOMER', '')

Related

ERROR in CREATE VIEW

I tried to create a new view in my MS Access database so I can select better from it but I wonder what's happening here.
CREATE VIEW new
AS
SELECT msthread.id,
msthread.threadname,
Count(msthread.threadname) AS TotalPost,
threadcategory
FROM msthread
LEFT OUTER JOIN msposts
ON msthread.threadname = msposts.threadname
GROUP BY msthread.id,
msthread.threadname,
msthread.threadcategory
Access gives me this error message when I try to execute that statement.
Syntax error in create table statement
Is there specific problems in creating view with JOINs? I'm trying to access 2 tables.
CREATE VIEW was introduced with Jet 4 in Access 2000. But you must execute the statement from ADO/OleDb. If executed from DAO, it triggers error 3290, "Syntax error in CREATE TABLE statement", which is more confusing than helpful.
Also CREATE VIEW can only create simple SELECT queries. Use CREATE PROCEDURE for any which CREATE VIEW can't handle.
But CREATE VIEW should handle yours. I used a string variable to hold the DDL statement below, and then executed it from CurrentProject.Connection in an Access session:
CurrentProject.Connection.Execute strSql
That worked because CurrentProject.Connection is an ADO object. If you will be doing this from outside Access, use an OleDb connection.
Notice I made a few changes to your query. Most were minor. But I think the query name change may be important. New is a reserved word so I chose qryNew instead. Reserved words as object names seem especially troublesome in queries run from ADO/OleDb.
CREATE VIEW qryNew
AS
SELECT
mst.id,
mst.threadname,
mst.threadcategory,
Count(mst.threadname) AS TotalPost
FROM
msthread AS mst
LEFT JOIN msposts AS msp
ON mst.threadname = msp.threadname
GROUP BY
mst.id,
mst.threadname,
mst.threadcategory;
Going out on a limb here without the error message but my assumption is that you need an alias in front of your non-aliased column.
You may also have a problem titling the view as new. This is a problem with using a generic name for a view or table. Try giving it a distinct name that matters. I'll use msThreadPosts as an example.
CREATE VIEW msThreadPosts
AS
SELECT msthread.id,
msthread.threadname,
Count(msthread.threadname) AS TotalPost,
msposts.threadcategory --Not sure if you want msposts or msthread just pick one
FROM msthread
LEFT OUTER JOIN msposts
ON msthread.threadname = msposts.threadname
GROUP BY msthread.id,
msthread.threadname,
msthread.threadcategory
As long as we are looking at this query lets fix some other things that are being done in a silly way.
Lets start off with aliasing. If you alias a column you can very easily make your query easy to understand and read to anyone who is inclined to read it.
CREATE VIEW msThreadPosts
AS
SELECT mt.id,
mt.threadname,
Count(mt.threadname) AS TotalPost,
mp.threadcategory
FROM mtas mt
LEFT OUTER JOIN msposts mp
ON mt.threadname = mp.threadname
GROUP BY mt.id,
mt.threadname,
mt.threadcategory
There now doesn't that look better.
The next thing to look as if your column names. msthread has an id column. That column name is incredibly generic. This can cause problems when a column isn't aliased and an id exists in mulitple places or there are muliple id columns. Now if we change that column name to msthreadID it makes things much clearer. The goal is to design your tables in a way that anyone working on your database can imidiatley tell what a column is doing.
The next thing to look at is your join. Why are you joining on thread name? threadname is likely a character string and therefore not terribly efficient for joins. if msthread as an id column and needs to be joined to msposts then shouldn't msposts also have that id column to match up on to make joins more efficient?

why would you use WHERE 1=0 statement in SQL?

I saw a query run in a log file on an application. and it contained a query like:
SELECT ID FROM CUST_ATTR49 WHERE 1=0
what is the use of such a query that is bound to return nothing?
A query like this can be used to ping the database. The clause:
WHERE 1=0
Ensures that non data is sent back, so no CPU charge, no Network traffic or other resource consumption.
A query like that can test for:
server availability
CUST_ATTR49 table existence
ID column existence
Keeping a connection alive
Cause a trigger to fire without changing any rows (with the where clause, but not in a select query)
manage many OR conditions in dynamic queries (e.g WHERE 1=0 OR <condition>)
This may be also used to extract the table schema from a table without extracting any data inside that table. As Andrea Colleoni said those will be the other benefits of using this.
A usecase I can think of: you have a filter form where you don't want to have any search results. If you specify some filter, they get added to the where clause.
Or it's usually used if you have to create a sql query by hand. E.g. you don't want to check whether the where clause is empty or not..and you can just add stuff like this:
where := "WHERE 0=1"
if X then where := where + " OR ... "
if Y then where := where + " OR ... "
(if you connect the clauses with OR you need 0=1, if you have AND you have 1=1)
As an answer - but also as further clarification to what #AndreaColleoni already mentioned:
manage many OR conditions in dynamic queries (e.g WHERE 1=0 OR <condition>)
Purpose as an on/off switch
I am using this as a switch (on/off) statement for portions of my Query.
If I were to use
WHERE 1=1
AND (0=? OR first_name = ?)
AND (0=? OR last_name = ?)
Then I can use the first bind variable (?) to turn on or off the first_name search criterium. , and the third bind variable (?) to turn on or off the last_name criterium.
I have also added a literal 1=1 just for esthetics so the text of the query aligns nicely.
For just those two criteria, it does not appear that helpful, as one might thing it is just easier to do the same by dynamically building your WHERE condition by either putting only first_name or last_name, or both, or none. So your code will have to dynamically build 4 versions of the same query. Imagine what would happen if you have 10 different criteria to consider, then how many combinations of the same query will you have to manage then?
Compile Time Optimization
I also might add that adding in the 0=? as a bind variable switch will not work very well if all your criteria are indexed. The run time optimizer that will select appropriate indexes and execution plans, might just not see the cost benefit of using the index in those slightly more complex predicates. Hence I usally advice, to inject the 0 / 1 explicitly into your query (string concatenating it in in your sql, or doing some search/replace). Doing so will give the compiler the chance to optimize out redundant statements, and give the Runtime Executer a much simpler query to look at.
(0=1 OR cond = ?) --> (cond = ?)
(0=0 OR cond = ?) --> Always True (ignore predicate)
In the second statement above the compiler knows that it never has to even consider the second part of the condition (cond = ?), and it will simply remove the entire predicate. If it were a bind variable, the compiler could never have accomplished this.
Because you are simply, and forcedly, injecting a 0/1, there is zero chance of SQL injections.
In my SQL's, as one approach, I typically place my sql injection points as ${literal_name}, and I then simply search/replace using a regex any ${...} occurrence with the appropriate literal, before I even let the compiler have a stab at it. This basically leads to a query stored as follows:
WHERE 1=1
AND (0=${cond1_enabled} OR cond1 = ?)
AND (0=${cond2_enabled} OR cond2 = ?)
Looks good, easily understood, the compiler handles it well, and the Runtime Cost Based Optimizer understands it better and will have a higher likelihood of selecting the right index.
I take special care in what I inject. Prime way for passing variables is and remains bind variables for all the obvious reasons.
This is very good in metadata fetching and makes thing generic.
Many DBs have optimizer so they will not actually execute it but its still a valid SQL statement and should execute on all DBs.
This will not fetch any result, but you know column names are valid, data types etc. If it does not execute you know something is wrong with DB(not up etc.)
So many generic programs execute this dummy statement for testing and fetching metadata.
Some systems use scripts and can dynamically set selected records to be hidden from a full list; so a false condition needs to be passed to the SQL. For example, three records out of 500 may be marked as Privacy for medical reasons and should not be visible to everyone. A dynamic query will control the 500 records are visible to those in HR, while 497 are visible to managers. A value would be passed to the SQL clause that is conditionally set, i.e. ' WHERE 1=1 ' or ' WHERE 1=0 ', depending who is logged into the system.
quoted from Greg
If the list of conditions is not known at compile time and is instead
built at run time, you don't have to worry about whether you have one
or more than one condition. You can generate them all like:
and
and concatenate them all together. With the 1=1 at the start, the
initial and has something to associate with.
I've never seen this used for any kind of injection protection, as you
say it doesn't seem like it would help much. I have seen it used as an
implementation convenience. The SQL query engine will end up ignoring
the 1=1 so it should have no performance impact.
Why would someone use WHERE 1=1 AND <conditions> in a SQL clause?
If the user intends to only append records, then the fastest method is open the recordset without returning any existing records.
It can be useful when only table metadata is desired in an application. For example, if you are writing a JDBC application and want to get the column display size of columns in the table.
Pasting a code snippet here
String query = "SELECT * from <Table_name> where 1=0";
PreparedStatement stmt = connection.prepareStatement(query);
ResultSet rs = stmt.executeQuery();
ResultSetMetaData rsMD = rs.getMetaData();
int columnCount = rsMD.getColumnCount();
for(int i=0;i<columnCount;i++) {
System.out.println("Column display size is: " + rsMD.getColumnDisplaySize(i+1));
}
Here having a query like "select * from table" can cause performance issues if you are dealing with huge data because it will try to fetch all the records from the table. Instead if you provide a query like "select * from table where 1=0" then it will fetch only table metadata and not the records so it will be efficient.
Per user milso in another thread, another purpose for "WHERE 1=0":
CREATE TABLE New_table_name as select * FROM Old_table_name WHERE 1 =
2;
this will create a new table with same schema as old table. (Very
handy if you want to load some data for compares)
An example of using a where condition of 1=0 is found in the Northwind 2007 database. On the main page the New Customer Order and New Purchase Order command buttons use embedded macros with the Where Condition set to 1=0. This opens the form with a filter that forces the sub-form to display only records related to the parent form. This can be verified by opening either of those forms from the tree without using the macro. When opened this way all records are displayed by the sub-form.
In ActiveRecord ORM, part of RubyOnRails:
Post.where(category_id: []).to_sql
# => SELECT * FROM posts WHERE 1=0
This is presumably because the following is invalid (at least in Postgres):
select id FROM bookings WHERE office_id IN ()
It seems like, that someone is trying to hack your database. It looks like someone tried mysql injection. You can read more about it here: Mysql Injection

Querying for software using SQL query in SCCM

I am looking for specific pieces of software across our network by querying the SCCM Database. My problem is that, for various reasons, sometimes I can search by the program's name and other times I need to search for a specific EXE.
When I run the query below, does it take 13 seconds to run if the where clause contains an AND, but it will run for days with no results if the AND is replaced with OR. I'm assuming it is doing this because I am not properly joining the tables. How can I fix this?
select vrs.Name0
FROM v_r_system as vrs
join v_GS_INSTALLED_SOFTWARE as VIS on VIS.resourceid = vrs.resourceid
join v_GS_SoftwareFile as sf on SF.resourceid = vrs.resourceid
where
VIS.productname0 LIKE '%office%' AND SF.Filename LIKE 'Office2007%'
GROUP BY vrs.Name0
Thanks!
Your LIKE clause contains a wildcard match at the start of a string:
LIKE '%office%'
This prevents SQL Server from using an index on this column, hence the slow running query. Ideally you should change your query so your LIKE clause doesn't use a wildcard at the start.
In the case where the WHERE clause contains an AND its querying based on the Filename clause first (it is able to use an index here and so this is relatively quick) and then filtering this reduced rowset based on your productname0 clause. When you use an OR however it isn't restricted to just returning rows that match your Filename clause, and so it must search through the entire table checking to see if each productname0 field matches.
Here's a good Microsoft article http://msdn.microsoft.com/en-us/library/ms172984.aspx on improving indexes. See the section on Indexes with filter clauses (reiterates the previous answer.)
Have you tried something along these lines instead of a like query?
... where product name in ('Microsoft Office 2000','Microsoft Office xyz','Whateverelse')

Hide Empty columns

I got a table with 75 columns,. what is the sql statement to display only the columns with values in in ?
thanks
It's true that a similar statement doesn't exist (in a SELECT you can use condition filters only for the rows, not for the columns). But you could try to write a (bit tricky) procedure. It must check which are the columns that contains at least one not NULL/empty value, using queries. When you get this list of columns just join them in a string with a comma between each one and compose a query that you can run, returning what you wanted.
EDIT: I thought about it and I think you can do it with a procedure but under one of these conditions:
find a way to retrieve column names dynamically in the procedure, that is the metadata (I never heard about it, but I'm new with procedures)
or hardcode all column names (loosing generality)
You could collect column names inside an array, if stored procedures of your DBMS support arrays (or write the procedure in a programming language like C), and loop on them, making a SELECT each time, checking if it's an empty* column or not. If it contains at least one value concatenate it in a string where column names are comma-separated. Finally you can make your query with only not-empty columns!
Alternatively to stored procedure you could write a short program (eg in Java) where you can deal with a better flexibility.
*if you check for NULL values it will be simple, but if you check for empty values you will need to manage with each column data type... another array with data types?
I would suggest that you write a SELECT statement and define which COLUMNS you wish to display and then save that QUERY as a VIEW.
This will save you the trouble of typing in the column names every time you wish to run that query.
As marc_s pointed out in the comments, there is no select statement to hide columns of data.
You could do a pre-parse and dynamically create a statement to do this, but this would be a very inefficient thing to do from a SQL performance perspective. Would strongly advice against what you are trying to do.
A simplified version of this is to just select the relevant columns, which was what I needed personally. A quick search of what we're dealing with in a table
SELECT * FROM table1 LIMIT 10;
-> shows 20 columns where im interested in 3 of them. Limit is just to not overflow the console.
SELECT column1,column3,colum19 FROM table1 WHERE column3='valueX';
It is a bit of a manual filter but it works for what I need.

Building Query from Multi-Selection Criteria

I am wondering how others would handle a scenario like such:
Say I have multiple choices for a user to choose from.
Like, Color, Size, Make, Model, etc.
What is the best solution or practice for handling the build of your query for this scneario?
so if they select 6 of the 8 possible colors, 4 of the possible 7 makes, and 8 of the 12 possible brands?
You could do dynamic OR statements or dynamic IN Statements, but I am trying to figure out if there is a better solution for handling this "WHERE" criteria type logic?
EDIT:
I am getting some really good feedback (thanks everyone)...one other thing to note is that some of the selections could even be like (40 of the selections out of the possible 46) so kind of large. Thanks again!
Thanks,
S
What I would suggest doing is creating a function that takes in a delimited list of makeIds, colorIds, etc. This is probably going to be an int (or whatever your key is). And splits them into a table for you.
Your SP will take in a list of makes, colors, etc as you've said above.
YourSP '1,4,7,11', '1,6,7', '6'....
Inside your SP you'll call your splitting function, which will return a table-
SELECT * FROM
Cars C
JOIN YourFunction(#models) YF ON YF.Id = C.ModelId
JOIN YourFunction(#colors) YF2 ON YF2.Id = C.ColorId
Then, if they select nothing they get nothing. If they select everything, they'll get everything.
What is the best solution or practice for handling the build of your query for this scenario?
Dynamic SQL.
A single parameter represents two states - NULL/non-existent, or having a value. Two more means squaring the number of parameters to get the number of total possibilities: 2 yields 4, 3 yields 9, etc. A single, non-dynamic query can contain all the possibilities but will perform horribly between the use of:
ORs
overall non-sargability
and inability to reuse the query plan
...when compared to a dynamic SQL query that constructs the query out of only the absolutely necessary parts.
The query plan is cached in SQL Server 2005+, if you use the sp_executesql command - it is not if you only use EXEC.
I highly recommend reading The Curse and Blessing of Dynamic SQL.
For something this complex, you may want a session table that you update when the user selects their criteria. Then you can join the session table to your items table.
This solution may not scale well to thousands of users, so be careful.
If you want to create dynamic SQL it won't matter if you use the OR approach or the IN approach. SQL Server will process the statements the same way (maybe with little variation in some situations.)
You may also consider using temp tables for this scenario. You can insert the selections for each criteria into temp tables (e.g., #tmpColor, #tmpSize, #tmpMake, etc.). Then you can create a non-dynamic SELECT statement. Something like the following may work:
SELECT <column list>
FROM MyTable
WHERE MyTable.ColorID in (SELECT ColorID FROM #tmpColor)
OR MyTable.SizeID in (SELECT SizeID FROM #tmpSize)
OR MyTable.MakeID in (SELECT MakeID FROM #tmpMake)
The dynamic OR/IN and the temp table solutions work fine if each condition is independent of the other conditions. In other words, if you need to select rows where ((Color is Red and Size is Medium) or (Color is Green and Size is Large)) you'll need to try other solutions.