I understand that this question is based on a very old programming language, and is also based on some poor practices with database design, but I am hoping for some guidance as my VBScript is not up to scratch, and at my job I am unable to change my database structure or my scripting language.
I am working on a webpage that gets its data from a database which originally contained all of its data in one table. However, the database managers at my work have decided to split the large table into many smaller ones. Our webpage contains an online map with point data, which displays in a new page a table of data for each point found. The code to connect to the database originally looked something like this:
sql_select = "SELECT * FROM table_name WHERE master_id='"&key&"'"
Set rs = Conn.Execute(sql_select)
This code worked fine for what it was used for. However, now that the main table has been divided into seven sub-tables, I am in need of a code that will select all data from all tables, and then filter them based on their master id.
I feel like I understand the theory of how to make this work (I am not a computer science major, so this is still somewhat foreign to me), and that I need to implement a union. My main problem is dealing with the syntax of VBScript/ASP in creating this script, as it seems like everything I try doesn't work.
Can anybody please lend me some guidance? Much appreciated!
It sounds like what you actually need is a JOIN rather than a UNION. A JOIN glues together data from several places into a single row, while a UNION glues together several rows of data in a collection of more rows:
The DB Query
In this case you have a master table and several sub tables. For the example's sake I'll assume the tables look like:
Table Columns
---------------------------
MasterTable MasterId, M1, M2, M3
SubTableA SubTableAId, MasterId, A1, A2, A3
SubTableB SubTableBId, MasterId, B1, B2, B3
So if you were wanting to retrieve the columns M1-3 and A1-A3 then you could join the columns from MasterTable with the columns on SubTableA where they have matching MasterId values:
SELECT *
FROM MasterTable MT
INNER JOIN SubTableA STA ON MT.MasterId = STA.MasterId
The MT and the STA are aliases so we can don't have to type out the entire tables names when we're clarifying which of the two master id's we're referring to.
If we then only wanted the values for a single, specific master ID we could add that as a where clause on the end:
SELECT *
FROM MasterTable MT
INNER JOIN SubTableA STA ON MT.MasterId = STA.MasterId
WHERE MT.MasterId = ?
If we needed columns from additional tables, we could JOIN to those tables as well:
SELECT *
FROM MasterTable MT
INNER JOIN SubTableA STA ON MT.MasterId = STA.MasterId
INNER JOIN SubTableB STB ON MT.MasterId = STB.MasterId
WHERE MT.MasterId = ?
We are specifying an INNER JOIN because we only want records returned where there are matching values in all three tables. If SubTableB only had values for each master id some of the time, then we would want to switch to a LEFT JOIN or LEFT OUTER JOIN which would tell the database to return our columns from the LEFT side of the join even when there aren't matching columns available on the right side.
So to get all columns in a situation where we may not always have SubTableB records but still want the Master and SubTableA columns:
SELECT *
FROM MasterTable MT
INNER JOIN SubTableA STA ON MT.MasterId = STA.MasterId
LEFT JOIN SubTableB STB ON MT.MasterId = STB.MasterId
WHERE MT.MasterId = ?
if it is possible that you will have multiple records in one of the SubTables for a single MasterId then the rows will be returned with each possible combinations of the table rows that match the JOIN criteria.
So if we had one record in the MasterTable with an ID of 5, 1 in SubTableA with a master id of 5, and 3 in SubTableB with an id of 5, then we would actually receive 3 rows back, 1 for each combination of MasterTable and SubTableA values with the SubTable3 values. So in your client-side you will either need to be able to handle the duplicate values/rows or you split the query to execute the query from SubTableB separately.
VBScript SQL Code
The code you provided, while a sample, has two downsides. One, because you are concatenating the argument into the database string, special characters in that string could break your SQL statement or, if you are retrieving that value from a form POST, Querystring, etc, then an end user could actually inject a SQL statement into the variable to run on your database (potentially dropping valuable records, manipulating the database, or potentially even uploading trojans or executables to the server, depending on the permissions level).
The second downside is that there are potential downsides to executing a string like that. Depending on your database engine you could take minor performance hits from implicit type conversions, lack of plan caching, etc simply because the statement was delivered as a non-parametrized string.
To resolve these issues, you should look into using parametrized SQL statements or creating a stored procedure for your statement and calling that with parametrized values. the ADODB object has support for parameters:
Dim yourKeyValue : yourKeyValue = 5
Dim objConn, objCommand, rsResults
Set objConn = Server.CreateObject("ADODB.Connection")
objConn.Open("Your connection string")
Set objCommand = Server.CreateObject("ADODB.Command")
objCommand.ActiveConnection = objConn
objCommand.CommandText = "YourStoreProcName"
objCommand.CommandType = 4 'stored proc type/enum
objCommand.Parameters.Append objCommand.CreateParameter("Key", 3,1,,yourKeyValue)
set rsResults = objCommand.Execute()
More information on parameters and constants: http://www.w3schools.com/ado/met_comm_createparameter.asp
More info on CommandType: http://www.w3schools.com/ado/prop_comm_commandtype.asp
You could change out the command text for a SQL string instead of a stored procedure and use ?-marks as placeholders for the parameters you define.
More Cleanup
Just a little more cleanup real quick.
Rather than specifying a * for your query, you will probably want to specify the list of columns you actually want returned. This will reduce the amount of data coming across the wire to only what you need, tell the database exactly what you need so it doesn't have to pre-lookup the fieldnames on it's own, and it will reduce some of the confusion in your recordset object (you would otherwise have quite a few master id columns, for instance).
As Dee mentioned in their response, you probably should include whoever made the database changes, but I don't think you want to just dump this in their laps because you will want to understand why and how it works, otherwise trying to make additional changes or maintaining the app will be that much harder than it already is.
The database managers split the table up, tell them you need a new SQL statement to replace the one you have. Since they know the new structure, they can write it for you in a couple of minutes.
Related
noob SQL question here. I have a database in SQL Server 2012 and access to another MySQL database with information I'd like to pull.
I would like to update a column representing quantities of a workstation, and that information is present on the other database. I would like to structure a query that updates quantities of all items where model numbers match in each database.
I'll include two select statements to demonstrate what I'd like to work with.
The data from the other database I want to pull from.
SELECT *
FROM OPENQUERY(S3MONITOR,
'SELECT count(BuildDriver) AS ''# of Workstations'', BuildDriver AS ''Workstation Model''
FROM workstation
GROUP BY BuildDriver')
Which produces this result
My database that I'd like to update.
SELECT NumOfDevices, Model
FROM dbo.Currency
INNER JOIN dbo.SubCategory
ON Currency.SubCategoryId = SubCategory.SubCategoryId
WHERE SubCategory.SubCategoryName = 'WorkStation';
Which produces this result
The models will be updated in my database to make sure model names correspond to one another but in the interim I'd like to test a query using the M93z. So what would be the most ideal way of updating NumofDevices in my database using # of Workstations in the other database, where the model names are equal?
Sorry if this is an obvious question, I'm still an SQL noob and it's still not intuitive to me when it comes to these kinds of things.
EDIT: The end goal is to routinely update all Workstation quantities nightly through an SQL Server Agent scheduled job but I'll cross that bridge when I come to it (I actually have some faith that I'll be able to apply the query solution to this job as soon as it's figured out) but I want to include this information in the event that it changes any of your suggestions.
You can use join in the update statement.
with newvals as (
SELECT *
FROM OPENQUERY(S3MONITOR,
'SELECT count(BuildDriver) AS ''# of Workstations'', BuildDriver AS ''Workstation Model''
FROM workstation
GROUP BY BuildDriver'
) oq
)
update c
set NumOfDevices = newvals.num_workstations
from dbo.Currency c join
dbo.SubCategory sc
on c.SubCategoryId = sc.SubCategoryId join
newvals
on newvals.model = c.model
where sc.SubCategoryName = 'WorkStation';
Note: This updates values already in the table. You might want to use merge if you want to add new rows as well. If this is the case, ask another question, because this question is specifically about updating.
I have imported data from a monolithic csv file into MS Access. One of my fields is notes which can be pretty much anything, or be any length. Regardless, its content is often repeated across records.
So I have split each unique note into a new table and added an 'autonumber' field to serve as the primary key. All good so far.
The problem is that I now need to link the original table with the new notes one, but the original table has no knowledge of which ID should match which note, and so I am unable to replace the note with the ID.
I also cannot link the note field on the original table with the note field in the note table (nor would I want to) because the fields are 'long text'.
Since you can't JOIN on Memo fields in Access, you will need to create a CROSS JOIN and filter where the ID's are equal. From there, it's just an UPDATE:
UPDATE Products, Notes
SET Products.NoteID = Notes.NoteID
WHERE (Products.Notes=Notes.Notes);
sgeddes has the same idea in his code as well.
You'll need to use an UPDATE statement with a JOIN:
update o
set o.noteid = n.id
from original o
join notes n on o.notes = n.note;
SQL Fiddle Demo
If this is a Memo field and you can't use a JOIN in Access, then this trick should work as well:
update o
set o.noteid = n.id
from original o, notes n
where o.notes = n.note;
I am working on a project where I have inherited an SQL Join that uses join
criteria in a format I have not seen before. The basic format of the join
is this:
Proc Sql;
create table mytest as
select t1.var1,
t1.var2,
t1.var3
from mysource1 t1
left join mysource2 t2 on
(t1.var1 = t2.var1), myparam t3;
quit;
The bit I am confused about is why myparam is included as a join
condition within the ON statement of the LEFT JOIN. The contents of
'myparam' is derived from the SAS Parameter File we have defined on our
system and contains just one row, with two columns. One contains month
start date, the other month end date.
None of the columns in this parameter file are in the other two source
tables and none of the columns in the parameter file appear in the final
output (they aren't referenced in the SELECT statement so they won't do).
I'm guessing that including the 'myparam' dataset in this context is
somehow using the date values within in it to cut the data in mysource1 and
mysource2, but could someone please provide confirmation that this is the
case and the exact mechanism at work please?
Thanks
This is an unusual construction for a join in SAS, but it's basically a Cartesian product. The myparam table isn't part of the LEFT JOIN condition but a new table, starting a new join. Any table included using a comma and no join condition causes it to be joined with all rows from one table matching to all rows in the other. This can be dangerous when two large tables are used (as the amount of rows is multiplied) but in your case the myparam table has one row, so it's only 1 x n.
However, saying all that, the query you have come across doesn't use any values from myparam (or mysource2 for that matter), so I don't see why these tables are being joined on at all. I'm fairly certain the following query would be equivalent:
proc sql;
select var1,var2,var3
from mysource1;
quit;
I'm aware this answer might come across as incomplete, so please feel free to comment...
When I have to select a number of fields from different tables:
do I always need to join tables?
which tables do I need to join?
which fields do I have to use for the join/s?
do the joins effects reflect on fields specified in select clause or on where conditions?
Thanks in advance.
Think about joins as a way of creating a new table (just for the purposes of running the query) with data from several different sources. Absent a specific example to work with, let's imagine we have a database of cars which includes these two tables:
CREATE TABLE car (plate_number CHAR(8),
state_code CHAR(2),
make VARCHAR(128),
model VARCHAR(128),);
CREATE TABLE state (state_code CHAR(2),
state_name VARCHAR(128));
If you wanted, say, to get a list of the license plates of all the Hondas in the database, that information is already contained in the car table. You can simply SELECT * FROM car WHERE make='Honda';
Similarly, if you wanted a list of all the states beginning with "A" you can SELECT * FROM state WHERE state_name LIKE 'A%';
In either case, since you already have a table with the information you want, there's no need for a join.
You may even want a list of cars with Colorado plates, but if you already know that "CO" is the state code for Colorado you can SELECT * FROM car WHERE state_code='CO'; Once again, the information you need is all in one place, so there is no need for a join.
But suppose you want a list of Hondas including the name of the state where they're registered. That information is not already contained within a table in your database. You will need to "create" one via a join:
car INNER JOIN state ON (car.state_code = state.state_code)
Note that I've said absolutely nothing about what we're SELECTing. That's a separate question entirely. We also haven't applied a WHERE clause limiting what rows are included in the results. That too is a separate question. The only thing we're addressing with the join is getting data from two tables together. We now, in effect, have a new table called car INNER JOIN state with each row from car joined to each row in state that has the same state_code.
Now, from this new "table" we can apply some conditions and select some specific fields:
SELECT plate_number, make, model, state_name
FROM car
INNER JOIN state ON (car.state_code = state.state_code)
WHERE make = 'Honda'
So, to answer your questions more directly, do you always need to join tables? Yes, if you intend to select data from both of them. You cannot select fields from car that are not in the car table. You must first join in the other tables you need.
Which tables do you need to join? Whichever tables contain the data you're interested in querying.
Which fields do you have to use? Whichever fields are relevant. In this case, the relationship between cars and states is through the state_code field in both table. I could just as easily have written
car INNER JOIN state ON (state.state_code = car.plate_number)
This would, for each car, show any states whose abbreviations happen to match the car's license plate number. This is, of course, nonsensical and likely to find no results, but as far as your database is concerned it's perfectly valid. Only you know that state_code is what's relevant.
And does the join affect SELECTed fields or WHERE conditions? Not really. You can still select whatever fields you want and you can still limit the results to whichever rows you want. There are two caveats.
First, if you have the same column name in both tables (e.g., state_code) you cannot select it without clarifying which table you want it from. In this case I might write SELECT car.state_code ...
Second, when you're using an INNER JOIN (or on many database engines just a JOIN), only rows where your join conditions are met will be returned. So in my nonsensical example of looking for a state code that matches a car's license plate, there probably won't be any states that match. No rows will be returned. So while you can still use the WHERE clause however you'd like, if you have an INNER JOIN your results may already be limited by that condition.
Very broad question, i would suggest doing some reading on it first but in summary:
1. joins can make life much easier and queries faster, in a nut shell try to
2. the ones with the data you are looking for
3. a field that is in both tables and generally is unique in at least one
4. yes, essentially you are createing one larger table with joins. if there are two fields with the same name, you will need to reference them by table name.columnname
do I always need to join tables?
No - you could perform multiple selects if you wished
which tables do I need to join?
Any that you want the data from (and need to be related to each other)
which fields do I have to use for the
join/s?
Any that are the same in any tables within the join (usually primary key)
do the joins effects reflect on fields specified in select clause or on where conditions?
No, however outerjoins can cause problems
(1) what else but tables would you want to join in mySQL?
(2) those from which you want to correlate and retrieve fields (=data)
(3) best use indexed fields (unique identifiers) to join as this is fast. e.g. join
retrieve user-email and all the users comments in a 2 table db
(with tables: tableA=user_settings, tableB=comments) and both having the column uid to indetify the user by
select * from user_settings as uset join comments as c on uset.uid = c.uid where uset.email = "test#stackoverflow.com";
(4) both...
I'm always discouraged from using one, but is there a circumstance when it's the best approach?
It's rare, but I have a few cases where it's used. Typically in exception reports or ETL or other very peculiar situations where both sides have data you are trying to combine.
The alternative is to use an INNER JOIN, a LEFT JOIN (with right side IS NULL) and a RIGHT JOIN (with left side IS NULL) and do a UNION - sometimes this approach is better because you can customize each individual join more obviously (and add a derived column to indicate which side is found or whether it's found in both and which one is going to win).
I noticed that the wikipedia page provides an example.
For example, this allows us to see
each employee who is in a department
and each department that has an
employee, but also see each employee
who is not part of a department and
each department which doesn't have an
employee.
Note that I never encountered the need of a full outer join in practice...
I've used full outer joins when attempting to find mismatched, orphaned data, from both of my tables and wanted all of my result set, not just matches.
Just today I had to use Full Outer Join. It is handy in situations where you're comparing two tables. For example, the two tables I was comparing were from different systems so I wanted to get following information:
Table A has any rows that are not in Table B
Table B has any rows that are not in Table A
Duplicates in either Table A or Table B
For matching rows whether values are different (Example: The table A and Table B both have Acct# 12345, LoanID abc123, but Interest Rate or Loan Amount is different
In addition, I created an additional field in SELECT statement that uses a CASE statement to 'comment' why I am flagging this row. Example: Interest Rate does not match / The Acct doesn't exist in System A, etc.
Then saved it as a view. Now, I can use this view to either create a report and send it to users for data correction/entry or use it to pull specific population by 'comment' field I created using a CASE statement (example: all records with non-matching interest rates) in my stored procedure and automate correction, etc.
If you want to see an example, let me know.
The rare times i have used it has been around testing for NULLs on both sides of the join in case i think data is missing from the initial INNER JOIN used in the SQL i'm testing on.
They're handy for finding orphaned data but I rarely use then in production code. I wouldn't be "always discouraged from using one" but I think in the real world they are less frequently the best solution compared to inners and left/right outers.
In the rare times that I used Full Outer Join it was for data analysis and comparison purpose such as when comparing two customers tables from different databases to find out duplicates in each table or to compare the two tables structures, or to find out null values in one table compared to the other, or finding missing information in one tables compared to the other.
For example, suppose you have two tables: one containing customer data and another containing order data. A full outer join would allow you to see all customers and all orders, even if some customers have no orders or some orders have no corresponding customer. This can help you identify any gaps in the data and ensure that all relevant information is included in the result set.
It's important to note that a full outer join can produce a huge result set since it includes all rows from both tables. This can be inefficient in terms of performance, so it's best to use a full outer join only when it is necessary to include all rows from both tables.
SELECT *
FROM table1
FULL OUTER JOIN table2
ON table1.column_name = table2.column_name;
This will return all rows from both table1 and table2, filling in NULL values for missing matches on either side.