(NB. The question is not a duplicate for this, since I am dealing with an ORM system)
I have a table in my database to store all Contacts information. Some of the columns for each contact is fixed (e.g. Id, InsertDate and UpdateDate). In my program I would like to give user the option to add or remove properties for each contact.
Now there are of course two alternatives here:
First is to save it all in one table and add and remove entire columns when user needs to;
Create a key-value table to save each property alongside its type and connect the record to user's id.
These alternatives are both doable. But I am wondering which one is better in terms of speed? In the program it will be a very common thing for the user to view the entire Contact list to check for updates. Plus, I am using an ORM framework (Microsoft's Entity Framework) to deal with database queries. So if the user is to add and remove columns from a table all the time, it will be a difficult task to map them to my program. But again, if alternative (1) is a significantly better option than (2), then I can reconsider the key-value option.
I have actually done both of these.
Example #1
Large, wide table with columns of data holding names, phone, address and lots of small integer values of information that tracked details of the clients.
Example #2
Many different tables separating out all of the Character Varying data fields, the small integer values etc.
Example #1 was a lot faster to code for but in terms of performance, it got pretty slow once the table filled with records. 5000 wasn't a problem. When it reached 50,000 there was a noticeable performance degradation.
Example #2 was built later in my coding experience and was built to resolve the issues found in Example #1. While it took more to get the records I was after (LEFT JOIN this and UNION that) it was MUCH faster as you could ultimately pick and choose EXACTLY what the client was after without having to search a massive wide table full of data that was not all being requested.
I would recommend Example #2 to fit your #2 in the question.
And your USER specified columns for their data set could be stored in a table just to their own (depending on how many you have I suppose) which would allow you to draw on the table specific to that USER, which would also give you unlimited ability to remove and add columns to suit that particular setup.
You could then also have another table which kept track of the custom columns in the custom column table, which would give you the ability to "recover" columns later, as in "Do you want to add this to your current column choices or to one of these columns you have deleted in the past".
I have a table with 5 fields that are all the same. They each can hold a reference to a row from another table with relationships. I want to update all of these fields at the same time on a row, but with a randomly selected row from the table for each field (with no duplicates). I am not sure how in access SQL you can update a lookup/relationship field like this. Any advice is greatly appreciated.
Simple answer is that you can't, not as it appears you would like to anyway. The closest thing possible would be to create an Insert query with parameters, and then feed in your 5 values using VBA. Since you will have to use VBA anyway, you may as well go the whole hog and conduct the entire process with Recordsets.
But that's not the fiddly part, (relatively speaking) selecting your source values is. What you will need to do is open a Recordset on your source table, and hook it up to your random-no-duplicates logic in order to select your 5 record references, then you open up a Recordset on your destination table, and drop them in the appropriate fields.
This tutorial will get you started on Recordsets: http://www.utteraccess.com/wiki/index.php/Recordsets_for_Beginners
I have a DB (Access 2010) that I am pulling data from, but I am trying to make it easier to pull specific cases instead of mucking about in Excel.
We have about 78 product type codes that we classify as a certain account type. Unfortunately I can't use an IN() function because there are too many characters (there is the 1024 char limit). I looked online for help and it was suggested that I make a table to inner join on the product codes that I want.
I created a table with the codes I want to pull, then joined on the productcodetype in the linked database table. Unfortunately when I run the sql nothing shows up, just blank. I tried different join combinations to no avail, read up further and found that you can't enforce referential integrity on linked DB tables from non-linked DB tables.
I think this is my problem but I'm not sure, and I don't know if I'm using the right language, but I can't find a similar issue to mine so I'm hoping it's an easy fix and I'm just not thinking about it the right way.
Is there any way to select certain cases (78 product type codes) from a large database using something like IN() or a reference table when I can't create a new table in the linked db?
Thank you,
K
You must to use two tables and build a query that join them. If your join don't return any result, be sure that the joined fields are of the same data type and realy share the same values.
If your data source is Excel, be sure that there isn't any trailing blanks or other 'invisible' character.
We receive a data feed from our customers and we get roughly the same schema each time, though it can change on the customer end as they are using a 3rd party application. When we receive the data files we import the data into a staging database with a table for each data file (students, attendance, etc). We then want to compare that data to the data that we already have existing in the database for that customer and see what data has changed (either the column has changed or the whole row was possibly deleted) from the previous run. We then want to write the updated values or deleted rows to an audit table so we can then go back to see what data changed from the previous data import. We don't want to update the data itself, we only want to record what's different between the two datasets. We will then delete all the data from the customer database and import the data exactly as is from the new data files without changing it(this directive has been handed down and cannot change). The big problem is that I need to do this dynamically since I don't know exactly what schema I'm going to be getting from our customers since they can make customization to their tables. I need to be able to dynamically determine what tables there are in the destination, and their structure, and then look at the source and compare the values to see what has changed in the data.
Additional info:
There are no ID columns on source, though there are several columns that can be used as a surrogate key that would make up a distinct row.
I'd like to be able to do this generically for each table without having to hard-code values in, though I might have to do that for the surrogate keys for each table in a separate reference table.
I can use either SSIS, SPs, triggers, etc., whichever would make more sense. I've looked at all, including tablediff, and none seem to have everything I need or the logic starts to get extremely complex once I get into them.
Of course any specific examples anyone has of something like this they have already done would be greatly appreciated.
Let me know if there's any other information that would be helpful.
Thanks
I've worked on a similar problem and used a series of meta data tables to dynamically compare datasets. These meta data tables described which datasets need to be staged and which combination of columns (and their data types) serve as business key for each table.
This way you can dynamically construct a SQL query (e.g., with a SSIS script component) that performs a full outer join to find the differences between the two.
You can join your own meta data with SQL Server's meta data (using sys.* or INFORMATION_SCHEMA.*) to detect if the columns still exist in the source and the data types are as you anticipated.
Redirect unmatched meta data to an error flow for evaluation.
This way of working is very risky, but can be done if you maintain your meta data well.
If you want to compare two tables to see what is different the keyword is 'except'
select col1,col2,... from table1
except
select col1,col2,... from table2
this gives you everything in table1 that is not in table2.
select col1,col2,... from table2
except
select col1,col2,... from table1
this gives you everything in table2 that is not in table1.
Assuming you have some kind of useful durable primary key on the two tables, everything in both sets, is a change. Everything in the first set is an insert; Everything in the second set is a delete.
I have a table, let's say it stores staff names (this isn't the exact situation, but the concept is the same), and the primary key is set to an autonumber. When adding a a new member of staff I want to check to see if that name exists in the database before it is added, and maybe give an error if it already exists. How can I do this from the normal add form for the table?
I tried creating a query for it but that won't work because the form is based on the table and can't use a query as the control source. I saw some examples online saying how to do something like this with VB code, but I couldn't get it to work as it wasn't a simple example and some lines were left out.
Is there any simple way in which this can be done?
In the table design view, you could make the Name column Indexed with No Duplicates
Then Access itself will reject the entry. I think it will however use up one of autonumbers before rejecting the input.
You're dealing with the issue of pre-qualifying records before inserting them into the database. Simple and absolute rules that you will never violate (like never, ever, ever allowing to records with the same name) can be dealt with through database constraints — in this case creating an index on the column in question with AllowDuplicates set to No.
However, in the real world pre-qualification is generally more complex. You may need to simply warn the user of a possible duplicate, but allow them to add the record anyway. And you may need to check other tables for certain conditions or to collect information for more than one table at a time.
In these cases you need to write your interface so it is not directly bound to the table (in Access terms, create a form with the record source empty), collect the information in various controls, perform your checks in code (often using DCOUNT and DLOOKUP) and then issue a series of INSERT and UPDATE statements in code using DoCmd.RunSQL.
You can occasionally use some tricks to get around having to code this in the front end, but sooner rather than later you'll encounter cases that require this level of coding.
I'll put my vote in on the side of using an unbound form to collect the required fields and presenting possible duplicates. Here's an example from a recent app:
(source: dfenton.com)
(I edited real people's names out and put in fake stuff, and my graphics program's anti-aliasing is different from ClearType's, hence the weirdness)
The idea here is that the user puts data in any of the four fields (no requirement for all of them) and clicks the ADD button. The first time, it populates the possible matches. Then the user has to decide whether one of the matches is the intended person or not, and either click ADD again (to add it, even if it's a duplicate), or click the button at the bottom to go to the selected customer.
The colored indicators are intended to convey how close the match is. In this case, the email address put in is an exact match for the first person listed, and exact match on email by itself is considered an exact match. Also, in this particular app, the client wants to minimize having multiple people entered at the same company (it's the nature of their business), so an exact match on Organization is counted as a partial match.
In addition to that, there's matching using Soundex, Soundex2 and Simil, along with substrings and substrings combined with Soundex/Soundex2/Simil. In this case, the second entry is the duplicate, but Soundex and Soundex2 don't catch it, while Simil returns 67% similarity, and I've set the sensitivity to greater than 50%, so "Wightman" shows up as a close match with "Whiteman". Last of all. I'm not sure why the last two are in the list, but there's obviously some reason for it (probably Simil and initials).
I run the names, the company and the email through scoring routines and then use the combination to calculate a final score. I store Soundex and Soundex2 values in each person record. Simil, of course, has to be calculated on the fly, but it works out OK because the Jet/ACE query optimizer knows to restrict on the other fields and thus calls Simil for a much-reduced data set (this is actually the first app I've used Simil for, and it is working great so far).
It takes a bit of a pause to load the possible matches, but is not inordinately slow (the app this version is taken from has about 8K existing records that are being tested against). I created this design for an app that had 250K records in the person table, and it worked just fine when the back end was still Jet, and still works just great after the back end was upsized to SQL Server several years ago.
The best solution is to get the user to enter a few characters of the first and last name and show a continuous form of all the individuals based on those search criteria. Also display relevant information such as middle name, if available, phone number and address to weed out potential duplicates. Then if a duplicate isn't found then they can add the person.
There will always be two John Smith's or Jane Jones in every town.
I read of a situation where two women with the identical first, middle and last names and birth dates were in a hospital at the saem tiem. Truly scary that.
Thanks for all the info here. It is great info and I would use it as I am self-taught.
The easiest way I found as alluded to here, is to use an index on all the fields I want (with No Duplicates). The trick here is really to use multiple indexes (this basically allows a compound index, or a "virtual" index which is made up of more than one field).
The method can be found here: http://en.allexperts.com/q/Using-MS-Access-1440/Creating-Unique-Value-check.htm, but I would repeat it in the event that the link gets removed.
From Access Help:
Prevent duplicate values from being entered into a combination of fields
Create a multiple-field index using the fields you want to prohibit duplicate values for. Leave the Indexes window open when you have finished defining the index.
How?
Open the table in Design view.
Click Indexes on the toolbar.
In the first blank row in the Index
Name column, type a name for the
index. You can name the index after
one of the index fields, or use
another name.
In the Field Name column, click the
arrow and select the first field for
the index.
In the next row in the Field Name
column, select the second field for
the index. (Leave the Index Name
column blank in that row.) Repeat
this step until you have selected
all the fields you want to include
in this index. The default sort order is
Ascending. Select Descending in the
Sort Order column of the Indexes
window to sort the corresponding
field's data in descending order.
In the upper portion of the Indexes
window, click the new index name.
In the lower portion of the Indexes
window, click the Unique property
box, and then click Yes.
You should now be able to not enter records which have the same values in the indexes. I got problems where I could still enter values if one of the indexed fields has a space (there is an option to check/ignore null values when you set up the index) but it didn't work for me, but the solution worked because I won't have null values anyway.
This is called an 'UPSERT' in the SQL world. The ISO/ANSI SQL-99 Standard defines a MERGE syntax which was recently added to SQL Server 2008 with proprietary extensions to the Standard. Happily this is the way the SQL world is going, following the trail blazed by MySQL.
Sadly, the Access Database Engine is an entirely different story. Even the simple UPDATE does not support the SQL-92 scalar subquery syntax, instead has its own is proprietary with arbitrary (unpredictable? certainly undocumented) results. The Windows team scuppered the SQL Server's attempts to fix this in Jet 4.0. Even now that the Access team has its own format for ACE they seem uninterested in making changes to the SQL syntax. So they chances of the product embracing a Standard SQL-99 -- or even their own alternative -- construct is very remote :(
One obvious workaround, assuming performance is not an issue (as always, needs to be tested), is to do the INSERT, ignoring any key failure errors, then immediately after do the UPDATE. Even if you are of the (IMO highly dubious) 'autonumber PK on every table' persuasion, you should have an unique key on your natural key, so all should be fine.