I am trying to compare a series of tables within an access database, 2 local and one linked.
Table A (local) contains UserID, Title, Position; Table B (linked) contains UserID, Title, and Position from the previous week (records could possibly change on a week to week basis); Table C (local) contains UNIQUE UserID's and Titles.
I need to ensure that all UserID's contained in Table C still exist
in Table A.
I need to ensure that all UserID's contained in Table C have not had
a change in Title or Position from the previous week. If so Add to a temp table.
I'd prefer to use Access VBA or SQL in accomplish this task and the information will be displayed in a report.
Basically the same logic for both examples. use a left join to to identify mismatches.
Identify missing users in A
Insert into TableA (userID,Title)
select TableC.UserID, TableC.Title
from TableC
left join TableA on TableC.UserID=TableA.UserID
where TableA.UserID is null
Identify changes from B to A
insert into temp (userID,title,position)
select c.userID,c.title,c.position
from TableA a
left join tableB b on b.userid=a.userID and b.title=a.title and b.position=a.position
where b.userID is null
I have a large table in SQL Server with user activity (Table A) and another table with list of users (Table B).
I need to run through the activity table and do a serialized selection of each user, and put the new records into a third table (Table C).
In other words, for each user in Table B, I need to get 1 matching record from Table A, and put them into a new table C, and the repeat the whole process until everyone gets x number of records each.
The end result is so that I can get a distributed record set in Table C, where each user is equally represented.
You can use window functions or a lateral join. Let's do the lateral join. If you want a random selection of records for each user:
select a.*
from b cross apply
(select top (x) a.*
from a
where a.user_id = b.user_id
order by newid()
) a;
Of course, if there aren't enough records in a, then some users will have fewer records.
Looking for best way to identify if there is a duplicate across keyed values. For instance, if I have t1 called Item, and t2 called SKU, and I want to join them together, and look for an attribute that is only duplicate across that join, and in the SKU table, what is best way to write that SQL?
Example:
Item = A
SKUs = 1, 2, 3
Duplicate Attribute in SKU table = testdupe
Assumption is that value testdupe appears in SKU table under A - 1, and A - 3, but not in A - 2. Expected results would be to show 2 duplicates. This would then repeat for items B, C, D, etc. and their related SKUs if that join contains an attribute that is duplicated.
Hope that lengthy description is clear and thanks in advance for your input!
You have a list of 2.0 million IDs in Table A. You have another list of 3.5 million IDs in Table B. Some customer IDs show up multiple times in each table and some IDs show up in both tables. Which of the following would you use to create (in one step - no subqueries) a Table C that contains a list of distinct (no duplicates) customer IDs present in either Table A, Table B, or both?
Union, Union ALL, Outer Join, Union Join-?
Union is a SET operation (in SQL and otherwise in mathematics in the field of set theory). A set is a collection of distinct items. Like a list that does not contain any duplicates. A UNION of sets is the combination of all the distinct items in one or more lists.
So in SQL, the UNION function will combine all of its parameters and return a SET of unique values in the combined list.
Where are Cartesian Joins used in real life?
Can some one please give examples of such a Join in any SQL database.
just random example. you have a table of cities: Id, Lat, Lon, Name. You want to show user table of distances from one city to another. You will write something like
SELECT c1.Name, c2.Name, SQRT( (c1.Lat - c2.Lat) * (c1.Lat - c2.Lat) + (c1.Lon - c2.Lon)*(c1.Lon - c2.Lon))
FROM City c1, c2
Here are two examples:
To create multiple copies of an invoice or other document you can populate a temporary table with names of the copies, then cartesian join that table to the actual invoice records. The result set will contain one record for each copy of the invoice, including the "name" of the copy to print in a bar at the top or bottom of the page or as a watermark. Using this technique the program can provide the user with checkboxes letting them choose what copies to print, or even allow them to print "special copies" in which the user inputs the copy name.
CREATE TEMP TABLE tDocCopies (CopyName TEXT(20))
INSERT INTO tDocCopies (CopyName) VALUES ('Customer Copy')
INSERT INTO tDocCopies (CopyName) VALUES ('Office Copy')
...
INSERT INTO tDocCopies (CopyName) VALUES ('File Copy')
SELECT * FROM InvoiceInfo, tDocCopies WHERE InvoiceDate = TODAY()
To create a calendar matrix, with one record per person per day, cartesian join the people table to another table containing all days in a week, month, or year.
SELECT People.PeopleID, People.Name, CalDates.CalDate
FROM People, CalDates
I've noticed this being done to try to deliberately slow down the system either to perform a stress test or an excuse for missing development deliverables.
Usually, to generate a superset for the reports.
In PosgreSQL:
SELECT COALESCE(SUM(sales), 0)
FROM generate_series(1, 12) month
CROSS JOIN
department d
LEFT JOIN
sales s
ON s.department = d.id
AND s.month = month
GROUP BY
d.id, month
This is the only time in my life that I've found a legitimate use for a Cartesian product.
At the last company I worked at, there was a report that was requested on a quarterly basis to determine what FAQs were used at each geographic region for a national website we worked on.
Our database described geographic regions (markets) by a tuple (4, x), where 4 represented a level number in a hierarchy, and x represented a unique marketId.
Each FAQ is identified by an FaqId, and each association to an FAQ is defined by the composite key marketId tuple and FaqId. The associations are set through an admin application, but given that there are 1000 FAQs in the system and 120 markets, it was a hassle to set initial associations whenever a new FAQ was created. So, we created a default market selection, and overrode a marketId tuple of (-1,-1) to represent this.
Back to the report - the report needed to show every FAQ question/answer and the markets that displayed this FAQ in a 2D matrix (we used an Excel spreadsheet). I found that the easiest way to associate each FAQ to each market in the default market selection case was with this query, unioning the exploded result with all other direct FAQ-market associations.
The Faq2LevelDefault table holds all of the markets that are defined as being in the default selection (I believe it was just a list of marketIds).
SELECT FaqId, fld.LevelId, 1 [Exists]
FROM Faq2Levels fl
CROSS JOIN Faq2LevelDefault fld
WHERE fl.LevelId=-1 and fl.LevelNumber=-1 and fld.LevelNumber=4
UNION
SELECT Faqid, LevelId, 1 [Exists] from Faq2Levels WHERE LevelNumber=4
You might want to create a report using all of the possible combinations from two lookup tables, in order to create a report with a value for every possible result.
Consider bug tracking: you've got one table for severity and another for priority and you want to show the counts for each combination. You might end up with something like this:
select severity_name, priority_name, count(*)
from (select severity_id, severity_name,
priority_id, priority_name
from severity, priority) sp
left outer join
errors e
on e.severity_id = sp.severity_id
and e.priority_id = sp.priority_id
group by severity_name, priority_name
In this case, the cartesian join between severity and priority provides a master list that you can create the later outer join against.
When running a query for each date in a given range. For example, for a website, you might want to know for each day, how many users were active in the last N days. You could run a query for each day in a loop, but it's simplest to keep all the logic in the same query, and in some cases the DB can optimize the Cartesian join away.
To create a list of related words in text mining, using similarity functions, e.g. Edit Distance