MS Access Database tables comparison - sql

I am trying to compare three MS Access tables for any given field. For example, I have a Main Table, which holds the record for school children. It has the fields Student ID and Name. Then there are 3 sub-tables schools, but they have some data discrepancy. So lets call these schools, A, B and C. These schools have somehow mixed up Student ID with Name, so I need a way to return any Student ID, which has a mismatch for Name. The Main table has student ID as the PKey, and the other; A, B & C have student ID as PKey as well. But the problem is that when I build relationships in Access, it only returns IDs that are common in all 3 tables - INNER JOIN. I need an efficient way to match schools, A -> B & A -> C and concatenate the results. I think JOINING each of these in pairs might take far too long. Please let me know if you have any other alternatives.

So, you have two problems:
You have bad data that needs to be fixed Student_ID and NAme mixed
up
Your schema is not good.
Addressing the data issue:
If your student_ids are all numeric, you could try something like:
UPDATE subA SET student_id = [name], [name]=student_id WHERE isnumeric([name]);
And repeat for the other mixed up sub tables.
Addressing the schema issue:
You have three "Subtables" one for each school. These three tables should be a single table, and "School" should be a field in that table. So your data looks something like:
+--------+------------+---------+
| School | Student_Id | Name |
+--------+------------+---------+
| A | 1 | John |
| A | 2 | Jasmine |
| B | 3 | Fred |
| C | 5 | Harold |
| C | 6 | Donna |
+--------+------------+---------+
This way you only join in a single table, and your data only grows in rows as new schools are brought into your database.
Second, if I'm reading your question correctly, you have both student_id and name in the main table as well as the three sub-tables? It seems like you should only keep these in a single table, maybe named student.
Lastly, you can combine the three subtables into a single view that will make it 9000% (guesstimate) easier to join for future queries, using a UNION query:
SELECT 'A' as school, student_id, name FROM subA
UNION ALL
SELECT 'B', student_id, name FROM subB
UNION ALL
SELECT 'C', student_id, name FROM subC
This will stack all three tables on top of each other and give you a schema similar to the example above. You can join to your main table like:
SELECT *
FROM mainTable
INNER JOIN
(
SELECT 'A' as school, student_id, name FROM subA
UNION ALL
SELECT 'B', student_id, name FROM subB
UNION ALL
SELECT 'C', student_id, name FROM subC
) AS subs ON
mainTable.student_id = subs.student_id

Related

Language dependent column headers

I am working on an PostgreSQL based application and am very curious if there might be a clever solution to have language dependent column headers.
I sure know, that I can set an alias for a header with the "as" keyword, but that obviously has to be done for every select and over and over again.
So I have a table for converting the technical column name to a mnemonic one, to be shown to the user.
I can handle the mapping in the application, but would prefer a database solution. Is there any?
At least could I set the column header to table.column?
You could use a "view". You can think of a view as a psuedo-table, it can be created using a single or multiple tables created from a query. For instance, if I have a table that has the following shape
Table: Pets
Id | Name | OwnerId | AnimalType
1 | Frank| 1 | 1
2 | Jim | 1 | 2
3 | Bobo | 2 | 1
I could create a "view" that changes the Name field to look like PetName instead without changing the table
CREATE VIEW PetView AS
SELECT Id, Name as PetName, OwnerId, AnimalType
FROM Pets
Then I can use the view just like any other table
SELECT PetName
FROM PetView
WHERE AnimalType = 1
Further we could combine another table as well into the view. For instance if we add another table to our DB for Owners then we could create a view that automatically joins the two tables together before subjecting to other queries
Table: Owners
Id | Name
1 | Susan
2 | Ravi
CREATE VIEW PetsAndOwners AS
SELECT p.Id, p.Name as PetName, o.Name as OwnerName, p.AnimalType
FROM Pets p, Owners o
WHERE p.OwnerId = o.Id
Now we can use the new view again as in any other table (for querying, inserts and deletes are not supported in views).
SELECT * FROM PetsAndOwners
WHERE OwnerName = 'Susan'

Can I get duplicate results (from one table) in an INTERSECT operation between two tables?

I know the wording of the question is awkward, but I couldn't phrase it any better. Let me explain the situation.
There's table A which has a bunch of columns (a, b, c ... ) and I run a SELECT query on it like so:
SELECT a FROM A WHERE b IN ('....') (the ellipsis indicates a number of values to be matched to)
There's another table B which has a bunch of columns (d, e, f ... ) and I run a SELECT query on it like so:
SELECT d FROM B WHERE f = '...' (the ellipsis indicates a single value to be matched to)
Now I should say here that the two tables store different types of information about the same entity, but the columns a and d contain the exact same data (in this case, an ID). I want to find out the intersection of the two tables so I run this:
SELECT a FROM A WHERE b IN ('....') INTERSECT SELECT d FROM B WHERE f = '...'
Now here's the problem:
The first SELECT contains a set of values in the WHERE clause, right? So let's say the set is (1234, 2345,3456). Now, the result of this query when b is matched ONLY to 1234 is, let's say, abc. When it's matched to 2345, it's def, suppose. And matching to 3456, it gives abc.
Let's suppose these two results (abc and def) are also in the set of results from the second SELECT.
So, now, putting back the entire set of values to matched into the WHERE clause, the INTERSECT operation will give me abc and def. But I want abc twice since two values in the WHERE clause set match to the second SELECT.
Is there any way I can get that?
I hope it's not too complicated to understand my problem. This is a real-life problem I'm facing in my job.
Data structure and my code
Table A contains general information about a company:
company_id | branch_id | no_of_employees | city
Table B contains the financials of the company:
company_id | branch_id | revenue | profits
First SELECT:
SELECT branch_id FROM A WHERE CITY IN ('Dallas', 'Miami', 'New Orleans')
Now, running each city separately in the first SELECT, I get the branch_ids:
branch_id | city
23 | Dallas
45 | Miami
45 | New Orleans
Once again, this seems impractical as to how two cities can have the same branch ids, but please bear with me on this.
Second SELECT:
SELECT branch_id FROM B
WHERE REVENUE = 5000000
I know this is a little impractical, but for the purpose of this example, it suffices.
Running this query I get the following set:
11
23
45
22
10
So the INTERSECT will give me just 23 and 45. But I want 45 twice, since both Miami and New Orleans have that branch_id and that branch_id has generated a revenue of 5 million.
Directly from Microsoft's documentation (https://msdn.microsoft.com/en-us/library/ms188055.aspx)
:
"INTERSECT returns distinct rows that are output by both the left and right input queries operator."
So NO, it is not possible to get the same value twice when using INTERSECT because the results will be DISTINCT. However if you build an INNER JOIN correctly you can do essentially the same thing as INTERSECT except keep the repetitive results by NOT using distinct or group by.
SELECT
A.a
FROM
A
INNER JOIN B
ON A.a = B.d
AND B.F = '....'
WHERE b IN ('....')
And for your specific Example that you edited:
SELECT
branch_id
FROM
A
INNER JOIN B
ON A.branch_id = B.branch_id
AND B.REVENUE = 5000000
WHERE A.CITY IN ('Dallas', 'Miami', 'New Orleans')
You overcomplicated your task a lot:
SELECT *
FROM A
WHERE CITY IN (...)
AND EXISTS
(
SELECT 1 FROM B
WHERE B.REVENUE = 5000000
AND B.branch_id = A.branch_id
)
INTERSECT and EXCEPT are both returning row sets with DISTINCT applied.
Regular joining/filtering operations are not performed by INTERSECT or EXCEPT.

How to concatenate field values with recursive query in postgresql?

I have a table in PostgreSQL database that contains parts of addresses in a form of a tree and looks like this:
Id | Name | ParentId
1 | London | 0
2 | Hallam Street| 1
3 | Bld 26 | 2
4 | Office 5 | 3
I would like to make a query to return an address, concatenated from all ancestor names. I need the result table to be like this:
Id | Address
1 | London
2 | London, Hallam Street
3 | London, Hallam Street, Bld 26
4 | London, Hallam Street, Bld 26, Office 5
I guess I have to use WITH RECURSIVE query, but all the examples I've found use the where clause, so I have to put WHERE name='Office 5' to get the result only for that particular row. But I need a concatenated address for each row of my initial table. How can this be done?
The trick with recursive queries is that you need to specify a seed query. This is the query that determines your root node, or the starting point to descend or ascend the tree that you are building.
The reason the WHERE clause is there is to establish the seed ID=1 or Name=Bld 26. If you want every record to have the tree ascended or descended (depending on what you specify in the unioned select), then you should just scrap the WHERE statement so all records are seeded.
Although, the example you give... you might want to start with WHERE ID=1 in the seed, write out the child ID and parent ID. Then in the Union'd SELECT join your derived Recursive table with your table from which you are selecting and join on the Derived Recursive table's Child to your table's parent.
Something like:
WITH RECURSIVE my_tree AS (
-- Seed
SELECT
ID as Child,
ParentID as Parent,
Name,
Name as Address
FROM <table>
WHERE <table>.ID = 1
UNION
-- Recursive Term
SELECT
table.id as Child,
table.parent_id as Parent,
table.name,
t.address || ', ' || table.name as Address
FROM my_tree as t
INNER JOIN <table> table ON
t.Child = table.Parent_Id
)
SELECT Child, Address from my_tree;
I've not used PostgreSQL before, so you might have to fuss a bit with the syntax, but I think this is pretty accurate for that RDBMS.

Multiple records in a table matched with a column

The architecture of my DB involves records in a Tags table. Each record in the Tags table has a string which is a Name and a foreign kery to the PrimaryID's of records in another Worker table.
Records in the Worker table have tags. Every time we create a Tag for a worker, we add a new row in the Tags table with the inputted Name and foreign key to the worker's PrimaryID. Therefore, we can have multiple Tags with different names per same worker.
Worker Table
ID | Worker Name | Other Information
__________________________________________________________________
1 | Worker1 | ..........................
2 | Worker2 | ..........................
3 | Worker3 | ..........................
4 | Worker4 | ..........................
Tags Table
ID |Foreign Key(WorkerID) | Name
__________________________________________________________________
1 | 1 | foo
2 | 1 | bar
3 | 2 | foo
5 | 3 | foo
6 | 3 | bar
7 | 3 | baz
8 | 1 | qux
My goal is to filter WorkerID's based on an inputted table of strings. I want to get the set of WorkerID's that have the same tags as the inputted ones. For example, if the inputted strings are foo and bar, I would like to return WorkerID's 1 and 3. Any idea how to do this? I was thinking something to do with GROUP BY or JOINING tables. I am new to SQL and can't seem to figure it out.
This is a variant of relational division. Here's one attempt:
select workerid
from tags
where name in ('foo', 'bar')
group by workerid
having count(distinct name) = 2
You can use the following:
select WorkerID
from tags where name in ('foo', 'bar')
group by WorkerID
having count(*) = 2
and this will retrieve your desired result/
Regards.
This article is an excellent resource on the subject.
While the answer from #Lennart works fine in Query Analyzer, you're not going to be able to duplicate that in a stored procedure or from a consuming application without opening yourself up to SQL injection attacks. To extend the solution, you'll want to look into passing your list of tags as a table-valued parameter since SQL doesn't support arrays.
Essentially, you create a custom type in the database that mimics a table with only one column:
CREATE TYPE list_of_tags AS TABLE (t varchar(50) NOT NULL PRIMARY KEY)
Then you populate an instance of that type in memory:
DECLARE #mylist list_of_tags
INSERT #mylist (t) VALUES('foo'),('bar')
Then you can select against that as a join using the GROUP BY/HAVING described in the previous answers:
select workerid
from tags inner join #mylist on tag = t
group by workerid
having count(distinct name) = 2
*Note: I'm not at a computer where I can test the query. If someone sees a flaw in my query, please let me know and I'll happily correct it and thank them.

Recursively duplicating entries

I am attempting to duplicate an entry. That part isn't hard. The tricky part is: there are n entries connected with a foreign key. And for each of those entries, there are n entries connected to that. I did it manually using a lookup to duplicate and cross reference the foreign keys.
Is there some subroutine or method to duplicate an entry and search for and duplicate foreign entries? Perhaps there is a name for this type of replication I haven't stumbled on yet, is there a specific database related title for this type of operation?
PostgreSQL 8.4.13
main entry (uid is serial)
uid | title
-----+-------
1 | stuff
department (departmentid is serial, uidref is foreign key for uid above)
departmentid | uidref | title
--------------+--------+-------
100 | 1 | Foo
101 | 1 | Bar
sub_category of department (textid is serial, departmentref is foreign for departmentid above)
textid | departmentref | title
-------+---------------+----------------
1000 | 100 | Text for Foo 1
1001 | 100 | Text for Foo 2
1002 | 101 | Text for Bar 1
You can do it all in a single statement using data-modifying CTEs (requires Postgres 9.1 or later).
Your primary keys being serial columns makes it easier:
WITH m AS (
INSERT INTO main (<all columns except pk>)
SELECT <all columns except pk>
FROM main
WHERE uid = 1
RETURNING uid AS uidref -- returns new uid
)
, d AS (
INSERT INTO department (<all columns except pk>)
SELECT <all columns except pk>
FROM m
JOIN department d USING (uidref)
RETURNING departmentid AS departmentref -- returns new departmentids
)
INSERT INTO sub_category (<all columns except pk>)
SELECT <all columns except pk>
FROM d
JOIN sub_category s USING (departmentref);
Replace <all columns except pk> with your actual columns. pk is for primary key, like main.uid.
The query returns nothing. You can return pretty much anything. You just didn't specify anything.
You wouldn't call that "replication". That term usually is applied for keeping multiple database instances or objects in sync. You are just duplicating an entry - and depending objects recursively.
Aside about naming conventions:
It would get even simpler with a naming convention that labels all columns signifying "ID of table foo" with the same (descriptive) name, like foo_id. There are other naming conventions floating around, but this is the best for writing queries, IMO.