Assign unique ID to duplicates in Access - sql

I had a very big excel spreadsheet that I moved into Access to try to deal with it easier. I'm very much a novice. I'm trying to use SQL via Access.
I need to assign a unique identifier to duplicates. I've seen people use DENSE_RANK in SQL but I can't get it to work in Access.
Here's what I'm trying to do: I have a large amount of patient and sample data (20k rows). My columns are called FULL_NAME, SAMPLE_NUM, and DATE_REC. Some patients have come in more than once and have multiple samples. I want to give each patient a unique ID that I want to call PATIENT_ID.
I can't figure out how to do this, aside from typing it out on each row. I would greatly appreciate help as I really don't know what I'm doing and there is no one at my work who can help.

To illustrate the previous answers' textual explanation, consider the following SQL action queries which can be run in an Access query window one by one or as VBA string queries with DAO's CurrentDb.Execute or DoCmd.RunSQL. The ALTER statements can be done in MSAcecss.exe.
Create a Patients table (make-table query)
SELECT DISTINCT s.FULL_NAME INTO myPatientsTable
FROM mySamplesTable s
WHERE s.FULL_NAME IS NOT NULL;
Add an autonumber field to new Patients table as a Primary Key
ALTER TABLE myPatientsTable ADD COLUMN PATIENT_ID AUTOINCREMENT NOT NULL PRIMARY KEY;
Add a blank Patient_ID column to Samples table
ALTER TABLE mySamplesTable ADD COLUMN PATIENT_ID INTEGER;
Update Patient_ID Column in Samples table using FULL_NAME field
UPDATE mySamplesTable s
INNER JOIN myPatientsTable p
ON s.[FULL_NAME] = p.[FULL_NAME]
SET s.PATIENT_ID = p.PATIENT_ID;
Maintain third-norm principles of relational databases and remove FULL_NAME field from Samples table
ALTER TABLE mySamplesTable DROP COLUMN FULL_NAME;
Then in a separate query, add a foreign key constraint on PATIENT_ID
ALTER TABLE mySamplesTable
ADD CONSTRAINT PatientRelationship
FOREIGN KEY (PATIENT_ID)
REFERENCES myPatientsTable (PATIENT_ID);

Sounds like FULL_NAME is currently the unique identifier. However, names make very poor unique identifiers and name parts should be in separate fields. Are you sure you don't have multiple patients with same name, e.g. John Smith?
You need a PatientInfo table and then the SampleData table. Do a query that pulls DISTINCT patient info (apparently this is only one field - FULL_NAME) and create a table that generates unique ID with autonumber field. Then build a query that joins tables on the two FULL_Name fields and updates a new field in SampleData called PatientID. Delete the FULL_Name field from SampleData.

The command to number rows in your table is [1]
ALTER TABLE MyTable ADD COLUMN ID AUTOINCREMENT;
Anyway as June7 pointed out it might not be a good idea to combine records just based on patient name as there might be duplicates. Better way will be treat each record as unique patient for now and have a way to fix patient ID when patient comes back. I would suggest to go this way:
create two new columns in your samples table
ID with autoincrement as per query above
patientID where you will copy values from ID column - for now they will be same. But in future they will diverge
copy columns patientID and patientName into separate table patients
now you can delete patientName column from samples table
add column imported to patients table to indicate, that there might be some other records that belong to this patient.
when patients come back you open his record, update all other info like address, phone, ... and look for all possible samples record that belong to him. If so, then fix patient id in those records.
Now you can switch imported indicator because this patient data are up to date.
After fixing patientID for samples records. You will end up with patients with no record in samples table. So you can go and delete them.

Unless you already have a natural key you will be corrupting this data when you run the distinct query and build a key from it. From your posting I would guess a natural key would be SAMPLE_NUM. Another problem is that if you roll up by last name you will almost certainly be combining different patients into one.

Related

Set column value to foreign key based on another column

I am using SQL Server and have imported data from an Excel file into my tables.
My tables consist of:
BH_Overview (foreign key table) BH_OverView Table
BH_Equipment (primary key table) BH_Equipment Table
I have different types of equipment and looking to split it out into its own table called BH_Equipment and link it into the main table BH_Overview.
I have my tables created and constraints made, however when data is imported into table I have just stored the equipment name in the the BH_Overview table in a column "Equipment" that isn't link with BH_Equipment.
I'm wondering how I go about updating the equipmentId column based on what is in the equipment column in the BH_Overview table to match the Id in the BH_Equipment table.
You can see I have the foreign keys done for Factory Area and responsibility and that was done manually with update statement as only a few foreign keys to link but with equipment there is 291 types in the BH_Equipment table.
I have tried a update and inner joins, but can't get my head around it. Apologies if I have went about this an awful way, relatively new to SQL so please show if there is a much easier way or if this has been asked before please link and ill give it a look.
UPDATE:
#Charlieface - error message appearing
The other answer is good, but for SQL Server you can update much more easily directly through a join:
update o
set equipmentId = e.id
from BH_OverView o
join BH_Equipment e on e.equipment = o.Equipment;
This rough syntax also works on Postgres, MySQL/MariaDB and the later versions of SQLite.
First, you should have the content of the column Equipment in the table BH_OverView match one of column equipment content in the table BH_Equipment
Then by the following SQL statement, you populate the corresponding equipmentId in the table BH_OverView
update BH_OverView
set equipmentId = (select id from BH_Equipment
where BH_Equipment.equipment=BH_OverView.Equipment)
after verifying the content of equipmentId in the table BH_OverView, you may drop the column Equipment from the table BH_OverView by
alter table BH_OverView drop column Equipment
I am using standard SQL which should operate on the majority of Databases.
Based on your comment
you got an error message
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
This means that in your table BH_Equipment, you have one more equipment that has the same name. You have repeated equipment name in rows of the table BH_Equipment
to get this equipment and the number of time they exist, use the following SQL statement
select equipment, count(id)
from BH_Equipment
group by equipment
having count(id)>1
delete one of the repeated rows, then the error message will not exist.

How can I add rows to an SQL table without knowing how many column it has

I'm building an attendance system that tracks attendance in an SQL table.
I do this by adding a new column of a date to the table every time it's used.
now I have a problem adding new rows for new users because I don't know how many columns I have so I cant use INSERT INTO Table VALUES().
Is there any alternative way to do it?
Edit:
here's how it's supposed to look
and every day it's supposed to add a column for the date.
I don't really understand how can I do it with adding dates as rows
can someone elaborate?
I think you should probably tweak the design of your tables. Using the example of a school:
Student Table:
ID (Primary Key)
Name
(More student specific columns here)
Attendance Table:
ID (Primary Key)
StudentID (Foreign Key)
Date
Attended
TimeArrived
(More Attendance specific columns here)
In the above example, each day would get a new row added to the attendance table and you could avoid dynamically adding columns.

SQL inserting rows from multiple tables

I have got an assignment. We have been given a table, MAIN_TABLE, which has a column patient_id as foreign key.
I need to make a separate table named patient which has patient_id as a primary key along with some other attributes such as name and address.
I did successfully create schema of this table. Now there is a serious problem I am facing. After creating this table I used insert statement to insert values for name and address from a dummy table.
Till this point everything works fine. However, the column patient_id is still empty rather I have set it to 0 by default.
Now the problem is that I need to get values into this column, patient_id, from the patient_id column of MAIN TABLE.
I can't figure out how do I do this? I did try to use:
UPDATE patient
SET patient_id=(select id from MAIN_TABLE)
BUT this gives me error that multiple rows returned which does make sense but what condition do I put in where clause then?
That sounds strange. How can there be a table MAIN_TABLE with a foreign key patient_id but the master table patient does not exist. Where do that patient_ids in MAIN_TABLE come from?
I suggest not to insert your data from a dummy table alone and then try to update it. But insert it with both - the MAIN_TABLE and the dummy table joined. If you can not join them. You would also not be able during the update.
So since i think they have no connected primary/foreign keys the only way to join them is using a good business key. Do you have a good business key?
You are talking about persons. So First Name, Last Name, Birth Day, Address often is good enough. But you have to think about it.
With your given data I can only give you some kind of meta insert statement. But you will get the point.
Example:
insert into patient (col1, col2, col3)
select
a.colA,
a.colF,
b.colX
from
dummy_table a
inner join MAIN_TABLE b on a.colN=b.colA and a.colM=b.colB
And: If patient_id is your primary key in patient you should ensure that it is even not possible to have duplicate values or null in this column. And you should use constraints to ensure your data integrity.
http://docs.oracle.com/cd/B19306_01/server.102/b14200/clauses002.htm

Query to retrieve data using two foreign keys

I'm working on a football statistics database, and in the table to store results of matches, I have two references to the primary key of a team table: one home, one away.
My intention is to create a query which returns the name of both of the teams, along with other details, but I can't think of a way to achieve this WITH the team names (my attempts so far can only produce one team name, with the other an ID number). I'll give the relation structure if this wasn't clear:
(PKs in bold, FKs asterisk)
team(team_id, team_name, venue)
match(match_id, home_team*, away_team*, home_score, away_score, date,)
My desired output would be a table with these columns:
home_team_name, home_team_score, away_team_score, away_team_name, date, venue
Is this possible with my tables, or should I change the way I store results?
When joining the team table to the match table in a query, you'll need to join the match table to the team table twice. You need to use an different alias for the teams each time.

Mysql how to avoid repeating myself

I have a table students with the following fields.
Id,FirstName,SecondName,Photo,Student_ID
I also have a table called class_of_2011 with the fields
Id,FirstName,SecondName,Photo,Student_ID,Subjects
I want to select some specific students from table students and have them in table class_of_2011,but I already have the names in table students.I am thinking the only way to do this is to copy the names i want to the table class_of_2011,but since there will be a class of 2012 and beyond,I feel like I will be simply copying data from one table to the other.
Is repeating myself inevitable in my case?
It looks like this could be normalized easily. Why not have your class_of_ tables simply have a foreign key to the student table's id column?
StudentId,Subjects
In this way, one student record could be associated with several classes, in case someone is on the 5-year plan.
I'm assuming that the Student_ID field in the Students table is their id number or something, and not the primary key of that table.
Students Table
Id,FirstName,SecondName,Photo,Student_ID
Subjects Table
Id,Subject
Student_Subjects Table
Id,Student_Id,Subject_Id,Year
You may then assign a student multiple subjects, for multiple years.
The class_of_2011 table should contain the primary key of the students table and none of the other "repeated" data. All of the other columns you're interested in can then be obtained by joining the two columns together in a query.
I would restructure the data if possible... Something like....
Student Table
ID, Name, Address, other common specific to student
GraduatingClass Table
YearGraduate, StudentID
Enrollment Table
StudentID, ClassID, SemesterID