How to denormalized data in SQL query - sql

I have a table that has Clinic Names and Doctor Names. one clinic can have many doctors. I need to split this data into two tables. one with clinic info and the other with Doctor info
trying to do this in a SQL query
Table CLINIC_DOC:
ID ClinicName Doctor
------------------------
1 xyz Dr Joe
2 xyz Dr Bob
3 abc Dr Mary
4 abc Dr John
I want to split the data into the following tables like this:
Table ClinicsData:
ClinicID ClinicName
----------------------
1 xyz
2 abc
Table DoctorData:
DocId ClinicID Doctor
--------------------------
1 1 Dr Joe
2 1 Dr Bob
3 2 Dr Mary
4 2 Dr John

Assuming that the ID columns (ClinicID and DocID) are automatically generated and that the clinic names are unique (i.e there are no two clinics with the same name in the portion of the real world your data represents) you can try:
INSERT INTO clinicsdata
(clinicname)
SELECT DISTINCT
cd.clinicname
FROM clinic_doc cd;
INSERT INTO doctordata
(clinicid,
doctor)
SELECT c.clinicid,
cd.doctor
FROM clinic_doc cd
INNER JOIN clinicsdata c
ON c.clinicname = cd.clinicname;

First, you'll probably want to create the tables you're going to populate. Here's my best guess at dataypes:
CREATE TABLE ClinicsData
(
ClinicID INT IDENTITY(1,1),
ClinicName varchar(100)
)
CREATE TABLE DoctorData
(
DocID INT IDENTITY(1,1),
ClinicID INT,
Doctor VARCHAR(100)
)
Notice that I've made ClinicsData.ClinicID an IDENTITY column. This will help us to populate DoctorData later.
Next, let's populate ClinicsData with all the distinct clinic names.
INSERT INTO ClinicsData
(
ClinicName
)
SELECT DISTINCT
ClinicName
FROM CLINIC_DOC;
Now, we can utilize ClinicsData to populate DoctorData, using an INNER JOIN.
INSERT INTO DoctorData
(
ClinicID
,Doctor
)
SELECT DISTINCT
cd.ClinicID,
c_d.Doctor
FROM CLINIC_DOC c_d
INNER JOIN ClinicsData cd ON cd.ClinicName = c_d.ClinicName

Related

How to copy information in SQL from one table to another

I need to build a query to copy information in a column from one table to a column in another table.
This is how the tables looks like:
People:
PersonId
Name
StatusId
1
John
2
Jenny
3
Steve
Assignments:
AssignmentId
Country
PersonId
1
UK.
1
2
USA
3
Status:
StateId
Name
1
Busy
2
Free
There is a relationsihp between the People and Assignments tables: PersonId on the Assignments table is a FK. The People table has a relationship with the Status table through the FK StatusId. What I need to do is populate the StatusId on the table People with the StatusId from the table Status if the person in the table People exists on the table Assignments.
On the sample above both John and Steve are in the Assignments table, in this case theirs StatusId on the table People should be set to 1.
I was trying to do it with this:
update People
set StatusId = 1
where PersonId IN (
select PersonId
from Assignments
where Assignments.PersonId = People.PersonId
)
but as you can see I am hardcoding the StatusId what will not works. Is there some way to get the StatusId based on the result of the select? Or is there another way to get the StatusId?
If you want to refer to it by "name", you can use a subquery:
update People
set StatusId = (select s.StatusId from status s where name = 'Busy')
where PersonId IN (select a.PersonId from Assignments a where a.PersonId = People.PersonId);

SQL Select Where Opposite Match Does Not Exist

Trying to compare between two columns and check if there are no records that exist with the reversal between those two columns. Other Words looking for instances where 1-> 3 exists but 3->1 does not exist. If 1->2 and 2->1 exists we will still consider 1 to be part of the results.
Table = Betweens
start_id | end_id
1 | 2
2 | 1
1 | 3
1 would be added since it is a start to an end with no opposite present of 3,1. Though it did not get added until the 3rd entry since 1 and 2 had an opposite.
So, eventually it will just return names where the reversal does not exist.
I then want to join another table where the number from the previous problem has its name installed on it.
Table = Names
id | name
1 | Mars
2 | Earth
3 | Jupiter
So results will just be the names of those that don't have an opposite.
You can use a not exists condition:
select t1.start_id, t1.end_id
from the_table t1
where not exists (select *
from the_table t2
where t2.end_id = t1.start_id
and t2.start_id = t1.end_id);
I'm not sure about your data volume, so with your ask, below query will supply desired result for you in Sql Server.
create table TableBetweens
(start_id INT,
end_id INT
)
INSERT INTO TableBetweens VALUES(1,2)
INSERT INTO TableBetweens VALUES(2,1)
INSERT INTO TableBetweens VALUES(1,3)
create table TableNames
(id INT,
NAME VARCHAR(50)
)
INSERT INTO TableNames VALUES(1,'Mars')
INSERT INTO TableNames VALUES(2,'Earth')
INSERT INTO TableNames VALUES(3,'Jupiter')
SELECT *
FROM TableNames c
WHERE c.id IN (
SELECT nameid1.nameid
FROM (SELECT a.start_id, a.end_id
FROM TableBetweens a
LEFT JOIN TableBetweens b
ON CONCAT(a.start_id,a.end_id) = CONCAT(b.end_id,b.start_id)
WHERE b.end_id IS NULL
AND b.start_id IS NULL) filterData
UNPIVOT
(
nameid
FOR id IN (filterData.start_id,filterData.end_id)
) AS nameid1
)

Nested result sets as with dynamic names

I'm trying to join the result of two referencing tables to get row values which are referencing different table names, which rows are selectable by their uuid.
my tables look like this:
table entry
table map
table cats
table dogs
nrrefInt
id name mapRef breed
mapRef breed
1 123
123'dogs'
456 'bengal'
123 'sheepdog'
2 456
456 'cats'
888 'birma' 999 'poodle'
3 789
789'dogs'
4 123
refInt of entry is referencing to map. the name of map is the reference to tables in addition with the field id which is also applied on the tables cats/dogs (dynamic tables loading).
// subset 1: list of numbers that needs to be loaded from entry table (1-4)
SELECT DISTINCT refInt FROM entry WHERE nr in (1,2,3,4)
// subset 2: get all names from map that have the same id like refInt from subset1
SELECT name FROM map WHERE id in subset1
// main query: load all rows from table with the given name
// from map table that have the same mapRef value on it
SELECT * FROM (subset2.names) WHERE mapRef IN (subset2.ids)
result should be the rows:
1) 456 bengal
2) 123 sheepdog
I also made a SQLFiddle of it.
Is there a way to combine this to one query?
It's going to look something like:
SqlFiddle
select
sub.nr,
sub.breed
from (
select e.nr, e.refInt,
case
when c.breed is not null then c.breed
when d.breed is not null then d.breed
else null
end as breed
from (
select e.nr, e.refInt, m.name
from entry e
inner join map m on e.refInt = m.id
) e
left join cats c on e.refInt = c.mapRef and e.name = 'cats'
left join dogs d on e.refInt = d.mapref and e.name = 'dogs'
) sub
where sub.breed is not null
This is going to be very poor in performance.
Now the IMO the correct schema would be:
table entry
nr refint
1 123
2 456
3 789
4 124 (duplicate?)
table breed
mapRef breed species
123 sheepdog 1
999 poodle 1
456 bengal 2
888 birma 2
table species
id species
1 dogs
2 cats
This is normalized and has very good performance.
Note how the following query fully achieves the desired result set and fully demonstrates how the tables cats and dogs are truly just partitions of a single entity animals. The schema should be reworked to reflect this new understanding. This query is also efficient because the inclusion test id pushed to the depths of the innermost CTE's, at the level where actual table rows are being read, without relying on the engine to discover this potential optimization (which can be problematic with UNIONs).
with
cats2 as (
select species='cat', mapref, breed
from cats animals
join entry on entry.refint = animals.mapref
where entry.nr in (1,2,3,4)
),
dogs2 as (
select species='dog', mapref, breed
from dogs animals
join entry on entry.refint = animals.mapref
where entry.nr in (1,2,3,4)
),
animals as (
select species, mapref, breed from cats2
union all
select species, mapref, breed from dogs2
)
select species, mapref, breed
from animals
group by species, mapref, breed
This test script:
declare #entry table (nr int, refint int );
declare #map table (id int, name varchar(20) );
declare #cats table (mapRef int, breed varchar(20));
declare #dogs table (mapRef int, breed varchar(20));
insert #entry(nr,refint) values
(1,123)
,(2,456)
,(3,789)
,(4,123);
insert #map(id,name) values
(123,'dogs')
,(456,'cats')
,(789,'dogs');
insert #cats(mapRef,breed) values
(456,'bengal'),(888,'burma');
insert #dogs(mapRef,breed) values
(123,'sheepdog'), (999,'poodle');
with
cats2 as (
select species='cat', mapref, breed
from #cats animals
join #entry entry on entry.refint = animals.mapref
where entry.nr in (1,2,3,4)
),
dogs2 as (
select species='dog', mapref, breed
from #dogs animals
join #entry entry on entry.refint = animals.mapref
where entry.nr in (1,2,3,4)
),
animals as (
select species, mapref, breed from cats2
union all
select species, mapref, breed from dogs2
)
select species, mapref, breed
from animals
group by species, mapref, breed
yields as desired:
species mapref breed
------- ----------- --------------------
cat 456 bengal
dog 123 sheepdog

One SQL statement for counting the records in the master table based on matching records in the detail table?

I have the following master table called Master and sample data
ID---------------Date
1 2014-09-07
2 2014-09-07
3 2014-09-08
The following details table called Details
masterId-------------Name
1 John Walsh
1 John Jones
2 John Carney
1 Peter Lewis
3 John Wilson
Now I want to find out the count of Master records (grouped on the Date column) whose corresponding details record with Name having the value "John".
I cannot figure how to write a single SQL statement for this job.
**Please note that join is needed in order to find master records for count. However, such join creates duplicate master records for count. I need to remove such duplicate records from being counted when grouping on the Date column in the Master table.
The correct results should be:
count: grouped on Date column
2 2014-09-07
1 2014-09-08
**
Thanks and regards!
This answer assumes the following
The Name field is always FirstName LastName
You are looking once and only once for the John firstname. The search criteria would be different, pending what you need
SELECT Date, Count(*)
FROM tblmaster
INNER JOIN tbldetails ON tblmaster.ID=tbldetails.masterId
WHERE NAME LIKE 'John%'
GROUP BY Date, tbldetails.masterId
What we're doing here is using a wilcard character in our string search to say "Look for John where any characters of any length follows".
Also, here is a way to create table variables based on what we're working with
DECLARE #tblmaster as table(
ID int,
[date] datetime
)
DECLARE #tbldetails as table(
masterID int,
name varchar(50)
)
INSERT INTO #tblmaster (ID,[date])
VALUES
(1,'2014-09-07'),(2,'2014-09-07'),(3,'2014-09-08')
INSERT INTO #tbldetails(masterID, name) VALUES
(1,'John Walsh'),
(1,'John Jones'),
(2,'John Carney'),
(1,'Peter Lewis'),
(3,'John Wilson')
Based on all comments below, this SQL statement in it's clunky glory should do the trick.
SELECT date,count(t1.ID) FROM #tblmaster mainTable INNER JOIN
(
SELECT ID, COUNT(*) as countOfAll
FROM #tblmaster t1
INNER JOIN #tbldetails t2 ON t1.ID=t2.masterId
WHERE NAME LIKE 'John%'
GROUP BY id)
as t1 on t1.ID = mainTable.id
GROUP BY mainTable.date
Is this what you want?
select date, count(distinct m.id)
from master m join
details d
on d.masterid = m.id
where name like '%John%'
group by date;

Add or delete repeated row

I have an output like this:
id name date school school1
1 john 11/11/2001 nyu ucla
1 john 11/11/2001 ucla nyu
2 paul 11/11/2011 uft mit
2 paul 11/11/2011 mit uft
I would like to achieve this:
id name date school school1
1 john 11/11/2001 nyu ucla
2 paul 11/11/2011 mit uft
I am using direct join as in:
select distinct
a.id, a.name,
b.date,
c.school
a1.id, a1.name,
b1.date,
c1.school
from table a, table b, table c,table a1, table b1, table c1
where
a.id=b.id
and...
Any ideas?
We will need more information such as what your tables contain and what you are after.
One thing I noticed is you have a school and then school1. 3nf states that you should never duplicate fields and append numbers to them to get more information even if you think that the relationship will only be 1 or 2 additional items. You need to create a second table that stores a user associated with 1 to many schools.
I agree with everyone else that both your source table and your desired output are poor design. While you probably can't do anything about your source table, I recommend the following code and output:
Select id, name, date, school from MyTable;
union
Select id, name, date, school1 from MyTable;
(repeat as necessary)
This will give you results in the format:
id name date school
1 john 11/11/2001 nyu
1 john 11/11/2001 ucla
2 paul 11/11/2011 mit
2 paul 11/11/2011 uft
(Note: in my version of SQL, union queries automatically select distinct records so the distinct flag isn't needed)
With this format, you could easily count the number of schools per student, number of students per school, etc.
If processing time and/or storage space is a factor here, you could then split this into 2 tables, 1 with the id,name & date, the other with the id & school (basically what JonH just said). But if you're just working up some simple statistics, this should suffice.
This problem was just too irresistable, so I just took a guess at the data structures that we are dealing with. The technology wasn't specified in the question. This is in Transact-SQL.
create table student
(
id int not null primary key identity,
name nvarchar(100) not null default '',
graduation_date date not null default getdate(),
)
go
create table school
(
id int not null primary key identity,
name nvarchar(100) not null default ''
)
go
create table student_school_asc
(
student_id int not null foreign key references student (id),
school_id int not null foreign key references school (id),
primary key (student_id, school_id)
)
go
insert into student (name, graduation_date) values ('john', '2001-11-11')
insert into student (name, graduation_date) values ('paul', '2011-11-11')
insert into school (name) values ('nyu')
insert into school (name) values ('ucla')
insert into school (name) values ('uft')
insert into school (name) values ('mit')
insert into student_school_asc (student_id, school_id) values (1,1)
insert into student_school_asc (student_id, school_id) values (1,2)
insert into student_school_asc (student_id, school_id) values (2,3)
insert into student_school_asc (student_id, school_id) values (2,4)
select
s.id,
s.name,
s.graduation_date as [date],
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s1 where s1.rank_num = 1) as school,
(select max(name) from
(select name,
RANK() over (order by name) as rank_num
from school sc
inner join student_school_asc ssa on ssa.school_id = sc.id
where ssa.student_id = s.id) s2 where s2.rank_num = 2) as school1
from
student s
Result:
id name date school school1
--- ----- ---------- ------- --------
1 john 2001-11-11 nyu ucla
2 paul 2011-11-11 mit uft