Join tables where no common field exists and based on ID columns - sql

This question seems hard to phrase, but after explaining my situation it should be easy to understand. I have two tables: one called INSTRUCTORS that holds instructor data and another called LIST_OPTION_ITEM that holds the ID values of different ID columns stored in the INSTRUCTORS table. A third and possibly important table to include is LIST_OPTION_TYPE, which contains the IDs of whatever ID column there is in INSTRUCTORS. Perhaps it would be easier to explain by showing sample data and my desired output.
INSTRUCTORS
RANK_ID
SPECIALTY_ID
DUTY_TITLE_ID
SERVICE_BRANCH_ID
STATUS_ID
UNIT_ID
OFFICE_SYMBOL_ID
1354
319
931
2604
1378
1406
1429
LIST_OPTION_ITEM
OPTION_ITEM_ID
OPTION_TYPE_ID
ITEM_VALUE
1354
22
CAPT
319
20
CBRN TRAUMA NURSE
931
21
IDMT-Squadron Medical Element
2604
128
46N NURSE
1378
23
USA
1406
24
Guard
1429
126
CERFP
LIST_OPTION_TYPE
OPTION_TYPE_ID
OPTION_TYPE
20
Specialty
21
Duty_Title
22
Rank
23
Service_Branch
24
Status
126
Unit
128
Office_Symbol
It is important to note that I cannot join INSTRUCTORS and LIST_OPTION_ITEM, as there is no common column. However, LIST_OPTION_ITEM and LIST_OPTION_TYPE can join on OPTION_TYPE_ID. My desired output from a SELECT query:
Rank
Specialty
Duty Title
Service Branch
Status
Unit
Office Symbol
CAPT
CBRN TRAUMA NURSE
IDMT-Squadron Medical Element
46N NURSE
USA
Guard
CERFP
I've tried some solutions but can't come up with anything. Do I need a cross join or something? Help would be much appreciated.

I tried to do with Pivot and unpivot functions
below is sample sql:
with inst as (select inst_id, col, col_id
from (select rownum as inst_id, a.* from instructor a)
unpivot
(col_id for col in (status_id as 'STATUS',rank_id as 'RANK',specialty_id as 'SPECIALTY',duty_title_id as 'DUTY_TITLE')
))
select * from
(select inst_id,col,item_value from inst,
(select a.option_type,b.option_item_id,b.item_value from LIST_OPTION_TYPE a, list_option_id b
where a.option_type_id = b.option_type_id) opt
where inst.col = upper(option_type)
and option_item_id = col_id)
pivot
(max(item_value)
for col in ('STATUS','RANK','SPECIALTY','DUTY_TITLE')
) order by inst_id;
this will give desired output

Related

SQL Code to Update Entities with the same ID

We have an app to manage our Member's information that is tied to a SQL Database. We have attributes that the users can set that apply to the whole family. I am trying to write a SQL Script that will update the values for the whole family.
Example:
Here is a sample of a few columns of our dbo.AttributeValue column:
AttributeID
EntityID
Value
CreatedDateTime
ModifiedDateTime
5856
733
True
2021-11-06 17:30:38.207
2021-11-10 13:52:09.843
5856
613
Fale
2021-11-05 12:12:08.207
2021-11-16 3:32:01.843
Here is a sample of a few columns in our dbo.Person Table:
ID
PrimaryFamilyID
733
187
709
187
137
187
I would like for anyone with the same value in PrimaryFamilyID to have the same values in the dbo.AttributeValue table. Bonus points if we can make it update to the value with the most recent ModifiedDateTime in the dbo.AttributeValue table so that if someone in the family modifies the value after every has an assigned attribute, it will go ahead an update those as well.
Desired outcome:
AttributeID
EntityID
Value
CreatedDateTime
ModifiedDateTime
5856
733
True
2021-11-06 17:30:38.207
2021-11-10 13:52:09.843
5856
709
True
2021-11-06 17:30:38.207
2021-11-10 13:52:09.843
5856
137
True
2021-11-06 17:30:38.207
2021-11-10 13:52:09.843
It took me a while to get to a solution what you want is but here it is
DBFiddleRunningSolution
You can start somewhere from here.
With PersonCTE as (
Select Count(*) as cnt,PrimaryFamilyId
from Person
group by PrimaryFamilyId
having count(*)>1
)
Select AV.AttributeId,P.Id,AV.Value,AV.CreatedDateTime,AV.ModifiedDateTime
into NewAttributeValue
from Person P inner join PersonCTE C
ON P.PrimaryFamilyId = C.PrimaryFamilyId
cross join AttributeValue AV
where AV.EntityId in (Select distinct Id from Person)

Including variables after filtering selecting only minimum values in SQL?

I am working with a twin dataset and would like to create a table with the Subject ID (Subject) and twin pair ID (twpair) for the twins with the lower (or one of the twins if the values are equal) lifetime total of marijuana use (MJ1a).
A portion of my table looks like this:
Subject
twpair
MJ1a
156
345
10
157
345
7
158
346
20
159
346
3
160
347
4
161
347
4
I'm hoping to create a table with only the twins that have the lower amount of marijuana use which would look like this:
Subject
twpair
MJ1a
157
345
7
159
346
3
161
347
4
This is the SQL code I have so far:
proc sql;
create table one_twin as
select twpair,min(MJ1a) as minUse, Subject
from twins_deviation
group by twpair;
Unfortunately this ends up causing all of the subjects to be remerged back in the dataset. If I don't include the Subject portion I get the correct values for twpair and MJ1a but not the Subject IDs.
How do I filter the dataset to only include those with the minimum values while also including variables of interest like Subject ID? Note that if two pairs of twins have the SAME value I would like to select one but it doesn't matter which I select. Any tips would be extremely appreciated!
This query should give you the desired result.
select a.subject,a.twpair,a.MJ1a from twins_deviation a join (select twpair,min(mj1a) as mj1a from twins_deviation group by twpair)b on a.twpair=b.twpair and a.mj1a=b.mj1a
If your DB supports analytic/window functions ,the same can be accomplished using a rank function ,solution given below.
EDIT1:to handle same values for mj1a
select subject,twpair,mj1a from(select subject,twpair,mj1a ,row_number() over(partition by twpair order by mj1a) as rnk from twins_deviation)out1 where rnk=1;
EDIT2:Updated solution 1 to include only one twin.
select min(subject) as subject,twpair,mj1a from(select a.subject as subject ,a.twpair as twpair,a.MJ1a as MJ1a from twins_deviation a join (select twpair,min(mj1a) as mj1a from twins_deviation group by twpair)b on a.twpair=b.twpair and a.mj1a=b.mj1a)out1 group by twpair,MJ1a;

Run a set number of joined select statements under a single sql query

I have a counselling appointment website. Currently I list clients in one table, but also have couples listed by id in a second table for when they book a couples session. ie:
client
id_no first last
564 John Smith
983 Mary Jones
999 Mark Fields
882 Joan Hancock
couple
id_no client1 client2
623 564 983
555 999 882
I would like to write a single select statement, using aliases, which will list out couples on a single line. Up until now, I have been doing a simple join then cleaning up the result using php after running the query, but would like to clean this up in sql so that I get a result like the following
id_no first_1 last_1 first_2 last_2
623 John Smith Mary Jones
555 Mark Fields Joan Hancock
I suspect that sub queries might be involved, but can't for the life of me wrangle them to get this result.
Update
I just tried the following:
SELECT id_no,first_1,last_1,first_2,last_2
FROM ( SELECT a.id_no AS id_no, b.first AS first_1,b.last AS last_1
FROM couple AS a, client AS b WHERE a.client1=b.id_no ) c1
JOIN ( SELECT a.id_no AS id_no, b.first AS first_2,b.last AS last_2
FROM couple AS a, client AS b WHERE a.client2=b.id_no ) c2 ON
(c1.id_no=c2.id_no)
And am getting a message that "Column 'id_no' in field list is ambiguous". Not sure if I am on the right track
You are getting the exception because you did not specified the table alias to id_no as this column belongs to both tables, so SQL is not sure which column to return, so it will throw ambiguous column error.
Also you don't have to use sub queries, you just need to join client table twice with couple table, one for client1 and other one for client2 like below
Select cp.id_no,
cl1.first as first_1,
cl1.last as last_1,
cl2.first as first_2,
cl2.last as last_2
From couple cp
inner join client cl1 on cl1.id_no = cp.client1
inner join client cl2 on cl2.id_no = cp.client2

JOIN the same table on two columns

I use JOINs to replace country and product IDs in import and export data with actual country and products names stored in separate tables. In the data source table (data), there are two columns with country IDs, for origin and destination, both of which I am replacing with country names.
The code I have come up with refers to the country_names table twice – as country_names, and country_names2, – which doesn’t seem to be very elegant. I expected to be able to refer to the table just once, by a single name. I would be grateful if someone pointed me to a more elegant and maybe more efficient way to achieve the same result.
SELECT
country_names.name AS origin,
country_names2.name AS dest,
product_names.name AS product,
SUM(data.export_val) AS export_val,
SUM(data.import_val) AS import_val
FROM
OEC.year_origin_destination_hs92_6 AS data
JOIN
OEC.products_hs_92 AS product_names
ON
data.hs92 = product_names.hs92
JOIN
OEC.country_names AS country_names
ON
data.origin = country_names.id_3char
JOIN
OEC.country_names AS country_names2
ON
data.dest = country_names2.id_3char
WHERE
data.year > 2012
AND data.export_val > 1E8
GROUP BY
origin,
dest,
product
The table to convert product IDs to product names has 6K+ rows. Here is a small sample:
id hs92 name
63215 3215 Ink
2130110 130110 Lac
21002 1002 Rye
2100200 100200 Rye
52706 2706 Tar
20902 902 Tea
42203 2203 Beer
42302 2302 Bran
178703 8703 Cars
The table to convert country IDs to country names (which is the table I have to JOIN on twice) has 264 rows for all countries in the world. (id_3char is the column used.) Here is a sample:
id id_3char name
euchi chi Channel Islands
askhm khm Cambodia
eublx blx Belgium-Luxembourg
eublr blr Belarus
eumne mne Montenegro
euhun hun Hungary
asmng mng Mongolia
nabhs bhs Bahamas
afsen sen Senegal
And here is a sample of data from the import and export data table with a total of 205M rows that has the two columns origin and dest that I am making a join on:
year origin dest hs92 export_val import_val
2009 can isr 300410 2152838.47 3199.24
1995 chn jpn 590190 275748.65 554154.24
2000 deu gmb 100610 1573508.44 1327.0
2008 deu jpn 540822 10000.0 202062.43
2010 deu ukr 950390 1626012.04 159423.38
2006 esp prt 080530 2470699.19 125291.33
2006 grc ind 844859 8667.0 3182.0
2000 ltu deu 630399 6018.12 5061.96
2005 usa zaf 290219 2126216.52 34561.61
1997 ven ecu 281122 155347.73 1010.0
I think you already have it done such that it can be considered good enough to just use as is :o)
Meantime, If for some reason you really-really want to avoid two joins on that country table - what you can do is to materialize below select statement into let's say `OEC.origin_destination_pairs` table
SELECT
o.id_3char o_id_3char,
o.name o_name,
d.id_3char d_id_3char,
d.name d_name
FROM `OEC.country_names` o
CROSS JOIN `OEC.country_names` d
Then you can just join on that new table as below
SELECT
country_names.o_name AS origin,
country_names.d_name AS dest,
product_names.name AS product,
SUM(data.export_val) AS export_val,
SUM(data.import_val) AS import_val
FROM OEC.year_origin_destination_hs92_6 AS data
JOIN OEC.products_hs_92 AS product_names
ON data.hs92 = product_names.hs92
JOIN OEC.origin_destination_pairs AS country_names
ON data.origin = country_names.o_id_3char
AND data.dest = country_names2.d_id_3char
WHERE data.year > 2012
AND data.export_val > 1E8
GROUP BY
origin,
dest,
product
The motivation behind above is cost of storing and querying in your particular case
Your `OEC.country_names` table is just about 10KB in size
Each time you query it you pay as if it is 10MB (Charges are rounded to the nearest MB, with a minimum 10 MB data processed per table referenced by the query, and with a minimum 10 MB data processed per query.)
So, if you will materialize above mentioned table - it will still be less than 10MB so no difference in querying charges
Similar situation with storing that table - no visible changes in charges
You can check more about pricing here

Display the related attributes of a MAX in an ACCESS query

I have 2 tables joined with political results and I need to have the votes SUM per county, and then the MAX of the vote counts per county, with the Party that relates to the MAX in another column. I'm having trouble getting the Party into the Query results without messing up the SUM and MAX columns.
This Table I can get with the Following SQL
County Name SumOfVoteCount MaxOfVoteCount OfficeID
Baker 7253 4008 S
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID
HAVING (((NY_Race.OfficeID)="S"));
What I need is for the Party that has that 4008 vote total to be included in the query results, but when I try to select Party to be added, it shows all of them and messes up the SUM of the vote count, and I end of with this:
County Name SumOfVoteCount MaxOfVoteCount1 Party OfficeID
Baker 2927 2927 Dem S
Baker 4008 4008 GOP S
Baker 101 101 Lib S
Baker 53 53 Prg S
Baker 164 164 WF S
This is the SQL code I am using that gets the above Table:
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount, NY_Results.Party
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID, NY_Results.Party
HAVING (((OR_Race.OfficeID)="S"));
How can I get this table in the query results?
County Name SumOfVoteCount MaxOfVoteCount Party OfficeID
Baker 7253 4008 GOP S
I can't help but think I'm missing a WHERE claus somewhere that compares Party to MAXofVoteCount
One way to approach these is to have a nested subquery that gets the MAX() for the field of interest. Then, only select the record with that MAX(). Here's the structure:
select COUNTY_NAME, R1.*
, (select sum(votecount) from results R2 where R1.COUNTY_ID=R2.COUNTY_ID and R1.OFFICE_ID=R2.OFFICE_ID)
from RESULTS R1
join RACE on R1.COUNTY_ID=RACE.COUNTY_ID and R1.OFFICE_ID=RACE.OFFICE_ID
where R1.office_id = 'S'
and voteCount =
(select max(votecount) from results R3 where R1.COUNTY_ID=R3.COUNTY_ID and R1.OFFICE_ID=R3.OFFICE_ID)
I created a demo on SQLFiddle.
One issue: what if two get exactly the same number of votes. That's a functional issue you will have to resolve.