What is the best way to join tables - sql

this is more like a general question.
I am looking for the best way to join 4, maybe 5 different tables. I am trying to create a Power Bi pulling live information from an IBM AS400 where customer service can type one of our parts number,
see how many parts we have in inventory, if none, see the lead time and if there are any orders already already entered for the typed part number.
SERI is our inventory table with 37180 records.
(active inventory that is available)
METHDM is our kit table with 37459 records.
(this table contains the bill of materials for custom kits, KIT A123 contains different part numbers in it witch are in SERI as well.)
STKA is our part lead time table with 76796 records.
(lead time means how long will it take for parts to come in)
OCRI is our sales order table with 6497 records.
(This table contains all customer orders)
I have some knowledge in writing queries but this one is more challenging of what I have created in the past. Should I start with the table that has the most records and start left joining the rest ?
From STKA 76796 records
Left join METHDM 37459 records on STKA
left join SERI 37180 records on STKA
left join OCRI 6497 records on STAK
Select
STKA.v6part as part,
STKA.v6plnt as plant,
STKA.v6tdys as pur_leadtime,
STKA.v6prpt as Pur_PrepLeadtime,
STKA.v6lead as Mfg_leadtime,
STKA.v6prpt as Mfg_PrepLeadTime,
METHDM.AQMTLP AS COMPONENT,
METHDM.AQQPPC AS QTYNEEDED,
SERI.HTLOTN AS BATCH,
SERI.HTUNIT AS UOM,
(HTQTY - HTQTYC) as ONHAND,
OCRI.DDORD# AS SALESORDER,
OCRI.DDRDAT AS PROMISED
from stka
left join METHDM on STKA.V6PART = METHDM.AQPART
left join SERI on STKA.V6PART = SERI.HTPART
left join OCRI on STKA.V6PART = OCRI.DDPART
Is this the best way to join the tables?

I think you already have your answer, but conceptually, there are a few issues here to deal with, and I figured I would give you a few examples, using data a little bit like yours, but massively simplified.
CREATE TABLE #STKA (V6PART INT, OTHER_DATA VARCHAR(50));
CREATE TABLE #METHDM (AQPART INT, KIT_ID INT, SOME_DATE DATETIME, OTHER_DATA VARCHAR(50));
CREATE TABLE #SERI (HTPART INT, OTHER_DATA VARCHAR(50));
CREATE TABLE #OCRI (DDPART INT, OTHER_DATA VARCHAR(50));
INSERT INTO #STKA SELECT 1, NULL UNION ALL SELECT 2, NULL UNION ALL SELECT 3, NULL; --1, 2, 3 Ids
INSERT INTO #METHDM SELECT 1, 1, '20200108 10:00', NULL UNION ALL SELECT 1, 2, '20200108 11:00', NULL UNION ALL SELECT 2, 1, '20200108 13:00', NULL; --1 Id appears twice, 2 Id once, no 3 Id
INSERT INTO #SERI SELECT 1, NULL UNION ALL SELECT 3, NULL; --1 and 3 Ids
INSERT INTO #OCRI SELECT 1, NULL UNION ALL SELECT 4, NULL; --1 and 4 Ids
So fundamentally we have a few issues here:
o the first problem is that the IDs in the tables differ, one table has an ID #4 but this isn't in any of the others;
o the second issue is that we have multiple rows for the same ID in one table;
o the third issue is that some tables are "missing" IDs that are in other tables, which you already covered by using LEFT JOINs, so I will ignore this.
--This will select ID 1 twice, 2 once, 3 once, and miss 4 completely
SELECT
*
FROM
#STKA
LEFT JOIN #METHDM ON #METHDM.AQPART = #STKA.V6PART
LEFT JOIN #SERI ON #SERI.HTPART = #STKA.V6PART
LEFT JOIN #OCRI ON #OCRI.DDPART = #STKA.V6PART;
So the problem here is that we don't have every ID in our "anchor" table STKA, and in fact there's no single table that has every ID in it. Now your data might be fine here, but if it isn't then you can simply add a step to find every ID, and use this as the anchor.
--This will select each ID, but still doubles up on ID 1
WITH Ids AS (
SELECT V6PART AS ID FROM #STKA
UNION
SELECT AQPART AS ID FROM #METHDM
UNION
SELECT HTPART AS ID FROM #SERI
UNION
SELECT DDPART AS ID FROM #OCRI)
SELECT
*
FROM
Ids I
LEFT JOIN #STKA ON #STKA.V6PART = I.Id
LEFT JOIN #METHDM ON #METHDM.AQPART = I.Id
LEFT JOIN #SERI ON #SERI.HTPART = I.Id
LEFT JOIN #OCRI ON #OCRI.DDPART = I.Id;
That's using a common-table expression, but a subquery would also do the job. However, this still leaves us with an issue where ID 1 appears twice in the list, because it has multiple rows in one of the sub-tables.
One way to fix this is to pick the row with the latest date, or any other ORDER you can apply to the data:
--Pick the best row for the table where it has multiple rows, now we get one row per ID
WITH Ids AS (
SELECT V6PART AS ID FROM #STKA
UNION
SELECT AQPART AS ID FROM #METHDM
UNION
SELECT HTPART AS ID FROM #SERI
UNION
SELECT DDPART AS ID FROM #OCRI),
BestMETHDM AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY AQPART ORDER BY SOME_DATE DESC) AS ORDER_ID
FROM
#METHDM)
SELECT
*
FROM
Ids I
LEFT JOIN #STKA ON #STKA.V6PART = I.Id
LEFT JOIN BestMETHDM ON BestMETHDM.AQPART = I.Id AND BestMETHDM.ORDER_ID = 1
LEFT JOIN #SERI ON #SERI.HTPART = I.Id
LEFT JOIN #OCRI ON #OCRI.DDPART = I.Id;
Of course you could also add some aggregation (SUM, MAX, MIN, AVG, etc.) to fix this problem (if it is indeed an issue). Also, I used a common-table expression, but this would work just as well with a subquery.

Expanding on a comment made on the question..
I would say I will start with SERI as that table contains the entire inventory for our facility and should cover the other tables
However the question said
SERI is our inventory table with 37180 records. (active inventory that is available)
In my experience, active inventory, isn't the same as all parts.
Normally, in a query like this, I'd expect the first table to be a Parts Master table of some sort that contains every possible part ID.

Related

How to copy records from inter-linked tables to another in a different database?

I have 3 tables that are inter-linked between each other. The design of the tables are as below.
First (PK:FirstID, vchar:Name, int:Year)
Second (PK:SecondID, FK:FirstID, int:Day, int:Month)
Third (PK:ThirdID, FK:SecondID, int:Speed, vchar:Remark)
I'm trying to copy records from 3 inter-linked tables from Database A to Database B. So my Transact-SQL looks something like this:
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT FirstID, Day, Month
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT SecondID, Speed, Remark
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID
WHERE Remark <> NULL
These statements works all well and fine until the starting position of First.FirstID in Database A and B becomes not the same due to the three tables in Database B being empty. Hence, the constraint on foreign_key error is produced.
Possible Solutions
Reuse old First.FirstID One of the solution I have figured out is to use reuse the old First.FirstID from Database A. This can be done by setting SET IDENTITY_INSERT TableName ON just before the insert into TableName and including the TableName.TableNameID into the insert statement. However, I'm advised against doing this by my colleagues.
Overwrite Second.FirstID with new First.FirstID and subsequently, Third.SecondID with the new Second.SecondID I'm trying to apply this solution using OUTPUT and TABLE variable by outputting all First.FirstID into a temporary table variable and associate them with table Second similar to this answer However, I'm stuck on how to associate and replace the Second.FirstIDs with the correct IDs in the temporary table. An answer on how to do this would also be accepted as the answer for this question.
Using solution No. 1 and Update the primary and foreign keys using UPDATE CASCADE. I just got this idea but I have a feeling it will be very tedious. More research needs to be done but if there's an answer that shows how to implement this successfully, then I'll accept that answer.
So how do I copy records from 3 inter-linked tables to another 3 similar tables but different primary keys? Are there any better solutions than the ones proposed above?
You can use OUTPUT Clause.
CREATE TABLE #First (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO First
(
Name,
Year,
OldId -- Added new column
)
OUTPUT Inserted.FirstID, Inserted.OldId INTO #First
SELECT
Name,
Year,
FirstID -- Old Id to OldId Column
FROM
DB_A.dbo.First
WHERE
Year >= 1992
Second Table
CREATE TABLE #Second (NewId INT PRIMARY KEY, OldId INT)
INSERT INTO Second
(
FirstID,
Day,
Month,
OldId -- Added new column
)
OUTPUT Inserted.SecondID, Inserted.OldId INTO #Second
SELECT
OF.NewId, --FirstID
Day,
Month,
SecondID
FROM
DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
#First OF ON F.FirstId = OF.OldId -- Old ids here
WHERE
Month > 6
Last one
INSERT INTO Third
(
SecondID,
Speed,
Remark
)
SELECT
OS.NewId, -- SecondID
Speed,
Remark
FROM
DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
#Second OS ON S.SecondID = OS.OldId
WHERE Remark <> NULL
First Solution
Using MERGE and OUTPUT together
OUTPUT combined with MERGE function has the ability to retrieve the old primary keys before inserting into the table.
Second Solution
NOTE: This solution only works if you are sure that you have another column that has its values unique in the table besides the table's primary key.
You may use this column as a link between the table in the source database and its sister table in the target database. The code below is an example taking into account that First.Name has unique values when month > 6.
-- no changes to insert code in First table
INSERT INTO First
(Name, Year)
SELECT Name, Year
FROM DB_A.dbo.First
WHERE Year >= 1992
INSERT INTO Second
(FirstID, Day, Month)
SELECT CurrentF.FirstID, Day, Month -- 2. Use the FirstID that has been input in First table
FROM DB_A.dbo.Second S INNER JOIN
DB_A.dbo.First F ON S.FirstID = F.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name -- 1. Join Name as a link
WHERE Month > 6
INSERT INTO Third
(SecondID, Speed, Remark)
SELECT CurrentS.SecondID, Speed, Remark --5. Get the proper SecondID
FROM DB_A.dbo.Third T INNER JOIN
DB_A.dbo.Second S ON T.SecondID = S.SecondID INNER JOIN
DB_A.dbo.First F ON F.FirstID = S.FirstID INNER JOIN
First CurrentF ON CurrentF.Name = F.Name INNER JOIN -- 3. Join using Name as Link
Second CurrentS ON CurrentS.FirstID= CurrentF.FirstID -- 4. Link Second and First table to get the proper SecondID.
WHERE Remark <> NULL

Join on resultant table of another join without using subquery,CTE or temp tables

My question is can we join a table A to resultant table of inner join of table A and B without using subquery, CTE or temp tables ?
I am using SQL Server.
I will explain the situation with an example
The are two tables GoaLScorers and GoalScoredDetails.
GoaLScorers
gid Name
-----------
1 A
2 B
3 A
GoalScoredDetails
DetailId gid stadium goals Cards
---------------------------------------------
1 1 X 2 1
2 2 Y 5 2
3 3 Y 2 1
The result I am expecting is if I select a stadium 'X' (or 'Y')
I should get name of all who may or may not have scored there, also aggregate total number of goals,total cards.
Null value is acceptable for names if no goals or no cards.
I can get the result I am expecting with the below query
SELECT
gs.name,
SUM(goal) as TotalGoals,
SUM(cards) as TotalCards
FROM
(SELECT
gid, stadium, goal, cards
FROM
GoalScoredDetails
WHERE
stadium = 'Y') AS vtable
RIGHT OUTER JOIN
GoalScorers AS gs ON vtable.gid = gs.gid
GROUP BY
gs.name
My question is can we get the above result without using a subquery or CTE or temp table ?
Basically what we need to do is OUTER JOIN GoalScorers to resultant virtual table of INNER JOIN OF GoalScorers and GoalScoredDetails.
But I am always faced with ambiguous column name error as "gid" column is present in GoalScorers and also in resultant table. Error persists even if I try to use alias for column names.
I have created a sql fiddle for this her: http://sqlfiddle.com/#!3/40162/8
SELECT gs.name, SUM(gsd.goal) AS totalGoals, SUM(gsd.cards) AS totalCards
FROM GoalScorers gs
LEFT JOIN GoalScoredDetails gsd ON gsd.gid = gs.gid AND
gsd.Stadium = 'Y'
GROUP BY gs.name;
IOW, you could push your where criteria onto joining expression.
The error Ambiguous column name 'ColumnName' occurs when SQL Server encounters two or more columns with the same and it hasn't been told which to use. You can avoid the error by prefixing your column names with either the full table name, or an alias if provided. For the examples below use the following data:
Sample Data
DECLARE #GoalScorers TABLE
(
gid INT,
Name VARCHAR(1)
)
;
DECLARE #GoalScoredDetails TABLE
(
DetailId INT,
gid INT,
stadium VARCHAR(1),
goals INT,
Cards INT
)
;
INSERT INTO #GoalScorers
(
gid,
Name
)
VALUES
(1, 'A'),
(2, 'B'),
(3, 'A')
;
INSERT INTO #GoalScoredDetails
(
DetailId,
gid,
stadium,
goals,
Cards
)
VALUES
(1, 1, 'x', 2, 1),
(2, 2, 'y', 5, 2),
(3, 3, 'y', 2, 1)
;
In this first example we recieve the error. Why? Because there is more than one column called gid it cannot tell which to use.
Failed Example
SELECT
gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
This example works because we explicitly tell SQL which gid to return:
Working Example
SELECT
gs.gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
You can, of course, return both:
Example
SELECT
gs.gid,
gsd.gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
In multi table queries I would always recommend prefixing every column name with a table/alias name. This makes the query easier to follow, and reduces the likelihood of this sort of error.

SQL Inner join in a nested select statement

I'm trying to do an inner join in a nested select statement. Basically, There are first and last reason IDs that produce a certain number (EX: 200). In another table, there are definitions for the IDs. I'm trying to pull the Last ID, along with the corresponding comment for whatever is pulled (EX: 200 - Patient Cancelled), then the first ID and the comment for whatever ID it is.
This is what I have so far:
Select BUSN_ID
AREA_NAME
DATE
AREA_STATUS
(Select B.REASON_ID
A.LAST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A.LAST_REASON _ID=B.REASON_ID,
(Select B.REASON_ID
A. FIRST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A_FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO
I believe an inner join is best, but I'm stuck on how it would actually work.
Required result would look like (this is example dummy data):
First ID -- Busn Reason -- Last ID -- Busn Reason
1 Patient Sick 2 Patient Cancelled
2 Patient Cancelled 2 Patient Cancelled
3 Patient No Show 1 Patient Sick
Justin_Cave's SECOND example is the way I used to solve this problem.
If you want to use inline select statements, your inline select has to select a single column and should just join back to the table that is the basis of your query. In the query you posted, you're selecting the same numeric identifier multiple times. My guess is that you really want to query a string column from the lookup table-- I'll assume that the column is called reason_description
Select BUSN_ID,
AREA_NAME,
DATE,
AREA_STATUS,
a.last_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.LAST_REASON_ID=B.REASON_ID),
a.first_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO A
More conventionally, though, you'd just join to the busn_reasons table twice
SELECT i.busn_id,
i.area_name,
i.date,
i.area_status,
i.last_reason_id,
last_reason.reason_description,
i.first_reason_id,
first_reason.reason_description
FROM busn_info i
JOIN busn_reason first_reason
ON( i.first_reason_id = first_reason.reason_id )
JOIN busn_reason last_reason
ON( i.last_reason_id = last_reason.reason_id )

SELECT Statement in CASE

Please don't downgrade this as it is bit complex for me to explain. I'm working on data migration so some of the structures look weird because it was designed by someone like that.
For ex, I have a table Person with PersonID and PersonName as columns. I have duplicates in the table.
I have Details table where I have PersonName stored in a column. This PersonName may or may not exist in the Person table. I need to retrieve PersonID from the matching records otherwise put some hardcode value in PersonID.
I can't write below query because PersonName is duplicated in Person Table, this join doubles the rows if there is a matching record due to join.
SELECT d.Fields, PersonID
FROM Details d
JOIN Person p ON d.PersonName = p.PersonName
The below query works but I don't know how to replace "NULL" with some value I want in place of NULL
SELECT d.Fields, (SELECT TOP 1 PersonID FROM Person where PersonName = d.PersonName )
FROM Details d
So, there are some PersonNames in the Details table which are not existent in Person table. How do I write CASE WHEN in this case?
I tried below but it didn't work
SELECT d.Fields,
CASE WHEN (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) = null
THEN 123
ELSE (SELECT TOP 1 PersonID
FROM Person
WHERE PersonName = d.PersonName) END Name
FROM Details d
This query is still showing the same output as 2nd query. Please advise me on this. Let me know, if I'm unclear anywhere. Thanks
well.. I figured I can put ISNULL on top of SELECT to make it work.
SELECT d.Fields,
ISNULL(SELECT TOP 1 p.PersonID
FROM Person p where p.PersonName = d.PersonName, 124) id
FROM Details d
A simple left outer join to pull back all persons with an optional match on the details table should work with a case statement to get your desired result.
SELECT
*
FROM
(
SELECT
Instance=ROW_NUMBER() OVER (PARTITION BY PersonName),
PersonID=CASE WHEN d.PersonName IS NULL THEN 'XXXX' ELSE p.PersonID END,
d.Fields
FROM
Person p
LEFT OUTER JOIN Details d on d.PersonName=p.PersonName
)AS X
WHERE
Instance=1
Ooh goody, a chance to use two LEFT JOINs. The first will list the IDs where they exist, and insert a default otherwise; the second will eliminate the duplicates.
SELECT d.Fields, ISNULL(p1.PersonID, 123)
FROM Details d
LEFT JOIN Person p1 ON d.PersonName = p1.PersonName
LEFT JOIN Person p2 ON p2.PersonName = p1.PersonName
AND p2.PersonID < p1.PersonID
WHERE p2.PersonID IS NULL
You could use common table expressions to build up the missing datasets, i.e. your complete Person table, then join that to your Detail table as follows;
declare #n int;
-- set your default PersonID here;
set #n = 123;
-- Make sure previous SQL statement is terminated with semilcolon for with clause to parse successfully.
-- First build our unique list of names from table Detail.
with cteUniqueDetailPerson
(
[PersonName]
)
as
(
select distinct [PersonName]
from [Details]
)
-- Second get unique Person entries and record the most recent PersonID value as the active Person.
, cteUniquePersonPerson
(
[PersonID]
, [PersonName]
)
as
(
select
max([PersonID]) -- if you wanted the original Person record instead of the last, change this to min.
, [PersonName]
from [Person]
group by [PersonName]
)
-- Third join unique datasets to get the PersonID when there is a match, otherwise use our default id #n.
-- NB, this would also include records when a Person exists with no Detail rows (they are filtered out with the final inner join)
, cteSudoPerson
(
[PersonID]
, [PersonName]
)
as
(
select
coalesce(upp.[PersonID],#n) as [PersonID]
coalesce(upp.[PersonName],udp.[PersonName]) as [PersonName]
from cteUniquePersonPerson upp
full outer join cteUniqueDetailPerson udp
on udp.[PersonName] = p.[PersonName]
)
-- Fourth, join detail to the sudo person table that includes either the original ID or our default ID.
select
d.[Fields]
, sp.[PersonID]
from [Details] d
inner join cteSudoPerson sp
on sp.[PersonName] = d.[PersonName];

SQL joins with multiple records into one with a default

My 'people' table has one row per person, and that person has a division (not unique) and a company (not unique).
I need to join people to p_features, c_features, d_features on:
people.person=p_features.num_value
people.division=d_features.num_value
people.company=c_features.num_value
... in a way that if there is a record match in p_features/d_features/c_features only, it would be returned, but if it was in 2 or 3 of the tables, the most specific record would be returned.
From my test data below, for example, query for person=1 would return
'FALSE'
person 3 returns maybe, person 4 returns true, and person 9 returns default
The biggest issue is that there are 100 features and I have queries that need to return all of them in one row. My previous attempt was a function which queried on feature,num_value in each table and did a foreach, but 100 features * 4 tables meant 400 reads and it brought the database to a halt it was so slow when I loaded up a few million rows of data.
create table p_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table c_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table d_features (
num_value int8,
feature varchar(20),
feature_value varchar(128)
);
create table default_features (
feature varchar(20),
feature_value varchar(128)
);
create table people (
person int8 not null,
division int8 not null,
company int8 not null
);
insert into people values (4,5,6);
insert into people values (3,5,6);
insert into people values (1,2,6);
insert into p_features values (4,'WEARING PANTS','TRUE');
insert into c_features values (6,'WEARING PANTS','FALSE');
insert into d_features values (5,'WEARING PANTS','MAYBE');
insert into default_features values('WEARING PANTS','DEFAULT');
You need to transpose the features into rows with a ranking. Here I used a common-table expression. If your database product does not support them, you can use temporary tables to achieve the same effect.
;With RankedFeatures As
(
Select 1 As FeatureRank, P.person, PF.feature, PF.feature_value
From people As P
Join p_features As PF
On PF.num_value = P.person
Union All
Select 2, P.person, PF.feature, PF.feature_value
From people As P
Join d_features As PF
On PF.num_value = P.division
Union All
Select 3, P.person, PF.feature, PF.feature_value
From people As P
Join c_features As PF
On PF.num_value = P.company
Union All
Select 4, P.person, DF.feature, DF.feature_value
From people As P
Cross Join default_features As DF
)
, HighestRankedFeature As
(
Select Min(FeatureRank) As FeatureRank, person
From RankedFeatures
Group By person
)
Select RF.person, RF.FeatureRank, RF.feature, RF.feature_value
From people As P
Join HighestRankedFeature As HRF
On HRF.person = P.person
Join RankedFeatures As RF
On RF.FeatureRank = HRF.FeatureRank
And RF.person = P.person
Order By P.person
I don't know if I had understood very well your question, but to use JOIN, you need your table loaded already and then use the SELECT statement with INNER JOIN, LEFT JOIN or whatever you need to show.
If you post some more information, maybe turn it easy to understand.
There are some aspects of your schema I'm not understanding, like how to relate to the default_features table if there's no match in any of the specific tables. The only possible join condition is on feature, but if there's no match in the other 3 tables, there's no value to join on. So, in my example, I've hard-coded the DEFAULT since I can't think of how else to get it.
Hopefully this can get you started and if you can clarify the model a bit more, the solution can be refined.
select p.person, coalesce(pf.feature_value, df.feature_value, cf.feature_value, 'DEFAULT')
from people p
left join p_features pf
on p.person = pf.num_value
left join d_features df
on p.division = df.num_value
left join c_features cf
on p.company = cf.num_value