Conditionally Replace Default Null Values from Left Join

Conditionally Replace Default Null Values from Left Join - sql

I am joining 3 tables. In tables 2 and 3, records do not exist for some of the IDs in table 1. This of course, yields null values for these instances in my joined table.
I know I can replace these nulls with coalesce, but I don't know how to replace the nulls conditionally. In my reprex below, I would like to replace a null with an ID's max total (if an ID has a non-null total under a different job title) or replace a null with 0 (if an ID has null totals under both of their job titles).
Reprex code below:
/*sample table a (contains full data - 2 records for each ID)*/
data table_a;
input id title $ region $ calls;
cards;
1 manager south 30
1 agent north 20
2 manager west 20
2 agent south 25
;
run;
/*sample table b (missing an agent record for ID 1 -- will result in null total sales for ID 1's agent record in joined table)*/
data table_b;
input id title $ sales;
cards;
1 manager 20
2 manager 5
2 agent 3
;
run;
/*sample table c (missing both records for ID 2 - will result in null total_leads for ID 2 in joined table)*/
data table_c;
input id title $ leads;
cards;
1 manager .
1 agent 10
;
run;
/*join tables*/
proc sql;
create table reprex as
select a.id,
a.region,
a.calls,
a.title,
coalesce(b.total_sales, 0) as total_sales, /*this replaces all nulls as 0, but I'd like to replace
b.sales, them conditionally */
coalesce(c.total_leads, 0) as total_leads,
c.leads
from table_a as a
left join (select sum(coalesce(sales, 0)) as total_sales, sales, id, title from table_b group by id) b
on a.id = b.id and a.title = b.title
left join (select sum(coalesce(leads, 0)) as total_leads, leads, id, title from table_c group by id) c
on a.id = c.id and a.title = c.title;
quit;

Ended up nesting subqueries to get what I was after:
proc sql;
create table reprex as
select id, region, title,
calls,
case
when max_total_calls is null then 0
else max_total_calls
end as total_calls,
sales,
case
when max_total_sales is null then 0
else max_total_sales
end as total_sales,
leads,
case
when max_total_leads is null then 0
else max_total_leads
end as total_leads
from
(select *,
max(total_sales) as max_total_sales,
max(total_leads) as max_total_leads,
max(total_calls) as max_total_calls
from
(select a.id, a.region, a.calls, a2.total_calls, a.title,
b.sales, c.total_leads, c.leads
from table_a as a
left join (select sum(coalesce(calls, 0)) as total_calls, calls, id, title from table_a group by id) a2
on a.id = a2.id and a.title = a2.title
left join (select sum(coalesce(sales, 0)) as total_sales, max(coalesce(sales,0)) as max_sales, sales, id, title from table_b group by id) b
on a.id = b.id and a.title = b.title
left join (select sum(coalesce(leads, 0)) as total_leads, leads, id, title from table_c group by id) c
on a.id = c.id and a.title = c.title)
group by id);
quit;

Related

SQL - Unique results in column A based on a specific value in column B being the most frequent value

So I have the following challenge:
I'm trying to get unique results from all the clients (Column A) that made most of their purchases at store 103 (Column B).
The store is defined in the first 3 digits of the ticket number. The challenge is that I'm also getting every ticket for each client. And I just need SQL to calculate and filter the results, based on all the unique clients that made most of their purchases at store 103.
The information in Column A comes from Table 1 and the information in column B comes from Table 2.
Example
I've been trying the following:
SELECT DISTINCT Table_1.Full_Name, Table_2.Ticket_#
FROM Table_2
LEFT OUTER JOIN Table_1
ON Table_2.Customer_Number = Table_1.Customer_Number;
I know I'm missing either the group by or order by keywords, but I don't know how to use them properly in this particular case.
Thank you very much in advance.

Here are three options.
SELECT customers.Full_Name, tickets."Ticket_#"
FROM Table_2 tickets INNER JOIN Table_1 customers
ON customers.Customer_Number = tickets.Customer_Number INNER JOIN
(
SELECT Customer_Number
FROM Table_2 tickets
GROUP BY Customer_Number
HAVING COUNT(CASE WHEN LEFT("Ticket_#", 3) = '103' then 1 end)
> COUNT(CASE WHEN LEFT("Ticket_#", 3) <> '103' then 1 end)
) AS m ON m.Customer_Number = customers.Customer_Number
SELECT customers.Full_Name, tickets."Ticket_#"
FROM Table_2 tickets INNER JOIN Table_1 customers
ON customers.Customer_Number = tickets.Customer_Number
WHERE customers.Customer_Number IN (
SELECT Customer_Number
FROM Table2 tickets
WHERE "Ticket_#" LIKE '103%'
GROUP BY Customer_Number
HAVING COUNT(*) > (
SELECT COUNT(*)
FROM Table2 tickets2
WHERE tickets2.Customer_Number = tickets.Customer_Number
AND NOT "Ticket_#" LIKE '103%'
)
)
WITH data AS (
SELECT customers.Full_Name, tickets."Ticket_#"
COUNT(CASE WHEN LEFT(tickets."Ticket_#", 3) = '103' then 1 end)
OVER (PARTITION BY customers.Customer_Number) AS MatchCount
COUNT(CASE WHEN LEFT(tickets."Ticket_#", 3) <> '103' then 1 end)
OVER (PARTITION BY customers.Customer_Number) AS NonmatchCount
FROM Table_2 tickets INNER JOIN Table_1 customers
ON customers.Customer_Number = tickets.Customer_Number
)
SELECT * FROM data WHERE MatchCount > NonmatchCount;

Limit result from inner join query to 2 rows

My query is giving me result from grouped data but now I want only two rows
I have tried HAVING COUNT(*) <= 2 but issue is is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
my query is
select f.CompanyName, f.EmployeeCity, f.PrioritySL ,f.EmployeeSeniorityLevel ,f.EmployeeID
from (
select ConcatKey, min(PrioritySL) as PSL
from dbo.WalkerItContacts group by ConcatKey
) as x inner join dbo.WalkerItContacts as f on f.ConcatKey = x.ConcatKey and f.PrioritySL = x.PSL
where f.PrioritySL != '10'
Company apple have 9 records I want only 2 records
my data
company name priority
a 10
a 1
a 3
b 2
b 4
b 3
b 5
c 1
c 10
c 2
my expected data
company name priority
a 1
a 3
b 2
b 3
c 1
c 2

Add a 'top 2' clause to the outer query:
select top 2 f.CompanyName, f.EmployeeCity, f.PrioritySL ,f.EmployeeSeniorityLevel ,f.EmployeeID
from (
select ConcatKey, min(PrioritySL) as PSL
from dbo.WalkerItContacts group by ConcatKey
) as x inner join dbo.WalkerItContacts as f on f.ConcatKey = x.ConcatKey and f.PrioritySL = x.PSL
where f.PrioritySL != '10'
and f.CompanyName= 'Apple'
will give you two rows. Add a order clause by in the outer query so you can control which two rows are returned.

You can phrase this more succinctly and with better performance as:
select top (2) wic.*
from (select wic,
rank() over (partition by CompanyName, ConcatKey order by PrioritySL) as seqnum
from dbo.WalkerItContacts wic
) wic
where seqnum = 1 and
wic.PrioritySL <> 10 and
wic.CompanyName = 'Apple';

I think you could solve your problem using the ROW_NUMBER() function to count the rows and filter it in the WHERE clause to only show 2 rows per group.
I think something like this might work for you:
SELECT rownum, f.CompanyName, f.EmployeeCity, f.PrioritySL,
f.EmployeeSeniorityLevel, f.EmployeeID
FROM ( SELECT ConcatKey, MIN(PrioritySL) AS PSL, ROW_NUMBER() OVER(PARTITION BY
f.CompanyName) AS rownum
FROM dbo.WalkerItContacts
GROUP BY ConcatKey) AS x
INNER JOIN dbo.WalkerItContacts AS f ON f.ConcatKey = x.ConcatKey
AND f.PrioritySL = x.PSL
WHERE f.PrioritySL != '10' AND rownum <= 2
ORDER BY f.CompanyName ASC;
Hope this helps some.

count after join on multiple tables and count of multiple column values

Please help me with below problem.
table 1 employee details
emp name empno.
---------------------------------
John 1234
Joe 6789
table 2 employee assignment
empno assignmentstartdate assignmentenddate assignmentID empassignmentID
-----------------------------------------------------------------------------
1234 01JAN2017 02JAN2017 A1 X1
6789 01jan2017 02JAN2017 B1 Z1
table 3 employee assignment property
empassignmentID assignmentID propertyname propertyvalue
-------------------------------------------------------------------
X1 A1 COMPLETED true
X1 A1 STARTED true
Z1 B1 STARTED true
Z1 B1 COMPLETED false
Result wanted: (count of completed and started for each employee)
emp name emp no. COMPLETED STARTED
------------------------------------------
John 1234 1 1
Joe 6789 0 1
Currently with my query it is not putting count correctly for propertyvalue if I run for one employee it works correctly but not for multiple employees.
Please help.
SELECT empno ,
empname ,
(SELECT COUNT(A.propertyvalue)
FROM employeedetails C ,
employees_ASSIGNMENT RCA,
employee_assignment_property A
WHERE TRUNC(startdate) >= '14jun2017'
AND TRUNC(endate) <= '20jun2017'
AND RCA.empno = C.empno
AND RCA.empassignmetid = A.empassignmetid
AND rca.EMPNO IN ('1234','6789')
AND RCA.assignmentid = A.assignmentid
AND A.Name = 'COMPLETED'
AND A.propertyvalue = 'true') ,
(SELECT COUNT(A.propertyvalue)
FROM employeedetails C ,
employees_ASSIGNMENT RCA,
employee_assignment_property A
WHERE TRUNC(startdate) >= '14jun2017'
AND TRUNC(endate) <= '20jun2017'
AND RCA.empno = C.empno
AND RCA.empassignmetid = A.empassignmetid
AND rca.EMPNO IN ('1234','6789')
AND RCA.assignmentid = A.assignmentid
AND A.Name = 'STARTED'
AND A.propertyvalue = 'true')FROM employeedetails WHERE EMPNO IN
('1234','6789') GROUP BY C.empno ,
C.EMPNAME

I think you are simply looking for this:
SELECT DET.empname
, COUNT(CASE WHEN PROP.propertyname = 'COMPLETED' THEN 1 END) COMP_COUNT
, COUNT(CASE WHEN PROP.propertyname = 'STARTED' THEN 1 END) START_COUNT
FROM employeedetails DET
INNER JOIN employees_ASSIGNMENT ASS
ON ASS.empno = DET.empno
INNER JOIN employee_assignment_property PROP
ON PROP.empassignmentID = ASS.empassignmentID
AND PROP.assignmentID = ASS.assignmentID
GROUP BY DET.empname
Just add a WHERE clause if you need one.

if you want you result as a query without CTEs this should work:
select empName,
empNo,
(select employee_details.empNo, count(employee_assignment.assId)
from employee_details as t1
join employee_assignment on (t1.empno = employee_assignment.empno)
join employee_assignment_property on (employee_assignment.assId = employee_assignment_property.assId)
where employee_assignment.ptop = 'COMPLETED'
and t.empNo = t1.empNo
group by t1.empNo ) as [COMPLETED],
(select employee_details.empNo, count(employee_assignment.assId)
from employee_details as t1
join employee_assignment on (t1.empno = employee_assignment.empno)
join employee_assignment_property on (employee_assignment.assId = employee_assignment_property.assId)
where employee_assignment.ptop = 'STARTED'
and t.empNo = t1.empNo
group by t1.empNo ) as [STARTED],
from employee_details as t

If you don't want to do a dirty query composed of subqueries, you can try creating a view (if your database permits it).
What does it mean : I'll be useless in front of this. In summary, a view is a temporary table.
Hope this helps

this should work using CTEs:
Using Common Table Expressions
with numComplet()
as
(
select tbl1.empNo, count(tbl2.assId)
from tbl1
join tbl2 on (tbl1.empno = tbl2.empno)
join tbl3 on (tbl2.assId = tbl3.assId)
where tbl2.ptop = 'COMPLETED'
group by tbl1.empNo
),
with numStarted()
as
(
select tbl1.empNo, count(tbl2.assId)
from tbl1
join tbl2 on (tbl1.empno = tbl2.empno)
join tbl3 on (tbl2.assId = tbl3.assId)
where tbl2.ptop = 'STARTED'
group by tbl1.empNo
)
select *
from tbl1
join numComplet on (tbl1.empNo = numComplet.empNo)
join numStarted on (tbl1.empNo = numStarted.empNo)
I put down table names as tbl[1|2|3]

SQL join query about 2 tables

I have a table A
Trade# Trade_DT
1 08/10/2013
2 08/20/2013
and table B
BaseRate EffectiveDT
1.5 08/01/2013
2.0 08/15/2013
3.0 08/25/2013
I want to have a join such that i get the EffectiveDT after the TradeDT
Trade# Trade_DT BaseRate EffectiveDT
1 08/10/2013 2.0 08/15/2013
2 08/20/2013 3.0 08/25/2013

I am guessing that you want the earliest effective date after the Trade_Dt. The following will work in any SQL dialect:
select a.*,
(select min(EffectiveDt)
from b
where b.EffectiveDt > a.TradeDt
) as EffectiveDt
from a;
EDIT:
To get all the values from the b table requires just joining the table back in:
select t.Trade#, t.Trade_DT, b.BaseRate, b.EffectiveDt
from (select a.*,
(select min(EffectiveDt)
from b
where b.EffectiveDt > a.TradeDt
) as EffectiveDt
from a
) t join
b
on a.EffectiveDt = b.EffectiveDt;

in Oracle you can use this:
select distinct Trade,
Trade_DT,
FIRST_VALUE(BaseRate) Over (partition by Trade order by EffectiveDT desc) BaseRate,
FIRST_VALUE(EffectiveDT) Over (partition by Trade order by EffectiveDT desc) EffectiveDT
from tableA
inner join tableB
on tableA.Trade_DT >= tableB.EffectiveDT
And here a demo in SQLFiddle.

This code uses only standard SQL:
SELECT a.*, b.*
FROM TableA a
JOIN TableB b ON b.EffectiveDT =
(SELECT MIN(EffectiveDT)
FROM TableB b1
WHERE a.TradeDT < b1.EffectiveDT)

SQL Dynamic join?

Please see http://sqlfiddle.com/#!3/2506f/2/0
I have two tables. One is a general record, and the other is a table containing related documents that link to that record.
In my example I've created a straightforward query which shows all records and their associated documents. This is fine, but I want a more complex situation.
In the 'mainrecord' table there is a 'multiple' field. If this is 0, then I only want the most recent document from the documents table (that is, with the highest ID). If it is 1, I want to join all linked documents.
So, rather than the result of the query being this:-
ID NAME MULTIPLE DOCUMENTNAME IDLINK
1 One 1 first document 1
1 One 1 second document 1
2 Two 0 third document 2
2 Two 0 fourth document 2
3 Three 1 fifth document 3
3 Three 1 sixth document 3
It should look like this:-
ID NAME MULTIPLE DOCUMENTNAME IDLINK
1 One 1 first document 1
1 One 1 second document 1
2 Two 0 fourth document 2
3 Three 1 fifth document 3
3 Three 1 sixth document 3
Is there a way of including this condition into my query to get the results I'm after. I'm happy to explain further if needed.
Thanks in advance.

WITH myData
AS
(SELECT mainrecord.*, documentlinks.documentName, documentlinks.idlink,
Row_number()
OVER (
partition BY mainrecord.ID
ORDER BY mainrecord.ID ASC) AS ROWNUM
FROM mainrecord INNER JOIN documentlinks
ON mainrecord.id = documentlinks.idlink)
SELECT *
FROM mydata o
WHERE multiple = 0 AND rownum =
(SELECT max(rownum) FROM mydata i WHERE i.id = o.id)
UNION
SELECT *
FROM myData
WHERE multiple = 1
http://sqlfiddle.com/#!3/2506f/57

Another solution (tested at SQL-Fiddle):
SELECT m.*,
d.id as did, d.documentName, d.IDLink
FROM mainrecord AS m
JOIN documentlinks AS d
ON d.IDLink = m.id
AND m.multiple = 1
UNION ALL
SELECT m.*,
d.id as did, d.documentName, d.IDLink
FROM mainrecord AS m
JOIN
( SELECT d.IDLink
, MAX(d.id) AS did
FROM mainrecord AS m
JOIN documentlinks AS d
ON d.IDLink = m.id
AND m.multiple = 0
GROUP BY d.IDLink
) AS g
ON g.IDLink = m.id
JOIN documentlinks AS d
ON d.id = g.did
ORDER BY id, did ;

This will probably do:
SELECT mainrecord.name, documentlinks.documentname
FROM documentlinks
INNER JOIN mainrecord ON mainrecord.id = documentlinks.IDLink AND multiple = 1
UNION
SELECT mainrecord.name, documentlinks.documentname
FROM (SELECT max(id) id, IDLink FROM documentlinks group by IDLink) maxdocuments
INNER JOIN documentlinks ON documentlinks.id = maxdocuments.id
INNER JOIN mainrecord ON mainrecord.id = documentlinks.IDLink AND multiple = 0

How about this:
select * from mainrecord a inner join documentlinks b on a.Id=b.IDLink
where b.id=(case
when a.multiple=1 then b.id
else (select max(id) from documentlinks c where c.IDLink=b.IDLink) end)

SQL FIDDLE
select
m.ID, m.name, m.multiple, dl.idlink,
dl.documentName
from mainrecord as m
left outer join documentlinks as dl on dl.IDlink = m.id
where
m.multiple = 1 or
not exists (select * from documentlinks as t where t.idlink = m.id and t.id < dl.id)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Conditionally Replace Default Null Values from Left Join - sql

Related

SQL - Unique results in column A based on a specific value in column B being the most frequent value

Limit result from inner join query to 2 rows

count after join on multiple tables and count of multiple column values

SQL join query about 2 tables

SQL Dynamic join?

Categories

Resources