Interesting and challenging self join problem SQL Server 2005 - sql

I have this table named OrdersToCall
Data types: All bigints, except for date which is a datetime
|-Order Num-|----Date--- |- Primary Ph -| Secondary Ph | Alternate Ph
|----101----| 02-07-2010 | 925-515-1234 | 916-515-1234 | 707-568-5778
|----102----| 02-07-2010 | 925-888-4141 | 925-888-4141 | 000-000-0000
|----103----| 02-07-2010 | 000-000-0000 | 000-000-0000 | 510-555-4575
|----104----| 02-07-2010 | 415-789-5454 | 415-707-5588 | 735-874-9566
|----105----| 02-07-2010 | 925-887-7979 | 925-887-7979 | 925-887-7979
and I have another table named PhoneNumCalled
|-AgentID-|----Date----|-Dialed Number|
|-145564--| 02-07-2010 | 925-515-1234 |
|-145564--| 02-07-2010 | 707-568-5778 |
|-145566--| 02-07-2010 | 925-888-4141 |
|-145567--| 02-07-2010 | 510-555-4575 |
|-145568--| 02-07-2010 | 415-789-5454 |
|-145568--| 02-07-2010 | 415-707-5588 |
|-145568--| 02-07-2010 | 735-874-9566 |
|-145570--| 02-07-2010 | 925-887-7979 |
|-145570--| 02-07-2010 | 925-887-7979 |
Now my challenge is: I want to count how many Order Num were called and create a table based off the results.
So for example if agent 1234 called all 3 numbers on 1 order that would still only count as 1 order for that agent. The ratio is 1:1. Once a phone number is called then it is counted as 1 order. No matter if all 3 were called, an agent only has to call 1 of phone numbers to get credit for the order.
In less than 3 months time I already have almost 1/2 a million records so try to be as space conscious as possible.
My solution (Which I wish to revise with your help):
I ended up creating a stored procedure which:
--Delete and recreate the CombinedData table created yesterday
Insert into the CombinedData table
Select Order Num, Date, Primary Ph as Phone
from OrdersToCall
Union
Select Order Num, Date, Secondary Ph as Phone
from OrdersToCall
Union
Select Order Num, Date, Alternate Ph as Phone
from OrdersToCall
Delete from the CombinedData table
where phone in ('000-000-0000', '999-999-9999')
Now not only does this create a new table, but since each phone number in each order is now its own row the table becomes HUGE and take up to 2 minutes to create.
Then from this table I derive the counts and store those in yet another table.

I think this is what you're looking for:
SELECT c.AgentId, COUNT(DISTINCT o.[Order Num]) AS [Orders per Agent]
FROM OrdersToCall o
JOIN PhoneNumCalled c ON c.[Dialed Number] = o.[Primary Ph]
OR c.[Dialed Number] = o.[Secondary Ph]
OR c.[Dialed Number] = o.[Alternate Ph]
GROUP BY c.AgentId
If you want to know how many calls were made on each date, you would have to join on the date also:
SELECT c.AgentId, c.Date, COUNT(DISTINCT o.[Order Num]) AS [Orders per Agent]
FROM OrdersToCall o
JOIN PhoneNumCalled c ON (c.[Dialed Number] = o.[Primary Ph]
OR c.[Dialed Number] = o.[Secondary Ph]
OR c.[Dialed Number] = o.[Alternate Ph])
AND o.Date = c.Date
GROUP BY c.AgentId, c.Date

Related

Complex SQL Joins with Where Clauses

Being pretty new to SQL, I ask for your patience. I have been banging my head trying to figure out how to create this VIEW by joining 3 tables. I am going to use mock tables, etc to keep this very simple. So that I can try to understand the answer - no just copy and paste.
ICS_Supplies:
Supplies_ ID Item_Description
-------------------------------------
1 | PaperClips
2 | Rubber Bands
3 | Stamps
4 | Staples
ICS_Orders:
ID SuppliesID RequisitionNumber
----------------------------------------------------
1 | 1 | R1234a
6 | 4 | R1234a
2 | 1 | P2345b
3 | 2 | P3456c
4 | 3 | R4567d
5 | 4 | P5678e
ICS_Transactions:
ID RequsitionNumber OrigDate TransType OpenClosed
------------------------------------------------------------------
1 | R1234a | 06/12/20 | Req | Open
2 | P2345b | 07/09/20 | PO | Open
3 | P3456c | 07/14/20 | PO | Closed
4 | R4567d | 08/22/20 | Req | Open
5 | P5678e | 11/11/20 | PO | Open
And this is what I want to see in my View Results
Supplies_ID Item RequsitionNumber OriginalDate TransType OpenClosed
---------------------------------------------------------------------------------------
1 | Paper Clips | P2345b | 07/09/20 | PO | OPEN
2 | Rubber Bands | Null | Null | Null | Null
3 | Stamps | Null | Null | Null | Null
4 | Staples | P56783 | 11/11/20 | PO | OPEN
I just can't get there. I want to always have the same amount of records that we have in the ICS_Supplies Table. I need to join to the ICS_Orders Table in order to grab the Requisition Number because that's what I need to join on the ICS_Transactions Table. I don't want to see data in the new added fields UNLESS ICS_Transactions.TransType = 'PO' AND ICS_Transactions.OpenClosed = 'OPEN', otherwise the joined fields should be seen as null, regardless to what they contain. IF that is possible?
My research shows this is probably a LEFT Join, which is very new to me. I had made many attempts on my own, and then posted my question yesterday. But I was struggling to ask the correct question and it was recommended by other members that I post the question again . .
If needed, I can share what I have done, but I fear it will make things overly confusing as I was going in the wrong direction.
I am adding a link to the original question, for those that need some background info
Original Question
If there is any additional information needed, just ask. I do apologize in advance if I have left out any needed details.
This is a bit tricky, because you want to exclude rows in the second table depending on whether there is a match in the third table - so two left joins are not what you are after.
I think this implements the logic you want:
select s.supplies_id, s.item_description,
t.requisition_number, t.original_date, t.trans_type, t.open_closed
from ics_supplies s
left join ics_transaction t
on t.transtype = 'PO'
and t.open_closed = 'Open'
and exists (
select 1
from ics_order o
where o.supplies_id = s.supplies_id and o.requisition_number = t.requisition_number
)
Another way to phrase this would be an inner join in a subquery, then a left join:
select s.supplies_id, s.item_description,
t.requisition_number, t.original_date, t.trans_type, t.open_closed
from ics_supplies s
left join (
select o.supplies_id, t.*
from ics_order o
inner join ics_transaction t
on t.requisition_number = o.requisition_number
where t.transtype = 'PO' and t.open_closed = 'Open'
) t on t.supplies_id = s.supplies_id
This query should return the data for supplies. The left join will add in all orders that have a supply_id (and return null for the orders that don't).
select
s.supplies_id
,s.Item_Description as [Item]
,t.RequisitionNumber
,t.OrigDate as [OriginalDate]
,t.TransType
,t.OpenClosed
from ICS_Supplies s
left join ICS_Orders o on o.supplies_id = s.supplies_id
left join ICS_Transactions t on t.RequisitionNumber = o.RequisitionNumber
where t.TransType = 'PO'
and t.OpenClosed = 'Open'
The null values will automatically show null if the record doesn't exist. For example, you are joining to the Transactions table and if there isn't a transaction_id for that supply then it will return 'null'.
Modify your query, run it, then maybe update your question using real examples if it's possible.
In the original question you wrote:
"I only need ONE matching record from the ICS_Transactions Table.
Ideally, the one that I want is the most current
'ICS_Transactions.OriginalDate'."
So the goal is to get the most recent transaction for which the TransType is 'PO' and OpenClosed is 'Open'. That the purpose of the CTE 'oa_cte' in this code. The appropriate transactions are then LEFT JOIN'ed on SuppliesId. Something like this
with oa_cte(SuppliesId, RequsitionNumber, OriginalDate,
TransType, OpenClosed, RowNum) as (
select o.SuppliesId, o.RequsitionNumber,
t.OrigDate, t.TransType, t.OpenClosed,
row_number() over (partition by o.SuppliesId
order by t.OrigDate desc)
from ICS_Orders o
join ICS_Transactions t on o.RequisitionNumber=t.RequisitionNumber
where t.TransType='PO'
and t.OpenClosed='OPEN')
select s.*, oa.*
from ICS_Supplies s
left join oa_cte oa on s.SuppliesId=oa.SuppliesId
and oa.RowNum=1;

SQL - BigQuery - How do I fill in dates from a calendar table?

My goal is to join a sales program table to a calendar table so that there would be a joined table with the full trailing 52 weeks by day, and then the sales data would be joined to it. The idea would be that there are nulls I could COALESCE after the fact. However, my problem is that I only get results without nulls from my sales data table.
The questions I've consulted so far are:
Join to Calendar Table - 5 Business Days
Joining missing dates from calendar table Which points to
MySQL how to fill missing dates in range?
My Calendar table is all 364 days previous to today (today being day 0). And the sales data has a program field, a store field, and then a start date and an end date for the program.
Here's what I have coded:
SELECT
CAL.DATE,
CAL.DAY,
SALES.ITEM,
SALES.PROGRAM,
SALES.SALE_DT,
SALES.EFF_BGN_DT,
SALES.EFF_END_DT
FROM
CALENDAR_TABLE AS CAL
LEFT JOIN
SALES_TABLE AS SALES
ON CAL.DATE = SALES.SALE_DT
WHERE 1=1
and SALES.ITEM = 1 or SALES.ITEM is null
ORDER BY DATE ASC
What I expected was 365 records with dates where there were nulls and dates where there were filled in records. My query resulted in a few dates with null values but otherwise just the dates where a program exists.
DATE | ITEM | PROGRAM | SALE_DT | PRGM_BGN | PRGM_END |
----------|--------|---------|----------|-----------|-----------|
8/27/2020 | | | | | |
8/26/2020 | | | | | |
8/25/2020 | | | | | |
8/24/2020 | | | | | |
6/7/2020 | 1 | 5 | 6/7/2020 | 2/13/2016 | 6/7/2020 |
6/6/2020 | 1 | 5 | 6/6/2020 | 2/13/2016 | 6/7/2020 |
6/5/2020 | 1 | 5 | 6/5/2020 | 2/13/2016 | 6/7/2020 |
6/4/2020 | 1 | 5 | 6/4/2020 | 2/13/2016 | 6/7/2020 |
Date = Calendar day.
Item = Item number being sold.
Program = Unique numeric ID of program.
Sale_Dt = Field populated if at least one item was sold under this program.
Prgm_bgn = First day when item was eligible to be sold under this program.
Prgm_end = Last day when item was eligible to be sold under this program.
What I would have expected would have been records between June 7 and August 24 which just had the DATE column populated for each day and null values as what happens in the most recent four records.
I'm trying to understand why a calendar table and what I've written are not providing the in-between dates.
EDIT: I've removed the request for feedback to shorten the question as well as an example I don't think added value. But please continue to give feedback as you see necessary.
I'd be more than happy to delete this whole question or have someone else give a better answer, but after staring at the logic in some of the answers in this thread (MySQL how to fill missing dates in range?) long enough, I came up with this:
SELECT
CAL.DATE,
t.* EXCEPT (DATE)
FROM
CALENDER_TABLE AS CAL
LEFT JOIN
(SELECT
CAL.DATE,
CAL.DAY,
SALES.ITEM,
SALES.PROGRAM,
SALES.SALE_DT,
SALES.EFF_BGN_DT,
SALES.EFF_END_DT
FROM
CALENDAR_TABLE AS CAL
LEFT JOIN
SALES_TABLE AS SALES
ON CAL.DATE = SALES.SALE_DT
WHERE 1=1
and SALES.ITEM = 1 or SALES.ITEM is null
ORDER BY DATE ASC) **t**
ON CAL.DATE = t.DATE
From what I can tell, it seems to be what I needed. It allows for the subquery to connect a date to all those records, then just joins on the calendar table again solely on date to allow for those nulls to be created.

How to select company which have two groups

I still tried select all customers which is in two group. Duplicate from customers is normal because select is from invoice but I need to know the customers who had a group in the first half year and jumped to another in the second half year.
Example:
SELECT
f.eankod as kod, --(groups)
ad.kod as firma, --(markComp)
f.nazfirmy as nazev, --(nameComp)
COUNT(ad.kod),
sum(f.sumZklZakl + f.sumZklSniz + f.sumOsv) as cena_bez_dph --(Price)
FROM
ddoklfak as f
LEFT OUTER JOIN aadresar ad ON ad.idfirmy = f.idfirmy
WHERE
f.datvyst >= '2017-01-01'
and f.datvyst <= '2017-12-31'
and f.modul like 'FAV'
GROUP BY
f.eankod,
ad.kod,
f.nazfirmy
HAVING COUNT (ad.kod) > 1
order by
ad.kod
Result:
GROUP markcomp nameComp price
| D002 | B5846 | Cosmopolis | price ... |
| D003 | B6987 | Tismotis | price ... |
| D009 | B8974 | Teramis | price ... |
| D006 | B8876 | Kesmethis | price ... | I need this, same company but diferent group, because this
| D008 | B8876 | Kesmethis | price ... | company jumped. I need know only jumped company. (last two rows from examples)
Thx for help.
You can use a CTE to find out which nameComp show up multiple times, and keep those ones only. For example:
with
x as (
-- your query
)
select * from x where nameComp in (
select nameComp from x group by nameComp having count(*) > 1
)

Creating user time report that includes zero hour weeks

I'm having a heck of a time putting together a query that I thought would be quite simple. I have a table that records total hours spent on a task and the user that reported those hours. I need to put together a query that returns how many hours a given user charged to each week of the year (including weeks where no hours were charged).
Expected Output:
|USER_ID | START_DATE | END_DATE | HOURS |
-------------------------------------------
|'JIM' | 4/28/2019 | 5/4/2019 | 6 |
|'JIM' | 5/5/2019 | 5/11/2019 | 0 |
|'JIM' | 5/12/2019 | 5/18/2019 | 16 |
I have a function that returns the start and end date of the week for each day, so I used that and joined it to the task table by date and summed up the hours. This gets me very close, but since I'm joining on date I obviously end up with NULL for the USER_ID on all zero hour rows.
Current Output:
|USER_ID | START_DATE | END_DATE | HOURS |
-------------------------------------------
|'JIM' | 4/28/2019 | 5/4/2019 | 6 |
| NULL | 5/5/2019 | 5/11/2019 | 0 |
|'JIM' | 5/12/2019 | 5/18/2019 | 16 |
I've tried a few other approaches, but each time I end up hitting the same problem. Any ideas?
Schema:
---------------------------------
| TASK_LOG |
---------------------------------
|USER_ID | DATE_ENTERED | HOURS |
-------------------------------
|'JIM' | 4/28/2019 | 6 |
|'JIM' | 5/12/2019 | 6 |
|'JIM' | 5/13/2019 | 10 |
------------------------------------
| DATE_HELPER_TABLE |
|(This is actually a function, but I|
| put it in a table to simplify) |
-------------------------------------
|DATE | START_OF_WEEK | END_OF_WEEK |
-------------------------------------
|5/3/2019 | 4/28/2019 | 5/4/2019 |
|5/4/2019 | 4/28/2019 | 5/4/2019 |
|5/5/2019 | 5/5/2019 | 5/11/2019 |
| ETC ... |
Query:
SELECT HRS.USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
,SUM(HOURS)
FROM DATE_HELPER_TABLE DHT
LEFT JOIN (
SELECT TL.USER_ID
,TL.HOURS
,DHT2.START_OF_WEEK
,DHT2.END_OF_WEEK
FROM TASK_LOG TL
JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
WHERE TL.USER_ID = 'JIM1'
) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
GROUP BY USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
ORDER BY DHT.START_OF_WEEK
http://sqlfiddle.com/#!18/02d43/3 (note: for this sql fiddle, I converted my date helper function into a table to simplify)
Cross join the users (in question) and include them in the join condition. Use coalesce() to get 0 instead of NULL for the hours of weeks where no work was done.
SELECT u.user_id,
dht.start_of_week,
dht.end_of_week,
coalesce(sum(hrs.hours), 0)
FROM date_helper_table dht
CROSS JOIN (VALUES ('JIM1')) u (user_id)
LEFT JOIN (SELECT tl.user_id,
dht2.start_of_week,
tl.hours
FROM task_log tl
INNER JOIN date_helper_table dht2
ON dht2.date_value = tl.date_entered) hrs
ON hrs.user_id = u.user_id
AND hrs.start_of_week = dht.start_of_week
GROUP BY u.user_id,
dht.start_of_week,
dht.end_of_week
ORDER BY dht.start_of_week;
I used a VALUES clause here to list the users. If you only want to get the times for particular users you can do so too (or use any other subquery, or ...). Otherwise you can use your user table (which you didn't post, so I had to use that substitute).
However the figures that are produced by this (and your original query) look strange to me. In the fiddle your user has worked for a total of 23 hours in the task_log table. Yet your sums in the result are 24 and 80, that is way to much on its own and even worse taking into account, that 1 hour in task_log isn't even on a date listed in date_helper_table.
I suspect you get more accurate figures if you just join task_log, not that weird derived table.
SELECT u.user_id,
dht.start_of_week,
dht.end_of_week,
coalesce(sum(tl.hours), 0)
FROM date_helper_table dht
CROSS JOIN (VALUES ('JIM1')) u (user_id)
LEFT JOIN task_log tl
ON tl.user_id = u.user_id
AND tl.date_entered = dht.date_value
GROUP BY u.user_id,
dht.start_of_week,
dht.end_of_week
ORDER BY dht.start_of_week;
But maybe that's just me.
SQL Fiddle
http://sqlfiddle.com/#!18/02d43/65
Using your SQL fiddle, I simply updated the select statement to account for and convert null values. As far as I can tell, there is nothing in your post that makes this option not viable. Please let me know if this is not the case and I will update. (This is not intended to detract from sticky bit's answer, but to offer an alternative)
SELECT ISNULL(HRS.USER_ID, '') as [USER_ID]
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
,SUM(ISNULL(HOURS,0)) as [SUM]
FROM DATE_HELPER_TABLE DHT
LEFT JOIN (
SELECT TL.USER_ID
,TL.HOURS
,DHT2.START_OF_WEEK
,DHT2.END_OF_WEEK
FROM TASK_LOG TL
JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
WHERE TL.USER_ID = 'JIM1'
) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
GROUP BY USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
ORDER BY DHT.START_OF_WEEK
Create a dates table that includes all dates for the next 100 years in the first column, the week of the year, day of the month etc in the next.
Then select from that dates table and left join everything else. Do isnull function to replace nulls with zeros.

INNER JOIN Need to use column value twice in results

I've put in the requisite 2+ hours of digging and not getting an answer.
I'd like to merge 3 SQL tables, where Table A and B share a column in common, and Table B and C share a column in common--Tables A and C do not.
For example:
Table A - entity_list
entity_id | entity_name | Other, irrelevant columns
Example:
1 | Microsoft |
2 | Google |
Table B - transaction_history
transaction_id | purchasing_entity | supplying_entity | other, irrelevant columns
Example:
1 | 2 | 1
Table C - transaction_details
transactional_id | amount_of_purchase | Other, irrelevant columns
1 | 5000000 |
Using INNER JOIN, I've been able to get a result where I can link entity_name to either purchasing_entity or supplying_entity. And then, in the results, rather than seeing the entity_id, I get the entity name. But I want to substitute the entity name for both purchasing and supplying entity.
My ideal results would look like this:
1 [transaction ID] | Microsoft | Google | 5000000
The closes I've come is:
1 [transaction ID] | Microsoft | 2 [Supplying Entity] | 5000000
To get there, I've done:
SELECT transaction_history.transaction_id,
entity_list.entity_name,
transaction_history.supplying_entity,
transaction_details.amount_of_purchase
FROM transaction.history
INNER JOIN entity_list
ON transaction_history.purchasing_entity=entity_list.entity.id
INNER JOIN
ON transaction_history.transaction_id=transaction_details.transaction_id
I can't get entity_name to feed to both purchasing_entity and supplying_entity.
Here is the query:
SELECT h.transaction_id, h.purchasing_entity, purchaser.entity_name, h.supplying_entity, supplier.entity_name, d.amount_of_purchase
FROM transaction_history h
INNER JOIN transaction_details d
ON h.transaction_id = d.transaction_id
INNER JOIN entity_list purchaser
ON h.purchasing_entity = purchaser.entity_id
INNER JOIN entity_list supplier
ON h.supplying_entity = supplier.entity_id