SQL: Using one table as a criteria for itself - sql

I am trying to use a parameter of a table as a criteria for itself, and can't quite get my sql statement right. It seems to be a relatively simple query; I'm using a sub query for my criteria, but it is not filtering out other rows on my table.
Background:
Manufacturing production floor: I have a bunch of machinists on their machines right now running an operation (OprSeq) of a job (JobNum). From the LaborDtl table, which keeps a record of all labor activity, I can see what labor is currently active (ActiveTrans = 1). With this criteria of active labor, I want to sum up all the past labor transactions on each active labor entry. So I need a LaborDtl table of inactive labor activity with the criteria of active labor from the same table.
The code:
Heres my 'criteria' subquery:
SELECT
LaborDtl.JobNum,
LaborDtl.OprSeq
FROM Erp.LaborDtl
WHERE LaborDtl.ActiveTrans = 1
Which returns active transactions, here's the first couple (sorted by job):
Job Operation
000193 90
000457 70
000457 70
020008-1 140
020008-2 130
020010 60
020035 130
020175 40
020175-2 50
020186 80
020199 10
020203 50
020212 40
020258 60
020272 10
020283 30
020298 10
020299 30
Then here's the full SQL Statement, with the query above embedded:
SELECT
LaborDtl.JobNum,
LaborDtl.OprSeq as "Op",
SUM(LaborDtl.LaborQty) as "Total Labor"
FROM Erp.LaborDtl
WHERE EXISTS
(
SELECT
LaborDtl.Company,
LaborDtl.JobNum,
LaborDtl.OprSeq
FROM Erp.LaborDtl
WHERE LaborDtl.ActiveTrans = 1 --Labor table of just current activity
)
GROUP BY LaborDtl.JobNum, LaborDtl.OprSeq
I expect to see only the Job and Operation numbers that exist in my sub query, but I'm getting both jobs and operations that don't exist in my sub query. Here are the first 10 (note, the first JobNum should be 000193 per my criteria)
JobNum Op Total Labor
0 0.00000000
000004 1 32.00000000
000019 1 106.00000000
000029 1 175.00000000
000143 1 85.00000000
000164 1 58.00000000
000181 1 500.00000000
000227 1 116.00000000
000421 1 154.00000000
000458 1 67.00000000

You're missing some condition to tie the outer and inner queries together. Right now, without that criteria, the inner query just returns "true", as there are jobs with active activities and thus all the rows in the outer query are returned. Note that you'll have to add aliases to the tables, as the inner and outer query use the same table:
SELECT a.JobNum, a.OprSeq as "Op", SUM(a.LaborQty) as "Total Labor"
FROM Erp.LaborDtl a
WHERE EXISTS (SELECT * -- The select list doesn't really matter here
FROM Erp.LaborDtl b
WHERE a.JobNum = b.JobNum AND -- Here!
a.OprSeq = b.OprSeq AND -- And here!
b.ActiveTrans = 1 -- Labor table of just current activity
)
GROUP BY a.JobNum, a.OprSeq
Note, however, that there's an easier (IMHO) way. Since you're grouping by JobNum and OprSeq anyway, you could just count the number of active transactions and using a having clause to query only those that have at least one active transaction:
SELECT JobNum, OprSeq as "Op", SUM(LaborQty) as "Total Labor"
FROM Erp.LaborDtl
GROUP BY JobNum, OprSeq
HAVING COUNT(CASE ActiveTrans WHEN 1 THEN 1 END) > 0

Without knowing RDBMS vendor and version this is the best I can do:
SELECT
t1.JobNum,
t1.OprSeq as "Op",
SUM(t1.LaborQty) as "Total Labor"
FROM Erp.LaborDtl t1
WHERE EXISTS
(
SELECT 1
FROM Erp.LaborDtl t2
WHERE t2.ActiveTrans = 1 --Labor table of just current activity
and t2.Company = t1.Company
and t2.JobNum = t1.JobNum
and t2.OprSeq = t1.OprSeq
)
GROUP BY t1.JobNum, t1.OprSeq

Related

How can I replace the LAST() function in MS Access with proper ordering on a rather large table?

I have an MS Access database with the two tables, Asset and Transaction. The schema looks like this:
Table ASSET
Key Date1 AType FieldB FieldC ...
A 2023.01.01 T1
B 2022.01.01 T1
C 2023.01.01 T2
.
.
TABLE TRANSACTION
Date2 Key TType1 TType2 TType3 FieldOfInterest ...
2022.05.31 A 1 1 1 10
2022.08.31 A 1 1 1 40
2022.08.31 A 1 2 1 41
2022.09.31 A 1 1 1 30
2022.07.31 A 1 1 1 30
2022.06.31 A 1 1 1 20
2022.10.31 A 1 1 1 45
2022.12.31 A 2 1 1 50
2022.11.31 A 1 2 1 47
2022.05.23 B 2 1 1 30
2022.05.01 B 1 1 1 10
2022.05.12 B 1 2 1 20
.
.
.
The ASSET table has a PK (Key).
The TRANSACTION table has a composite key that is (Key, Date2, Type1, Type2, Type3).
Given the above tables let's see an example:
Input1 = 2022.04.01
Input2 = 2022.08.31
Desired result:
Key FieldOfInterest
A 41
because if the Transactions in scope was to be ordered by Date2, TType1, TType2, TType3 all ascending then the record having FieldOfInterest = 41 would be the last one.
Note that Asset B is not in scope due to Asset.Date1 < Input1, neither is Asset C because AType != T1. Ultimately I am curious about the SUM(FieldOfInterest) of all the last transactions belonging to an Asset that is in scope determined by the input variables.
The following query has so far provided the right results but after upgrading to a newer MS Access version, the LAST() operation is no longer reliably returning the row which is the latest addition to the Transaction table.
I have several input values but the most important ones are two dates, lets call them InputDate1 and
InputDate2.
This is how it worked so far:
SELECT Asset.AType, Last(FieldOfInterest) AS CurrentValue ,Asset.Key
FROM Transaction
INNER JOIN Asset ON Transaction.Key = Asset.Key
WHERE Transaction.Date2 <= InputDate2 And Asset.Date1 >= InputDate1
GROUP BY Asset.Key, Asset.AType
HAVING Asset.AType='T1'
It is known that the grouped records are not guaranteed to be in any order. Obviously it is a mistake to rely on the order of the records of the group by operation will always keep the original table order but lets just ignore this for now.
I have been struggling to come up with the right way to do the following:
join the Asset and Transaction tables on Asset.Key = Transaction.Key
filter by Asset.Date1 >= InputDate1 AND Transaction.Date2 <= InputDate2
then I need to select one record for all Transaction.Key where Date2 and TType1 and TType2 and TType3 has the highest value. (this represents the actual last record for given Key)
As far as I know there is no way to order records within a group by clause which is unfortunate.
I have tried Ranking, but the Transactions table is large (800k rows) and the performance was very slow, I need something faster than this. The following are an example of three saved queries that I wrote and chained together but the performance is very disappointing probably due to the ranking step.
-- Saved query step_1
SELECT Asset.*, Transaction.*
FROM Transaction
INNER JOIN Asset ON Transaction.Key = Asset.Key
WHERE Transaction.Date2 <= 44926
AND Asset.Date1 >= 44562
AND Asset.aType = 'T1'
-- Saved query step_2
SELECT tr.FieldOfInterest, (SELECT Count(*) FROM
(SELECT tr2.Transaction.Key, tr2.Date2, tr2.Transaction.tType1, tr2.tType2, tr2.tType3 FROM step_1 AS tr2) AS tr1
WHERE (tr1.Date2 > tr.Date2 OR
(tr1.Date2 = tr.Date2 AND tr1.tType1 > tr.Transaction.tType1) OR
(tr1.Date2 = tr.Date2 AND tr1.tType1 = tr.Transaction.tType1 AND tr1.tType2 > tr.tType2) OR
(tr1.Date2 = tr.Date2 AND tr1.tType1 = tr.Transaction.tType1 AND tr1.tType2 = tr.tType2 AND tr1.tType3 > tr.tType3))
AND tr1.Key = tr.Transaction.Key)+1 AS Rank
FROM step_1 AS tr
-- Saved query step_3
SELECT SUM(FieldOfInterest) FROM step_2
WHERE Rank = 1
I hope I am being clear enough so that I can get some useful recommendations. I've been stuck with this for weeks now and really don't know what to do about it. I am open for any suggestions.
Reading the following specification
then I need to select one record for all Transaction.Key where Date2 and TType1 and TType2 and TType3 has the highest value. (this represents the actual last record for given Key)
Consider a simple aggregation for step 2 to retrieve the max values then in step 3 join all fields to first query.
Step 1 (rewritten to avoid name collision and too many columns)
SELECT a.[Key] AS Asset_Key, a.Date1, a.AType,
t.[Key] AS Transaction_Key, t.Date2,
t.TType1, t.TType2, t.TType3, t.FieldOfInterest
FROM Transaction t
INNER JOIN Asset a ON a.[Key] = a.[Key]
WHERE t.Date2 <= 44926
AND a.Date1 >= 44562
AND a.AType = 'T1'
Step 2
SELECT Transaction_Key,
MAX(Date2) AS Max_Date2,
MAX(TType1) AS TType1,
MAX(TType2) AS TType2,
MAX(TType3) AS TType3
FROM step_1
GROUP Transaction_Key
Step 3
SELECT s1.*
FROM step_1 s1
INNER JOIN step_2 s2
ON s1.Transaction_Key = s2.Transaction_Key
AND s1.Date2 = s2.Max_Date2
AND s1.TType1 = s2.Max_TType1
AND s1.TType2 = s2.Max_TType2
AND s1.TType3 = s2.Max_TType3

Sum multiple rows in a joined query

I'm not even sure how to ask this. So here goes. I have two tables I am joining, and am needing to sum one column of data, easy enough, but the data that needs to be summed is dependent on a certain character associated with the same job number from a different table.
Table1
JobNumber CostType Amount
1 A 10
1 B 20
1 C 50
1 C 50
3 C 75
Table 2
JobNumber Status Value
1 A 100
2 I 50
3 A 75
Okay, so some of the jobs will have multiple lines for CostType 'C'. I'm trying to display all JobNumbers with the total of any amounts for CostType C, BUT only for jobs that have the Status of 'A'. Here's my query so far:
SELECT Table1.JobNumber
,Table1.Amount
,Table2.Value
FROM DB.Table1, DB.Table2
WHERE Table1.JobNumber = Table2.JobNumber and Table1.CostType = 'C' and Table2.Status = 'A'
GROUP BY Table1.JobNumber, Table1.Amount, Table2.Value
ORDER BY Table1.JobNumber ASC
It's giving me the list of job numbers, their amounts, and the contract value, and only for CostType 'C' and with the Status of 'A'. But each separate CostType 'C' amount has its own row. Is there a way to combine them and display the total Amount along with the Value for each JobNumber, like this?
JobNumber CostTypeCTotal Value
1 100 100
3 75 75
Hmmm . . . Try aggregating table1 before joining the tables:
SELECT t2.JobNumber, t1.c_Amount, t2.Value
FROM DB.Table2 t2
(SELECT JobNumber,
SUM(CASE WHEN CostType = 'C' THEN amount END) as c_amount
FROM DB.Table1 t1
GROUP BY JobNumber
HAVING SUM(CASE WHEN CostType = 'A' THEN 1 ELSE 0 END) > 0
) t1 JOIN
ON t1.JobNumber = t2.JobNumber;
Note: Learn to use proper, explicit, standard, readable JOIN syntax. Never use commas in the FROM clause. Such use of , is archaic syntax that has been out of date since the 1990s.

SQL query Splitting a column into Multiple rows divide by percentage

How to get percentage of a column and then inserting it as rows
Col1 item TotalAmount**
1 ABC 5558767.82
2 ABC 4747605.5
3 ABC 667377.69
4 ABC 3844204
6 CTB 100
7 CTB 500.52
I need to create a new column percentage for each item which is I have done as :-
Select item, (totalAmount/select sum(totalAmount) from table1) as Percentage
From table1
Group by item
Col1 item TotalAmount percentage
1 ABC 5558767.82 38
2 ABC 4747605.5 32
3 ABC 667377.69 5
4 ABC 3844204 26
6 CTB 100 17
7 CTB 500.52 83
Now, the complex part I have to calculate another amount by multiplying this percentage to an amount from another table say table2
ii) update the Total amount column by spilt the total amount column of table 1 into 2 rows – 1st row of the new Calculate PledgeAmount and 2nd row – (totalAmount – PledgeAmount)
*Select t1.percentage * t2.new amount as [PledgeAmount]
From table 1 join table2 where t1.item=t2.item*
. e.g. for col1 Amount of 5558767.82 will split into two rows.
Final Result sample for :-
Col1 item TotalAmount Type
1 ABC 363700.00 Pledge
1 ABC 5195067.82 Unpledge
....
I am using Temporary table to do calculations.
One of the way I think is to calculate the Pledged and Unpledged amount as new column and Pivot it but its huge table with hundreds of columns it will not perform fast.
Any other efficient way?
You can use a windowing function to solve this problem -- first in a sub-query calculate the total and then in the main query the percent:
Select *, (totalAmount/total_for_item)*100 as percent_of_total
from (
SELECT t.*,
SUM(totalAmount) OVER (PARTITION BY item) as total_for_item
FROM table t
) sub
First, let's get the total amount per item:
SELECT item, SUM( totalAmount ) as sumTotal
INTO #totalperitem
FROM table1
GROUP BY item
Now it's easy to get to the percentages:
SELECT t1.Col1,
t1.item,
t1.totalAmount,
t1.totalAmount/tpi.sumTotal*100 AS percentage
FROM table1 t1
INNER JOIN #totalperitem tpi on ...
Tricky part: Separate rows with/without match in table2. Can be done with a WHERE NOT EXISTS, or, my preference, with a single outer join:
SELECT t1.item,
CASE WHEN tpledged.item IS NULL
THEN "Unpledged"
ELSE "Pledged"
END,
SUM( t1.totalAmount ) AS amount
FROM table1 t1
LEFT OUTER JOIN table2 tpledged ON t1. ... = tpledged. ...
GROUP BY t1.item,
CASE WHEN tpledged.item IS NULL
THEN "Unpledged"
ELSE "Pledged"
END
The basic trick is to create an artificial column from the presence/absence of records in table2 and to also group by that artificial column.

Select rows in one table, adding column where MAX(Date) of rows in other, related table

I have a table containing a set of tasks to perform:
Task
ID Name
1 Washing Up
2 Hoovering
3 Dusting
The user can add one or more Notes to a Note table. Each note is associated with a task:
Note
ID ID_Task Completed(%) Date
11 1 25 05/07/2013 14:00
12 1 50 05/07/2013 14:30
13 1 75 05/07/2013 15:00
14 3 20 05/07/2013 16:00
15 3 60 05/07/2013 17:30
I want a query that will select the Task ID, Name and it's % complete, which should be zero if there aren't any notes for it. The query should return:
ID Name Completed (%)
1 Washing Up 75
2 Hoovering 0
3 Dusting 60
I've really been struggling with the query for this, which I've read is a "greatest n per group" type problem, of which there are many examples on SO, none of which I can apply to my case (or at least fully understand). My intuition was to start by finding the MAX(Date) for each task in the note table:
SELECT ID_Task,
MAX(Date) AS Date
FROM
Note
GROUP BY
ID_Task
Annoyingly, I can't just add "Complete %" to the above query unless it's contained in a GROUP clause. Argh! I'm not sure how to jump through this hoop in order to somehow get the task table rows with the column appended to it. Here is my pathetic attempt, which fails as it only returns tasks with notes and then duplicates task records at that (one for each note, so it's a complete fail).
SELECT Task.ID,
Task.Name,
Note.Complete
FROM
Task
JOIN
(SELECT ID_Task,
MAX(Date) AS Date
FROM
Note
GROUP BY
ID_Task) AS InnerNote
ON
Task.ID = InnerNote.ID_Task
JOIN
Note
ON
Task.ID = Note.ID_Task
Can anyone help me please?
If we assume that tasks only become more complete, you can do this with a left outer join and aggregation:
select t.ID, t.Name, coalesce(max(n.complete), 0)
from tasks t left outer join
notes n
on t.id = n.id_task
group by t.id, t.name
If tasks can become "less complete" then you want the one with the last date. For this, you can use row_number():
select t.ID, t.Name, coalesce(n.complete, 0)
from tasks t left outer join
(select n.*, row_number() over (partition by id_task order by date desc) as seqnum
from notes n
) n
on t.id = n.id_task and n.seqnum = 1;
In this case, you don't need a group by, because the seqnum = 1 performs the same role.
How about this just get the max of completed and group by taskid
SELECT t.ID_Task as ID,n.`name`,MAX(t.completed) AS completed
FROM `task` t RIGHT JOIN `note` n on ( t.ID_Task=n.ID )
GROUP BY t. ID_Task
OR
SELECT t.ID_Task as ID,n.`name`,
(CASE when MAX(t.completed) IS NULL THEN '0' ELSE MAX(t.completed))AS completed
FROM `task` t RIGHT JOIN `note` n on ( t.ID_Task=n.ID )
GROUP BY t. ID_Task
select a.ID,
a.Name,
isnull((select completed
from Note
where ID_Task = b.ID_Task
and Date = b.date),0)
from Task a
LEFT OUTER JOIN (select ID_Task,
max(date) date
from Note
group by ID_Task) b
ON a.ID = b.ID_Task;
See DEMO here

DB2 SQL filter query result by evaluating an ID which has two types of entries

After many attempts I have failed at this and hoping someone can help. The query returns every entry a user makes when items are made in the factory against and order number. For example
Order Number Entry type Quantity
3000 1 1000
3000 1 500
3000 2 300
3000 2 100
4000 2 1000
5000 1 1000
What I want to the query do is to return filter the results like this
If the order number has an entry type 1 and 2 return the row which is type 1 only
otherwise just return row whatever the type is for that order number.
So the above would end up:
Order Number Entry type Quantity
3000 1 1000
3000 1 500
4000 2 1000
5000 1 1000
Currently my query (DB2, in very basic terms looks like this ) and was correct until a change request came through!
Select * from bookings where type=1 or type=2
thanks!
select * from bookings
left outer join (
select order_number,
max(case when type=1 then 1 else 0 end) +
max(case when type=2 then 1 else 0 end) as type_1_and_2
from bookings
group by order_number
) has_1_and_2 on
type_1_and_2 = 2
has_1_and_2.order_number = bookings.order_number
where
bookings.type = 1 or
has_1_and_2.order_number is null
Find all the orders that have both type 1 and type 2, and then join it.
If the row matched the join, only return it if it is type 1
If the row did not match the join (has_type_2.order_number is null) return it no matter what the type is.
A "common table expression" [CTE] can often simplify your logic. You can think of it as a way to break a complex problem into conceptual steps. In the example below, you can think of g as the name of the result set of the CTE, which will then be joined to
WITH g as
( SELECT order_number, min(type) as low_type
FROM bookings
GROUP BY order_number
)
SELECT b.*
FROM g
JOIN bookings b ON g.order_number = b.order_number
AND g.low_type = b.type
The JOIN ON conditions will work so that if both types are present then low_type will be 1, and only that type of record will be chosen. If there is only one type it will be identical to low_type.
This should work fine as long as 1 and 2 are the only types allowed in the bookings table. If not then you can simply add a WHERE clause in the CTE and in the outer SELECT.