Select distinct values from one table and join with another table - sql

I have a problem that I've spent way to much time trying to figure out, with close to no success at all.. I'll try to describe the problem as good as I can, and use an example, which is the solution I use right now.
I have two different MS SQL tables.
Table 1:
itemNumber - 192031, 533853 etc.
date - the date the database post was added
quantity - the amount of items for each item number
Table 2:
MTITNO - also item number, contains many different item numbers (more than Table 1)
MTTRDT - the date the database post was added
MTTYP - transaction type. I will be looking for MTTYP = 11
MTTRQT - transaction quantity. I will be looking for MTTRQT < 0
So what I want to do is to get DISTINCT itemNumber between two dates from Table 1. Once I have those item numbers, I would like to join Table 2 on item number, and also between the same dates that I use in the query for Table 1. I also need to only get the values from Table 2 where MTTYP = 11 and MTTRQT < 0 and SUM MTTRQT.
I've sorted this by using loops in java code, which isn't that good to be honest. What I do is this:
SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;
Take the top value from this result (that is the first item number) and then:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO = "the first item number from the result query from above"
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0
Add the result to a new list. Remove the item number used
Loop through all the item numbers in the list and run step 3 and 4 for every single item number (this is the bad part).
Surely there must be a SQL query that produces the same result!?
Appreciate any help I can get!
Update:
This is the data I have.
Table 1
|Item number | Quantity | date
192031 | 1 | 20190521
192031 | 1 | 20190522
19192301 | 2 | 20190521
19189507 | 1 | 20190523
19189507 | 1 | 20190521
19189507 | 1 | 20190524
Table 2
|MTITNO | MTTRDT | MTTTYP | MTTRQT
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
192031 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190520 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
19189507 | 20190521 | 11 | -1
Table 2 contains all sorts of item numbers (that is item numbers that you can find in Table 2, but not in Table 1), and many more posts. There can be posts in Table 1 and no posts in Table 2 for one or more item numbers.
I want to summarise the MTTRQT for all items where the item number is in both Table 1 and Table 2 and within the date span I have set. The "amount used" in the desired result below is MTTRQT added up for every single item number.
Desired result
So if I look for all the item numbers with date between 20190520 - 20190524, I should get the list below.
"Item number" is supposed to be DISTINCT item numbers from Table 1.
"Amount used" is the SUM function, that sums MTTRQT where all the conditions are met.
|Item Number | Amount used
192031 | -4
19189507 | -7

Reading through the lines a bit, but is this not what you're after?
SELECT SUM(T2.MTTRQT) AS [SUM_MTTRQT_]
FROM [Table 2] T2
LEFT JOIN (SELECT TOP (1)
T1.ItemNumber
FROM [Table 1] T1
WHERE T1.[date] BETWEEN #fromDate AND #toDate --Note, if [date] has a time portion, this is unlikely to work as you expect
ORDER BY T1.ItemNumber) T1 ON T2.MTITNO = T1.ItemNumber --Assumed ORDER BY clause
WHERE T2.MTTTYP = 11
AND T2.MTTRDT BETWEEN #fromDate AND #toDate --Note, if MTTRDT has a time portion, this is unlikely to work as you expect
AND T2.MITTRA.MTTRQT < 0;

If I am following your logic correctly:
select sum(mttrqt)
from table2 t2
where t2.mtitno in (select t1.itemno
from table1 t1
where t1.date >= #date1 and t1.date <= #date2
) and
t2.mttrdt >= #date1 and
t2.mttrdt <= #date1 and
t2.mttype = 11 and
t2.mttrqt < 10;

Have you tried this:
SELECT Sum(MTTRQT) "SUM_MTTRQT_"
FROM Table 2
WHERE MTITNO in (SELECT DISTINCT itemNumber "itemNumber"
FROM Table 1
WHERE date BETWEEN #fromDate AND #toDate;)
AND MTTTYP = 11
AND MTTRDT BETWEEN #fromDate AND #toDate
AND MITTRA.MTTRQT < 0

Related

Merge rows based on a condition

Is it possible to merge a collection of rows based on a condition in Spark SQL using a sql query ?
If the difference between purch_dt of two consecutive rows placed in order (line_num) is less than 5 days, then combine them into 1 row and output that merged row and the merged row should have the max value of purch_dt for that group. I tried using the LEAD function but I can't get it to reset after each false condition is encountered and consider the following rows as a new group. I am not being able to get the max of purch_dt for each such group.
Input:
orderid | line_num | purch_dt
1 | 1 | 10-02-2020
1 | 2 | 12-02-2020
1 | 3 | 14-02-2020
1 | 4 | 21-03-2020
1 | 5 | 23-03-2020
Output:
orderid | purch_dt
1 | 14-02-2020 -- 1 - 3 combined into 1 row because difference is <5 between each
1 | 23-02-2020 -- 4 - 5 combined into 1 row because difference is <5 between each
Total Output rows = 2 because we have 2 groups.
Please note that line_num 4 is used as a set break since its difference between line_num = 3 is greater than 5. Hence it should have its own merged record set.
I have the sql below so far, but I can't get to break out and create the groups.
create temporary view next_dt as
select
order,
LEAD(purch_dt) over (partition by orderid order by line_num asc) AS next_purch_dt,
purch_dt
from orders;
select *
from (
select
order,
CASE WHEN datediff(next_purch_dt, purch_dt) < 5 OR next_purch IS NULL THEN 'Y'
ELSE 'N'
END AS flg
from
next_dt)
WHERE flg = 'Y';
Any help is appreciated.
UPDATE:
Slight change in the requirements:-
The comparison has now to be made between two different fields in consecutive records - purch_dt of the current record and the return_dt of the next record.
Also, when a merged record group is being output, it should have the purch_dt populated with the value of the record with the least line_num in that group. And the return_dt column populated with the value of the max line_num record of that same group.
Input:
orderid | line_num | purch_dt | return_dt
1 | 1 | 10-02-2020 | 10-02-2020
1 | 2 | 12-02-2020 | 13-02-2020
1 | 3 | 14-02-2020 | 14-02-2020
1 | 4 | 21-03-2020 | 23-02-2020
1 | 5 | 23-03-2020 | 24-02-2020
Output:
orderid | purch_dt | return_dt
1 | 10-02-2020 | 14-02-2020
1 | 21-03-2020 | 24-02-2020
Total Output rows = 2 because we have 2 groups.
Note that each output record contains the purch_dt of the record with min line_num in that group. And contains return_dt populated as per the record with max line_num in that group.
You almost got this, below query has worked for me,
sql("""create temporary view next_dt_orders as
select *
from (
select
orderid,line_num,purch_dt,
case when datediff(
(lead(purch_dt) over (partition by orderid order by line_num asc)),
purch_dt) < 5
then "N"
else "Y"
end as flag
from
orders) tab
where
flag='Y'""")
sql("select * from next_dt_orders").show()
+-------+--------+----------+----+
|orderid|line_num| purch_dt|flag|
+-------+--------+----------+----+
| 1| 3|2020-02-14| Y|
| 1| 5|2020-03-23| Y|
+-------+--------+----------+----+

How to aggregate based on various conditions

lets say I have a table which stores itemID, Date and total_shipped over a period of time:
ItemID | Date | Total_shipped
__________________________________
1 | 1/20/2000 | 2
2 | 1/20/2000 | 3
1 | 1/21/2000 | 5
2 | 1/21/2000 | 4
1 | 1/22/2000 | 1
2 | 1/22/2000 | 7
1 | 1/23/2000 | 5
2 | 1/23/2000 | 6
Now I want to aggregate based on several periods of time. For example, I Want to know how many of each item was shipped every two days and in total. So the desired output should look something like:
ItemID | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
_____________________________________________
1 | 7 | 6 | 13
2 | 7 | 13 | 20
How do I do that in the most efficient way
I know I can make three different subqueries but I think there should be a better way. My real data is large and there are several different time periods to be considered i. e. in my real problem I want the shipped items for current_week, last_week, two_weeks_ago, three_weeks_ago, last_month, two_months_ago, three_months_ago so I do not think writing 7 different subqueries would be a good idea.
Here is the general idea of what I can already run but is very expensive for the database
WITH
sq1 as (
SELECT ItemID, sum(Total_shipped) sum1
FROM table
WHERE Date BETWEEN '1/20/2000' and '1/21/2000'
GROUP BY ItemID),
sq2 as (
SELECT ItemID, sum(Total_Shipped) sum2
FROM table
WHERE Date BETWEEN '1/22/2000' and '1/23/2000'
GROUP BY ItemID),
sq3 as(
SELECT ItemID, sum(Total_Shipped) sum3
FROM Table
GROUP BY ItemID)
SELECT ItemID, sq1.sum1, sq2.sum2, sq3.sum3
FROM Table
JOIN sq1 on Table.ItemID = sq1.ItemID
JOIN sq2 on Table.ItemID = sq2.ItemID
JOIN sq3 on Table.ItemID = sq3.ItemID
I dont know why you have tagged this question with multiple database.
Anyway, you can use conditional aggregation as following in oracle:
select
item_id,
sum(case when "date" between date'2000-01-20' and date'2000-01-21' then total_shipped end) as "Jan20-Jan21",
sum(case when "date" between date'2000-01-22' and date'2000-01-23' then total_shipped end) as "Jan22-Jan23",
sum(case when "date" between date'2000-01-20' and date'2000-01-23' then total_shipped end) as "Jan20-Jan23"
from my_table
group by item_id
Cheers!!
Use FILTER:
select
item_id,
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-21') as "Jan20-Jan21",
sum(total_shipped) filter (where date between '2000-01-22' and '2000-01-23') as "Jan22-Jan23",
sum(total_shipped) filter (where date between '2000-01-20' and '2000-01-23') as "Jan20-Jan23"
from my_table
group by 1
item_id | Jan20-Jan21 | Jan22-Jan23 | Jan20-Jan23
---------+-------------+-------------+-------------
1 | 7 | 6 | 13
2 | 7 | 13 | 20
(2 rows)
Db<>fiddle.

In MS Access, how do I update a table record to its current value plus the count of records in a different table?

I have two tables.
**tblMonthlyData**
ReportMonth | TotalItems | TotalVariances
Jan | 5 | 0
Feb | 1 | 1
Mar | 2 | 0
Apr | 8 | 4
May | 4 | 0
Jun | 5 | 0
Jul | 3 | 0
Aug | 5 | 0
Sep | 9 | 3
Oct | 1 | 0
Nov | 7 | 0
Dec | 6 | 0
and
**tblDailyData**
ID | ItemNum | CountedQty | SystemQty | Variance
1 | Item1 | 4 | 4 | 0
2 | Item2 | 8 | 5 | -3
3 | Item3 | 1 | 2 | 1
4 | Item4 | 6 | 4 | -2
For the sake of clarity, we'll say the above tblDailyData is from a count done today, 01/27/2017. Variance is a calculated field based on the data in both quantity fields.
I'm trying to add the count of records in tblDailyData to TotalItems in tblMonthlyData based on the date of the count (i.e. counts are done daily and each counts data needs to be added to the appropriate month in tblMonthlyData). So for the above example I'd need to add 4 (number of records) to TotalItems in tblMonthlyData for the Jan record, resulting in the updated record being 9, and add 3 (number of variances) to TotalVariances, resulting in the updated record being 3.
So far, I've tried using a Make Table Query for both total items counted and total number of variances, then using an Update Query that looks like this:
UPDATE tblMonthlyData
SET TotalItems = TotalItems + tblTempTotalItems.CountOfItems,
TotalVariances = TotalVariances + tblTempTotalVariances.CountOfVariances
WHERE Format$([ReportMonth],"mmm")=Format$(Now(),"mmm");
I've also tried a similar method using select queries to count records and variances (without creating the temporary tables) and running the update query based on those. Both methods result in Access prompting for the CountOfItems and CountOfVariances parameters when the update query is ran instead of just taking the values from the specified temporary table or select query.
This seemed like it'd be such a simple operation (query the count of records and variances, add them to the appropriate monthly record in separate table), but it turns out I can't figure out how to make it work. Thanks for any help!
This does not seem to be a situation for a table, but rather for some views/queries, which will always be up to date.
Use a GROUP BY FORMAT([date_field],"mm/dd/yyyy") clause in your query for daily item count (if you want to add that to a montlhy count, we will do that in ANOTHER query.
SELECT FORMAT([date_field],"mm/dd/yyyy") AS Date, COUNT(ID) AS TotalItems
FROM tblDailyData
GROUP BY Date
Call this query dailyTotalItems.
SELECT FORMAT([date_field],"mm/dd/yyyy") AS Date, COUNT(ID) AS TotalItemsWithVariance, SUM(
FROM tblDailyData
WHERE NOT (Variance = 0)
GROUP BY Date
Call this query dailyTotalItemsWithVariance.
SELECT MONTH([date_field]) As MonthDate, SUM(TotalItems) As TotalMonthlyItems
FROM dailyTotalItems
GROUP BY MonthDate
Call this query monthlyTotalItems.
SELECT MONTH([date_field]) As MonthDate, SUM(TotalItemsWithVariance) As TotalMonthlyItemsWithVariance
FROM dailyTotalItemsWithVariance
GROUP BY MonthDate
Call this query monthlyTotalItemsWithVariance.
Then LEFT JOIN both on MonthDate.
SELECT * FROM monthlyTotalItems
LEFT JOIN monthlyTotalItemsWithVariance ON monthlyTotalItems.MonthDate = monthlyTotalItemsWithVariance.MonthDate
NOTE: TotalItems will always be >= TotalItemsWithVariance AND every date with a variance must have had a count. So get ALL dates in monthlyTotalItems and left join to match the monthlyTotalItemsWithVariance items (which must be included, as shown above)

Disassemble string, group, and reconstruct in Oracle SQL

So here is what a sample of my data look like:
ID | Amount
1111-1 | 5
1111-1 | -5
1111-2 | 5
1111-2 | -5
12R-1 | 8
12R-1 | -8
12R-3 | 8
12R-3 | -8
54A73-1| 2
54A73-1| -2
54A73-2| 2
54A73-2| -1
What I want to do is group by the string in the ID column before the dash, and find the group of IDs that have a sum of zero. The kicker is that after I find which group of IDs sum to zero, I want to add back the dash and number following the dash.
Here is what I hope the solution to look like:
ID | Amount
1111-1 | 5
1111-1 | -5
1111-2 | 5
1111-2 | -5
12R-1 | 8
12R-1 | -8
12R-3 | 8
12R-3 | -8
Notice how the IDs starting with 54A73 are not there anymore, its because the sum of their Amounts is not equal to zero.
Any help solving this questions would be much appreciated!
Here's one option joining the table back to itself after grouping by the beginning part of the id field using left and locate:
MySQL Version
select id, amount
from yourtable t
join (
select left(id, locate('-', id)-1) shortid
from yourtable
group by left(id, locate('-', id)-1)
having sum(amount) = 0
) t2 on left(t.id, locate('-', t.id)-1) = t2.shortid
SQL Fiddle Demo
Oracle Version
select id, amount
from yourtable t
join (
select substr(id, 0, instr(id,'-')-1) shortid
from yourtable
group by substr(id, 0, instr(id,'-')-1)
having sum(amount) = 0
) t2 on substr(t.id, 0, instr(t.id,'-')-1) = t2.shortid
More Fiddle

Grouping by partial value

I'm very new to SQL and I have no clue how to even begin with this one.
I've got two tables: Warehouse and Items. Here's how they look like(simplified):
Warehouse
ItemID | QuantityInStock | QuantityOnOrder | QuantityOnOrder2 | QuantityOnOrder3 | QuantityOnOrder4
-------+-----------------+-----------------+------------------+------------------+-----------------
1111 | 8 | 1 | 0 | 1 | 0
2222 | 3 | 0 | 0 | 0 | 0
3333 | 4 | 0 | 1 | 0 | 0
Items
ItemID | Code
-------+-----------------
1111 | abc123456-111-01
2222 | abc123456-111-02
3333 | abc123457-112-01
What I need to return via SQL query is this:
ShortCode | Quantity
----------+---------
abc123456 | 9
abc123457 | 3
ItemID is the key to join both tables
Code in the Items table include main product code (abc123456) and variants (-111-01). I need to group the lines by main product code only
Quantity I need comes from Warehouse table and it equals "QuantityInStock - QuantityOnOrder - QuantityOnOrder2 - QuantityOnOrder3 - QuantityOnOrder4". Using this we get abc123456 (comes in two variants in Items table with ItemId 1111 and 2222) and Quantity is equal 8 minus 1 minus 0 minus 1 minus 0 for 1111, and 3 minus 0 minus 0 minus 0 minus 0 for 2222 which together gives 9
This is probably the worst explanation ever, so I hope there is someone that can understand it.
Please help.
Assuming that you can always count on matching the first 9 characters of the Code column then the following query should work.
/// note that the SUM method may return a negative (-) number
SELECT LEFT(I.[Code], 9) AS 'ShortCode', SUM([QuantityInStock] - [QuantityOnOrder] - [QuantityOnOrder2] - [QuantityOnOrder3] - [QuantityOnOrder4]) AS 'Quantity'
FROM [dbo].[Warehouse] AS W
INNER JOIN [dbo].[Items] AS I ON I.[ItemId] = W.[ItemId]
GROUP BY LEFT(I.[Code], 9)
Using standard SQL:
SELECT
LEFT(Items.Code, 9) AS ShortCode,
SUM(T.remaining) AS Quantity
FROM Items
JOIN (
SELECT
ItemID,
QuantityInStock - QuantityOnOrder - QuantityOnOrder2 - QuantityOnOrder3 - QuantityOnOrder4 AS remaining
FROM Warehouse
) AS T ON (T.ItemID = Items.ItemID)
GROUP BY LEFT(Items.Code, 9);
Not tested, but should work. Only potential issue is that you use uppercase letters in your table and column names, so you might have to enclose all table and column names in backticks (`) or square brackets depending on your DB server.
EDIT: If you want to filter those with less than a certain number of pieces left, just add:
HAVING SUM(T.remaining) > xxx
Where xxx is the minimum quantity you want