SQL Query Creates Duplicated Results

SQL Query Creates Duplicated Results - sql

My task is to produce a report that shows the on time delivery of products to consumers. In essence I have achieved this. However, as you will see only some of the data is accurate.
Here is our test case: we have a sales order number '12312.' This sales order has had 5 partial shipments made (200 pieces each). The result is shown below from our DUE_DTS table.
Due Dates table data
The following code gives me the information I need (excluding due date information) to show the packing details of the 5 shipments:
DECLARE #t AS TABLE (
CUSTNAME char(35),
SONO char(10),
INVDATE date,
PACKLISTNO char(10),
PART_NO char(25),
SOBALANCE numeric(9,2)
)
INSERT INTO #t
SELECT DISTINCT c.CUSTNAME, s.SONO, p.INVDATE, p.PACKLISTNO, i.PART_NO, q.SOBALANCE
FROM [manex].[dbo].[SODETAIL]
INNER JOIN [manex].[dbo].[SOMAIN] s ON s.SONO = SODETAIL.SONO
INNER JOIN [manex].[dbo].[CUSTOMER] c ON c.CUSTNO = s.CUSTNO
INNER JOIN [manex].[dbo].[INVENTOR] i ON i.UNIQ_KEY = SODETAIL.UNIQ_KEY
INNER JOIN [manex].[dbo].[DUE_DTS] d ON d.SONO = s.SONO
INNER JOIN [manex].[dbo].[PLMAIN] p ON p.SONO = s.SONO
INNER JOIN [manex].[dbo].[PLDETAIL] q ON q.PACKLISTNO = p.PACKLISTNO
WHERE s.SONO LIKE '%12312'
SELECT * FROM #t
Here is a screenshot of the results from running this query:
Query Result
Now is when it should be time to join my due dates table (adding in the appropriate column(s) to my table definition and select statement) and make DATEDIFF comparisons to determine if shipments were on time or late. However, once I reference the due dates table, each of the 5 shipments is compared to all 5 dates in the due dates table, resulting in 25 rows. The only linking column DUE_DTS has is the SONO column. I've tried using DISTINCT and variations of the group by clause without success.
I've put enough together myself to figure joining the DUE_DTS table on SONO must be causing this to happen, as there are 5 instances of that value in the table (making it not unique) and a join should be based on a unique column. Is there a workaround for something like this?

You will need to use additional fields to join the records and reduce the results. You may need to link SONO to SODETAIL to DUE_DTS because the dates are tied to the items, not to the SONO.

Related

Left Join on three tables in access with Group by

I have broken my head with syntax error response from Access Jet engine.
I have three tables.
First one "tblMstItem" is the master table for Item details contains two columns "colITemID" PK and "colItemName"
Second one "tblStocks" is the table where the purchases are maintained. This table has a column "colRQty" which keeps the quantity of the particular item purchased. "colItemID" FK from "tblMstItem"
Third one "tblSales" is the table where the sales are maintained. This table has a column "colSoldQty" which keeps the quantity of the particular item sold. "colItemID" FK from "tblMstItem"
Therefore "colItemID" is common in all the three tables and has links created.
My requirement is I need all the Items listed in the "tblMstItem" table columns are "colItemID" "colItemName" and if there is any item purchased or any item sold should be shown as sum of that particular item.
I have used Left Join shown in the following select statement but it always giving me an error message.
Select statement as follows:
SELECT
i.colItemID,
i.colItemName,
s.rqty,
n.soldqty
from tblMstItem i
left join
( select sum( colRQty ) as rqty from tblStocks group by colItemID ) s
on i.colItemID = s.colItemID
left join
( select sum( colSoldQty ) as soldqty from tblSales group by colItemID ) n
on i.colItemID=n.colItemID``
I tried the above given code with many different syntax but every time I get syntax error. It is making me to doubt do MS Access support three table joins, I am sure I am wrong.
See the error Message below
Table columns and table link shown below
I would be very thankful to get any help on this. Access sql please because this I am able to get results in SQL Server.
Thanks in Advance

MS Access has a picky syntax. For instance, joins need extra parentheses. So, try this:
select i.colItemID, i.colItemName,
s.rqty, n.soldqty
from (tblMstItem as i left join
(select colItemID, sum(colRQty ) as rqty
from tblStocks
group by colItemID
) as s
on i.colItemID = s.colItemID
) left join
(select colItemID, sum( colSoldQty ) as soldqty
from tblSales
group by colItemID
) as n
on i.colItemID = n.colItemID;
You also need to select colItemID in the subqueries.

LEFT JOIN two tables with multiple AND conditions

I need help getting a sql statement correctly by joining two tables.
My goal is to return the number of purchases between certain purchase dates for given products where customer_id is null. The foreign key for table Purchases is prospect_id corresponding to id in Prospect
In separate SQL statements, I will have this:
SELECT COUNT (id) FROM Purchases
WHERE (purchasedate BETWEEN '5/1/18' AND '12/31/18')
AND (product = 'Scooter')
SELECT id
from Prospect
where customer_id is null
So, I am coming up with a query like this:
SELECT COUNT (id)
FROM Purchases
LEFT JOIN Prospect
ON Prospect.id = Purchases.prospect_id
AND (Purchases.purchasedate BETWEEN '5/1/18' AND '12/31/18')
AND Purchases.product = 'Scooter'
AND Prospect.customer_id is null;
but then I am getting ERROR: column reference "id" is ambiguous.

Use count(*):
SELECT COUNT(*)
FROM Purchases pu LEFT JOIN
Prospect pr
ON pr.id = pu.prospect_id AND
pr.customer_id is null
WHERE pu.purchasedate >= '2018-05-01' AND
pu.purchasedate < '2019-01-01' AND
pu.product = 'Scooter';
I made a few changes to the query.
First, the conditions on purchase are in the where clause rather than the on clause. Presumably, you actually want these to be filters.
Second, the dates use a proper format, YYYY-MM-DD.
I've also replaced the between with explicit comparisons. This means that the code works even when the "date" column has a time component.
Finally, I also introduced table aliases.

the reason for your error is that you did not define what table the "id" field you want to count is coming from.
SELECT
COUNT(PURCHASES.ID) AS PURCHASE_COUNT
FROM
Purchases
LEFT JOIN Prospect
ON Prospect.id = Purchases.prospect_id
AND (Purchases.purchasedate BETWEEN '5/1/18' AND '12/31/18')
AND Purchases.product = 'Scooter'
AND Prospect.customer_id is null;

Compare 2 tables and add missing records to the first, taking into account year/months

I have 2 tables, one with codes and budgets called FACT_QUANTITY_TMP and the other is a tree with all possible codes called C_DS_BD_AP_A.
All codes that exist are in this C_DS_BD_AP_A table, yet not all are in FACT_QUANTITY_TMP. Only those with budget get added by the ERP.
We need all codes to be in this FACT_QUANTITY_TMP table, just with budget to be 0 in that case.
I was trying first to get the missing codes by the following query:
SELECT T2.D_ACTIECODE From
(SELECT distinct
A.FULL_DATE as FULL_DATE, A.DIM03 as DIM03
FROM FACT_QUANTITY_TMP A) T1
RIGHT JOIN
(select distinct B.D_ACTIECODE AS D_ACTIECODE from C_DS_BD_AP_A B) T2
ON
T1.DIM03 = T2.D_ACTIECODE
where T1.DIM03 is null
order by T1.full_date
I get a list of my missing records yet it doesn't take into accounts the FULL_DATE (year and month) of the destination table.
In short, FACT_QUANTITY_TMP needs to have all records added that it's missing grouped by months and year.
Kind of looking for the best approach here, this query would be used in a automatically run stored proc every month when the ERP data gets pulled.

You can generate the missing records by doing a cross join to generate all combinations and then removing those that are already there. For example:
select fd.fulldate, c.D_ACTIECODE
from (select distinct fulldate from fact_quantity_tmp) fd cross join
(select D_ACTIECODE from C_DS_BD_AP_A) c left join
fact_quantity_tmp fqt
on fqt.fulldate = fd.fulldate and fqt.dim03 = c.D_ACTIECODE
where fqt.fulldate is null;
You can put an insert before this to insert these rows into the fact table.

Joining table issue with SQL Server 2008

I am using the following query to obtain some sales figures. The problem is that it is returning the wrong data.
I am joining together three tables tbl_orders tbl_orderitems tbl_payment. The tbl_orders table holds summary information, the tbl_orderitems holds the items ordered and the tbl_payment table holds payment information regarding the order. Multiple payments can be placed against each order.
I am trying to get the sum of the items sum(mon_orditems_pprice), and also the amount of items sold count(uid_orderitems).
When I run the following query against a specific order number, which I know has 1 order item. It returns a count of 2 and the sum of two items.
Item ProdTotal ProdCount
Westvale Climbing Frame 1198 2
This order has two payment records held in the tbl_payment table, which is causing the double count. If I remove the payment table join it reports the correct figures, or if I select an order which has a single payment it works as well. Am I missing something, I am tired!!??
SELECT
txt_orditems_pname,
SUM(mon_orditems_pprice) AS prodTotal,
COUNT(uid_orderitems) AS prodCount
FROM dbo.tbl_orders
INNER JOIN dbo.tbl_orderitems ON (dbo.tbl_orders.uid_orders = dbo.tbl_orderitems.uid_orditems_orderid)
INNER JOIN dbo.tbl_payment ON (dbo.tbl_orders.uid_orders = dbo.tbl_payment.uid_pay_orderid)
WHERE
uid_orditems_orderid = 61571
GROUP BY
dbo.tbl_orderitems.txt_orditems_pname
ORDER BY
dbo.tbl_orderitems.txt_orditems_pname
Any suggestions?
Thank you.
Drill down Table columns
dbo.tbl_payment.bit_pay_paid (1/0) Has this payment been paid, yes no
dbo.tbl_orders.bit_order_archive (1/0) Is this order archived, yes no
dbo.tbl_orders.uid_order_webid (integer) Web Shop's ID
dbo.tbl_orders.bit_order_preorder (1/0) Is this a pre-order, yes no
YEAR(dbo.tbl_orders.dte_order_stamp) (2012) Sales year
dbo.tbl_orders.txt_order_status (varchar) Is the order dispatched, awaiting delivery
dbo.tbl_orderitems.uid_orditems_pcatid (integer) Product category ID

It's a normal behavior, if you remove grouping clause you'll see that there really are 2 rows after joining and they both have 599 as a mon_orditems_pprice hence the SUM is correct. When there is a multiple match in any joined table the entire output row becomes multiple and the data that is being summed (or counted or aggregated in any other way) also gets summed multiple times. Try this:
SELECT txt_orditems_pname,
SUM(mon_orditems_pprice) AS prodTotal,
COUNT(uid_orderitems) AS prodCount
FROM dbo.tbl_orders
INNER JOIN dbo.tbl_orderitems ON (dbo.tbl_orders.uid_orders = dbo.tbl_orderitems.uid_orditems_orderid)
INNER JOIN
(
SELECT x.uid_pay_orderid
FROM dbo.tbl_payment x
GROUP BY x.uid_pay_orderid
) AS payments ON (dbo.tbl_orders.uid_orders = payments.uid_pay_orderid)
WHERE
uid_orditems_orderid = 61571
GROUP BY
dbo.tbl_orderitems.txt_orditems_pname
ORDER BY
dbo.tbl_orderitems.txt_orditems_pname
I don't know what data from tbl_payment you are using, are any of the columns from the SELECT list actually from tbl_payment? Why is tbl_payment being joined?

Some SQL Questions

I have been using SQL for years, but have mostly been using the query designer within SQL Studio (etc.) to put together my queries. I've recently found some time to actually "learn" what everything is doing and have set myself the following fairly simple tasks. Before I begin, I'd like to ask the SOF community their thoughts on the questions, possible answers and any tips they may have.
The questions are;
Find all records w/ a duplicate in a particular column (e.g. a linking id is in more than 1 record throughout table)
SUM price from a linked table within the same query (select within a select?)
Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER
Copy data from one table to another based on SELECT and WHERE criteria
Input welcomed & appreciated.
Chris

I recommend that you start by following some tutorials on this topic. Your questions are not uncommon questions for someone moving from a beginner to intermediate level in SQL. SQLZoo is an excellent resource for learning SQL so consider following that.
In response to your questions:
1) Find all records with a duplicate in a particular column
There are two steps here: find duplicate records and select those records. To find the duplicate records you should be doing something along the lines of:
select possible_duplicate_field, count(*)
from table
group by possible_duplicate_field
having count(*) > 1
What we're doing here is selecting everything from a table, then grouping it by the field we want to check for duplicates. The count function then gives me a count of the number of items within that group. The HAVING clause indicates that we want to filter AFTER the grouping to only show the groups which have more than one entry.
This is all fine in itself but it doesn't give you the actual records that have those values on them. If you knew the duplicate values then you'd write this:
select * from table where possible_duplicate_field = 'known_duplicate_value'
We can use the SELECT within a select to get a list of the matches:
select *
from table
where possible_duplicate_field in (
select possible_duplicate_field
from table
group by possible_duplicate_field
having count(*) > 1
)
2) SUM price from a linked table within the same query
This is a simple JOIN between two tables with a SUM of the two:
select sum(tableA.X + tableB.Y)
from tableA
join tableB on tableA.keyA = tableB.keyB
What you're doing here is joining two tables together where those two tables are linked by a key field. In this case, this is a natural join which operates as you would expect (i.e. get me everything from the left table which has a matching record in the right table).
3) Explain the difference between the 4 joins; LEFT, RIGHT, OUTER, INNER
Consider two tables A and B. The concept of "LEFT" and "RIGHT" in this case are slightly clearer if you read your SQL from left to right. So, when I say:
select x from A join B ...
The left table is "A" and the right table is "B". Now, when you explicitly say "LEFT" the SQL statement you are declaring which of the two tables you are joining is the primary table. What I mean by this is: Which table do I scan through first? Incidentally, if you omit the LEFT or RIGHT, then SQL implicitly uses LEFT.
For INNER and OUTER you are declaring what to do when matches don't exist in one of the tables. INNER declares that you want everything in the primary table (as declared using LEFT or RIGHT) where there is a matching record in the secondary table. Hence, if the primary table contains keys "X", "Y" and "Z", and the secondary table contains keys "X" and "Z", then an INNER will only return "X" and "Z" records from the two tables.
When OUTER is used, we're saying: Give me everything from the primary table and anything that matches from the secondary table. Hence, in the previous example, we'd get "X", "Y" and "Z" records in the output record set. However, there would be NULLs in the fields which should have come from the secondary table for key value "Y" as it doesn't exist in the secondary table.
4) Copy data from one table to another based on SELECT and WHERE criteria
This is pretty trivial and I'm surprised you've never encountered it. It's a simple nested SELECT in an INSERT statement (this may not be supported by your database - if not, try the next option):
insert into new_table select * from old_table where x = y
This assumes the tables have the same structure. If you have different structures then you'll need to specify the columns:
insert into new_table (list, of, fields)
select list, of, fields from old_table where x = y

Let's say you have 2 tables named :
[OrderLine] with the columns [Id, OrderId, ProductId, Qty, Status]
[Product] with [Id, Name, Price]
1) all orderline of command having more than 1 line (it's technically the same as looking for duplicates on OrderId :) :
select OrderId, count(*)
from OrderLine
group by OrderId
having count(*) > 1
2) total price for all order line of the order 1000
select sum(p.Price * ol.Qty) as Price
from OrderLine ol
inner join Product p on ol.ProductId = p.Id
where ol.OrderId = 1000
3) difference between joins:
a inner join b => take all a that has a match with b. if b is not found, a will be not be returned
a left join b => take all a, match them with b, include a even if b is not found
a righ join b => b left join a
a outer join b => (a left join b) union ( a right join b)
4) copy order lines to a history table :
insert into OrderLinesHistory
(CopiedOn, OrderLineId, OrderId, ProductId, Qty)
select
getDate(), Id, OrderId, ProductId, Qty
from
OrderLine
where
status = 'Closed'

To answer #4 and to perhaps show at least some understanding of SQL and the fact this isn't HW, just me trying to learn best practise;
SET NOCOUNT ON;
DECLARE #rc int
if #what = 1
BEGIN
select id from color_mapper where product = #productid and color = #colorid;
select #rc = ##rowcount
if #rc = 0
BEGIN
exec doSavingSPROC #colorid, #productid;
END
END
END

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas