Table 1: Customer Info Data (has ~50K records)
Table 2: Sales by customer ID (has ~25K records)
When I am performing a left join of Sales data from Table 2 on Table 1 based on the Customer ID, the output has a handful of records (~500) with a unit increase in the sales quantity. For e.g. sales quantity for customer #1 is 200, the sale quantity that I am getting in my joined output is 201. Note again that this is only for handful of records, for the majority of the data, it is joined absolutely correctly.
The SQL query is a pretty standard one:
SELECT [Table1$].[ID], Name, Volume, Amount
FROM [Table1$] LEFT JOIN [Table2$] ON [Table1$].[ID] = [Table2$].[ID]
What is weird is that this error is only for a few records and it is changing only by a unit for all of them. What do you think could be the potential reason here? Note that I run this SQL query from VBA.
Edit:
Maybe if I add an image, it would help, I have masked my data:
Table1, blue colored column is the joined column, I have filtered for the ID where there is a problem:
Table2, source of the joined column, as you can see it is 20, but the value that has been joined with table 1 is 21. Interestingly, there is no ID with value 21 in table2
There is certainly a space or something that makes seemingly identical IDs different. To make sure do a select distinct id on the sales table.
Related
I have a select query which combined several tables. PRODUCTION_ORDER_RESULTS, PRODUCTION_ORDERS and SERVICE_GUARANTY_NEW have common value however STOCKS table does not.
SELECT PR_ORDERS.ARRIVED_CITY,
PR_ORDERS.MONTAJ_DATE,
PR_ORDER_RESULT.TRANSFER_DATE,
PR_ORDERS.P_ORDER_ID,
PR_ORDER_RESULT.P_ORDER_ID,
SG.SALE_CONSUMER_ID,
SG.IS_SERI_SONU,
S.BRAND_ID,
S.PROPERTY
FROM workcube_test_1.PRODUCTION_ORDER_RESULTS PR_ORDER_RESULT,
workcube_test_1.PRODUCTION_ORDERS PR_ORDERS,
workcube_test_1.SERVICE_GUARANTY_NEW SG,
workcube_test_1.STOCKS S
WHERE PR_ORDER_RESULT.P_ORDER_ID = PR_ORDERS.P_ORDER_ID
AND PR_ORDER_RESULT.PR_ORDER_ID = SG.PROCESS_ID
when I run the query, it shows the output as below.
The problem here is there are four data rows returned from PRODUCTION_ORDER_RESULTS, PRODUCTION_ORDERS, SERVICE_GUARANTY_NEW and once I have added the STOCKS table, arrived_city, montaj_date, transfer_date columns are side by side with STOCKS table's rows, but the columns value should be null, not filled with data.
The way I tried is UNION of STOCKS table, however unioned table values are ignored, can not use them in html blocks.
there needs to be at least one more join condition among tables where there's for STOCKS table, I think there might exist such a column STOCK_ID within a table such as PRODUCTION_ORDER_RESULTS in order to join with STOCKS table. I think this should be the reason for multiple returning rows. If there's no common column, then the returning data will be produced as many as the number of records within STOCK table due to existing CROSS JOIN logic within the current query. So rearrange your query as
SELECT PR_ORDERS.ARRIVED_CITY,
PR_ORDERS.MONTAJ_DATE,
PR_ORDER_RESULT.TRANSFER_DATE,
PR_ORDERS.P_ORDER_ID,
PR_ORDER_RESULT.P_ORDER_ID,
SG.SALE_CONSUMER_ID,
SG.IS_SERI_SONU,
S.BRAND_ID,
S.PROPERTY
FROM workcube_test_1.PRODUCTION_ORDER_RESULTS PR_ORDER_RESULT
JOIN workcube_test_1.PRODUCTION_ORDERS PR_ORDERS
ON PR_ORDER_RESULT.P_ORDER_ID = PR_ORDERS.P_ORDER_ID
JOIN workcube_test_1.SERVICE_GUARANTY_NEW SG
ON PR_ORDER_RESULT.PR_ORDER_ID = SG.PROCESS_ID
JOIN workcube_test_1.STOCKS S
ON PR_ORDER_RESULT.STOCK_ID = S.ID
I'm with a doubt about joins. For example, using an example database dvdrental, this query:
SELECT customer.customer_id
, first_name
, last_name
FROM customer
INNER JOIN payment ON Customer.customer_id = Payment.customer_id
Some records appear repeated, for example, it appears 3 times "342 Harold Martino" like:
342 Harold Martino
342 Harold Martino
342 Harold Martino
Do you know why it appears repeated records like in this example that appears the same Record 3 times? This repetition means that there are 3 records in the payment table where customer_id = 342? But this query "select * from payment where customer_id = 342" returns 32 records. So I'm not understanding properly how the join works.
There are many resources around this, so to be short your expression says this in plain english:
Get all the records from the customer table
Then for each of those records, get every payment record that has the same value in the customer_Id field.
return a single row for each payment record that duplicates all the fields from the customer record for each row in the payment record.
Finally, only return 3 columns:
the customer_id column from the customer table
the first_name column that is in one of the customer or payment table
the last_name column that is in one of the customer or payment table
Note that we didn't bring back any columns from the payment table... (I assume first_name and last_name are fields in the customer table...)
Keep in mind, a CROSS JOIN (or a FULL OUTER JOIN) is a join that says take all fields from the left side and create a single row combination that is multiplied by the right side, so for every row on the left, return a combination of the left row with every row on the right. So the number of rows returned in a CROSS JOIN is the number of rows in the current table, multiplied by the number of rows in the joined table.
In your query, an INNER JOIN or LEFT INNER JOIN will produce a recordset that includes all the fields from the current table structure and will include fields from the joined table as well.
the implicit LEFT component specifies that only records that match the existing table structure should be returned, so in this case only Payment records that match a customer_id in the currently not filtered customer table will be returned.
The number of resulting rows is the number of rows in the joined table that have a match in the current table.
If instead you want to query:
Select all the customers that have payments
then you can use a join, but you should also use a DISTINCT clause, to only return the unique records:
SELECT DISTINCT customer.customer_id
, first_name
, last_name
FROM customer
INNER JOIN payment ON Customer.customer_id = Payment.customer_id
An alternative way to do this is to use a sub-query instead of a join:
SELECT customer_id
, first_name
, last_name
FROM customer
WHERE EXISTS (SELECT customer_id FROM payment WHERE payment.customer_id = customer.customer_id)
The rules on when to use one style of query over the other are pretty convoluted and very dependant on the number of rows in each table, the types of indexes that are available and even the type or version of the RDBMS you are operating within.
To optimise, run both queries, compare the results and timing and use the one that fits your database better. If later performance becomes an issue, try the other one :)
Select the Customer_id field
Good Evening Stackoverflow team/members
Oracle version: 11g Release 11.1.0.6.0 - 64bit Production
I have two Tables: ORDERS and ITEMS.
ORDERS looks like this
ORDERS
ITEMS looks like this
enter image description here
Table ORDERS has 1 or more Order_Number each assigned to at least one depot or more. the column Total_Value SHOULD store exactly the sum of the item values associated to an order number. the table ITEMS in fact stores via parcel_code items values for a specific order_number/s.
my db has a bug for orders that have more than one depot assigned to an order number (e.g. order number 1 and 4) do not store the actual total value correctly.
in my case I cannot figure out the UPDATE statement that would select order numbers that would take the sum from the table ITEMS and link it via parcel_code and update the column total_value in the table ORDERS.
the result of my update should give this back:
for table orders i should get back for
order number 1:
for both rows total value 1120
order number 4
for both rows total value: 350
order number 2 and 3 as they are single depot order remain the same:
50 and 20
pseudo-code :
update ORDERS
set total_value = (select sum(I.item_value) from ITEMS I, ORDERS O where O.parcel_code = I.parcel_code)
i would update also those orders which have on e depot assigned as they will exactly the same.
i was looking at MERGE statements or INNER select queries. the problem i am facing is that my update must be dynamic. this means not driven by values but by columns joins as i may have to create a process that update this every day.
can someone help me?
You need to join the items and orders tables together, then select the sum of all item_values where your order_number is equal to the row which you are updating (in table o1).
update
Orders o1
set o1.total_value = (
select sum(i.item_value) from Items i
join Orders o2 on o2.parcel_code = i.parcel_code
where o2.Order_number = o1.Order_number
)
I am using the following query to obtain some sales figures. The problem is that it is returning the wrong data.
I am joining together three tables tbl_orders tbl_orderitems tbl_payment. The tbl_orders table holds summary information, the tbl_orderitems holds the items ordered and the tbl_payment table holds payment information regarding the order. Multiple payments can be placed against each order.
I am trying to get the sum of the items sum(mon_orditems_pprice), and also the amount of items sold count(uid_orderitems).
When I run the following query against a specific order number, which I know has 1 order item. It returns a count of 2 and the sum of two items.
Item ProdTotal ProdCount
Westvale Climbing Frame 1198 2
This order has two payment records held in the tbl_payment table, which is causing the double count. If I remove the payment table join it reports the correct figures, or if I select an order which has a single payment it works as well. Am I missing something, I am tired!!??
SELECT
txt_orditems_pname,
SUM(mon_orditems_pprice) AS prodTotal,
COUNT(uid_orderitems) AS prodCount
FROM dbo.tbl_orders
INNER JOIN dbo.tbl_orderitems ON (dbo.tbl_orders.uid_orders = dbo.tbl_orderitems.uid_orditems_orderid)
INNER JOIN dbo.tbl_payment ON (dbo.tbl_orders.uid_orders = dbo.tbl_payment.uid_pay_orderid)
WHERE
uid_orditems_orderid = 61571
GROUP BY
dbo.tbl_orderitems.txt_orditems_pname
ORDER BY
dbo.tbl_orderitems.txt_orditems_pname
Any suggestions?
Thank you.
Drill down Table columns
dbo.tbl_payment.bit_pay_paid (1/0) Has this payment been paid, yes no
dbo.tbl_orders.bit_order_archive (1/0) Is this order archived, yes no
dbo.tbl_orders.uid_order_webid (integer) Web Shop's ID
dbo.tbl_orders.bit_order_preorder (1/0) Is this a pre-order, yes no
YEAR(dbo.tbl_orders.dte_order_stamp) (2012) Sales year
dbo.tbl_orders.txt_order_status (varchar) Is the order dispatched, awaiting delivery
dbo.tbl_orderitems.uid_orditems_pcatid (integer) Product category ID
It's a normal behavior, if you remove grouping clause you'll see that there really are 2 rows after joining and they both have 599 as a mon_orditems_pprice hence the SUM is correct. When there is a multiple match in any joined table the entire output row becomes multiple and the data that is being summed (or counted or aggregated in any other way) also gets summed multiple times. Try this:
SELECT txt_orditems_pname,
SUM(mon_orditems_pprice) AS prodTotal,
COUNT(uid_orderitems) AS prodCount
FROM dbo.tbl_orders
INNER JOIN dbo.tbl_orderitems ON (dbo.tbl_orders.uid_orders = dbo.tbl_orderitems.uid_orditems_orderid)
INNER JOIN
(
SELECT x.uid_pay_orderid
FROM dbo.tbl_payment x
GROUP BY x.uid_pay_orderid
) AS payments ON (dbo.tbl_orders.uid_orders = payments.uid_pay_orderid)
WHERE
uid_orditems_orderid = 61571
GROUP BY
dbo.tbl_orderitems.txt_orditems_pname
ORDER BY
dbo.tbl_orderitems.txt_orditems_pname
I don't know what data from tbl_payment you are using, are any of the columns from the SELECT list actually from tbl_payment? Why is tbl_payment being joined?
I have 3 data tables.
In the entries data table I have entries with ID (entryId as primary key).
I have another table called EntryUsersRatings in there are multiple entries that have entryId field and a rating value (from 1 to 5).
(ratings are stored multiple times for one entryId).
Columns: ratingId (primary key), entryId, rating (integer value).
In the third data table I have translations of entries in the first table (with entryId, languageId and title - translation).
What I would like to do is select all entries from first data table with their titles (by language ID).
On a top of that I want average rating of each entry (which can be stored multiple times) that is stored in EntryUsersRatings.
I have tried this:
SELECT entries.entryId, EntryTranslations.title, AVG(EntryUsersRatings.rating) AS AverageRating
FROM entries
LEFT OUTER JOIN
EntryTranslations ON entries.entryId = EntryTranslations.entryId AND EntryTranslations.languageId = 1
LEFT OUTER JOIN
EntryUsersRatings ON entries.entryId = EntryUsersRatings.entryId
WHERE entries.isDraft=0
GROUP BY title, entries.entryId
isDraft is just something that means that entries are not stored with all information needed (just incomplete data - irrelevant for our case here).
Any help would be greatly appreciated.
EDIT: my solution gives me null values for rating.
Edit1: this query is working perfectly OK, I was looking into wrong database.
We also came to another solution, which gives us the same result (I hope someone will find this useful):
SELECT entries.entryId, COALESCE(x.EntryUsersRatings, 0) as averageRating
FROM entries
LEFT JOIN(
SELECT rr.entryId, AVG(rating) AS entryRating
FROM EntryUsersRatings rr
GROUP BY rr.entryId) x ON x.entryId = entries.entryId
#CyberHawk: as you are using left outer join with entries, your result will give all records from left table and matching record with your join condition from right table. but for unmatching records it will give you a null value .
check out following link for the deta:
http://msdn.microsoft.com/en-us/library/ms187518(v=sql.105).aspx