Unexpected query results with multiple joins - mysql - sql

I have a multi-table setup to handle online purchase transactions. The main table I'm getting data from is a purchase activity table. It contains ShipAcctID, which connects it to an addresses table, and AcctID, which connects it to a users table - sort of. The AcctID field is a reference to the user's account ID, which is stored in the users table, but what I need to return is their shipping address, which is connected in the addresses table via the customer_id field.
To try to clear that up, here's what the tables actually look like.
Purchases table
ID | PurchAmt | AcctID | ShipAddrID
================================================
1 | 30.99 | 25 | 420
2 | 45.22 | 31 | 209
Users table
ID | Name
=================================
25 | Anastasia Beaverhausen
31 | Charles Beaverhausen
45 | Bennie Beaverhausen
Addresses table
ID | customer_id | name | address
==========================================================================
300 | 25 | Anastasia Beaverhausen | 123 Park Avenue
209 | 31 | Charles Beaverhausen | 500 5th Avenue
420 | 45 | Bennie Beaverhausen | 500 North Michigan Avenue
What I need to do is return something like this:
PurchaseID | PurchAmt | billname |billAddress |shipName | shipaddress
====================================================================================================================================
1 | 30.99 | Anastasia Beaverhausen |123 Park Avenue |Bennie Beaverhausen |500 North Michigan Avenue
So I need to get the billaddress by joining purchases to addresses via the purchases.AcctID = addresses.customer_id relationship; then get the shipaddress by joining purchases directly to addresses via the purchases.ShipAddrID = addresses.id relationship. It makes sense in my head, anyway. But when I run the query, I get multiple rows per purchase ID, like this:
PurchActvtyID | billName | billAddress1 |shipName | shipAddress1
==================================================================================================================================
1535 | Anastasia Beaverhausen | 123 Park Avenue |Bennie Beaverhausen | 500 North Michigan Avenue
1535 | Bennie Beaverhausen | 500 North Michigan Avenue | Bennie Beaverhausen | 500 North Michigan Avenue
Can anyone explain why this is happening? I'm sure it's probably something to do with which kind of join to use, but I can't seem to get the correct results no matter which kind I try. Here's my query:
SELECT p.PurchActvtyID, a1.name AS billName, a1.address1 AS billAddress1, a2.name AS shipName, a2.address1 AS shipAddress1
FROM arrc_PurchaseActivity p
LEFT OUTER JOIN jos_customers_addresses a1 ON p.AcctID = a1.customer_id
LEFT OUTER JOIN jos_customers_addresses a2 ON p.ShipAddrID = a2.id
ORDER BY p.PurchActvtyID ASC
EDITS
results of Stephen's query:
PurchActvtyID | billName | shipName | billAddress | shipAddress
========================================================================================
1535 | Esther Strom | Esther Strom |123 Park Avenue | 500 North Michigan Avenue
1535 | Esther Strom | Esther Strom |500 North Michigan Avenue |500 North Michigan Avenue
The reason for the name being different from what I showed in my example of desired outcome is that your query is pulling the name from the users table, which isn't accurate - the user name isn't necessarily the same as the billing or shipping name asssociated with a given user. This is why I need to pull those values from the addresses table, not the users table.

Although there aren't duplicates as such, you can have multiple addresses for a single customer ID - this is what appears to be happening in the example, as a single purchase (1535) is returning multiple billing addresses (both 123 Park Avenue and 500 North Michigan Avenue).
Normally, a customer could have many billing addresses (over time), although there is normally only one billing address per transaction. I therefore suggest adding a BillAddrID field to arrc_PurchaseActivity (if it doesn't already have one), and changing the linkage to jos_customers_addresses alias a1 to be p.BillAddrID = a2.id.
Alternatively, you may want (or already have) only one billing address per customer, in which case you should add a billing address ID field to your customer table (users, in the question), and then change the query to link from the purchase table to the customer table, and then from the customer's billing address ID to the address table, to return a single billing address for the transaction.
EDIT, following comments:
The following query should resolve the issue of multiple addresses being returned on the billing alias:
SELECT p.PurchActvtyID, a1.name AS billName, a1.address1 AS billAddress1, a2.name AS shipName, a2.address1 AS shipAddress1
FROM arrc_PurchaseActivity p
LEFT OUTER JOIN jos_customers_addresses a1 ON p.AcctID = a1.customer_id and a1.billing = 1
LEFT OUTER JOIN jos_customers_addresses a2 ON p.ShipAddrID = a2.id
ORDER BY p.PurchActvtyID ASC

To create the output you want you only need two tables:
Purchases table
Addresses table
However, you need to join to the Addresses table twice:
Once for the Billing Address
Once for the Shipping Address
SELECT
t1.ID as PurchaseID,
t1.PurchAmt as PurchAmt,
t2.name as billname,
t2.address as billaddress,
t3.name as shipname,
t3.address as shipaddress
FROM Purchases t1
INNER JOIN Addresses t2
ON t1.AcctID
= t2.customer_id
INNER JOIN Addresses t3
ON t1.ShipAddrID
= t3.ID
The 1st INNER JOIN links to the billing information
The 2nd INNER JOIN links to the shipping information

I did get an answer to this on experts-exchange. All I needed to do was use a group by clause. So my query now looks like this:
SELECT p.PurchActvtyID, a1.name AS billName, a1.address1 AS billAddress1, a2.name AS shipName, a2.address1 AS shipAddress1
FROM arrc_PurchaseActivity p
LEFT OUTER JOIN jos_customers_addresses a1 ON p.AcctID = a1.customer_id
LEFT OUTER JOIN jos_customers_addresses a2 ON p.ShipAddrID = a2.id
GROUP BY p.PurchActvtyID
ORDER BY p.PurchActvtyID ASC

I don't think there's a reason to join the same table twice based on your criteria...
Try:
SELECT p.PurchActvtyID, a1.name AS billName, a1.address1 AS billAddress1, a1.name AS shipName, a1.address1 AS shipAddress1
FROM arrc_PurchaseActivity p
LEFT OUTER JOIN jos_customers_addresses a1 ON p.AcctID = a1.customer_id AND p.ShipAddrID = a1.id
ORDER BY p.PurchActvtyID ASC
Note - I work with SQL Server, I'm assuming syntax will work with MySQL.
Edit:
I suspect it's happening because you are doing two left outer joins to the same table on different criteria.

Related

Oracle SQL query partially including the desired results

My requirement is to display country name, total number of invoices and their average amount. Moreover, I need to return only those countries where the average invoice amount is greater than the average invoice amount of all invoices.
Query for Oracle Database
SELECT cntry.NAME,
COUNT(inv.NUMBER),
AVG(inv.TOTAL_PRICE)
FROM COUNTRY cntry JOIN
CITY ct ON ct.COUNTRY_ID = cntry.ID JOIN
CUSTOMER cst ON cst.CITY_ID = ct.ID JOIN
INVOICE inv ON inv.CUSTOMER_ID = cst.ID
GROUP BY cntry.NAME,
inv.NUMBER,
inv.TOTAL_PRICE
HAVING AVG(inv.TOTAL_PRICE) > (SELECT AVG(TOTAL_PRICE)
FROM INVOICE);
Result: Austria 1 9500
Expected: Austria 2 4825
Schema
Country
ID(INT)(PK) | NAME(VARCHAR)
City
ID(INT)(PK) | NAME(VARCHAR) | POSTAL_CODE(VARCHAR) | COUNTRY_ID(INT)(FK)
Customer
ID(INT)(PK) | NAME(VARCHAR) | CITY_ID(INT)(FK) | ADDRS(VARCHAR) | POC(VARCHAR) | EMAIL(VARCHAR) | IS_ACTV(INT)(0/1)
Invoice
ID(INT)(PK) | NUMBER(VARCHAR) | CUSTOMER_ID(INT)(FK) | USER_ACC_ID(INT) | TOTAL_PRICE(INT)
With no sample data, we can't really tell whether this:
Expected: Austria 2 4825
is true or not.
Anyway: would changing the GROUP BY clause to
GROUP BY cntry.NAME
(i.e. removing additional two columns from it) do any good?
`SELECT C.COUNTRY_NAME,COUNT(I.INVOICE_NUMBER),AVG(I.TOTAL_PRICE) AS AVERAGE
FROM COUNTRY AS C JOIN CITY AS CS ON C.ID=CS.COUNTRY_ID
JOIN CUSTOMER AS CUS ON CUS.CITY_ID=CS.ID
JOIN INVOICE AS I ON I.CUSTOMER_ID=CUS.ID
GROUP BY C.COUNTRY_NAME,C.ID
HAVING AVERAGE>(SELECT AVG(TOTAL_PRICE) FROM INVOICE`
would changing the GROUP BY clause to
GROUP BY cntry.NAME , cntry.ID
Fix your group by columns.
Keep only cntry.name.
It will work.
This is a hackerrank question.

Coalesce value column on basis of other column without using multiple left join

So, I have a table like this:
|---------------------|------------------|------------------|
| ID | Region |isProductAvailable|
|---------------------|------------------|------------------|
| 12 | USA | Yes |
|---------------------|------------------|------------------|
| 13 | Ohio | No |
|---------------------|------------------|------------------|
| 14 | Australia | Yes |
|---------------------|------------------|------------------|
Now, The use-case that I have is, there is a product, and it's availability is based on hierarchy that is predefined.
For example:
USA -> Ohio
Australia -> Sydney
Case 1: Now whenever I am querying in this product table, I want to check if it is available in Ohio. Since there is an entry for Ohio. The result should be returned.
Case 2: Now whenever I am querying for Sydney, the table does not contain Sydney, so it should search for it's parent in hierarchy specified above. Since there is an entry available for Australian the value for Australia should be returned.
P.S. I have solved this problem with left join and coalesce, but the problem with that is the number of left join increase as the length of specified hierarchy increases.
select coalesce(rgn_Oh.isProductAvailable,rgn_USA.isProductAvailable)
from
(select t.* from t where region = 'Ohio') rgn_Oh
left join
(select t.* from t where region = 'USA') rgn_USA
on rgn_Oh.id = rgn_USA.id;
If I understand correctly, you can use order by for this:
select t.*
from t
where region in ('USA', 'Ohio', 'Australia', 'Sydney')
order by (case region
when 'Sydney' then 1
when 'Australia' then 2
when 'Ohio' then 3
when 'USA' then 4
end)
fetch first 1 row only;

How can I do a group-concat call with a max value?

I'm tracking game prices across multiple stores. I have a games table:
id | title | platform_id
---|-------------|-----------
1 | Super Mario | 1
2 | Tetris | 3
3 | Sonic | 2
a stores table:
id | title
---|-------------
1 | Target
2 | Amazon
3 | EB Games
and a copies table with one entry for Target's copy of a given game, one entry for Amazon's, etc. I store the SKU so I can use it when scraping their websites.
game_id | store_id | sku
--------|----------|----------
1 | 2 | AMZ-3F4YK
1 | 3 | 001481
I run one scrape a day or a week or however long, and I store the result as cents in a prices table:
sku | price | time
----------|---------|------
AMZ-3F4YK | 4010 | 13811101
001481 | 3210 | 13811105
Plus a platforms table that just maps IDs to names.
Here's where I get confused and stuck.
I want to issue a query that selects each game, plus its most recent price at each store. So it would net results like
games.title | platform_name | info
------------|---------------|------
Super Mario | NES | EB Games,1050;Amazon,3720;Target,5995
Tetris | Game Boy | EB Games,3720;Amazon,410;Target,5995
My best attempt thus far is
select
games.title as title,
platforms.name as platform,
group_concat(distinct(stores.name) || "~" || prices.price) as price_info
from games
join platforms on games.platform_id = platforms.id
join copies on copies.game_id = games.id
join prices on prices.sku = copies.sku
join stores on stores.id = copies.store_id
group by title
Which nets results like
Super Mario | NES | EB Games~2300,Target~2300,Target~3800
that is, it includes every price listed, when I only want one per store (and for it to be the most recent). Figuring out how to integrate the 'select price where id = (select id from max(time)...' etc subquery to sort this out has totally stumped me all night and I'd appreciate any advice anyone could offer me.
I'm using SQLite, but if there's a better option in Postgres I could do it there.
You need two levels of aggregation . . . And, Postgres is much simpler for this, so I'll use Postgres syntax:
select title, platform,
string_agg(s.name || '~' pr.price order by s.name)
from (select distinct on (g.title, p.name, s.name) g.title as title, p.name as platform, s.name, pr.price
from games g join
platforms p
on g.platform_id = p.id join
copies c
on c.game_id = g.id join
prices pr
on pr.sku = c.sku join
stores s
on s.id = c.store_id
group by g.title, p.name, s.name, pr.time desc
) gps
group by title, platform

Getting count of cases of a variable from one table such that its entries are different in an another column of different table in SQLlite?

So here it is,
I've from customers table:
customersid (unique for each customers)
customers' city
from invoices table:
billing city
customers id
Now i've to find customers id whose billing city is different from city they live (customers_city). My code is this:
Select Customers.customerid, Customers.city, Invoices.Billingcity
From Customers Inner join
Invoices
ON customers.city <> invoices.billingcity
Now the problem is that count of unique customer_id (1,2,3,4) and number of mismatch cases in another column. But what I am getting is something like this:
(read it like this, after billing city, when the 1 (customer_id) repeats it indicates its a new entry).. I don't know how to format this column, sorry
CustomerId | City | BillingCity |
| 1 | São José dos Campos | Stuttgart |
| 1 | São José dos Campos | Oslo |
| 1 | São José dos Campos | Brussels |
(Output limit exceeded, 10 of 23812 total rows shown)
You need to join on the customer id and then compare the cities:
Select c.customerid, c.city, i.Billingcity
From Customers c join
Invoices i
on c.customerid = i.customerid
where c.city <> i.billingcity;

Condition SQL Join

We have a table of users where most have a school number attached. I've then used an Inner Join to join the school's table to get the name of the school. Some users don't have a school number so there is a NULL value - which means that none of their data is appearing. Is there a way I can do a conditional join dependent on the schoolid field?
Users Table:
Name | Schoolid
-----|---------
John | 27
Fred | 49
Sam | NULL
School Table:
Schoolid | Schoolname
----------|-----------
27 | John's School
49 | Fred's School
When the tables are Joined on the Schoolid the results are
Name | Schoolname
-----|-----------
John | John's School
Fred | Fred's School
Ideally I would like the results to look like this:
Name | Schoolname
-----|-----------
John | John's School
Fred | Fred's School
Sam | NULL
Can anybody help? Is it something simple and I'm just being an idiot?
Thanks
You're looking for an outer join.
E.g.
select * from Users left outer join School on Users.Schoolid = School.Schoolid
Microsoft has an article with some examples that may make it more clear, even if you're using a different SQL dialect.
SELECT *
FROM Users u
LEFT OUTER JOIN School s ON u.Schoolid = s.Schoolid
Or even.
SELECT *
FROM Users u
FULL OUTER JOIN School s ON u.Schoolid = s.Schoolid