New to SQL. Query SQL database using info across three tables

New to SQL. Query SQL database using info across three tables - sql

This is using phpMyAdmin.
I need to find the contact information for Subscribers who have pending Orders on November 15th. Their contact information is stored in a table called Subscribers, and the primary key is UID (User ID). The Subscriptions Table has a primary key called SID (Subscriptions ID). The Subscriptions table also stores the UID for each Subscription. However, the Orders table is where the Date is stored, and this table stores the SID but not the UID, so I can't directly JOIN Orders with Subscribers.
I have to JOIN Orders with Subscriptions on SID where the Orders Date is 11-15-10, and then I have to JOIN the resulting table with the Subscribers table on UID.
I'm currently trying this:
SELECT * FROM Subscribers
RIGHT JOIN (Orders a, Subscriptions b, Subscribers c)
ON (a.SID = b.SID AND b.UID = c.UID)
WHERE a.Date = '2010-11-01'
This is causing a massive lag followed by Gateway Timeout.
This is a classic case of knowing what to do, but not knowing how to do it. Any help would be greatly appreciated. Thanks!

You could try this:
SELECT
scrb.*
FROM
Subscribers scrb
WHERE
scrb.UID in (
SELECT DISTINCT
scrp.UID
FROM
Subscriptions scrp
INNER JOIN Orders ordr ON
ordr.SID = scrp.SID
WHERE
ordr.Date = STR_TO_DATE('2010-11-01')
)
Not sure if you're going to have a big performance improvement though... Maybe your tables miss a better indexing strategy...?
In fact, you should try executing just the inner query (SELECT DISTINCT scrp.UID...) first... If it is too slow, I would guess your problem is on the Orders.Date field - a full scan over that table probably has a high performance cost.

Why do you join Subscribers to Subscribers?
SELECT * FROM Subscribers ... JOIN ... Subscribers c)

Given the limited amount we know about your schema, it seems like you'd do better with an INNER JOIN, which will filter records for you, and #seriyPS is right about the redundant Subscribers table - currently, this query as written is performing a CROSS JOIN, joining all Subscribers to every result of Subscriber joined to Subscription joined to Order...
Is there a reason why this won't work?
SELECT a.*
FROM Subscribers a
INNER JOIN Subscriptions b ON a.UID = b.UID
INNER JOIN Orders c ON b.SID = c.SID
WHERE c.Date = '2010-11-01'

Related

How to join 4 tables in SQL?

I just started using SQL and I need some help. I have 4 tables in a database. All four are connected with each other. I need to find the amount of unique transactions but can't seem to find it.
Transactions
transaction_id pk
name
Partyinvolved
transaction.id pk
partyinvolved.id
type (buyer, seller)
PartyCompany
partyinvolved.id
Partycompany.id
Companies
PartyCompany.id pk
sector
pk = primary key
The transaction is unique if the conditions are met.
I only need a certain sector out of Companies, this is condition1. Condition2 is a condition inside table Partyinvolved but we first need to execute condition1. I know the conditions but do not know where to put them.
SELECT *
FROM group
INNER JOIN groupB ON groupB.group_id = group.id
INNER JOIN companies ON companies.id = groupB.company_id
WHERE condition1 AND condition2 ;
I want to output the amount of unique transactions with the name.

It is a bit unclear what you are asking as your table definitions look like your hinting at column meanings more than names such as partycompany.id you are probably meaning the column that stores the relationship to PartyCompany column Id......
Anyway, If I follow that logic and I look at your questions about wanting to know where to limit the recordsets during the join. You could do it in Where clause because you are using an Inner Join and it wont mess you your results, but the same would not be true if you were to use an outer join. Plus for optimization it is typically best to add the limiter to the ON condition of the join.
I am also a bit lost as to what exactly you want e.g. a count of transactions or the actual transactions associated with a particular sector for instance. Anyway, either should be able to be derived from a basic query structure like:
SELECT
t.*
FROM
Companies co
INNER JOIN PartyCompancy pco
ON co.PartyCompanyId = pco.PartyCompanyId
INNER JOIN PartyInvolved pinv
ON pco.PartyInvolvedId = pinv.PartyInvolvedId
AND pinv.[type] = 'buyer'
INNER JOIN Transactions t
ON ping.TransactionId = t.TransactionId
WHERE
co.sector = 'some sector'

Select SQL table content based on content in foreign table

I have a database holding informations about different jobs.
The jobs can either be for internal customers or external customers.
I need to select the rows in the table Job, which points to a record in customer where isInternal is set to true
I've tried to use innerjoins:
select Job.* from Job as Job
INNER JOIN Task as Task
ON Job.JobID = Task.JobID
Inner Join Customer as Customer
ON Task.CustomerID = Customer.CustomerID
But this way i will end up with a lot of duplicates in the job.
I tried to use distinct as well, but i end up with less rows than i actually have.
Can anyone point me in the right direction regarding how to approach this kind of task with sql?
In the end this will be used in a SSIS package, for loading data into a staging layer of a DWH

If you want jobs where any task has an internal customer, you can use exists:
select j.*
from Job
where exists (select 1
from Task t join
Customer c
on t.CustomerID = c.CustomerID
where j.JobID = t.JobID and
c.isInternal = 1
);

SQL: Showing data from two tables together

I'm just starting in the SQL world, so I have a very noob question:
I have 2 tables:
clients (columns: client_id and name)
accounts (columns: account_id and client_id)
and I need to write a query that shows the accounts of all the clients.
But, the problem is that not all the clients have accounts, if the client doesn't have one: how can I show the client_id, the name and NULL for the account_id column?

This query should work:
SELECT *
FROM accounts
LEFT [OUTER] JOIN clients
ON accounts.client_id = clients.client_id;
if not try this one:
SELECT *
FROM accounts
LEFT [OUTER] JOIN clients
ON accounts.client_id = clients.client_id WHERE clients.client_id IS NOT NULL;
These are plain SQL queries, I mean they are not PL-SQL specific. LEFT [OUTER] JOIN will only returns the columns of accounts table. [OUTER] keyword is optional, it defers from database version to version. ON accounts.client_id = clients.client_id will match client_id columns in both tables. Lastly, WHERE clients.client_id IS NOT NULL part should prevent the rows with NULL values in client_id cells.
Useful link: https://www.techonthenet.com/oracle/joins.php

Try this query it returns the clients name client id and shows null to those client who has no accountid.
select clients.name, accounts.account_id from accounts left join clients on
accounts.clintid=clients.client_id

What's the most efficient way to exclude possible results from an SQL query?

I have a subscription database containing Customers, Subscriptions and Publications tables.
The Subscriptions table contains ALL subscription records and each record has three flags to mark the status: isActive, isExpire and isPending. These are Booleans and only one flag can be True - this is handled by the application.
I need to identify all customers who have not renewed any magazines to which they have previously subscribed and I'm not sure that I've written the most efficient SQL query. If I find a lapsed subscription I need to ignore it if they already have an active or pending subscription for that particular magazine.
Here's what I have:
SELECT DISTINCT Customers.id, Subscriptions.publicationName
FROM Subscriptions
LEFT JOIN Customers
ON Subscriptions.id_Customer = Customers.id
LEFT JOIN Publications
ON Subscriptions.id_Publication = Publications.id
WHERE Subscriptions.isExpired = 1
AND NOT EXISTS
( SELECT * FROM Subscriptions s2
WHERE s2.id_Publication = Subscriptions.id_Publication
AND s2.id_Customer = Subscriptions.id_Customer
AND s2.isPending = 1 )
AND NOT EXISTS
( SELECT * FROM Subscriptions s3
WHERE s3.id_Publication = Subscriptions.id_Publication
AND s3.id_Customer = Subscriptions.id_Customer
AND s3.isActive = 1 )
I have just over 50,000 subscription records and this query takes almost an hour to run which tells me that there's a lot of looping or something going on where for each record the SQL engine is having to search again to find any 'isPending' and 'isActive' records.
This is my first post so please be gentle if I've missed out any information in my question :) Thanks.

I don't have your complete database structure, so I can't test the following query but it may contain some optimization. I will leave it to you to test, but will explain why I have changed, what I have changed.
select Distinct Customers.id, Subscriptions.publicationName
from Subscriptions
join Customers on Subscriptions.id_Customer = Customer.id
join Publications
ON Subscriptions.id_Publication = Publications.id
Where Subscriptions.isExpired = 1
And Not Exists
(select * from Subscriptions s2
join Customers on s2.id_Customer = Customer.id
join Publications
ON s2.id_Publication = Publications.id
where s2.id_Customer = s2.id_customer and
(s2.isPending = 1 or s2.isActive = 1))
If you have no resulting data in Customer or Publications DB, then the Subscription information isn't useful, so I eliminated the LEFT join in favor of simply join. Combine the two Exists subqueries. These are pretty intensive if I recall so the fewer the better. Last thing which I did not list above but may be worth looking into is, can you run a subquery with specific data fields returned and use it in an Exists clause? The use of Select * will return all data fields which slows down processing. I'm not sure if you can limit your result unfortunately, because I don't have an equivalent DB available to me that I can test on (the google probably knows).
I suspect there are further optimizations that could be made on this query. Eliminating the Exists clause in favor of an 'IN' clause may help, but I can't think of a way right now, seeing how you've got to match two unique fields (customer id and the relevant subscription). Let me know if this helps at all.
With a table of 50k rows, you should be able to run a query like this in seconds.

Get an array in a field after a JOIN

Let's say that I have a table called Appointments, another called Clients and a joining table called ClientsAppointments which relates the two aforementioned ones. This is used in a scenario where a client can have several appointments, and several clients can attend to the same appointment (n-n relation).
I would like to list the Appointments with a field "clients" being an array of all the Clients related to that appointment. What I've tried so far:
SELECT * FROM Appointments a
INNER JOIN ClientsAppointments ca ON ca.IdAppointment = a.IdAppointment
INNER JOIN Clients c ON c.IdClient = ca.IdClient
That doesn't work, of course. It gives me a list of appointments repeated with each of the clients they have. Then in PHP I would process them to achieve this. It seemed more efficient this way rather than making multiple queries because there's usually only one client per appointment.
Table schema (these are not the actual tables, but simplified to illustrate the case):
Appointment(
INT idAppointment,
DATETIME start,
DATETIME end)
Clients(
INT idClient,
VARCHAR name)
ClientsAppointments(
INT idAppointment,
INT idClient)

If you use MySQL as a database try to use GROUP_CONCAT:
SELECT a.IdAppointment,GROUP_CONCAT(CAST(ca.IdClient AS CHAR) SEPARATOR ',') FROM Appointments a
INNER JOIN ClientsAppointments ca ON ca.IdAppointment = a.IdAppointment
group by a.IdAppointment

Try to use LEFT OUTER JOIN
SELECT a.*,c.Name FROM ClientsAppointments ca
LEFT OUTER JOIN Appointments a ON ca.IdAppointment = a.IdAppointment
LEFT OUTER JOIN Clients c ON ca.IdClient = c.IdClient

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas