I am new to sql and have a question about joining 2 tables. Why is there a . in between customers.custnum in this example. What is its significance and what does it do?
Ex.
Select
customers.custnum, state, qty
From
customers
Inner join
sales On customers.custnum = sales.custnum
The . is to specify a column of a table.
Let's use your customer table; we could do:
SELECT c.custnum, c.state, c.qty FROM customers as c INNER JOIN
sales as s ON c.custnum = s.custnum
You don't really need the . unless two tables have columns with the same name.
In the below query, there are two tables being referred. One is CUSTOMERS another is STATE. Since both has same column CUSTNUM, we need a way to tell the database which CUSTNUM are we referring to. Same as there may be many Bob's, if so their last name is used for disambiguation.
I would consider the below style as more clearer. That's opinionated.
Select
cust.custnum, cust.state, s.qty
From
customers cust -- use alias for meaningful referencing, you may be self-joining, during that time you can use cust1, cust2 as aliases.
Inner join
sales as s On cust.custnum = s.custnum
Think of it as a way to categorize the hierarchical nature of the database. Within a DB, there are tables, and within tables there are columns. It's just a way of keeping track, especially if you are working with multiple tables that may have the same column name.
For example, a table called Sales and a table called Customers might both have a column called Date. You may be writing a query where you only want the date from the Sales table, so you would specify that by writing:
Select *
From Sales
inner join Customers on Sales.ID = Customers.ID
where Sales.Date = '1/1/2019'
Related
I have a distinct list of part numbers from one table. It is basically a table that contains a record of all the company's part numbers. I want to add columns that will pull data from different tables but only pertaining to the part number on that row of the distinct part list.
For example: if I have part A, B, C from the unique part list I want to add columns for Purchase quantity, repair quantity, loan quantity, etc... from three totally unique tables.
So it's almost like I need 3 subqueries that will sum of that data from the different tables for each part.
Can anybody steer me in the direction of how to do this? Please and thank you so much!
One method is correlated subqueries. Something like this:
select p.*,
(select count(*)
from purchases pu
where pu.part_id = p.part_id
) as num_purchases,
(select count(*)
from repairs r
where r.part_id = p.part_id
) as num_repairs,
(select count(*)
from loans l
where l.part_id = p.part_id
) as num_loans
from parts p;
Another option is joins with aggregation before the join. Or lateral joins (which are quite similar to correlated subqueries).
I'm learning SQL language using online resources, but mostly using queries that my predecessors have written at my company. I'm editing fields correspondingly to produce the correct results. But I want to understand more.
I have a few questions about this section of code.
1. Why is there a "p" before TrackingNumber, and oh/cc/im in front of others?
It seems to matter which I choose, so I just use trial and error until it runs.
2. Why do I need to have tracking number - when I delete this line, the code won't run!
select
p.TrackingNumber
,im.Sku
,oh.BusinessUnitCode
,cc.Qty
,oh.ShipCode
,oh.OrigShipCode
,oh.Store
,convert(date,oh.ShipTime) as 'OrderDate'
,oh.ShipToName
,oh.OrderNumber
from dmhost.tblOrderHeader oh
join dmhost.tblContainer c on oh.OrderHeaderID = c.OrderHeaderID
join dmhost.tblPackage p on c.ContainerID = p.ContainerID
join dmhost.tblContainerContents cc on c.ContainerID = cc.ContainerID
join dmhost.tblItemMaster im on im.ItemMasterID = cc.ItemMasterID
where (oh.ShipTime between '04/07/2019' and '05/05/2019')
The letters you talk about are referring to table names (or aliases).
Example using aliases would be:
SELECT c.customerName, o.orderNumber from Customers c
INNER JOIN Orders o on c.id=o.customerid
Same query without aliases:
SELECT Customers.customerName, Orders.orderNumber from Customers
INNER JOIN Orders on Customers.id=Orders.customerid
or omitting table names
SELECT customerName, orderNumber from Customers
INNER JOIN Orders on Customers.id=Orders.customerid
The table denomination is specially important when you retrieve columns with the same name from different table. For example id from Customers and id from Orders
SELECT c.id as CustomerId, o.id as OrderId from Customers c
INNER JOIN Orders o on c.id=o.customerid
The bits before the dot (.) in your field names are table aliases. If you look in the FROM clause of this query you should see these abbreviations in front of the various tables listed in there. They're used to
a) make it less tedious to type table names and
b) make it unambiguous which table you are selecting the column from (this both increases readability of the code and also deals with any cases where two of the tables have columns with the same name)
Here's a simple example of table alias usage:
SELECT emp.ID, emp.Name, dep.ID, dep.Name
FROM employees emp
INNER JOIN departments dep ON dep.ID = emp.DepartmentID
Here you can see that the employees and departments tables have each got aliases to shorten their name. In the query we refer to each field using it's alias. This is especially useful since both tables have fields called "ID" and "Name".
As for why it crashes when you remove p.TrackingNumber, it's likely because you did not also remove the comma (,) from the next line. The comma is used to mark where the name of the next field begins - it could be at the end of the previous line, rather than the start of the next one. Clearly you can't start the list of fields with a comma, because there is no field name preceding it - hence you get a syntax error.
The same query could have been written
select
p.TrackingNumber,
im.Sku,
oh.BusinessUnitCode,
-- etc
which might make it easier to see the usage of the comma.
i have a complex query to be written but cannot figure it out
here are my tables
Sales --one row for each sale made in the system
SaleProducts --one row for each line in the invoice (similar to OrderDetails in NW)
Deals --a list of possible deals/offers that a sale may be entitled to
DealProducts --a list of quantities of products that must be purchased in order to get a deal
now im trying to make a query which will tell me for each sale which deals he may get
the relevant fields are:
Sales: SaleID (PK)
SaleProducts: SaleID (FK), ProductID (FK)
Deals: DealID (PK)
DealProducts: DealID(FK), ProductID(FK), Mandatories (int) for required qty
i believe that i should be able to use some sort of cross join or outer join, but it aint working
here is one sample (of about 30 things i tried)
SELECT DealProducts.DealID, DealProducts.ProductID, DealProducts.Mandatories,
viwSaleProductCount.SaleID, viwSaleProductCount.ProductCount
FROM DealProducts
LEFT OUTER JOIN viwSaleProductCount
ON DealProducts.ProductID = viwSaleProductCount.ProductID
GROUP BY DealProducts.DealID, DealProducts.ProductID, DealProducts.Mandatories,
viwSaleProductCount.SaleID, viwSaleProductCount.ProductCount
The problem is that it doesn't show any product deals that are not fulfilled (probably because of the ProductID join). i need that also sales that don't have the requirements show up, then I can filter out any SaleID that exists in this query where AmountBought < Mandatories etc
Thank you for your help
I'm not sure how well I follow your question (where does viwSaleProductCount fit in?) but it sounds like you will want an outer join to a subquery that returns a list of deals along with their associated products. I think it would go something like this:
Select *
From Sales s Inner Join SaleProducts sp on s.SaleID = sp.SaleID
Left Join (
Select *
From Deals d Inner Join DealProducts dp on d.DealID = dp.DealId
) as sub on sp.ProductID = sub.ProductID
You may need to add logic to ensure that deals don't appear twice, and of course replace * with the specific column names you'd need in all cases.
edit: if you don't actually need any information from the sale or deal tables, something like this could be used:
Select sp.SaleID, sp.ProductID, sp.ProductCount, dp.DealID, dp.Mandatories
From SaleProducts sp
Left Join DealProducts as dp on sp.ProductID = dp.ProductID
If you need to do grouping/aggregation on this result you will need to be careful to ensure that deals aren't counted multiple times for a given sale (Count Distinct may be appropriate, depending on your grouping). Because it is a Left Join, you don't need to worry about excluding sales that don't have a match in DealProducts.
Of all the thousands of queries I've written, I can probably count on one hand the number of times I've used a non-equijoin. e.g.:
SELECT * FROM tbl1 INNER JOIN tbl2 ON tbl1.date > tbl2.date
And most of those instances were probably better solved using another method. Are there any good/clever real-world uses for non-equijoins that you've come across?
Bitmasks come to mind. In one of my jobs, we had permissions for a particular user or group on an "object" (usually corresponding to a form or class in the code) stored in the database. Rather than including a row or column for each particular permission (read, write, read others, write others, etc.), we would typically assign a bit value to each one. From there, we could then join using bitwise operators to get objects with a particular permission.
How about for checking for overlaps?
select ...
from employee_assignments ea1
, employee_assignments ea2
where ea1.emp_id = ea2.emp_id
and ea1.end_date >= ea2.start_date
and ea1.start_date <= ea1.start_date
Whole-day inetervals in date_time fields:
date_time_field >= begin_date and date_time_field < end_date_plus_1
Just found another interesting use of an unequal join on the MCTS 70-433 (SQL Server 2008 Database Development) Training Kit book. Verbatim below.
By combining derived tables with unequal joins, you can calculate a variety of cumulative aggregates. The following query returns a running aggregate of orders for each salesperson (my note - with reference to the ubiquitous AdventureWorks sample db):
select
SH3.SalesPersonID,
SH3.OrderDate,
SH3.DailyTotal,
SUM(SH4.DailyTotal) RunningTotal
from
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH3
join
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH4
on SH3.SalesPersonID = SH4.SalesPersonID AND SH3.OrderDate >= SH4.OrderDate
group by SH3.SalesPersonID, SH3.OrderDate, SH3.DailyTotal
order by SH3.SalesPersonID, SH3.OrderDate
The derived tables are used to combine all orders for salespeople who have more than one order on a single day. The join on SalesPersonID ensures that you are accumulating rows for only a single salesperson. The unequal join allows the aggregate to consider only the rows for a salesperson where the order date is earlier than the order date currently being considered within the result set.
In this particular example, the unequal join is creating a "sliding window" kind of sum on the daily total column in SH4.
Dublicates;
SELECT
*
FROM
table a, (
SELECT
id,
min(rowid)
FROM
table
GROUP BY
id
) b
WHERE
a.id = b.id
and a.rowid > b.rowid;
If you wanted to get all of the products to offer to a customer and don't want to offer them products that they already have:
SELECT
C.customer_id,
P.product_id
FROM
Customers C
INNER JOIN Products P ON
P.product_id NOT IN
(
SELECT
O.product_id
FROM
Orders O
WHERE
O.customer_id = C.customer_id
)
Most often though, when I use a non-equijoin it's because I'm doing some kind of manual fix to data. For example, the business tells me that a person in a user table should be given all access roles that they don't already have, etc.
If you want to do a dirty join of two not really related tables, you can join with a <>.
For example, you could have a Product table and a Customer table. Hypothetically, if you want to show a list of every product with every customer, you could do somthing like this:
SELECT *
FROM Product p
JOIN Customer c on p.SKU <> c.SSN
It can be useful. Be careful, though, because it can create ginormous result sets.
I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.