"Joining" 2 different selects on the same table? - sql

I have a table of orders with products, each product has their own shipping date. How can I retrieve the orders so it shows the fastest shipping date?
For example:
Order Product Ship date
1 phone 02/03/2019
1 charger 02/07/2019
2 printer 03/01/2019
What would be the sql query to retrieve the following?
Order Product Ship date
1 phone 02/03/2019
1 charger 02/03/2019
2 printer 03/01/2019
I.e on order 1, all ship dates are 02/03/2019 since it's the earliest.
I tried this:
SELECT order,
product,
(SELECT ship_date FROM Tracking ORDER BY ship_date ASC) tbl ON tbl.order = t.order
FROM Tracking t
But I'm getting the error:
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.

Considering the error message I believe this is for SQL Server and thus window functions to be available.
You could use the windowed version of min() to get the minimum shipping date for an order.
SELECT [order],
[product],
min([ship date]) OVER (PARTITION BY [order]) [ship date]
FROM tracking;

To get rid of the "The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified." message, you can do that by adding "SELECT TOP 100 PERCENT" to the sub-query.
But I'd suggest looking into the RANK and DENSE_RANK functions as they're probably going to be more helpful

I am not sure that i understand what you are trying to achieve here. Well if you want just to query this table ordered by shipping date, i think something like this would work:
SELECT * FROM Tracking ORDER BY ship_date ASC;

You are making it very complicated. The query is pretty simple:
select * from my_table group by shipdate order by shipdate asc

Related

Query GROUP BY and COUNT

I'm new to SQL and taking COURSERA's "SQL for Data Science" course.I have the following question in a summary assignment:
Show the number of orders placed by each customer and sort the result by the number of orders in descending order.
Having failed to write the correct code, the answer would be as follows (of course one of several options):
SELECT *
,COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId
ORDER BY number_of_orders DESC
I am still having trouble understanding the query logic. I would appreciate your assistance in understanding this query.
I seriously hope that Coursera isn't giving you the query you cited above as the recommended answer. It won't run on most databases, and even in cases such as MySQL where it might run, it is not completely correct. You should be using this version:
SELECT CustomerId, COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId
ORDER BY number_of_orders DESC;
A basic rule of GROUP BY is that the only columns available for selection are those which appear in the GROUP BY clause. In addition to these columns, aggregates of any column(s) may also appear in the select. The version I gave you above follows these rules, and is ANSI compliant, meaning it would run on any database.
When you say SELECT * it represents ALL COLUMNS. But you are grouping by only CustomerId which is wrong in SQL.
Specify the other columns in the group section that you want to show
The script should be something like
SELECT CustomerName, DateEntered
,COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId, CustomerName, DateEntered
ORDER BY number_of_orders DESC

How to work past "At most one record can be returned by this subquery"

I'm having trouble understanding this error through all the researching I have done. I have the following query
SELECT M.[PO Concatenate], Sum(M.SumofAward) AS TotalAward, (SELECT TOP 1 M1.[Material Group] FROM
[MGETCpreMG] AS M1 WHERE M1.[PO Concatenate]=M.[PO Concatenate] ORDER BY M1.SumofAward DESC) AS TopGroup
FROM MGETCpreMG AS M
GROUP BY M.[PO Concatenate];
For a brief instance it reviews the results I want, but then the "At most one record can be returned by this subquery" error comes and wipes all the data to #Name?
For context, [MGETCpreMG] is a query off a main table [MG ETC] that was used to consolidate Award for differing Material Groups on a PO transaction ([PO Concatenate])
SELECT [MG ETC].[PO Concatenate], Sum([MG ETC].Award) AS SumOfAward, [MG ETC].[Material Group]
FROM [MG ETC]
GROUP BY [MG ETC].[PO Concatenate], [MG ETC].[Material Group]
ORDER BY [MG ETC].[PO Concatenate];
I'm thinking it lies in my inability to understand how to utilize a subquery.
In the case in which the query can return more then one value? Simply add an additonal sort by.
So, a common sub query might be to get the last invoice. So you might have:
select ID, CompanyName,
(SELECT TOP 1 InvoiceDate from tblInvoice
where tblInvoice.CustomerID = tblCompany.ID
Order by InvoiceDate DESC)
As LastInvoiceDate
From tblCustomers
Now the above might work for some time, but then it will blow up since you might have two invoices for the same day!
So, all you have to do is add that extra order by clause - say on the PK of the child table like this:
Order by InvoiceDate DESC,ID DESC)
So top 1 will respect the "additional" order columns you add, and thus only ever return one row - even if there are multiple values that match the top 1 column.
I suppose in the above we could perhaps forget the invoiceDate and always take the top most last autonumber ID, but for a lot of queries, you can't always be sure - it might be we want the last most expensive invoice amount. And again, if the max value (top) was the same for two large invoice amounts, then again two rows could be return. So, simply add the extra ORDER BY clause with an 2nd column that further orders the data. And thus top 1 will only pull the first value. Your example of a top group is such an example. Just tack on the extra order by "ID" or whatever the auto number ID column is.

Efficiently find last date in a table - Teradata SQL

Say I have a rather large table in a Teradata database, "Sales" that has a daily record for every sale and I want to write a SQL statement that limits this to the latest date only. This will not always be the previous day, for example, if it was a Monday the latest date would be the previous Friday.
I know I can get the results by the following:
SELECT s.*
FROM Sales s
JOIN (
SELECT MAX(SalesDate) as SalesDate
FROM Sales
) sd
ON s.SalesDate=sd.SalesDt
I am not knowledgable on how it would process the subquery and since Sales is a large table would there be a more efficient way to do this given there is not another table I could use?
Another (more flexible) way to get the top n utilizes OLAP-functions:
SELECT *
FROM Sales s
QUALIFY
RANK() OVER (ORDER BY SalesDate DESC) = 1
This will return all rows with the max date. If you want only one of them switch to ROW_NUMBER.
That is probably fine, if you have an index on salesdate.
If there is only one row, then I would recommend:
select top 1 s.*
from sales s
order by salesdate desc;
In particular, this should make use of an index on salesdate.
If there is more than one row, use top 1 with ties.

Count(), max(),min() fuctions definition with many selects

Lets say we have a view/table hotel(hotel_n,hotel_name, room_n, price). I want to find the cheapest room. I tried group by room_n, but I want the hotels name (hotel_name) to be shown to the board without grouping it.
So as an amateur with sql(oracle 11g) I began with
select hotel_n, room_n, min(price)
from hotel
group by room_n;
but it shows the error: ORA-00979: not a GROUP BY expression. I know I have to type group by room_n, hotel_n, but I want the hotel_n to be seen in the table that I make without grouping by it!
Any ideas? thank you very much!
Aggregate functions are useful to show, well, aggregate information per group of rows. If you want to get a specific row from a group of rows in relation to the other group members (e.g., the cheapest room per room_n), you'd probably need an analytic function, such as rank:
SELECT hotel_n, hotel_name, room_n, price
FROM (SELECT hotel_n, hotel_name, room_n, price
RANK() OVER (PARTITION BY room_n ORDER BY price ASC) rk
FROM hotel) t
WHERE rk = 1

Distinct with order by clause

I want to get distinct Category and order there result by curdate column.
select distinct(Category)'Category' from sizes order by curdate desc
But this simple query is generating errors.
ORDER BY items must appear in the select list if SELECT DISTINCT is specified.
I'm afraid you have the same constraint for SELECT DISTINCT as for GROUP BY clauses: namely, you cannot make use of a field that's not declared in the fields list, because it simply doesn't know which curdate to use when sorting in case there are several rows with different curdate values for the same Category.
EDIT: try something like:
SELECT Category FROM sizes GROUP BY Category ORDER BY MAX(curdate) DESC
Replace MAX with MIN or whatever suits you.
EDIT2: In this case, MAX(curdate) doesn't even have to be present in the field list since it's used in an aggregate function.
with cte as
(
select
Category,
[CurDate],
row_number() over(partition by Category order by [CurDate]) as rn
from sizes
)
select
Category
from cte
where rn = 1
order by [CurDate]
You look to be after a list of all the categories, with a date associated with each one. Whether you want the earliest first or latest first, you should be able to do one of the following:
SELECT Category, MAX(curdate) FROM sizes GROUP BY Category
Or:
SELECT Category, MIN(curdate) FROM sizes GROUP BY Category
Depending on whether you want the most recent or earliest dates associated with each category. If you need the list to then be ORDERed by the dates, add one of the following onto the end:
ORDER BY MAX(curdate)
ORDER BY MIN(curdate)
Curdate must be in your select statement also, right now you are only specifying Category
Exactly what the error says, it cannot order a distinct list if the sort filed is not part of the select. Reason being is that there may be multiple sort values for each of the distinct values selected. If the data looks like this
Category CurDate
AAA 1/1/2011
BBB 2/1/2011
AAA 3/1/2011
Should AA be before or after BBB in the distinct list? If you just ordered by the date without the distinct you would get it in both positions. Since SQL doesn't know which date should be associated with the distinct category it will not let you sort by the date.
as the error-message said, you can't order by a column that isn't selected wehen using SELECT DISTINCT (same problem as with GROUP BY...). change your query to this:
SELECT DISTINCT category, curdate FROM sizes ORDER BY curdate DESC
EDIT: replying to yourt comment:
if you want to select the distinct category with the last date for every category, you'll have to change your query a bit. i can think of two possibilities for this: using MAX() like Costi Ciudatu posted or doing some crazy stuff with subselects - the first one would be the better approach.