Selecting average total where an associated table has id present - sql

I'm fairly new to SQL and I'm trying to answer this question:
What is the average order total, where product X is present.
We have an orders table, and line items. An order has many line items, which again store the product_id.
SELECT
avg(total)
FROM
orders
WHERE
(shipment_state = 'shipped')
AND (delivery_date BETWEEN '2017-09-11' AND '2017-09-18');
is what I have now and that is working, however I do not know how to fetch and calculate it based on another table (in this case line_item)

You can use exists:
SELECT AVG(o.total)
FROM orders o
WHERE o.shipment_state = 'shipped' AND
o.delivery_date BETWEEN '2017-09-11' AND '2017-09-18' AND
EXISTS (SELECT 1
FROM orderlines ol
WHERE ol.order_id = o.order_id AND
ol.product_id = X
);
Notes:
When you have more than one table in a query, always use table aliases and qualified column names.
When working with dates, BETWEEN is not recommended. The recommended construct is o.delivery_date >= '2017-09-11' AND o.delivery_date < '2017-09-19'. This works for both dates and date/time values.

Related

Combining SQL queries into one with various having/group by/where rownum

I currently have three ORACLE SQL queries which are similar to this simplified example.
I get a list of customers which fulfill my requirements:
CREATE VIEW customerQRY AS
SELECT
o.customer_id,
o.order_id,
si.item_id,
o.price,
o.discount
FROM
Orders o
JOIN StockItems si ON o.order_id = si.order_id
WHERE
o.returned = 'N'
AND o.num_items = 1
AND o.completed = 'Y'
AND o.order_date BETWEEN TO_DATE('01-01-2019', 'DD-MM-YYYY') AND TO_DATE('01-01-2020', 'DD-MM-YYYY')
;
From those I get the top 1000 customers which have bought more than 10 items at max 10% discount:
CREATE TABLE CustomerSamples
SELECT
customer_id
FROM (
SELECT
customer_id
FROM
customerQRY
GROUP BY
customer_id
HAVING
COUNT(DISTINCT(order_id)) > 9
AND discount < 11
ORDER BY
COUNT(DISTINCT(order_id)) DESC,
discount DESC
)
WHERE
ROWNUM < 1001
;
Then I get all the data related to the order and items for this subset of customers:
(edit: this is actually not totally correct: I want the order details here to be a subset of the orders specified in CustomerSamples i.e. the ones which fall into the discount < 11 category; this can be done with a "where" clause here or however defined in a potential single query)
SELECT
Orders.*,
StockItems.*
FROM
CustomerSamples cs
JOIN Orders ON Orders.customer_id = cs.customer_id
JOIN StockItems ON StockItems.order_id = Orders.order_id
;
(please forgive any missed syntax errors as I've simplified the real ones - these run correctly in reality)
This is fair enough - it works - but I was asked to try and combine this into one query which makes sense for us with running on production boxes etc.
I have gone back and forth trying different things, but can't come up with a sensible solution!
Sure I can literally use customerQry as a subquery in the CustomerSamples, but this means I don't have the data from customerQRY and suddenly things get more complicated. I can't return order_ids from query 2 as we are grouping on the customer and counting the order_ids.
I can't see a way to get the 1000 customer_ids and their related order_ids in one go. I feel like I'm missing an obvious solution here, but I can't see it. Anyone have any ideas? Am I just fighting a waterfall?
If you use the texts of your request, then an example:
SELECT
Orders.*,
StockItems.*
FROM
Orders JOIN StockItems ON StockItems.order_id = Orders.order_id
WHERE
Orders.customer_id in (
SELECT
customer_id
FROM (
SELECT
customer_id
FROM
(
SELECT
o.customer_id,
o.order_id,
si.item_id,
o.price,
o.discount
FROM
Orders o
JOIN StockItems si ON o.order_id = si.order_id
WHERE
o.returned = 'N'
AND o.num_items = 1
AND o.completed = 'Y'
AND o.order_date BETWEEN TO_DATE('01-01-2019', 'DD-MM-YYYY') AND TO_DATE('01-01-2020', 'DD-MM-YYYY')
)
GROUP BY
customer_id
HAVING
COUNT(DISTINCT(order_id)) > 9
AND discount < 11
ORDER BY
COUNT(DISTINCT(order_id)) DESC,
discount DESC
)
WHERE
ROWNUM < 1001)
;

Selecting current valid record of historical Data with SQL

I have 2 tables
Table Customer
customer_shortcut (char)
Table CustomerData
customerID (ForeignKey to Customer)
customer_valid (Valid date for the record)
customer_name (char)
Table CustomerData can have multiple records for a customer, but with different valid dates, p.e.
01.01.2019
01.01.2020
01.01.2021
I managed to get the last record for each customer using the query:
SELECT Customer.*
FROM Customer
FULL JOIN CustomerData ON (Customer.id = CustomerData."customerID_id")
FULL JOIN CustomerData CustomerData2 ON (Customer.id = CustomerData2."customerID_id"
AND (CustomerData.customer_valid < CustomerData2.customer_valid
OR CustomerData.customer_valid = CustomerData2.customer_valid
AND CustomerData.id < CustomerData2.id)
)
WHERE CustomerData2.id IS NULL
How do I get now the current valid record (in my example the record with customer_valid 01.01.2020)?
I tried to add "AND customer_valid <= '2020-05-05' on nearly every position within the query but never got the expected result.
If I understand you correctly you are looking for the highest "valid date" that is before "today" (or any given date). This can be achieved using a lateral join in Postgres:
SELECT c.*, cd.customer_name
FROM customer c
JOIN LATERAL (
SELECT *
FROM customerdata cd
WHERE c.id = cd.customer_id
AND cd.customer_valid <= current_date
ORDER BY cd.customer_valid DESC
LIMIT 1
) cd on true
A more efficient option would be (in my opinion) to store the start and the end of the valid period in a daterange column:
create table customer_data
(
customer_id int not null references customer,
valid_during daterange not null,
customer_name text
);
Overlapping ranges can be prevented using an exclusion constraint
And the example ranges from your question would be stored as
[2019-01-01,2020-01-01)
[2020-01-01,2021-01-01)
[2021-01-01,infinity)
The ) denotes that the right edge is excluded.
The query then becomes as simple as:
SELECT c.*, cd.customer_name
FROM customer c
JOIN customer_data cd
on c.id = cd.customer_id
AND cd.valid_during #> current_date;

Oracle find whether a corresponding record exists within a number of days

I am trying to subtract to date from each other
The question says that I have to create a query to display the orders that were not shipped within 30 days of ordering.
Here is my trying:
select orderno
from orders
where 30> (select datediff(dd,s.ship_date,o.odate )
from o.orders,s.shipment);
The error I get is
ERROR at line 1:
ORA-00942: table or view does not exist
These are the two tables :
SQL> desc orders
Name Null? Type
----------------------------------------- -------- ----------------------------
ORDERNO NOT NULL NUMBER(3)
ODATE NOT NULL DATE
CUSTNO NUMBER(3)
ORD_AMT NUMBER(5)
SQL> desc shipment
Name Null? Type
----------------------------------------- -------- ----------------------------
ORDERNO NOT NULL NUMBER(3)
WAREHOUSENO NOT NULL VARCHAR2(3)
SHIP_DATE DATE
You'd be wanting something along the lines of:
select ...
from orders o
where not exists (
select null
from shipments s
where s.orderno = o.orderno
and s.ship_date <= (o.odate + 30))
Date arithmetic is pretty easy if you just want a difference in days, as you can add or subtract days as integers. If it were months, quarters or years you'd want to use Add_Months().
Also, it's better in the query above to say "shipment_date <= (order_date + 30)" rather than "(shipment_date - order_date) <= 30)" as it lets indexes be used on the join key and shipment date combined. In practice you'd probably want an index on (s.orderno, s.ship_date) so that the shipment table does not have to be accessed for this query.
I used NOT EXISTS here because in the case that there might be multiple shipments per order you would want the query stop finding additional shipments if it has found a single one.
Here is one method, using Oracle syntax:
select o.orderno
from orders o
where 30 > (select o.date - s.ship_date
from shipment s
where s.orderno = o.orderno
);
Note the correlation clause in the subquery, but each table is only mentioned once.
The problem that you have is that an order could ship on more than on occasion -- and this would generate an error in the query, because the subquery would return more than one row. One solution is aggregation. You need to decide if the question is "the entire order does not ship within 30 days" or "no part of the order ships within 30 days". The latter would use MIN():
select o.orderno
from orders o
where (select MIN(o.date - s.ship_date)
from shipment s
where s.orderno = o.orderno
) > 30;
Your Syntax is wrong and you are trying to do a cross join implicitely. I think what you need is an INNER JOIN which i assume is going to return one row (if it returns multiple rows then use >ALL) like:
select orderno
from orders
where 30> (select s.ship_date - o.odate
from orders o INNER JOIN shipment s
ON o.orderNo = s.orderNo);

Subselect on date

In an Access database there are two tables.
Table one containing Articles and table two contains Prices.
So Articles is the description of all articles and the prices
table just contains an article number, a date date and a price.
If a prices changes, there will be added a new row to prices.
The prices have a date from which on that price shall be used.
Now I want to get the prices that were valid on 01. Oct 2012.
I used my query on current prices and added and prsdat<#02/10/2012#
to the subselect in the query.
Here is what I already have:
SELECT
Articles.ARTNR,
Articles.TXT,
Articles.ACTIVE,
Prices.PRICE,
Prices.PRSGR,
Prices.PRSDAT
FROM
Articles INNER JOIN Prices ON Articles.ARTNR = Prices.ARTNR
WHERE
(((Articles.ACTIVE)="Y") AND
((Prices.PRSGR)=0) AND
((Prices.PRSDAT)=
(SELECT
max(prsdat)
FROM
Prices as art
WHERE
art.artnr = Prices.artnr and prsdat<#02/10/2012#)))
ORDER BY
Articles.ARTNR;
Now the select returns articles that I did not see with this select
I used before, having just added and prsdat<#01/10/2012#.
The result is now 430 articles, before I just had about 260.
The prices returned are older, but I'm not sure about the date format.
In the table I see DD.MM.YYYY, and in the query i shall use MM/DD/YYYY or DD/MM/YYYY?
What is the correct form of this select?
SELECT
a.ARTNR
, a.TXT
, a.ACTIVE -- dubious, since it is constant
, p.PRICE
, p.PRSGR -- dubious, since it is constant
, p.PRSDAT
FROM Articles a
INNER JOIN Prices p ON a.ARTNR = p.ARTNR
WHERE a.ACTIVE = 'Y'
AND p.PRSGR = 0
AND p.prsdat < #02/10/2012#
AND NOT EXISTS (
SELECT *
FROM Prices nx
WHERE nx.ARTNR = p.ARTNR
AND nx.PRSGR = 0
AND nx.prsdat < #02/10/2012#
AND nx.prsdat > p.prsdat
)
ORDER BY
Articles.ARTNR
;

TSQL - How to extract column with the Minimum and Maximum Value in another column

eg.
I have 2 tables Customers and Orders
Customers
has columns
CustomerID, Name
Orders has columns OrderID, OrderedOn
where OrderedOn is a DateTime
Now I want a query which will give me the
CustomerID OrderID and OrderTally
where OrderTally = 'Initial' for the first order
OrderTally = 'InMiddle' for everything in the middle
and
OrderTally = 'Final' if its the last order and was 30 days ago,
I am trying to create a Case statement for OrderTally
and struggling
How do I check if the OrderID is the first or last or in the middle
-- The First Order
CASE WHEN OrderID IN (...)
THEN 'Initial'
-- The Last Order
WHEN OrderID IN (...)
THEN 'Final'
ELSE
'InTheMiddle'
END
I was thinking of writing ranking statements and then check if it rank is one that's the first one and if rank = total count of all orders then last... but this seems a little complicated.
Is there an easy way of doing this?
How about this:
SELECT o.CustomerId, o.OrderId,
CASE
WHEN o.OrderedOn = o2.FirstOrderDate THEN 'Initial'
WHEN o.OrderedOn = o2.LastOrderDate THEN 'Final'
ELSE 'InTheMiddle'
END AS OrderTally
FROM [Orders] o
JOIN
(
SELECT CustomerId, MIN(OrderedOn) AS FirstOrderDate, MAX(OrderedOn) AS LastOrderDate
FROM [Orders]
GROUP BY CustomerId
) o2 ON o.CustomerId = o2.CustomerId
I've made the assumption that a customer won't have 2 orders with the same OrderOn date/time - if they do, this would possibly result in 2 orders being classed as "Initial" or "Final". But it seemed like a reasonable assumption.