distinct record by with oldest date - sql

Ok i have 2 tables
they have matching customer id fields
customer has cust_id as a primary field and orders has many cust_Ids
I want to display the first order record (earlist dated) for each customer id
Select customer.*, orders.*
from customer , orders
where orders.date = (select max(orders.date) from orders
where customer.customer-id = orders.customer-id)
This query combines the tables but i have multiple entries for each customer id and I only want the oldest date entry for each customer-id
How do I just get the oldest date record for each customer

You could accomplish this using an outer apply. That would look something like this:
select c.*, o.*
from customer c
outer apply (
select top 1 *
from orders o
where o.Customer-ID = c.Customer-ID
order by o.Date asc
) o

Related

Getting only the last customer purchase

I have to write a query to get the following result: "We want to give a coupon with a 10% value of the last customer purchase. The customers eligible for this coupon must have made another purchase before the last one that is equal or bigger than the last one. Create a query that returns the coupon values for each one of the eligible customers."
I have the following query already written, who returns to me all the purchases and then the coupons' values, but for the result to be right I need only the rows matching the last customer purchase to be shown. Any idea on how to do this?
SELECT * FROM(
select *,
lag(PaymentValue) over (PARTITION BY IDCustomer ORDER BY ApprovalDate) LastPurchaseValue,
PaymentValue*0.1 as CouponValue
from (
SELECT Customers.customer_unique_id as IDCustomer,
Orders.order_id as IDOrder,
order_approved_at as ApprovalDate,
SUM(Payments.payment_value) as PaymentValue
from olist_customers_dataset as Customers
inner join olist_orders_dataset as Orders on Customers.customer_id = Orders.customer_id
INNER JOIN olist_order_payments_dataset as Payments on Payments.order_id = Orders.order_id
GROUP BY IDCustomer, IDOrder, ApprovalDate
ORDER BY IDCustomer
)
) WHERE LastPurchaseValue > PaymentValue
Here you can find the schema:

Finding the most frequently occurring combination

I have two table with name Orders and Products,The order table contains the number of specific orders made by a customer and the products included in that order is in the Products table.
My requirement is to get the number of total orders against the most frequently coming products.
means for these products product 1,Product 2, product 3 what is the total orders,If an order contains 10 Products which contains Product 1 ,Product 2 and Product 3 that order should be counted.
For an order_id there can be multiple products will be there and i'm confused on how to get this result.Can anyone share or suggest a solution on how to get this?
I'm using PostgreSQL.
Below is the sample query ,
SELECT
"Orders"."order_id",pr.product_name
FROM
"data"."orders" AS "Orders"
LEFT JOIN data.items i On i."order_id"="Orders"."order_id"
LEFT join data.products pr on pr."product_id"=i."product_id"
WHERE TO_CHAR("Orders"."created_at_order",'YYYY-MM-DD') BETWEEN '2019-02-01' AND '2019-04-30'
ORDER BY "Orders"."order_id"
Desired Result will be like this(3 columns),The most purchased product combination with number of occurring orders.
Product 1, Product 2,Product 3,etc..... , Number Of Orders
This is the sample data output,Need the product list which is purchased in combination the most.(As of now i have given only 3 columns for sample but it may vary according to the number of PRODUCTS in an order).
and example
SELECT
"Orders"."order_id",
string_agg(DISTINCT pr.product_name,::character varying, ',') AS product_name
count(1) AS product_no
FROM
"data"."orders" AS "Orders"
LEFT JOIN data.items i On i."order_id"="Orders"."order_id"
LEFT join data.products pr on pr."product_id"=i."product_id"
WHERE TO_CHAR("Orders"."created_at_order",'YYYY-MM-DD') BETWEEN '2019-02-01' AND '2019-04-30'
GROUP BY "Orders"."order_id"
ORDER BY count(1);
You can try to use group by clause.
If you want to generally get the number of orders against some products then you can just count the number of orders grouped on the products from product table. Query should look something like this:
SELECT product_id, COUNT(*)
FROM data.products
GROUP BY product_id
ORDER BY COUNT(*)
LIMIT 1;
Hope this helps!
Try to use GROUP BY and take MOST counted value as below-
SELECT
pr.product_name,
COUNT(DISTINCT Orders.order_id)
FROM
"data"."orders" AS "Orders"
LEFT JOIN data.items i On i."order_id"="Orders"."order_id"
LEFT join data.products pr on pr."product_id"=i."product_id"
WHERE TO_CHAR("Orders"."created_at_order",'YYYY-MM-DD') BETWEEN '2019-02-01' AND '2019-04-30'
GROUP BY pr.product_name
ORDER BY COUNT(DISTINCT Orders.order_id) DESC
LIMIT 1 -- You can use the LIMIT or NOT as per requirement

SQL LEFT JOIN - Inner select not returning columns

I have two tables called 'Customers' and 'Orders'. Tables column names are as follow:
Customers: id, name, address
Orders: id, person_id, product, price
The desired outcome is to query all customers with one of their latest purchases. I have a lot of duplicates in 'Orders' table whereby two records with same time-stamp due to some bug.
I have written the following code but the issue is that the query does not return table 2(Orders) column values. Can anyone advise what the issue is?
SELECT C.Id,C.Name, O.item, O.price, O.product
FROM Customers C
LEFT JOIN
(
SELECT TOP 1 person_id
FROM Orders
WHERE status = 'Pending'
) O ON C.ID = O.person_id
Results: O.item, O.price, O.product values are all null
Edit: Sample Data
ID/ NAME/ ADDRESS/
1/ A/ Ad1/
2/ B/ Ad2/
3/ C/ Ad3/
ID/ Person ID/ PRODUCT PRICE/ Created Date
ID-1234/ 1/ Book/ $5/ 26-2-2017
ID-1235/ 1/ Book/ $5/ 26-2-2017
ID-1236/ 2/ Calendar/ $10/ 4-2-2017
ID-1238/ 1/ Pen/ $2/ 1-1-2016
Assuming that the id column in Orders is a primary key autoincrement, then the following should work:
SELECT c.id,
c.name,
COALESCE(t1.price, 0.0) AS price,
COALESCE(t1.product, 'NA') AS product
FROM Customers c
LEFT JOIN Orders t1
ON c.id = t1.person_id
LEFT JOIN
(
SELECT person_id, MAX(CAST(SUBSTRING(id, 4, LEN(id)) AS INT)) AS max_id
FROM Orders
GROUP BY person_id
) t2
ON t1.person_id = t2.person_id AND
t2.max_id = CAST(SUBSTRING(t1.id, 4, LEN(t1.id)) AS INT)
This answer assumes that taking the greatest order ID per customer will yield the most recent purchase. Ideally you should have a timestamp column which captures when a transaction took place. Note that even in the query above, we still have no way of knowing when the most recent transaction took place.
So where is the timestamp column? It's not mentioned in your table schema. But your description does not mention the status column either, and that is clearly in there.
Is orders.id unique? Is it the key for the Orders table?> If it is, then your schema has no way to identify "duplicate" records. You cannot mean to imply that only one order per customer is allowed, so if there are multiple orders for a single customer, how do we identify the duplicates? By the unmentioned timestamp column?
If there IS a `timestamp column, and that's how you would identify dupes, then use it.
SELECT C.Id,C.Name, O.item, O.price, O.product
FROM Customers C LEFT JOIN Orders o
on o.id = (Select Min(id) from orders
where person_id = c.Id
and timestamp = o.timestamp
and status = 'Pending')

sql getting data three columns from one table and one from another

I have two tables sales and customers
from 1st table I want to sum two columns recived_amount and bill_amount and subtract them both and after that also want to get customer_id
from 2nd table I just want to get Customer_name
I've tried
SELECT Customer_details.CUS_name,
SUM(SALES.Bill_Amount) - SUM(SALES.Recived_Amount),
Customer_details.Cus_id
from sales
INNER JOIN Customer_details
ON Customer_details.Cus_id=sales.Cus_id
where SALES.Cus_id = 1 order BY Cus_id
SELECT Customer_details.CUS_name, (SUM(SALES.Bill_Amount) - SUM(SALES.Recived_Amount)) as Subtract,Customer_details.Cus_id
from sales INNER JOIN Customer_details ON Customer_details.Cus_id=sales.Cus_id
where SALES.Cus_id = 1
group by Customer_details.CUS_name,Customer_details.Cus_id
order BY Cus_id
SELECT Customer_details.CUS_name,
(SUM(SALES.Bill_Amount)-SUM(SALES.Recived_Amount)) AS Balance_Owed,
Customer_details.Cus_ID
FROM Sales INNER JOIN Customer_details ON Customer_details.Cus_id=Sales.Cus_id
GROUP BY Customer_details.CUS_name,Customer_details.Cus_id
ORDER BY Customer_details.Cus_id;

SQL query for join with condition

I have these two tables:
Customers: Id, Name
Orders: Id, CustomerId, Time, Status
I want to get a list of customers for which the LAST order does not have a status of 'Wrong'.
I know how to use a LEFT JOIN to get a count of orders for each customer, but I don't know how I can use this statement for what I want. Maybe a JOIN is not the right thing to use too, I'm not sure.
It's possible that customers do not have any order, and they should be returned.
I'm abstracting the real tables here, but the scenario is for a windows phone app sending notifications. I want to get all clients for which their last notification does not have a 'Dropped' status. I can sort their notifications (orders) by the 'Time' field. Thanks for the help, while I continue experimenting with subqueries in the where clause.
Select ...
From Customers As C
Where Not Exists (
Select 1
From Orders As O1
Join (
Select O2.CustomerId, Max( O2.Time ) As Time
From Orders As O2
Group By O2.CustomerId
) As LastOrderTime
On LastOrderTime.CustomerId = O1.CustomerId
And LastOrderTime.Time = O1.Time
Where O1.Status = 'Dropped'
And O1.CustomerId = C.Id
)
There are obviously alternatives based on the actual database product and version. For example, in SQL Server one could use the TOP command or a CTE perhaps. However, without knowing what specific product is being used, the above solution should produce the results you want in almost any database product.
Addition
If you were using a product that supported ranking functions (which database product and version isn't mentioned) and common-table expressions, then an alternative solution might be something like so:
With RankedOrders As
(
Select O.CustomerId, O.Status
, Row_Number() Over( Partition By CustomerId Order By Time Desc ) As Rnk
From Orders As O
)
Select ...
From Customers
Where Not Exists (
Select 1
From RankedOrders As O1
Where O1.CustomerId = C.Id
And O1.Rnk = 1
And O1.Status = 'Dropped'
)
Assuming Last order refers to the Time column here is my query:
SELECT C.Id,
C.Name,
MAX(O.Time)
FROM
Customers C
INNER JOIN Orders O
ON C.Id = O.CustomerId
WHERE
O.Status != 'Wrong'
GROUP BY C.Id,
C.Name
EDIT:
Regarding your table configuration. You should really consider revising the structure to include a third table. They would look like this:
Customer
CustomerId | Name
Order
OrderId | Status | Time
CompletedOrders
CoId | CustomerId | OrderId
Now what you do is store the info about a customer or order in their respective tables ... then when an order is made you just create a CompletedOrders entry with the ids of the 2 individual records. This will allow for a 1 to Many relationship between customer and orders.
Didn't check it out, but something like this?
SELECT c.CustmerId, c.Name, MAX(o.Time)
FROM Customers c
LEFT JOIN Orders o ON o.CustomerId = c.CustomerId
WHERE o.Status <> 'Wrong'
GROUP BY c.CustomerId, C.Name
You can get list of customers with the LAST order which has status of 'Wrong' with something like
select customerId from orders where status='Wrong'
group by customerId
having time=max(time)