SQL - create flag in query to highlight order which contain quantity = 1 - sql

I have tried creating a case statement but doesn't seem to give me what i want. Id like to get a split of the table (which is at a product level) and aggregate at an order level of items which contain quantity of 1.
Any ideas on how I would do this?
order id | Product | Quantity
---------+---------+--------------
11111 | sdsd4 | 1 (single item )
22222 | sasas | 1 (multiple items)
22222 | wertt | 1 (multiple items)
I'd like to get a case statement to add another column to split out orders with quantity = 1 and orders greater 1
Any idea on how I would do this?
The desired outcome would be the column in (brackets)
I could then count the orders and bring in the newly created column as the dimension
More detail here:
enter image description here
Attached is an image of table structure.
Logic, if quantity = 1 and 1 order then single item order
if order has one item but multiples of same item non single item order
if order has more than one product then non single item order

If your database supports analytic functions, then you can use a query like this one:
SELECT *,
CASE WHEN count("Product") OVER (partition by "order id") > 1
THEN 'multiple items' ELSE 'single item'
END As "How many items"
FROM Table1
Demo: https://dbfiddle.uk/?rdbms=postgres_11&fiddle=b659279fc16d2084cb1cf4a3bea361a1

Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
CASE COUNT(DISTINCT Product) OVER(PARTITION BY order_id)
WHEN 1 THEN 'Single Item Order'
ELSE 'Multiple Items Order'
END Single_or_Multiple
FROM `project.dataset.table`
You can test, play with above using dummy data as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT 11111 order_id, 'sdsd4' Product, 1 Quantity UNION ALL
SELECT 22222, 'sasas', 2 UNION ALL
SELECT 22222, 'wertt', 1
)
SELECT *,
CASE COUNT(DISTINCT Product) OVER(PARTITION BY order_id)
WHEN 1 THEN 'Single Item Order'
ELSE 'Multiple Items Order'
END Single_or_Multiple
FROM `project.dataset.table`
with result
Row order_id Product Quantity Single_or_Multiple
1 11111 sdsd4 1 Single Item Order
2 22222 sasas 2 Multiple Items Order
3 22222 wertt 1 Multiple Items Order

If I understand this right, you could use a subquery to get the count of records for an order and flag a record, if this count is larger then 1 and the quantity is equal to 1.
SELECT t1.order_id,
t1.product,
t1.quantity,
CASE
WHEN t1.quantity = 1
AND (SELECT count(*)
FROM elbat t2
WHERE t2.order_id = t1.order_id) > 1 THEN
'flag'
ELSE
'no flag'
END flag
FROM elbat t1;

Related

Get count With Distinct in SQL Server

Select
Count(Distinct iif(t.HasReplyTask = 1, t.CustomerID, Null)) As Reply,
Count(Distinct iif(t.HasOverdueTask = 1, t.CustomerID, Null)) As Overdue,
Count(Distinct t.CustomerID) As Total
From
Table1 t
If a customer is in Reply, we need to remove that customer in Overdue count, That means if Customer 123 is in both, The Overdue count should be one less. How can I do this?
I am adding some data here,
Customer 123 has "HasReplyTask", so, we have to filter that customer from Count in OverDue(even though that customer has one Overdue task without HasReplyTask). 234 is one and Distinct of 456 is one.
So, the overdue count should be 2, Above query returns 3
If I've got it right, this can be done using a subquery to get the numbers for each customer, and then get the summary information as follows:
Select Sum(HasReplyTask) As Reply,
Sum(HasOverdueTask) As Overdue,
Count(CustomerID) As Total
From (
Select CustomerID,
IIF(Max(Cast(HasReplyTask As TinyInt))<>0, 0, Max(Cast(HasOverdueTask As TinyInt))) As HasOverdueTask,
Max(Cast(HasReplyTask As TinyInt)) As HasReplyTask
From Table1
Group by CustomerID) As T
I don't know about column data types, so I used cast function to use max function.
db<>fiddle
Reply
Overdue
Total
1
2
3
What would probably be more efficient for you is to pre-aggregate your table by customer ID and have counts per customer. Then your outer query can test for whatever you are really looking for. Something like
select
sum( case when PQ.ReplyCount > 0 then 1 else 0 end ) UniqReply,
sum( case when PQ.OverdueCount > 0 then 1 else 0 end ) UniqOverdue,
sum( case when PQ.OverdueCount - PQ.ReplyCount > 0 then 1 else 0 end ) PendingReplies,
count(*) as UniqCustomers
from
( select
yt.customerid,
count(*) CustRecs,
sum( case when yt.HasReplyTask = 1 then 1 else 0 end ) ReplyCount,
sum( case when yt.HasOverdueTask = 1 then 1 else 0 end ) OverdueCount
from
yourTable yt
group by
yt.customerid ) PQ
Now to differentiate the count you are REALLY looking for, you might need to do a test against the prequery (PQ) of ReplyCount vs OverdueCount such as... For a single customer ID (thus the pre query), if the OverdueCount is GREATER than the ReplyCount, then it is still considered overdue? So for customer 123, they had 3 overdue, but only 2 replies. You want that counted once? But for customers 234 and 456, the only had overdue entries and NO replies. So, the total where Overdue - Reply > 0 = 3 distinct people.
Is that your ultimate goal?

Last record per transaction

I am trying to select the last record per sales order.
My query is simple in SQL Server management.
SELECT *
FROM DOCSTATUS
The problem is that this database has tens of thousands of records, as it tracks all SO steps.
ID SO SL Status Reason Attach Name Name Systemdate
22 951581 3 Processed Customer NULL NULL BW 2016-12-05 13:33:27.857
23 951581 3 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
24 947318 1 Hold Customer NULL NULL bw 2016-12-05 13:54:27.173
25 947318 1 Invoices Submit Customer NULL NULL bw 2016-13-05 13:54:27.300
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440
I would to see the most recent record per the SO
ID SO SL Status Reason Attach Name Name Systemdate
23 951581 4 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440
Well I'm not sure how that table has two Name columns, but one easy way to do this is with ROW_NUMBER():
;WITH cte AS
(
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY SO ORDER BY Systemdate DESC)
FROM dbo.DOCSTATUS
)
SELECT ID, SO, SL, Status, Reason, ..., Systemdate
FROM cte WHERE rn = 1;
Also please always reference the schema, even if today everything is under dbo.
I think you can keep it this simple:
SELECT *
FROM DOCSTATUS
WHERE ID IN (SELECT MAX(ID)
FROM DOCSTATUS
GROUP BY SO)
You want only the maximum ID from each SO.
An efficient method with the right index is a correlated subquery:
select t.*
from t
where t.systemdate = (select max(t2.systemdate) from t t2 where t2.so = t.so);
The index is on (so, systemdate).

Query to list every line item but only list total once

We have a sales report that pulls data from multiple tables and my query shows correct data except orders that have multiple line items, i.e., the Total from the Orders table is listed on every line item row.
How can we list the Order Total only once on the row that has the smallest line item ID (for that order) but still list every line item row? Thanks!
Data Structure:
Orders Table:
Order_ID
Total
Line Items Table:
ID
Order_ID
Line_Item_Price
Line_Item_Qty
Result should be:
Order_ID Total Line_Item_Price Line_item_Qty Line_Item_ID
---------- ------- ----------------- --------------- --------------
10001 200 100 2 32001
10002 150 150 1 32002
10003 210 55 1 32003
10003 30 2 32004
10003 95 1 32005
10004 125 125 1 32006
This should be done in the application not in SQL.
But you can do that using window functions
select o.order_id,
case row_number() over (partition by o.order_id order by line_item_id)
when 1 then o.total
end as total,
li.line_item_price,
li.line_item_qty,
li.line_item_id
from orders o
join line_item li on o.order_id = li.order_id
order by o.order_id, li.line_item_id;
row_number() assigns a unique row number for each line item for every order. When the number is 1 the total is displayed, otherwise it's not.
In a relational database there is no such thing as "the first row" unless you specify an order by - in this case the "first row" is the line item with the smallest line_item_id
Online example: http://rextester.com/TQOIX50171
Unrelated, but: storing the total in the orders table is not a terribly good idea. In a normalized design you shouldn't store information that can easily be derived from existing data.
What this does is get the orders with their items and ranks them. So that the smallest item_id is rank 1 for that order, and the latest is the last rank.
ROW_NUMBER() is the function that gives the index of the rows in the output of the query (row 1 = 1, row 2 = 2). Then we can combine this with OVER (PARTITION BY), which means get the row numbers within a certain window, a partition. In this case we want to number the rows for the windows of Order_IDs. We use ORDER BY alongside to say how we order the rows within the window
When we have this table, we can then write a query on it to show the total only on the rows where the item_rank = 1
WITH rank_items_for_orders AS (
SELECT
Order_ID,
Line_Item_Price,
Line_Item_Qty,
Line_Item_ID,
Total,
ROW_NUMBER()
OVER (PARTITION BY Order_ID, ORDER BY Line_Item_ID ASC)
AS order_the_items_IDs
FROM orders o
LEFT JOIN line_items li ON o.order_id = li.order_id
ORDER BY Order_ID ASC)
SELECT
Order_ID,
Line_Item_Price,
Line_Item_Qty,
Line_Item_ID,
CASE WHEN order_the_items_IDs = 1
THEN Total ELSE NULL END
AS Total
FROM
rank_items_for_orders

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

Select if then case with first record

Can you do something like this in SQL Server?
I want to select from a table which has some records with the same product_id in one column and a Y or N in another (in stock), and take the first one which has a Y where the product_id is the same, while matching the product_id_set from another table.
... ,
SELECT
(SELECT TOP 1
(product_name),
CASE
WHEN in_stock = 'Y' THEN product_name
ELSE product_name
END
FROM
Products
WHERE
Products.product_set = Parent_Table.product_set) AS 'Product Name',
...
Sample data would be
product_set in_stock product_id product_name
---------------------------------------------------
1 N 12 Orange
1 Y 12 Pear
2 N 12 Apple
2 N 12 Lemon
Output from product_set = 1 would be 'Pear' for example.
So there's kind of two solutions depending on the answer to the following question. If there are no records for a product id with an in_stock value of 'Y', should anything return? Secondly, if there are multiple rows with in_stock 'Y', do you care which one it picks?
The first solution assumes you want the first row, whether or not there is ANY "Y" value.
select *
from (select RID = row_number() over (partition by product_set order by in_stock desc) -- i.e. sort Y before N
from Products) a
where a.RID = 1
The second will only return a value if there is at least one row with a 'Y' for in_stock. Note that the order by (select null) is essentially saying you don't care which one it picks if there are multiple in_stock items. If you DO care the order, replace it with the appropriate sort condition.
select *
from (select RID = row_number() over (partition by product_set order by (select null)) -- i.e. sort Y before N
from Products
where in_stock = 'Y') a
where a.RID = 1
I don't know what the structure of the "parent table" in your query is, so I've simplified it to assume you have what you need in Products alone.
SELECT ISNULL(
(
SELECT TOP 1 product_name
FROM Products
WHERE Products.product_set = Parent_Table.product_set
AND Products.in_stock = 'Y'
), 'Not in the stock') AS 'Product Name'