The problem is as follows. Let there be two types of orders, VE and VC orders (VE orders have priority over VC orders). And two types of priorities HIGH and LOW. Every order is identified by an ORDER_ID, then labeled with an order type and lastly a priority. It happens that over time orders can improve their type, priority, or both, resulting in several new entries with duplicate order id's. The task is to label the state with the highest priority for each order with 1 and the rest with 0's. How would you attempt to do this considering that the ORDERS table is sufficiently big and that in some cases some rows would have to be re-labeled.
Example input:
Example output:
"How would you attempt to do this considering that the ORDERS table is
sufficiently big "
Well, first of all I wouldn't query all the rows in a "sufficiently big" table. That's why Nature gave us the WHERE clause.
So, given some form of filtering, the remaining logic is:
select order_id
, order_type
, priority
, case when rn = 1 then 1 else 0 end as temp_label
from
( select order_id
, order_type
, priority
, row_number() over ( partition by order_id
order by decode(order_type, 'VE', 1, 2)
, decode(priority, 'HIGH', 1, 2)
) as rn
from your_table
where whatever = 'BLAH' -- your criteria go here
)
Related
Retrieve the total number of orders made and the number of orders for which payment has been done(delivered).
TABLE ORDER
------------------------------------------------------
ORDERID QUOTATIONID STATUS
----------------------------------------------------
Q1001 Q1002 Delivered
O1002 Q1006 Ordered
O1003 Q1003 Delivered
O1004 Q1006 Delivered
O1005 Q1002 Delivered
O1006 Q1008 Delivered
O1007 Q1009 Ordered
O1008 Q1013 Ordered
Unable to get the total number of orderid i.e 8
select count(orderid) as "TOTALORDERSCOUNT",count(Status) as "PAIDORDERSCOUNT"
from orders
where status ='Delivered'
The expected output is
TOTALORDERDSCOUNT PAIDORDERSCOUNT
8 5
I think you want conditional aggregation:
select count(*) as TOTALORDERSCOUNT,
sum(case when status = 'Delivered' then 1 else 0 end) as PAIDORDERSCOUNT
from orders;
Try this-
SELECT COUNT(ORDERID) TOTALORDERDSCOUNT,
SUM(CASE WHEN STATUS = 'Delivered' THEN 1 ELSE 0 END ) PAIDORDERSCOUNT
FROM ORDER
You can also use COUNT in place of SUM as below-
SELECT COUNT(ORDERID) TOTALORDERDSCOUNT,
COUNT(CASE WHEN STATUS = 'Delivered' THEN 1 ELSE NULL END ) PAIDORDERSCOUNT
FROM ORDER
you could use cross join between the two count
select count(orderid) as TOTALORDERSCOUNT, t.PAIDORDERSCOUNT
from orders
cross join (
select count(Status) PAIDORDERSCOUNT
from orders where Status ='Delivered'
) t
What I've used in the past for summarizing totals is
SELECT
count(*) 'Total Orders',
sum( iif( orders.STATUS = 'Delivered', 1, 0 ) ) 'Total Paid Orders'
FROM orders
I personally don't like using CASE WHEN if I don't have to. This logic may look like its a little too much for a simple summation of totals, but it allows for more conditions to be added quite easily and also just involves less typing, at least for what I use this regularly for.
Using the iif( statement to set up the conditional where you're looking for all rows in the STATUS column with the value 'Delivered', with this set up, if the status is 'Delivered', then it marks it stores a value of 1 for that order, and if the status is either 'Ordered' or any other value, including null values or if you ever need a criteria such as 'Pending', it would still give an accurate count.
Then, nesting this within the 'sum' function totals all of the 1's denoted from your matched values. I use this method regularly for report querying when there's a need for many conditions to be narrowed down to a summed value. This also opens up a lot of options in the case you need to join tables in your FROM statement.
Also just out of personal preference and depending on which SQL environment you're using this in, I tend to only use AS statements for renaming when absolutely necessary and instead just denote the column name with a single quoted string. Does the same thing, but that's just personal preference.
As stated before, this may seem like it's doing too much, but for me, good SQL allows for easy change to conditions without having to rewrite an entire query.
EDIT** I forgot to mention using count(*) only works if the orderid's are all unique values. Generally speaking for an orders table, orderid is an expected unique value, but just wanted to add that in as a side note.
SELECT DISTINCT COUNT(ORDERID) AS [TOTALORDERSCOUNT],
COUNT(CASE WHEN STATUS = 'ORDERED' THEN ORDERID ELSE NULL END) AS [PAIDORDERCOUNT]
FROM ORDERS
TotalOrdersCount will count all distinct values in orderID while the case statement on PaidOrderCount will filter out any that do not have the desired Status.
I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here
You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).
Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).
A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE
I have three tables:
T_ORDER_PLACEMENTS (ORDER_ID, CUSTOMER_ID, ORDER_DATE)
T_ORDER_DETAILS (ORDER_ID, STOCK_ID)
T_STOCK_DETAILS(STOCK_ID, STOCK_NAME, STOCK_PRICE)
Can someone please help me to write a query which generates the following output:
STOCK_ID, STOCK_NAME, STOCK_PRICE, ORDERED_STATUS
1 stock1 5000 ordered
2 stock2 10000 unordered
Populate the ORDERED_STATUS column with 'ordered' if the stock is ordered and 'unordered' if the stock is unordered.
SELECT
t_stock_details.*,
CASE WHEN order_check.stock_id IS NULL THEN 'unordered' ELSE 'ordered' END AS ordered_status
FROM
t_stock_details
LEFT JOIN
(
SELECT stock_id FROM t_order_details GROUP BY stock_id
)
order_check
ON order_check.stock_id = t_stock_details.stock_id
The sub-query checks to see which stock_ids have an order associated with them. It also uses GROUP BY to ensure only one row is returned per stock_id, no matter how many orders are found.
The LEFT JOIN ensures that every row in t_stock_details is returned, whether or not it is successfully joined to anything. Where there is a successful join, we know there has been an order. It will also only ever be joined on to one row at the most (thanks to the above mentioned GROUP BY, so no duplication is being caused).
An unsuccessful join will have NULL in the order_check.stock_id, so we use that to check which string to return, using a CASE statement.
I'm running PostgreSQL 9.4 and have the following table structure for invoicing:
id BIGINT, time UNIX_TIMESTAMP, customer TEXT, amount BIGINT, status TEXT, billing_id TEXT
I hope I can explain my challenge correctly.
A invoice record can have 2 different status; begin, ongoing and done.
Several invoice records can be part of the same invoice line, over time.
So when an invoice period begins, a record is started with status begin.
Then every 6 hour there will be generated a new record with status ongoing containing the current amount spend in amount.
When an invoice is closed a record with status done is generated with the total amount spend in column amount. All the invoice records within the same invoice contains the same billing_id.
To calcuate a customers current spendings I can run the following:
SELECT sum(amount) FROM invoice_records where id = $1 and time between '2017-06-01' and '2017-07-01' and status = 'done'
But that does not take into account if there's an ongoing invoice which are not closed yet.
How can I also count the largest billing_id with no status done?
Hope it make sense.
Per invoice (i.e. billing_id) you want the amount of the record with status = 'done' if such exists or of the last record with status = 'ongoing'. You can use PostgreSQL's DISTINCT ON for this (or use standard SQL's ROW_NUMBER to rank the records per invoice).
SELECT DISTINCT ON (billing_id) billing_id, amount
FROM invoice_records
WHERE status IN ('done', 'ongoing', 'begin')
ORDER BY
billing_id,
CASE status WHEN 'done' THEN 1 WHEN 'ongoing' THEN 2 ELSE 3 END,
unix_timestamp desc;
The ORDER BY clause represents the ranking.
select sum (amount), id
from (
select distinct on (billing_id) *
from (
select distinct on (status, billing_id) *
from invoice_records
where
id = $1
and time between '2017-06-01' and '2017-07-01'
and status in ('done', 'ongoing')
order by status, billing_id desc
) s
order by billing_id desc
) s
My SQL is quite rusty, so much that I have not created a view before and I am not entirely sure how to do what I need. Perhaps I need a stored procedure. Here is the deal.
We have a a database of ticket history (purchases). We want to filter on a certain SKU, but we want all line items from each ticket that has that SKU. For isntance, Someone buys a shirt and a hat. I want to filter on the shirt to find everyone who wants a shirt but display the entire ticket showing the shirt and the hat.
I thought my query would be something like this but I don't think it would work.
select
ticket_id, post_date, qty_sold, total_price, sales_total
from
ticket_history
where
sku = 'xxxx'
Union
select
sku as trans_sku, qty_sold as trans_qty_sold, desc as trans_desc, total_price as trans_total_price
from
ticket_history
where
ticket_id = <the ticket id in first query>
Perhaps a sub-select is what is needed but I'm not too understanding of how to do that either.
Any suggestions would be great.
I am not sure what you are trying to do here and whether UNION is what you are looking for or not.
In your query the columns are different and doesn't matched between the two queries. Any way, you can use a Common table Expression so that you can reuse the subquery, this should solve your problem:
WITH FirstQuery
AS
(
select
ticket_id,
post_date,
qty_sold,
total_price,
sales_total
from ticket_history
where sku = 'xxxx'
)
SELECT *
FROM FirstQuery
UNION
SELECT
... -- You should select the same number of columns
... -- and with the same data types to match the first columns
from ticket_history
where ticket_id IN(SELECT ticket_id FROM FirstQuery);
Here the FirstQuery acts like a subquery, but here you can reuse it later like what we did and use it in the where clause.
But, again the columns you selected in the first query:
ticket_id,
post_date,
qty_sold,
total_price,
sales_total
are different than the columns you selected in the second query:
sku as trans_sku,
qty_sold as trans_qty_sold,
desc as trans_desc,
total_price as trans_total_price
These columns should be matched (the count of them and data types). Otherwise you will got an error.
Things to note about UNION:
the columns count should be the same between the two queries.
The columns' names are driven from the first query.
When doing a UNION, the selected columns must match between the two select's. (Same number of columns, and matching data types.)
Maybe you want a self join instead?
select th1.ticket_id, th1.post_date, th1.qty_sold, th1.total_price, th1.sales_total,
th2.sku as trans_sku, th2.qty_sold as trans_qty_sold,
th2.desc as trans_desc, th2.total_price as trans_total_price
from ticket_history th1
left join ticket_history th2 on th2.ticket_id = th1.ticket_id
where th1.sku = 'xxxx'
LEFT JOIN to get th1 rows even if there are no matching th2 row.