Removing Duplicates only when criteria's are met - sql

Last question in regards to duplication. I understand how to select duplicate records using COUNT(*) with the HAVING clause > 1, but I'm faced with a challenge of removing duplicates given when a criteria has been met.
I asked one part of this yesterday in removing duplicates when the bill amount cancels out, but now I have to include a criteria to it where when the bill amount has the same positive and negative value that cancels out, the date is the same for both as well as the code.
So for example, record 1 has a bill amount of $250 with code "JUN" and a date of 03/02/2020, record 2 has a bill amount of $250 with code "PII" and a date of 03/07/2020 and record 3 has a bill amount of -$250 with code "PII" and a date of 03/07/2020. The results I would like to see in this example is only record 1 where record 2 and 3 would be consider the duplicates given the criteria that I stated.
Table Creation:
CREATE TABLE Billing (
BillId varchar(10),
SerialNo varchar(10),
BillAmt MONEY,
Code varchar(5),
DispenseDt DATE
);
Data Entry:
INSERT INTO Billing (BillId, SerialNo, BillAmt, Code, DispenseDt)
VALUES ('BL_001','aaa-111',250,'AAP','20200503')
,('BL_002','aab-112',250,'ADD','20200309')
,('BL_003','aab-112',-250,'ADD','20200309')
,('BL_004','aba-120',700,'YED','20200503')
,('BL_005','aba-120',370,'TPP','20200822')
,('BL_006','aba-120',370,'TPP','20201003')
,('BL_007','aba-120',400,'TPP','20200822')
,('BL_008','aba-120',-370,'TPP','20200822')
,('BL_009','aba-120',-700,'YED','20200503')
,('BL_010','baa-201',1000,'TOK','20200927')
,('BL_011','baa-201',-1000,'TOK','20200927')
,('BL_012','bab-210',1000,'TOK','20200927');
Sample Data:
+----------+-----------+---------+------+------------+
| BillId | SerialNo | BillAmt | Code | DispenseDt |
+----------+-----------+---------+------+------------+
| BL_001 | aaa-111 | $250 | AAP | 20200503 |
| BL_002 | aab-112 | $250 | ADD | 20200309 |
| BL_003 | aab-112 |-$250 | ADD | 20200309 |
| BL_004 | aba-120 | $700 | YED | 20200503 |
| BL_005 | aba-120 | $370 | TPP | 20200822 |
| BL_006 | aba-120 | $370 | TPP | 20201003 |
| BL_007 | aba-120 | $400 | TPP | 20200822 |
| BL_008 | aba-120 |-$370 | TPP | 20200822 |
| BL_009 | aba-120 |-$700 | YED | 20200503 |
| BL_010 | baa-201 | $1000 | TOK | 20200927 |
| BL_011 | baa-201 |-$1000 | TOK | 20200927 |
| BL_012 | bab-210 | $1000 | TOK | 20200927 |
+----------+-----------+---------+------+------------+
Desire Results:
+----------+-----------+---------+------+------------+
| BillId | SerialNo | BillAmt | Code | DispenseDt |
+----------+-----------+---------+------+------------+
| BL_001 | aaa-111 | $250 | AAP | 20200503 |
| BL_006 | aba-120 | $370 | TPP | 20201003 |
| BL_007 | aba-120 | $400 | TPP | 20200822 |
| BL_012 | bab-210 | $1000 | TOK | 20200927 |
+----------+-----------+---------+------+------------+
My Code:
select a.SerialNo, a.BillAmt, a.Code, a.DispenseDt
from (
select *,
count(SerialNo) over(partition by SerialNo, DispenseDt) b
from Billing ) a
where b = 1
AND
InvoiceDt >= '20200601' And InvoiceDt <= '20200630'
AND
FacID IN ('IND600','IND605','IND610','IND620','IND630','IND640','IND650','IND660','IND670','IND680','IND690','IND695')
ORDER BY a.Serial;

i tried to solve it but I am kind of stuck myself. The logic here was to get ranks and then filter the rank that is same but somehow my code creates rank [created 2 of them using rank() and row_number()] that will remove some of the cases that you need as output maybe if someone else can edit this code? that would be great
fiddle link:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e0c990d3694ad99b628b3e05a5de624f
select
Bill_ID,
Code,
DispenseDt,
new_bill_amt,
rank()
over(partition by new_bill_amt,DispenseDt, code) as rank_,
row_number()
over(partition by new_bill_amt,DispenseDt, code) as rank_2
from (
select
*,
replace(billamt,'-','') as new_bill_amt
from Billing
) as f

I think this might work.
(I used a CTE, but you could convert that into a subquery.)
WITH base_cte AS (
SELECT
B1.SerialNo
, SUM(B1.BillAmt) AS [TotAmt]
, B1.Code
, B1.DispenseDt
FROM #Billing AS B1
GROUP BY
B1.SerialNo
, B1.Code
, B1.DispenseDt
)
SELECT
B.BillId
, B.SerialNo
, B.BillAmt
, B.code
, B.DispenseDt
FROM #Billing AS B
LEFT JOIN base_cte AS X ON X.SerialNo = B.SerialNo
WHERE X.TotAmt = B.BillAmt
AND X.DispenseDt = B.DispenseDt
Output:
BillId SerialNo BillAmt code DispenseDt
BL_001 aaa-111 250.00 AAP 2020-05-03
BL_006 aba-120 370.00 TPP 2020-10-03
BL_007 aba-120 400.00 TPP 2020-08-22
BL_012 bab-210 1000.00 TOK 2020-09-27
EDIT: Here's a different method with OVER().
SELECT
Y.BillId
, Y.SerialNo
, Y.BillAmt
, Y.Code
, Y.DispenseDt
FROM (
SELECT X.*
, [Ct] = COUNT(*) OVER(PARTITION BY X.code, X.TotAmt, X.DispenseDt ORDER BY X.SerialNo, X.code, X.DispenseDt)
FROM (
SELECT
B.BillId
, B.SerialNo
, B.BillAmt
, B.code
, B.DispenseDt
, [TotAmt] = SUM(B.BillAmt) OVER(PARTITION BY B.SerialNo, B.code, B.DispenseDt ORDER BY B.SerialNo, B.code, B.DispenseDt)
FROM #Billing AS B
) AS X
) AS Y
WHERE Y.BillAmt = Y.TotAmt
ORDER BY Y.BillId

Related

Find the first order of a supplier in a day using SQL

I am trying to write a query to return supplier ID (sup_id), order date and the order ID of the first order (based on earliest time).
+--------+--------+------------+--------+-----------------+
|orderid | sup_id | items | sales | order_ts |
+--------+--------+------------+--------+-----------------+
|1111132 | 3 | 1 | 27,0 | 24/04/17 13:00 |
|1111137 | 3 | 2 | 69,0 | 02/02/17 16:30 |
|1111147 | 1 | 1 | 87,0 | 25/04/17 08:25 |
|1111153 | 1 | 3 | 82,0 | 05/11/17 10:30 |
|1111155 | 2 | 1 | 29,0 | 03/07/17 02:30 |
|1111160 | 2 | 2 | 44,0 | 30/01/17 20:45 |
|....... | ... | ... | ... | ... ... |
+--------+--------+------------+--------+-----------------+
Output I am looking for:
+--------+--------+------------+
| sup_id | date | order_id |
+--------+--------+------------+
|....... | ... | ... |
+--------+--------+------------+
I tried using a subquery in the join clause as below but didn't know how to join it without having selected order_id.
SELECT sup_id, date(order_ts), order_id
FROM sales s
JOIN
(
SELECT sup_id, date(order_ts) as date, min(time(order_date))
FROM sales
GROUP BY merchant_id, date
) m
on ...
Kindly assist.
You can use not exists:
select *
from sales
where not exists (
-- find sales for same supplier, earlier date, same day
select *
from sales as older
where older.sup_id = sales.sup_id
and older.order_ts < sales.order_ts
and older.order_ts >= cast(sales.order_ts as date)
)
The query below might not be the fastest in the world, but it should give you all information you need.
select order_id, sup_id, items, sales, order_ts
from sales s
where order_ts <= (
select min(order_ts)
from sales m
where m.sup_id = s.sup_id
)
select sup_id, min(order_ts), min(order_id) from sales
where order_ts = '2022-15-03'
group by sup_id
Assumed orderid is an identity / auto increment column

SQL Group By and Join based on a weird client table

I have 3 tables that I want to join together and group it to get client membership info. My code works for grouping the base table together but it breaks at the join part and I can't figure out why.
BASE TABLE : sales_detail
+-------+-----------+-----------+-----------------------------------------+
| order_date | transaction_id| product_cost | payment_type | country
+-------+-----------+-----------+------------------------------------------+
| 10/1 | 12345 | 20 | mastercard | usa
| 10/1 | 12345 | 50 | mastercard | usa
| 10/5 | 82456 | 50 | mastercard | usa
| 10/9 | 64789 | 30 | visa | canada
| 10/15 | 08546 | 20 | mastercard | usa
| 10/15 | 08546 | 90 | mastercard | usa
| 10/17 | 65898 | 50 | mastercard | usa
+-------+-----------+-----------+-------------------------------------+
table : client_information
+-------+-----------+-----------+-------------------+
| other_id | client_Type | item
+-------+-----------+-----------+----------+
| 112341 | new | hola |
| 112341 | old | mango |
| 145634 | old | pine |
| 879547 | old | vip |
| 745688 | new | unio |
| 745688 | old | dog |
| 147899 | new | cat |
| 124589 | new | amigo |
+-------+-----------+-----------+-----------+
table : connector
+-------+-----------+-----------+-------------------+
| transaction_ID | other_id | item
+-------+-----------+-----------+----------+
| 12345 | 112341 | hola |
| 82456 | 145634 | pine |
| 08157 | 879547 | unio |
| 08546 | 745688 | dog |
| 65898 | 147899 | cat |
| 06587 | 124589 | amigo |
+-------+-----------+-----------+-----------+
**I want the output to look something like this: **
IDEAL OUTPUT
+-------+-----------+-----------+--------------------------------+
| order_date | transaction_ID | product_cost | client_Type|
+-------+-----------+-----------+--------------------------------+
| 10/1 | 12345 | 70 | new |
| 10/5 | 82456 | 70 | old |
| 10/15 | 08546 | 110 | old |
| 10/17 | 65898 | 50 | new |
+-------+-----------+-----------+----------------------------------+
**i am trying to join my base table to the connector table by transaction ID to get other_id and items to match to client_type **
This is the code i used but it failed to compile after adding in left joins :
select t1.transaction_id, sum(t1.product_cost), t1.order_date, t3.client_type
from sales_detail t1
left join (select DISTINCT transaction_ID, other_id, fruits from connector) t2
ON t1.transaction_ID=t2.transaction_ID
left join (select DISTINCT order_id, client_type, fruits from client information) t3
ON t2.other_id=t3.other_id and t2.item=t3.item
where t1.payment_type='mastercard' and t1.order_Date between '2020-10-01' and'2020-10-31'
and country != 'canada'
GROUP BY t1.transaction_id, t1.order_date, t3.client_type;
Thanks in advance! I am a beginner so still learning the ins and outs of sql! (am using hive)
I think that's joins and aggregation. For more efficiency, you can pre-aggregate in a subquery, then join:
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
where
payment_type = 'mastercard'
and order_date >= '2020-10-01'
and order_date < '2020-11-01'
and country <> 'canada'
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id
Note that I rewrote the filter on order_date to use half-open intervals rather than between. This properly handles the case when your dates have a time portion.
From what I have understood, your code works although not as you would like using an INNER JOIN and it fails to add a LEFT JOIN. I think what happens is a failure due to the NULL elements. to add a NULL element and not get an error, you have to use some function that changes the NULL value to 0 .
One such function is the ISNULL(yourColumn, 0) function of T-SQL.The documentation.
I can see that in result table you only need clients who used mastercard, so you should use inner join there so only those client who used mastercard will be considered. While the remaining query is okay i guess, but main problem was the join on client information.
I think on the answer with GMB you also need to join on item column otherwise you will get multiple rows output.
select sd.*, ci.client_type
from (
select order_date, transaction_id, sum(product_cost) product_cost
from sales_detail
group by order_date, transaction_id
) sd
inner join connector c on c.transaction_id = sd.transaction_id
inner join client_information ci on ci.other_id = c.other_id and ci.item = c.item
Just modify with your filters and you should be sorted.

Duplicate records upon joining table

I am still very new to SQL and Tableau however I am trying to work myself towards achieving a personal project of mine.
Table A; shows a table which contains the defect quantity per product category and when it was raised
+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity |
+--------+-------------+--------------+-----------------+
| PCR12 | 11-Jan-2019 | Product#1 | 14 |
| PCR13 | 12-Jan-2019 | Product#1 | 54 |
| PCR14 | 5-Feb-2019 | Product#1 | 5 |
| PCR15 | 5-Feb-2019 | Product#2 | 7 |
| PCR16 | 20-Mar-2019 | Product#1 | 76 |
| PCR17 | 22-Mar-2019 | Product#2 | 5 |
| PCR18 | 25-Mar-2019 | Product#1 | 89 |
+--------+-------------+--------------+-----------------+
Table B; shows the consumption quantity of each product by month
+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019 | Product#1 | 100 |
| 17-Jan-2019 | Product#1 | 200 |
| 5-Feb-2019 | Product#1 | 100 |
| 8-Feb-2019 | Product#2 | 50 |
| 10-Mar-2019 | Product#1 | 100 |
| 12-Mar-2019 | Product#2 | 50 |
+-------------+--------------+-------------------+
END RESULT
I would like to create a table/bar chart in tableau that shows that Defect_Quantity/Consumed_Quantity per month, per Category_ID#, so something like this below;
+----------+-----------+-----------+
| Month | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23% | |
| Feb-2019 | 5% | 14% |
| Mar-2019 | 89% | 10% |
+----------+-----------+-----------+
WHAT I HAVE TRIED SO FAR
Unfortunately i have not really done anything, i am struggling to understand how do i get rid of the duplicates upon joining the tables based on Category_ID#.
Appreciate all the help I can receive here.
I can think of doing left joins on both product1 and 2.
select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
, (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100
, (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100
from tableA t1
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product1 tableB
where category_id = 'Product#1'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
on p1.Date_Raised = t1.Date_Raised
left join
(select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
, sum(Comsumed_Quantity) as product2 tableB
where category_id = 'Product#2'
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
By using ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN, you can remove duplicate rows. As of your end result you should extract month from date and use pivot to achieve.
I would do this as:
select to_char(date_raised, 'YYYY-MM'),
(sum(case when product = 'Product#1' then defect_quantity end) /
sum(case when product = 'Product#1' then consumed_quantity end)
) as product1,
(sum(case when product = 'Product#2' then defect_quantity end) /
sum(case when product = 'Product#2' then consumed_quantity end)
) as product2
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
from a
) union all
(select date_raised, product, 0 as defect_quantity, consumed_quantity
from b
)
) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);
(I changed the date format because I much prefer YYYY-MM, but that is irrelevant to the logic.)
Why do I prefer this method? This will include all months where there is a row in either table. I don't have to worry that some months are inadvertently filtered out, because there are missing production or defects in one month.

eSQL multiple join but with conditions

I've 3 tables as under
MERCHANDISE
+-----------+-----------+---------------+
| MERCH_NUM | MERCH_DIV | MERCH_SUB_DIV |
+-----------+-----------+---------------+
| 1 | car | awd |
| 1 | car | awd |
| 2 | bike | 1kcc |
| 3 | cycle | hybrid |
| 3 | cycle | city |
| 4 | moped | fixie |
+-----------+-----------+---------------+
PRIORITY
+----------+-----------+---------+---------+------------+------------+---------------+
| CUST_NUM | SALES_NUM | DOC_NUM | BALANCE | PRIORITY_1 | PRIORITY_2 | PRIORITY_CODE |
+----------+-----------+---------+---------+------------+------------+---------------+
| 90 | 1000 | 10 | 23 | 1 | 6 | NO |
| 91 | 1001 | 20 | 32 | 3 | 7 | PRI |
| 92 | 1002 | 30 | 11 | 2 | 8 | LATE |
| 93 | 1003 | 40 | 22 | 5 | 9 | 1MON |
+----------+-----------+---------+---------+------------+------------+---------------+
ORDER
+----------+-----------+---------+---------+-----------+-----------+
| CUST_NUM | SALES_NUM | DOC_NUM | COUNTRY | MERCH_NUM | MERCH_DIV |
+----------+-----------+---------+---------+-----------+-----------+
| 90 | 1000 | 10 | INDIA | 1 | car |
| 91 | 1001 | 20 | CHINA | 2 | bike |
| 92 | 1002 | 30 | USA | 3 | cycle |
| 93 | 1003 | 40 | UK | 4 | moped |
+----------+-----------+---------+---------+-----------+-----------+
I want to join the left joined table from the last two tables with the first one such that the MERCH_SUB_DIV 'awd' appears only once for each unique combination of merch_num and merch_div
the code I came up with is as under, but I'm not sure how do I eliminate the duplicate row just for the awd
select
ROW#, MERCH.MERCH_NUMBER, ORDPRI.MERCH_NUMBER, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, ITEM_NUM, RANK, PRIORITY_1
from (
select
ROW_NUMBER() OVER(
PARTITION BY ORD.DOC_NUM, ORD.ITEM_NUM
ORDER BY ORD.DOC_NUM, ORD.ITEM_NUM ASC
) AS Row#,
ORD.CUST_NUM, PRI.CUST_NUM, ORD.MERCH_NUM, ORD.MERCH_DIV, PRI.BALANCE,
pri.DOC_NUM, pri.SALES_NUM, pri.PRIORITY_1, pri.PRIORITY_2
from ORDER as ORD
left join PRIORITY as PRI on ORD.DOC_NUM = PRI.DOC_NUM
and ORD.SALES_NUMBER = PRI.SALES_NUM
where country_name in ('USA', ‘INDIA’)
) as ORDPRI
left join MERCHANDISE as MERCH on ORDPRI.DIV = MERCH.DIV
and ORDPRI.MERCH_NUM = MERCH.MERCH_NUM
You have to use 'DISTINCT' keyword to get unique values, but if your 'Priority table' & 'Order table' contains different values for Same MERCH_NUM then the final result contains the repetation of the 'MERCH_NUM'.
SELECT DISTINCT M.MERCH_NUMBER, O.MERCH_NUMBER, O.CUST_NUM, BALANCE, SALES_NUM,ITEM_NUM,RANK,PRIORITY_1
FROM priority_table P
LEFT JOIN order_table O ON P.CUST_NUM = O.CUST_NUM AND P.SALES_NUM=O.SALES_NUM AND P.DOC_NUM = O.DOC_NUM
LEFT JOIN merchandise_table M ON M.MERCH_NUM = O.MERCH_NUM
A way around can be to add one new Row_Number() in the outermost query having Partition by MERCH_SUB_DIV + all the columns in the final list and then filter final results based on the New Row_Number() . Follows a pseudo code that might help:
select
-- All expected columns in final result except the newRow#
ROW#, MERCH_NUM, CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
from (
select
ROW#,
-- the new row number includes all column you want to show in final result
row_number() over ( PARTITION BY MERCH.MERCH_SUB_DIV ,
MERCH.MERCH_NUM, ORDPRI.MERCH_NUM, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
order by (select 1 )) as newRow# ,
MERCH.MERCH_NUM, ORDPRI.CUST_NUM,
BALANCE, SALES_NUM, PRIORITY_1
from (
-- main query goes here
select
ROW_NUMBER() OVER(
PARTITION BY ORD.DOC_NUM --, ORD.ITEM_NUM
ORDER BY ORD.DOC_NUM ASC --, ORD.ITEM_NUM
) AS Row#,
ORD.CUST_NUM, ORD.MERCH_NUM, ORD.MERCH_DIV as DIV, PRI.BALANCE,
pri.DOC_NUM, pri.SALES_NUM, pri.PRIORITY_1, pri.PRIORITY_2
from #ORDER as ORD
left join #PRIORITY as PRI on ORD.DOC_NUM = PRI.DOC_NUM
and ORD.SALES_NUMBER = PRI.SALES_NUM
where country_name in ('USA', 'INDIA')
) as ORDPRI
left join #MERCHANDISE as MERCH on ORDPRI.DIV = MERCH.DIV
and ORDPRI.MERCH_NUM = MERCH.MERCH_NUM
) as T
-- final filter to get distinct values
where newRow# = 1
Sample code here .. Hope this helps!!

sql compact 3.5, select top n rows from each group

I am writing a query to select the top 1 record from each group. Keep in mind that I working on sql compact 3.5 and thus can not use the rank function. I'm pretty sure my query is incorrect but I'm not sure how to select top from each group. Any one got any ideas?
Here is the query I was trying to get working
/*
* added fH.InvoiceNumber to my query to get result further below.
/
select tH., t.CustomerNumber, c.CustomerName, fH.Status, fH.InvoiceNumber
from tenderHeader tH
join task t ON tH.TaskActivityID = t.ActivityID
join finalizeTicketHeader fH ON tH.FinalizeTicketTaskActivityID = fH.TaskActivityID
join customer c ON t.CustomerNumber = c.CustomerNumber
where fH.Status <> '3' AND t.TripID = '08ea6982-6efd-46fa-9753-0fd8b076f24c';
Here is what my tables look like:
customer table:
|------------------------------------------------|
| CustomerNumber | CustomerName | Address1 | ... |
|------------------------------------------------|
| 0012084737 | Customer A | 150 Rd A | ... |
|------------------------------------------------|
| 0012301891 | Customer B | 152 Rd A | ... |
|------------------------------------------------|
task table
|-----------------------------------------------------------------|
| ActivityID | TripID | TaskTypeName | Status | CustomerNumber |
|-----------------------------------------------------------------|
| 4967f6cc | 08ea6982 | Payment | 2 | 0012084737 |
|-----------------------------------------------------------------|
| e96469a1 | 08ea6982 | Payment | 2 | 0012301891 |
|-----------------------------------------------------------------|
finalizeTicketHeader table
|---------------------------------------------------|
| TaskActivityID | InvoiceNumber | Amount | Status |
|---------------------------------------------------|
| 916082c8 | 1000 | 563.32 | 3 |
|---------------------------------------------------|
| 916082c8 | 1001 | -343.68 | 0 |
|---------------------------------------------------|
| 4b38bf60 | 1002 | 152.29 | 0 |
|---------------------------------------------------|
| 4b38bf60 | 1003 | -35.80 | 0 |
|---------------------------------------------------|
tenderHeader table
|-------------------------------------------------------------------------------------|
| TaskActivityID | InvoiceNumber | PastDue | TodaysDue | FinalizeTicketTaskActivityID |
|-------------------------------------------------------------------------------------|
| 4967f6cc | 1234567891 | 23.55 | 219.64 | 916082c8 |
|-------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 |
|-------------------------------------------------------------------------------------|
the problem I was having was getting duplicates.
like so:
|------------------------------------------------------------------------------------------------------------------------------------|
| TaskActivityID | InvoiceNumber | PastDue | TodaysDue | FinalizeTicketTaskActivityID | CustomerNumber | CustomerName | InvoiceNumber |
|------------------------------------------------------------------------------------------------------------------------------------|
| 4967f6cc | 1234567891 | 23.55 | 219.64 | 916082c8 | 0012084737 | Customer A | 1001 |
|------------------------------------------------------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 | 0012301891 | Customer B | 1002 |
|------------------------------------------------------------------------------------------------------------------------------------|
| e96469a1 | 1234567893 | 0.00 | 116.49 | 4b38bf60 | 0012301891 | Customer B | 1003 |
|------------------------------------------------------------------------------------------------------------------------------------|
I've rewritten the query like so, but I need to get specific columns from the sub query.
select tH.* from tenderHeader th
inner join task t on tH.TaskActivityID = t.ActivityID
inner join (
select k.TaskActivityID from finalizeTicketHeader k group by k.TaskActivityID
) as fH on tH.FinalizeTicketTaskActivityID = fH.TaskActivityID
inner join customer c on t.CustomerNumber = c.CustomerNumber
I need to get the status from fH. Any ideas of how to do that?
select tH.*, fH.Status from tenderHeader th
inner join task t on tH.TaskActivityID = t.ActivityID
inner join finalizeTicketHeader fH on tH.FinalizeTicketTaskActivityID = tH.TaskActivityID
inner join customer c on t.CustomerNumber = c.CustomerNumber
where tH.FinalizeTicketTaskActivityID = (
select top (1) k.TaskActivityID from finalizeTicketHeader k
);
but it seems that sql compact 3.5 does not support scalar values with subquery in where cause.
Here is an example that demonstrat a way of selecting the top 1 from each group
id|time
--------
2 | 1:10
2 | 0:45
2 | 1:45
2 | 1:30
1 | 1:00
1 | 1:10
the table is called table_1; we group by id and assume that time should be desc ordered
select table_1.* from table_1
inner join (
select id, max(time) as max_time from table_1
group by id
) as t
on t.max_time = table_1.time and table_1.id = t.id
order by table_1.id
the result we get is
id|time
--------
1 | 1:10
2 | 1:45