How to fix correlated subquery wrongly selected data? - sql

I want to select the 'batchid's from below table that batchid's all records 'subId' and 'substatus' in '23' and 'READY' respectively. if any values from 'subId' or 'substatus' is not matched '23' and 'READY' respectively then don't take that batch.
Table:
+---------+----------+--------+-------+-----------+
| batchid | dcn | dcnseq | subId | substatus |
+---------+----------+--------+-------+-----------+
| 10001 | 10001001 | 1 | 23 | READY |
| 10001 | 10001001 | 2 | 23 | READY |
| 10001 | 10001002 | 1 | 23 | READY |
| 10001 | 10001003 | 1 | 23 | READY |
| 10001 | 10001004 | 1 | 23 | READY |
| 10001 | 10001004 | 2 | 23 | READY |
| 10001 | 10001004 | 3 | 23 | READY |
| 10002 | 10001005 | 1 | 23 | READY |
| 10002 | 10001005 | 2 | 23 | READY |
| 10002 | 10001006 | 1 | 23 | READY |
| 10002 | 10001007 | 1 | 23 | READY |
| 10002 | 10001008 | 1 | 23 | READY |
| 10002 | 10001008 | 2 | 23 | READY |
| 10002 | 10001009 | 1 | 23 | READY |
+---------+----------+--------+-------+-----------+
I am using below query to achieve this requirement.
select distinct batchid from fm o
where o.subId='23' and o.substatus='READY'
and o.dcnseq='1' and o.batchid in
(
select a.batchid from
(
select i.batchid, SUM(case when i.subId='23' and i.substatus='READY' then 0 else 1 end)match from fm i
where i.batchid=o.batchid
group by i.batchid
having SUM(case when i.subId='23' and i.substatus='READY' then 0 else 1 end)=0
)a
)
Result:
+---------+
| batchid |
+---------+
| 10001 |
| 10002 |
+---------+
It's working perfectly. Now changed 'substatus' value for one records as 'HOLD'
+---------+----------+--------+-------+-----------+
| batchid | dcn | dcnseq | subId | substatus |
+---------+----------+--------+-------+-----------+
| 10001 | 10001001 | 1 | 23 | HOLD |
| 10001 | 10001001 | 2 | 23 | READY |
| 10001 | 10001002 | 1 | 23 | READY |
| 10001 | 10001003 | 1 | 23 | READY |
| 10001 | 10001004 | 1 | 23 | READY |
| 10001 | 10001004 | 2 | 23 | READY |
| 10001 | 10001004 | 3 | 23 | READY |
| 10002 | 10001005 | 1 | 23 | READY |
| 10002 | 10001005 | 2 | 23 | READY |
| 10002 | 10001006 | 1 | 23 | READY |
| 10002 | 10001007 | 1 | 23 | READY |
| 10002 | 10001008 | 1 | 23 | READY |
| 10002 | 10001008 | 2 | 23 | READY |
| 10002 | 10001009 | 1 | 23 | READY |
+---------+----------+--------+-------+-----------+
Now result is:
+---------+
| batchid |
+---------+
| 10002 |
+---------+
Now its also working correctly. But sometimes also picking '10001' for same case. its occurred when tables have lot of batchid. I try to understand mistake. But I can't able to find out.

I think your query is too complicated. Just use aggregation and having:
select batchid
from fm
group by batchid
having min(subid) = max(subid) and max(subid) = 23 and
min(substatus) = max(substatus) and max(substatus)= 'READY';
I don't know if your other conditions are important. They are in your query but not mentioned in the question.

Selecting batchid's where (subid=23 and substatus = 'READY'), and no other values for subid and/or substatus exist for that batchid.
select batchid
from fm
where subId=23 and substatus='READY'
group by batchid
except
select batchid
from fm
where not(subId=23 and substatus='READY' )
group by batchid

The simplest solution is with NOT EXISTS:
select distinct f.batchid
from fm f
where not exists (
select 1 from fm
where batchid = f.batchid and (coalesce(subid, 0) <> 23 or coalesce(substatus, '') <> 'READY')
)
coalesce() is needed only for the case there may exist nulls in the columns subId and substatus.
If there are not any nulls then the where clause can be simplified to:
where batchid = f.batchid and (subid <> 23 or substatus <> 'READY')
See the demo.
Results:
> | batchid |
> | ------: |
> | 10001 |
> | 10002 |

Related

SQL count duplicates in another column based on one field per row

I am building out a customer retention report. We identify customers by their email. Here is some sample data from our table:
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| Email | BrandNewCustomer | RecurringCustomer | ReactivatedCustomer | OrderCount | TotalOrders | Date_Created | Customer_Name | Customer_Address | Customer_City | Customer_State | Customer_Zip | Customer_Country | | | | | |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
| zyw#marketplace.amazon.com | 1 | 0 | 0 | 1 | 1 | 41:50.0 | Sha | 990 | BRO | NY | 112 | US | | | | | |
| zyu#gmail.com | 1 | 0 | 0 | 1 | 1 | 57:25.0 | Zyu | 181 | Mia | FL | 330 | US | | | | | |
| ZyR#aol.com | 1 | 0 | 0 | 1 | 1 | 10:19.0 | Day | 581 | Myr | SC | 295 | US | | | | | |
| zyr#gmail.com | 1 | 0 | 0 | 1 | 1 | 25:19.0 | Nic | 173 | Was | DC | 200 | US | | | | | |
| zy#gmail.com | 1 | 0 | 0 | 1 | 1 | 19:18.0 | Kim | 675 | MIA | FL | 331 | US | | | | | |
| zyou#gmail.com | 1 | 0 | 0 | 1 | 1 | 40:29.0 | zoe | 160 | Mob | AL | 366 | US | | | | | |
| zyon#yahoo.com | 1 | 0 | 0 | 1 | 1 | 17:21.0 | Zyo | 84 | Sta | CT | 690 | US | | | | | |
| zyo#gmail.com | 1 | 0 | 0 | 2 | 2 | 02:03.0 | Zyo | 432 | Ell | GA | 302 | US | | | | | |
| zyo#gmail.com | 1 | 0 | 0 | 1 | 2 | 12:54.0 | Zyo | 432 | Ell | GA | 302 | US | | | | | |
| zyn#icloud.com | 1 | 0 | 0 | 1 | 1 | 54:56.0 | Zyn | 916 | Nor | CA | 913 | US | | | | | |
| zyl#gmail.com | 0 | 1 | 0 | 3 | 3 | 31:27.0 | Ser | 123 | Mia | FL | 331 | US | | | | | |
| zyk#marketplace.amazon.com | 1 | 0 | 0 | 1 | 1 | 44:00.0 | Myr | 101 | MIA | FL | 331 | US | | | | | |
+----------------------------+------------------+-------------------+---------------------+------------+-------------+--------------+---------------+------------------+---------------+----------------+--------------+------------------+--+--+--+--+--+
We define our customer by email. So all orders with the same email are marked to be under one customer and then we do calculations on top of that.
Now I am trying to find out about customers whose emails have changed. So to do this we will try to line up customers by their address.
So per each row (so when separated by email), I want to have another column called something like Orders_With_Same_Address_Different_Email. How would I do that?
I have tried doing something with Dense Rank but it doesn't seem to work:
SELECT DISTINCT
Email
,BrandNewCustomer
,RecurringCustomer
,ReactivatedCustomer
,OrderCount
,TotalOrders
,Date_Created
,Customer_Name
,Customer_Address
,Customer_City
,Customer_State
,Customer_Zip
,Customer_Country
,(DENSE_RANK() over (partition by Email order by (case when email <> email then Customer_Address end) asc)
+DENSE_RANK() over ( partition by Email order by (case when email <> email then Customer_Address end) desc)
- 1) as Orders_With_Same_Name_Different_Email
--*
FROM Customers
Try counting the email partitioned by address, not by email:
select Email,
-- ...
Orders_With_Same_Name_Different_Email = iif(
(count(email) over (partition by Customer_Address) > 1,
1, 0)
from Customers;
But this is a lesson in why you wouldn't use an email as an identifier for a client. Address is a bad idea as well. Use something that won't change. That usually means making an internal identifier, such as something that auto-increments:
alter table #customers
add customerId int identity(1,1) primary key not null
Now customerId = 1 will always refer to that particular customer.
You can group by customer_address and check the count. This is by the assumption that each customer has one address.
Select * from table where
customer_address IN (
Select customer_address
From table group by customer_address
having count(distinct customer_email)
>1)
If I understand what you want to do, this is how I would solve it:
Note, you don't need the having clause in the CTE but depending on your data it could make it faster. (That is, if you have a large dataset.)
WITH email2addr
(
select email, count(distinct customer_address) as addr_cnt
from customers
group by email
having count(distinct customer_address) > 1
)
SELECT
Email
,BrandNewCustomer
,RecurringCustomer
,ReactivatedCustomer
,OrderCount
,TotalOrders
,Date_Created
,Customer_Name
,Customer_Address
,Customer_City
,Customer_State
,Customer_Zip
,Customer_Country
CASE when coalese(email2addr.addr_cnt,1) > 1 then 'Y' ELSE 'N' END as has_more_than_1_email
from customers
left join email2addr on customers.email = email2addr.email

Moving data to correct record

I have a table where the data is needs to be corrected. Below is an example of one record. Basically the data in the selling closed_unit needs to be in the Agent_to_Agent Ref close_unit. I have tried every different what I can think of but I can't get it figured out. I am sure it is fairly simple I think I am just looking too hard at the wrong way. Any help is greatly appreciated!
Current (bad) data:
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
| sale_no | payeeID | ComType | close_units | record_type | ref_agent_type | referring_agentID | ref_side |
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
| 7586 | 1001 | Listing | 1 | Listing | NULL | 0 | |
| 7586 | 2001 | Selling | 1 | Selling | NULL | 0 | |
| 7586 | 3254 | NULL | 0 | Off The Top Ref | NULL | 0 | L |
| 7586 | 4684 | Agent to Agent Ref | 0 | Agent Paid Ref | Selling | 2001 | |
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
Expected result:
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
| sale_no | payeeID | ComType | close_units | record_type | ref_agent_type | referring_agentID | ref_side |
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
| 7586 | 1001 | Listing | 1 | Listing | NULL | 0 | |
| 7586 | 2001 | Selling | 0 | Selling | NULL | 0 | |
| 7586 | 3254 | NULL | 0 | Off The Top Ref | NULL | 0 | L |
| 7586 | 4684 | Agent to Agent Ref | 1 | Agent Paid Ref | Selling | 2001 | |
+---------+---------+--------------------+-------------+-----------------+----------------+-------------------+----------+
The following query will copy the value to the "Agent to Agent Ref" row:
update my_table t1 set close_units = (
select close_units from my_table t2
where t2.sale_no = t1.sale_no and t2.ComType = 'Selling'
)
where ComType = 'Agent to Agent Ref';
And this one will reset the "Selling" value to zero:
update my_table t1
set close_units = 0
where ComType = 'Selling'
and exists (
select close_units from my_table t2
where t2.sale_no = t1.sale_no and t2.ComType = 'Agent to Agent Ref'
)

Considering values from one table as column header in another

I have a base table where I need to calculate the difference between two dates based on the type of the entry.
tblA
+----------+------------+---------------+--------------+
| TypeCode | Log_Date | Complete_Date | Pending_Date |
+----------+------------+---------------+--------------+
| 1 | 18/04/2016 | 19/04/2016 | |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 |
| 3 | 12/04/2016 | 19/04/2016 | |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 |
| 5 | 16/04/2016 | 21/04/2016 | |
| 1 | 19/04/2016 | 20/04/2016 | |
| 2 | 20/03/2016 | 31/03/2015 | |
| 3 | 25/03/2016 | 28/03/2016 | |
| 4 | 26/03/2016 | 27/03/2016 | |
| 5 | 27/03/2016 | 30/03/2016 | |
+----------+------------+---------------+--------------+
I have another look up table which has the column names to be considered based on the TypeCode.
tblB
+----------+----------+---------------+
| TypeCode | DateCol1 | DateCol2 |
+----------+----------+---------------+
| 1 | Log_Date | Complete_Date |
| 2 | Log_Date | Pending_Date |
| 3 | Log_Date | Complete_Date |
| 4 | Log_Date | Pending_Date |
| 5 | Log_Date | Complete_Date |
+----------+----------+---------------+
I am doing a simple DATEDIFF between two dates for my calculation. However I want to lookup which columns to consider for this calculation from tblB and apply it on tblA based on the TypeCode.
Resulting table:
For example: When the TypeCode is 2 or 4 then the calculation should be DATEDIFF(d, Log_Date, Pending_Date), otherwise DATEDIFF(d, Log_Date, Complete_Date)
+----------+------------+---------------+--------------+----------+
| TypeCode | Log_Date | Complete_Date | Pending_Date | Cal_Days |
+----------+------------+---------------+--------------+----------+
| 1 | 18/04/2016 | 19/04/2016 | | 1 |
| 2 | 10/04/2016 | 18/04/2016 | 15/04/2016 | 5 |
| 3 | 12/04/2016 | 19/04/2016 | | 7 |
| 4 | 15/04/2016 | 17/04/2016 | 16/04/2016 | 1 |
| 5 | 16/04/2016 | 21/04/2016 | | 5 |
| 1 | 19/04/2016 | 20/04/2016 | | 1 |
| 2 | 20/03/2016 | 31/03/2015 | | |
| 3 | 25/03/2016 | 28/03/2016 | | 3 |
| 4 | 26/03/2016 | 27/03/2016 | | |
| 5 | 27/03/2016 | 30/03/2016 | | 3 |
+----------+------------+---------------+--------------+----------+
Any help would be appreciated. Thanks.
Use JOIN with CASE expression:
SELECT
a.*,
Cal_Days =
DATEDIFF(
DAY,
CASE
WHEN b.DateCol1 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol1 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END,
CASE
WHEN b.DateCol2 = 'Log_Date' THEN a.Log_Date
WHEN b.DateCol2 = 'Complete_Date' THEN a.Complete_Date
ELSE a.Pending_Date
END
)
FROM TblA a
INNER JOIN TblB b
ON b.TypeCode = a.TypeCode

Sum data from two tables with different number of rows

There are 3 Tables (SorMaster, SorDetail, and InvWarehouse):
SorMaster:
+------------+
| SalesOrder |
+------------+
| 100 |
| 101 |
| 102 |
+------------+
SorDetail:
+------------+------------+---------------+
| SalesOrder | MStockCode | MBackOrderQty |
+------------+------------+---------------+
| 100 | PN-1 | 4 |
| 100 | PN-2 | 9 |
| 100 | PN-3 | 1 |
| 100 | PN-4 | 6 |
| 101 | PN-1 | 6 |
| 101 | PN-3 | 2 |
| 102 | PN-2 | 19 |
| 102 | PN-3 | 14 |
| 102 | PN-4 | 6 |
| 102 | PN-5 | 4 |
+------------+------------+---------------+
InvWarehouse:
+------------+-----------+-----------+
| MStockCode | Warehouse | QtyOnHand |
+------------+-----------+-----------+
| PN-1 | A | 1 |
| PN-2 | B | 9 |
| PN-3 | A | 0 |
| PN-4 | B | 1 |
| PN-1 | A | 0 |
| PN-3 | B | 5 |
| PN-2 | A | 9 |
| PN-3 | B | 4 |
| PN-4 | A | 6 |
| PN-5 | B | 0 |
+------------+-----------+-----------+
Desired Results:
+------------+-----------------+--------------+
| MStockCode | SumBackOrderQty | SumQtyOnHand |
+------------+-----------------+--------------+
| PN-1 | 10 | 10 |
| PN-2 | 28 | 1 |
| PN-3 | 17 | 5 |
| PN-4 | 12 | 13 |
| PN-5 | 11 | 6 |
+------------+-----------------+--------------+
I have been going around in circles with no end in sight. Seems like it should be simple but just can't wrap my head around it. The SumBackOrderQty obviously getting counted twice as the SumQtyOnHand is evaluated. To this point I have been doing the calculations in the PHP instead of the select statement but would like to clean things up a bit where possible.
Current query statement is:
SELECT SorDetail.MStockCode,
SUM(SorDetail.MBackOrderQty) AS 'SumMBackOrderQty',
SUM(InvWarehouse.QtyOnHand) AS 'SumQtyOnHand'
FROM SysproCompanyJ.dbo.SorMaster SorMaster,
SysproCompanyJ.dbo.SorDetail SorDetail LEFT OUTER JOIN SysproCompanyJ.dbo.InvWarehouse InvWarehouse
ON SorDetail.MStockCode = InvWarehouse.StockCode
WHERE SorMaster.SalesOrder = SorDetail.SalesOrder
AND SorMaster.ActiveFlag != 'N'
AND SorDetail.MBackOrderQty > '0'
AND SorDetail.MPrice > '0'
GROUP BY SorDetail.MStockCode
ORDER BY SorDetail.MStockCode ASC
Without providing the complete picture, in terms of your RDBMS, database schema, a description of the problem you're trying to solve and sample data that matches the aforementioned, the following is just an illustration of what a solution based on Barmar's comment could look like:
SELECT SD.MStockCode,
SD.SumBackOrderQty,
IW.SumQtyOnHand
FROM (SELECT MStockCode,
SUM(MBackOrderQty) AS `SumBackOrderQty`
FROM SorDetail
JOIN SorMaster ON SorDetail.SalesOrder=SorMaster.SalesOrder
WHERE SorMaster.ActiveFlag != 'N'
AND SorDetail.MBackOrderQty > 0
AND SorDetail.MPrice > 0
GROUP BY MStockCode) AS SD
LEFT JOIN (SELECT MStockCode,
SUM(QtyOnHand) AS `SumQtyOnHand`
FROM InvWarehouse
GROUP BY MStockCode) AS IW ON SD.MStockCode=IW.MStockCode
ORDER BY SD.MStockCode;
Here's one approach:
select MStockCode,
(select sum(MBackOrderQty) from sorDetail as T2
where T2.MStockCode = T1.MStockCode ) as SumBackOrderQty,
(select sum(QtyOnHand) from invWarehouse as T3
where T3.MStockCode = T1.MStockCode ) as SumQtyOnHand
from
(
select mstockcode from sorDetail
union
select mstockcode from invWarehouse
) as T1
In a fiddle here: http://sqlfiddle.com/#!9/fdaca/6
Though my SumQtyOnHand values don't match yours (as #Gordon pointed out).

Ask about query in sql server

i have table like this:
| ID | id_number | a | b |
| 1 | 1 | 0 | 215 |
| 2 | 2 | 28 | 8952 |
| 3 | 3 | 10 | 2000 |
| 4 | 1 | 0 | 215 |
| 5 | 1 | 0 |10000 |
| 6 | 3 | 10 | 5000 |
| 7 | 2 | 3 |90933 |
I want to sum a*b where id_number is same, what the query to get all value for every id_number? for example the result is like this :
| ID | id_number | result |
| 1 | 1 | 0 |
| 2 | 2 | 523455 |
| 3 | 3 | 70000 |
This is a simple aggregation query:
select id_number, sum(a*b)
from t
group by id_number
I'm not sure what the first column is for.