There is yearly data in the source. I need to exclude the data -which is in another table and raw count is not static- from it.
Source data:
Dates to be excluded:
There can be 2 raws or 5 raws of data to be excluded, so it need to be dynamically and 2 tables can be bound by the DISPLAY_NAME column.
I am trying to do it with query, don't want to use sp. Is there any way or sp is only choise to do this.
Maybe multiple case when for each raw 1 / 0 and only get if all new case when columns are 1 but issue is don't know how many case when i will use since exclude table data raw count is not static.
Are you looking for not exists?
select s.*
from source s
where not exists (select 1
from excluded e
where e.display_name = s.display_name and
s.start_datetime >= e.start_date and
s.end_datetime < e.end_date
);
Note: Your question does not explain how the end_date should be handled. This assumes that the data on that date should be included in the result set. You can tweak the logic to exclude data from that date as well.
Related
I have an initial query written below and need to find values in the quote_id column that different but the corresponding values in the benefit_plan_cd column are the same. The output should look like the below. I know the prospect_nbr for this issue which is why I am able to add it to my initial query to get the expected results but need to be able to find other ones going forward.
select prospect_nbr, qb.quote_id, quote_type, effective_date,
benefit_plan_cd, package_item_cd
from qo_benefit_data qb
inner join
qo_quote qq on qb.quote_id = qq.quote_id
where quote_type = 'R'
and effective_date >= to_date('06/01/2022','mm/dd/yyyy')
and package_item_cd = 'MED'
Output should look like something like this excluding the other columns.
quote_id benefit_plan_cd
514 1234
513 1234
Let's do this in two steps.
First take your existing query and add the following at the end of your select list:
select ... /* the columns you have already */
, count(distinct quote_id partition by benefit_plan_id) as ct
That is the only change - don't change anything else. You may want to run this first, to see what it produces. (Looking at a few rows should suffice, you don't need to look at all the rows.)
Then use this as a subquery, to filter on this count being > 1:
select ... /* only the ORIGINAL columns, without the one we added */
from (
/* write the query from above here, as a SUBquery */
)
where ct > 1
;
I am seeing duplicates in my data after running my sql query, and have figured out the issue stemming to our data team not updating a table but adding a new row instead. In this instance, I need to use the largest LD_SEQ_NBR to get the latest data.
Given the following table -- ORDERS
ID ORD_NBR LD_SEQ_NBR
0 130263789 1665
1 130263789 1870
What do I need to add to my WHERE clause to make sure I'm taking the rows with the largest LD_SEQ_NBR?
LD_SEQ_NBR = (SELECT MAX(LD_SEQ_NBR) FROM ORDERS A WHERE A.ORD_NBR = ORDERS.ORD_NBR)
I have the following codes in SAS:
proc sql;
create table play2
as select a.anndats,a.amaskcd,count(b.amaskcd) as experience
from test1 as a, test1 as b
where a.amaskcd = b.amaskcd and intck('day', b.anndats, a.anndats)>0
group by a.amaskcd, a.ANNDATS;
quit;
The data test1 has 32 distinct obs, while this play2 only returns 22 obs. All I want to do is for each obs, count the number of appearance for the same amaskcd in history. What is the best way to solve this? Thanks.
The reason this would return 22 observations - which might not actually be 22 distinct from the 32 - is that this is a comma join, which in this case ends up being basically an inner join. For any given row a if there are no rows b which have a later anndats with the same amaskcd, then that a will not be returned.
What you want to do here is a left join, which returns all rows from a once.
create table play2
as select ...
from test1 a
left join test1 b
on a.amaskcd=b.amaskcd
where intck(...)>0
group by ...
;
I would actually write this differently, as I'm not sure the above will do exactly what you want.
create table play2
as select a.anndats, a.amaskcd,
(select count(1) from test1 b
where b.amaskcd=a.amaskcd
and b.anndats>a.anndats /* intck('day') is pointless, dates are stored as integer days */
) as experience
from test1 a
;
If your test1 isn't already grouped by amaskcd and anndats, you may need to rework this some. This kind of subquery is easier to write and more accurately reflects what you're trying to do, I suspect.
If both the anndats variables in each dataset are date type (not date time) then you can simple do an equals. Date variables in SAS are simply integers where 1 represents one day. You would not need to use the intck function to tell the days differnce, just use subtraction.
The second thing I noticed is your code looks for > 0 days returned. The intck function can return a negative value if the second value is less than the first.
I am still not sure I understand what your looking to produce in the query. It's joining two datasets using the amaskcd field as the key. Your then filtering based on anndats, only selecting records where b anndats value is less than a anndats or b.anndats < a.anndats.
Edited
I am running into an error and I know what is happening but I can't see what is causing it. Below is the sql code I am using. Basically I am getting the general results I want, however I am not accurately giving the query the correct 'where' clause.
If this is of any assistance. The count is coming out as this:
Total Tier
1 High
2 Low
There are 4 records in the Enrollment table. 3 are active, and 1 is not. Only 2 of the records should be displayed. 1 for High, and 1 for low. The second Low record that is in the total was flagged as 'inactive' on 12/30/2010 and reflagged again on 1/12/2011 so it should not be in the results. I changed the initial '<=' to '=' and the results stayed the same.
I need to exclude any record from Enrollments_Status_Change that where the "active_status" was changed to 0 before the date.
SELECT COUNT(dbo.Enrollments.Customer_ID) AS Total,
dbo.Phone_Tier.Tier
FROM dbo.Phone_Tier as p
JOIN dbo.Enrollments as eON p.Phone_Model = e.Phone_Model
WHERE (e.Customer_ID NOT IN
(Select Customer_ID
From dbo.Enrollment_Status_Change as Status
Where (Change_Date >'12/31/2010')))
GROUP BY dbo.Phone_Tier.Tier
Thanks for any assistance and I apologize for any confusion. This is my first time here and i'm trying to correct my etiquette on the fly.
If you don't want any of the fields from that table dbo.Enrollment_Status_Change, and you don't seem to use it in any way — why even include it in the JOINs? Just leave it out.
Plus: start using table aliases. This is very hard to read if you use the full table name in each JOIN condition and WHERE clause.
Your code should be:
SELECT
COUNT(e.Customer_ID) AS Total, p.Tier
FROM
dbo.Phone_Tier p
INNER JOIN
dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE
e.Active_Status = 1
AND EXISTS (SELECT DISTINCT Customer_ID
FROM dbo.Enrollment_Status_Change AS Status
WHERE (Change_Date <= '12/31/2010'))
GROUP BY
p.Tier
Also: most likely, your EXISTS check is wrong — since you didn't post your table structures, I can only guess — but my guess would be:
AND EXISTS (SELECT * FROM dbo.Enrollment_Status_Change
WHERE Change_Date <= '12/31/2010' AND CustomerID = e.CustomerID)
Check for existence of any entries in dbo.Enrollment_Status_Change for the customer defined by e.CustomerID, with a Change_Date before that cut-off date. Right?
Assuming you want to:
exclude all customers whose latest enrollment_status_change record was since the start of 2011
but
include all customers whose latest enrollment_status_change record was earlier than the end of 2010 (why else would you have put that EXISTS clause in?)
Then this should do it:
SELECT COUNT(e.Customer_ID) AS Total,
p.Tier
FROM dbo.Phone_Tier p
JOIN dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE dbo.Enrollments.Active_Status = 1
AND e.Customer_ID NOT IN (
SELECT Customer_ID
FROM dbo.Enrollment_Status_Change status
WHERE (Change_Date >= '2011-01-01')
)
GROUP BY p.Tier
Basically, the problem with your code is that joining a one-to-many table will always increase the row count. If you wanted to exclude all the records that had a matching row in the other table this would be fine -- you could just use a LEFT JOIN and then set a WHERE clause like Customer_ID IS NULL.
But because you want to exclude a subset of the enrollment_status_change table, you must use a subquery.
Your intention is not clear from the example given, but if you wanted to exclude anyone who's enrollment_status_change as before 2011, but include those who's status change was since 2011, you'd just swap the date comparator for <.
Is this any help?
I have the below query. The problem is the last column productdesc is returning two records and the query fails because of distinct. Now i need to add one more column in where clause of the select query so that it returns one record. The issue is that the column i need
to add should not be a part of group by clause.
SELECT product_billing_id,
billing_ele,
SUM(round(summary_net_amt_excl_gst/100)) gross,
(SELECT DISTINCT description
FROM RES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele) productdescr
FROM bil.bill_sum aa
WHERE file_id = 38613 --1=1
AND line_type = 'D'
AND (product_billing_id, billing_ele) IN (SELECT DISTINCT
product_billing_id,
billing_ele
FROM bil.bill_l2 )
AND trans_type_desc <> 'Change'
GROUP BY product_billing_id, billing_ele
I want to modify the select statement to the below way by adding a new filter to the where clause so that it returns one record .
(SELECT DISTINCT description
FROM RRES.tariff_nt
WHERE product_billing_id = aa.product_billing_id
AND billing_ele = aa.billing_ele
AND (rate_structure_start_date <= TO_DATE(aa.p_effective_date,'yyyymmdd')
AND rate_structure_end_date > TO_DATE(aa.p_effective_date,'yyyymmdd'))
) productdescr
The aa.p_effective_date should not be a part of GROUP BY clause. How can I do it? Oracle is the Database.
So there are multiple RES.tariff records for a given product_billing_id/billing_ele, differentiated by the start/end dates
You want the description for the record that encompasses the 'p_effective_date' from bil.bill_sum. The kicker is that you can't (or don't want to) include that in the group by. That suggests you've got multiple rows in bil.bill_sum with different effective dates.
The issue is what do you want to happen if you are summarising up those multiple rows with different dates. Which of those dates do you want to use as the one to get the description.
If it doesn't matter, simply use MIN(aa.p_effective_date), or MAX.
Have you looked into the Oracle analytical functions. This is good link Analytical Functions by Example