Comparing two max dates with a condition in Oracle SQL - sql

I have the data as below
ID date state
1 24-Aug-18 Not defined
1 23-Aug-18 Incorrect
1 22-Aug-18 Incorrect
1 21-Aug-18 Incorrect
1 1-Aug-18 Correct
1 23-Jul-17 Incorrect
1 22-Jul-17 Incorrect
1 21-Jul-17 Incorrect
1 10-Jul-17 Correct
The record 1 can stay at the incorrect state for 3 days post that it goes to 'not defined' (unless any update has not been made to the record. If done then it gets back to Correct). The not defined state has to be avoided. Now I need to define a query such that the query can identify the minimum latest record date at which the record went to the incorrect state i.e. in this case 21-Aug-2018. Also problem here is the table doesn't have unique keys.
I have been trying the below code but it is throwing me the error
'ORA-01427: single-row subquery returns more than one row'
select id, min(date) from table where state = 'Incorrect' group by id having
((Select trunc(MAX (date)) from table where state = 'Incorrect'
group by id) >= (select trunc(Max (date)) from table where state = 'Correct'
group by id))

Hmmm, I think this does what you want:
select id, min(date) as min_latest_incorrect_date
from (select t.*,
max(case when state = 'Correct' then date end) over (partition by id) as max_date_correct
from t
) t
where (date > max_date_correct or max_date_correct is null) and
state = 'Incorrect'
group by id

Per ID, you are looking for incorrect records that are not followed by any correct record. Of these take the first.
select id, min(date)
from mytable i
where state = 'Incorrect'
and not exists
(
select *
from mytable c
where c.id = i.id
and c.state = 'Correct'
and c.date > i.date
)
group by id
order by id;

Related

Grouping by id and looking at another column in a particular order to see if the id group satisfies a particular condition

customer_id
transaction success
1
Failed
2
Complete
1
Failed
1
Complete
3
Failed
2
Failed
3
Complete
3
Failed
3
Failed
3
Complete
Essentially I want to write a statement to identify if the customer has had a completed transaction after having had a failed transaction sometime before. So in this example, customer 1 and customer 2 would be satisfy this. Assume that there is an added timestamp column next to transaction success.
The resulting table should look like this:
customer_id
returning_success
1
True
2
False
3
True
Assuming that is not important if the Complete was after or prior to the Cancellation, you can LEFT JOIN the table with a subquery that only takes the completes. If the result is NULL, then is not have a complete state. Otherwise is true.
As you don't provide your DBMS (Please read: Why should I "tag my RDBMS"?) we take in consideration IFNULL but this can change in other DBMS: https://www.w3schools.com/sql/sql_isnull.asp
SELECT
yt.customer_id,
IFNULL(completes.customer_id,'false','true') as returning_success
FROM
yourtable yt
LEFT JOIN
(
SELECT
customer_id
FROM
yourTable
WHERE transaction_success = 'Complete') completes
ON completes.customer_id = yt.customer_id
 If you just need customers that had had both succesfull and faild transactions, you should implement this:
select customer_id, case when sum(case
when transaction='Faild'
then 1
else 0 end)>0
and
sum(case
when transaction='Complete'
then 1
else 0 end)>0
then 'True'
else 'False' end
returning_success
from table_
group by customer_id
 If you actually do have some timestamp column:
select nvl(c.customer_id, f.customer_id) customer_id
, case when last_complete_time is null
or first_fail_time is null
or first_fail_time>last_complete_time
then 'False'
else 'True' end
returning_success
from (
select customer_id, max(time_) last_complete_time
from table_
group by customer_id
where transaction='Complete'
) c
full join (
select customer_id, min(time_) first_fail_time
from table_
group by customer_id
where transaction='Fail'
) f on c.customer_id=f.customer_id
 You also can use this query to filter all True cases and then just union or join the rest:
select f.customer_id, 'True'
from (
select customer_id, max(time_) last_complete_time
from table_
group by customer_id
where transaction='Complete'
) c
join (
select customer_id, min(time_) first_fail_time
from table_
group by customer_id
where transaction='Fail'
) f on c.customer_id=f.customer_id
where first_fail_time<last_complete_time

Finding rows in SQL where changes but only certain changes while keeping others

I have this scenario where I want each occurrence of an active row to bring back that row in my result set and also inactive if there is only 1 inactive record for that IDENTIFIER and also if there are more than 1 active also show those. I've used Row_Number function and then in another query show where the row = '1' but if I do that row 1s only come back and then I lose some of my desired results. To restate my issue is I want all active records to come back and only inactive where IDENTIFIER is unique. The row that is bold should not be shown in the results.
1 has 1 active record in the DB.
2 has 2 active and 1 inactive records.
3 has no active records.
4 has only 2 active records, no inactive.
You can use a windowed conditional count, this has the benfit of only scanning the table once
SELECT
t.IDENTIFIER,
t.DB_ID,
t.Status
FROM (
SELECT *,
HasActive = COUNT(CASE WHEN t.Status = 'Active' THEN 1 END) OVER (PARTITION BY t.IDENTIFIER)
FROM YourTable t
) t
WHERE t.Status = 'Active' OR t.HasActive = 0;
One way to do this is with NOT EXISTS:
SELECT t1.*
FROM tablename t1
WHERE t1.Status = 'Active'
OR NOT EXISTS (
SELECT 1
FROM tablename t2
WHERE t2.identifier = t1.identifier AND t2.db_id <> t1.db_id
);
I assume that the column db_id is unique, at least for the same identifier.
If I understood you correctly, this is my variant.
select IDENTIFIER, [DB_ID], [Status]
from Tab
where [Status]='Active'
union
select IDENTIFIER, [DB_ID], [Status]
from Tab as t
where [Status]='Inactive' And 1=(select Count(*) from Tab where
IDENTIFIER=t.IDENTIFIER)
Order by IDENTIFIER, [DB_ID]
you can do it like this, because (rank=1 and Status=Inactive) only if there are no active rows for a particular Identifier
select * from (
select *,
DENSE_RANK() OVER (PARTITION BY identifier order by status) AS rank
from some_table
)
where rank=1 or status = 'Active'

Check whether an employee is present on three consecutive days

I have a table called tbl_A with the following schema:
After insert, I have the following data in tbl_A:
Now the question is how to write a query for the following scenario:
Put (1) in front of any employee who was present three days consecutively
Put (0) in front of employee who was not present three days consecutively
The output screen shoot:
I think we should use case statement, but I am not able to check three consecutive days from date. I hope I am helped in this
Thank you
select name, case when max(cons_days) >= 3 then 1 else 0 end as presence
from (
select name, count(*) as cons_days
from tbl_A, (values (0),(1),(2)) as a(dd)
group by name, adate + dd
)x
group by name
With a self-join on name and available = 'Y', we create an inner table with different combinations of dates for a given name and take a count of those entries in which the dates of the two instances of the table are less than 2 units apart i.e. for each value of a date adate, it will check for entries with its own value adate as well as adate + 1 and adate + 2. If all 3 entries are present, the count will be 3 and you will have a flag with value 1 for such names(this is done in the outer query). Try the below query:
SELECT Z.NAME,
CASE WHEN Z.CONSEQ_AVAIL >= 3 THEN 1 ELSE 0 END AS YOUR_FLAG
FROM
(
SELECT A.NAME,
SUM(CASE WHEN B.ADATE >= A.ADATE AND B.ADATE <= A.ADATE + 2 THEN 1 ELSE 0 END) AS CONSEQ_AVAIL
FROM
TABL_A A INNER JOIN TABL_A B
ON A.NAME = B.NAME AND A.AVAILABLE = 'Y' AND B.AVAILABLE = 'Y'
GROUP BY A.NAME
) Z;
Due to the complexity of the problem, I have not been able to test it out. If something is really wrong, please let me know and I will be happy to take down my answer.
--Below is My Approch
select Name,
Case WHen Max_Count>=3 Then 1 else 0 end as Presence
from
(
Select Name,MAx(Coun) as Max_Count
from
(
select Name, (count(*) over (partition by Name,Ref_Date)) as Coun from
(
select Name,adate + row_number() over (partition by Name order by Adate desc) as Ref_Date
from temp
where available='Y'
)
) group by Name
);
select name as employee , case when sum(diff) > =3 then 1 else 0 end as presence
from
(select id, name, Available,Adate, lead(Adate,1) over(order by name) as lead,
case when datediff(day, Adate,lead(Adate,1) over(order by name)) = 1 then 1 else 0 end as diff
from table_A
where Available = 'Y') A
group by name;

Constructing A Query In BigQuery With CASE Statements

So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC

SQL Case on Where clause to different columns

I've been trying to get a query that, based on a given condition (if isCurrent = 1 or not) should give me just one value/row based on the CurriculumId /which will be a parameter on a stored procedure).
This value should, in case isCurrent = 1 return to me the item with the most current StartDate but if isCurrent = 0 then it should give me the one with the most current EndDate.
The thing is that I only want one item per CurriculumId, ideally the one with isCurrent = 1 and the most current StartDate (ignoring the remaining rows) but, if there are no experiences with isCurrent = 1, then it should return to me the one with the most current EndDate.
My previous query was almost working but I still got the one with the most current StartDate (isCurrent = 1) AND the one with the most current EndDate when I want to retrieve just one or another.
I've come to the query bellow:
SELECT table.IntProfessionalExperienceId,
table.IsCurrent,
table.StartDate,
table.EndDate
FROM table
WHERE table.CurriculumId = 12
AND
CASE table.IsCurrent
WHEN 1
THEN
table.StartDate = (
SELECT max(table.StartDate)
FROM table
WHERE table.IsCurrent = 1
AND table.CurriculumId = 12
GROUP BY table.CurriculumId
)
ELSE
table.EndDate = (
SELECT max(table.EndDate)
FROM table
WHERE table.CurriculumId = 12
GROUP BY table.CurriculumId
)
END
Individually, the queries seem to be working OK and returning the supposed value although when ran as a whole I get the following errors:
Msg 102, Level 15, State 1, Line 8
Incorrect syntax near '='.
Msg 102, Level 15, State 1, Line 14
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 21
Incorrect syntax near ')'.
What in my syntax is wrong? I know from reading the errors what is wrong with the query but I just don't know how to fix it. And is it just the syntax or am I doing the query wrong to start with?
Split this into multiple conditions, like this:
SELECT
table.IntProfessionalExperienceId,
table.IsCurrent,
table.StartDate,
table.EndDate
FROM table
WHERE
table.CurriculumId = 12 AND
(
(
Table.IsCurrent = 1 AND
table.StartDate =
(
SELECT max(table.StartDate)
FROM table
WHERE
table.IsCurrent = 1 AND
table.CurriculumId = 12
GROUP BY table.CurriculumId
)
) OR
(
ISNULL(table.IsCurrent,0) != 1 AND
table.EndDate =
(
SELECT max(table.EndDate)
FROM table
WHERE table.CurriculumId = 12
GROUP BY table.CurriculumId
)
)
)
EDIT: another, arguably simpler approach would be to pre-aggregate the data you want in your WHERE clause so that you only need to call it a single time, rather than evaluate each row separately. Something like the following:
SELECT
table.IntProfessionalExperienceId,
table.IsCurrent,
table.StartDate,
table.EndDate
FROM
table
INNER JOIN
(
SELECT
MAX(table.EndDate) AS MaxEndDate,
MAX(CASE WHEN table.IsCurrent = 1 THEN table.StartDate END) AS MaxCurrentStartDate
FROM table
WHERE CurriculumID = 12
) MaxDates ON
(Table.IsCurrent = 1 AND Table.StartDate = MaxDates.MaxCurrentStartDate) OR
(ISNULL(Table.IsCurrent, 0) != 1 AND Table.EndDate = MaxDates.MaxEndDate)
WHERE
table.CurriculumId = 12
Give each row a rank in its curriculumid group, using ROW_NUMBER with an appropriate order by clause. Then only take the records ranked 1 (i.e. best matching).
select
intprofessionalexperienceid,
iscurrent,
startdate,
enddate
from
(
select mytable.*
row_number() over
(
partition by curriculumid
order by
case when iscurrent = 1 then 1 else 2 end,
case when iscurrent = 1 then startdate else enddate end desc
) as rn
from mytable
) ranked
where rn = 1;
(I know this doesn't actually answer your question, but is the straight-forward way to approach the problem in my opinion.)
Try to use CASE statemen this way:
SELECT table.IntProfessionalExperienceId,
table.IsCurrent,
table.StartDate,
table.EndDate
FROM table
WHERE table.CurriculumId = 12
AND table.EndDate = CASE
WHEN table.IsCurrent = 1
THEN (
SELECT max(table.StartDate)
FROM table
WHERE table.IsCurrent = 1
AND table.CurriculumId = 12
GROUP BY table.CurriculumId
)
ELSE
(
SELECT max(table.EndDate)
FROM table
WHERE table.CurriculumId = 12
GROUP BY table.CurriculumId
)
END