ORACLE How to get data from a table depending on year and the amount of duplicates - sql

so the question is "Produce a list of those employees who have made bookings at the Sports Club more than 5 times in the last calendar year (this should be calculated and not hard coded). Order these by the number of bookings made."
what i am struggling with is being able to get todays year and then subtract 1 year aswell as showing ONLY one name of people appearing 5 times.
this is what i have so far, but it doesnt work.
SELECT DISTINCT
FIRSTNAME,SURNAME,DATEBOOKED,MEMBERSHIPTYPEID,BOOKINGID,MEMBER.MEMBERID
FROM MEMBERSHIP,MEMBER,BOOKING
WHERE MEMBER.MEMBERID = BOOKING.MEMBERID
AND MEMBERSHIP.MEMBERSHIPTYPEID = 3 AND BOOKING.DATEBOOKED = (SYSDATE,
'DD/MON/YY -1') AND FIRSTNAME IN (
SELECT FIRSTNAME
FROM MEMBER
GROUP BY FIRSTNAME,SURNAME
HAVING COUNT(FIRSTNAME) >= 5
)
ORDER BY MEMBER.MEMBERID;
EDR
tabele structures

What you need is this day last year. There are various different ways of calculating that. For instance,
add_months(sysdate, -12)
or
sysdate - interval '1' year
The subquery looks a bit whacky too. What you're after is the number of BOOKING records in the last year. So the subquery should drive off the BOOKING table, and the date filter should be there.
Finally, there is a missing join condition between MEMBER and MEMBERSHIP. That's probably why you think you need that distinct; fix the join and you'll get the result set you want and not a product. One advantage of the ANSI 92 explicit join syntax is that it stops us from missing joins.
Your query needs to be sorted by the number of bookings, so you need to count those and sort by the total. This means you don't actually need a subquery at all.
So your query should look something like:
SELECT MEMBER.MEMBERID
, MEMBER.FIRSTNAME
, MEMBER.SURNAME
, count(BOOKING.BOOKID) as no_of_bookings
FROM MEMBER
inner join MEMBERSHIP
on MEMBER.MEMBERID = MEMBERSHIP.MEMBERID
inner join BOOKING
on MEMBER.MEMBERID = BOOKING.MEMBERID
WHERE MEMBERSHIP.MEMBERSHIPTYPEID = 3
and BOOKING.DATEBOOKED >= add_months(trunc(sysdate), -12)
GROUP BY MEMBER.MEMBERID
, MEMBER.FIRSTNAME
, MEMBER.SURNAME
HAVING COUNT(*) >= 5
ORDER BY no_of_bookings desc;
Here is a SQL Fiddle demo for my query.

SELECT
FIRSTNAME,SURNAME,DATEBOOKED,MEMBERSHIPTYPEID,BOOKINGID,MEMBER.MEMBERID
FROM MEMBERSHIP,MEMBER,BOOKING
WHERE MEMBER.MEMBERID = BOOKING.MEMBERID
AND MEMBERSHIP.MEMBERSHIPTYPEID = 3 AND BOOKING.DATEBOOKED between (SYSDATE) and
(sysdate - interval '1' year)
AND FIRSTNAME IN (
SELECT distinct FIRSTNAME
FROM MEMBER
GROUP BY FIRSTNAME,SURNAME
HAVING COUNT(FIRSTNAME) >= 5
)
ORDER BY MEMBER.MEMBERID;

I think this answers the question:
SELECT m.FIRSTNAME, m.SURNAME, m.MEMBERID
FROM MEMBER m JOIN
BOOKING b
ON m.MEMBERID = b.MEMBERID JOIN
MEMBERSHIP ms
ON ms.MEMBERID = m.MEMBERID
WHERE ms.MEMBERSHIPTYPEID = 3 AND
B.DATEBOOKED >= SYSDATE - INTERVAL '1 YEAR'
GROUP BY m.FIRSTNAME, m.SURNAME, m.MEMBERID
HAVING COUNT(*) >= 5
ORDER BY COUNT(* DESC;
Table aliases make the query much easier to write and to read.

Related

Filter customers with atleast 3 transactions a year for the past 2 years Presto/SQL

I have a table of customer transactions called cust_trans where each transaction made by a customer is stored as one row. I have another col called visit_date that contains the transaction date. I would like to filter the customers who transact atleast 3 times a year for the past 2 years.
The data looks like below
Id visit_date
---- ------
1 01/01/2019
1 01/02/2019
1 01/01/2019
1 02/01/2020
1 02/01/2020
1 03/01/2020
1 03/01/2020
2 01/02/2019
3 02/04/2019
I would like to know the customers who visited atleast 3 times every year for the past two years
ie. I want below output.
id
---
1
From the customer table only one person visited atleast 3 times for 2 years.
I tried with below query but it only checks if total visits greater than or equal to 3
select id
from
cust_scan
GROUP by
id
having count(visit_date) >= 3
and year(date(max(visit_date)))-year(date(min(visit_date))) >=2
I would appreciate any help, guidance or suggestions
One option would be to generate a list of distinct ids, cross join it with the last two years, and then bring the original table with a left join. You can then aggregate to count how many visits each id had each year. The final step is to aggregate again, and filter with a having clause
select i.id
from (
select i.id, y.yr, count(c.id) cnt
from (select distinct id from cust_scan) i
cross join (values
(date_trunc('year', current_date)),
(date_trunc('year', current_date) - interval '1' year)
) as y(yr)
left join cust_scan c
on i.id = c.id
and c.visit_date >= y.yr
and c.visit_date < y.yr + interval '1' year
group by i.id, y.yr
) t
group by i.id
having min(cnt) >= 3
Another option would be to use two correlated subqueries:
select distinct id
from cust_scan c
where
(
select count(*)
from cust_scan c1
where
c1.id = c.id
and c1.visit_date >= date_trunc('year', current_date)
and c1.visit_date < date_trunc('year', current_date) + interval '1' year
) >= 3
and (
select count(*)
from cust_scan c1
where
c1.id = c.id
and c1.visit_date >= date_trunc('year', current_date) - interval '1' year
and c1.visit_date < date_trunc('year', current_date)
) >= 3
I assume you mean calendar years. I think I would use two levels of aggregation:
select ct.id
from (select ct.id, year(visit_date) as yyyy, count(*) as cnt
from cust_trans ct
where ct.visit_date >= '2019-01-01' -- or whatever
group by ct.id
) ct
group by ct.id
having count(*) = 2 and -- both year
min(cnt) >= 3; -- at least three transactions
If you want the last two complete years, just change the where clause in the subquery.
You can use a similar idea -- of two aggregations -- if you want the last two years relative to the current date. That would be two full years, rather than 1 and some fraction of the current year.

SQL In Oracle - How to search through occurrences in an interval?

I've gotten myself stuck working in Oracle with SQL for the first time. In my library example, I need to make a query on my tables for a library member who has borrowed more than 5 books in some week during the past year. Here's my attempt:
SELECT
PN.F_NAME,
PN.L_NAME,
M.ENROLL_DATE,
COUNT(*) AS BORROWED_COUNT,
(SELECT
(BD.DATE_BORROWED + INTERVAL '7' DAY)
FROM DUAL, BORROW_DETAILS BD
GROUP BY BD.DATE_BORROWED + INTERVAL '7' DAY
HAVING COUNT(*) > 5
) AS VALID_INTERVALS
FROM PERSON_NAME PN, BORROW_DETAILS BD, HAS H, MEMBER M
WHERE
PN.PID = M.PID AND
M.PID = BD.PID AND
BD.BORROWID = H.BORROWID
GROUP BY PN.F_NAME, PN.L_NAME, M.ENROLL_DATE, DATEDIFF(DAY, BD.DATE_RETURNED, VALID_INTERVALS)
ORDER BY BORROWED_COUNT DESC;
As I'm sure you can tell, Im really struggling with the Dates in oracle. For some reason DATEDIFF wont work at all for me, and I cant find any way to evaluate the VALID_INTERVAL which should be another date...
Also apologies for the all caps.
DATEDIFF is not a valid function in Oracle; if you want the difference then subtract one date from another and you'll get a number representing the number of days (or fraction thereof) between the values.
If you want to count it for a week starting from Midnight Monday then you can TRUNCate the date to the start of the ISO week (which will be Midnight of the Monday of that week) and then group and count:
SELECT MAX( PN.F_NAME ) AS F_NAME,
MAX( PN.L_NAME ) AS L_NAME,
MAX( M.ENROLL_DATE ) AS ENROLL_DATE,
TRUNC( BD.DATE_BORROWED, 'IW' ) AS monday_of_iso_week,
COUNT(*) AS BORROWED_COUNT
FROM PERSON_NAME PN
INNER JOIN MEMBER M
ON ( PN.PID = M.PID )
INNER JOIN BORROW_DETAILS BD
ON ( M.PID = BD.PID )
GROUP BY
PN.PID,
TRUNC( BD.DATE_BORROWED, 'IW' )
HAVING COUNT(*) > 5
ORDER BY BORROWED_COUNT DESC;
db<>fiddle
You haven't given your table structures or any sample data so its difficult to test; but you don't appear to need to include the HAS table and I'm assuming there is a 1:1 relationship between person and member.
You also don't want to GROUP BY names as there could be two people with the same first and last name (who happened to enrol on the same date) and should use something that uniquely identifies the person (which I assume is PID).

SQL creating a pivot function

I have a SQL code that looks like this:
select cast(avg(age) as decimal(16,2)) as 'avg' From
(select distinct acct.Account, cast(Avg(year(getdate())- year(client_birth_date)) as decimal(16,2)) as 'Age'
from WF_PM_ACCT_DB DET
inner join WF_PM_ACCT_DET_DB ACCT
ON det.Account = acct.Account
where (acct_closing_date is null or acct_closing_date > '2017-01-01')
and Acct_Open_Date < '2017-01-01'
group by acct.Account
) x
Then basically what this give me is a simple one cell answer of the average age of accounts in the year Acct_Open_Date < '2017-01-01' . I am an ameture so i change the date everytime and run the query again and again to get the remaining year. Is there an easy way to say lets have all the years as column headings and just one row with the average account age in that year.
Please note that the account closing date being null means accounts never got close and i have to change it to less than the analysis year in order to get a true picture of the average account age that existed at that time
Any help is appreciated. Thanks.
You can run this for multiple dates by including them in a single derived table:
with dates as (
select cast('2017-01-01' as date) as yyyy union all
select cast('2016-01-01' as date)
)
select yyyy, cast(avg(age) as decimal(16,2)) as avg_age
From (select dates.yyyy, acct.Account,
cast(Avg(year(getdate())- year(client_birth_date)) as decimal(16,2)) as Age
from dates cross join
WF_PM_ACCT_DB DET inner join WF_PM_ACCT_DET_DB
ACCT
on det.Account = acct.Account
where (acct_closing_date is null or acct_closing_date > dates.yyyy) and
Acct_Open_Date < dates.yyyy
group by acct.Account, dates.yyyy
) x
group by yyyy
order by yyyy;

Searching records between today's date and a specified month of this year

I want to search records between today's date and a specified month of this year. If i wanted to perform the same query next year without altering the query, then the month has to be based from sysdate I think? This is what I am trying to achieve anyway. Here is my query:
SELECT *
FROM details n
WHERE n.id (SELECT p.nhno
FROM patient p
WHERE p.fname = 'James'
AND p.surname = 'Gump'
AND p.nhno = n.nhno
AND n.daterecorded
BETWEEN '01-OCT-15' AND to_char(sysdate, 'dd-mon-yy'))
Thanks
What u are inferring above is not correct. U have given a date of '01-oct-15' and u want to run the same query without altering
the date for next year also. And what you think the result would be? U will be getting the records from '01-oct-15' to every year u run the query not a specific month in the same year!
As u said above, u seemed to need only records for a specified month of the same year but not the same '01-oct-15' for every year.
IF U NEED RECORDS BETWEEN 2 MONTHS(BEFORE) AND TODAY, USE BELOW QUERY,
SELECT *
FROM details n
WHERE EXISTS (SELECT p.nhno
FROM patient p
WHERE p.fname = 'James'
AND p.surname = 'Gump'
AND p.nhno = n.nhno
AND p.daterecorded BETWEEN ADD_MONTHS(SYSDATE, -2) AND TO_CHAR(SYSDATE, 'dd-mon-yy'));
IF U NEED RECORDS BETWEEN 2 MONTHS(AFTER) AND TODAY, REPLACE THE BETWEEN STATEMENT IN THE ABOVE QUERY WITH,
'BETWEEN ADD_MONTHS(SYSDATE, 2) AND TO_CHAR(SYSDATE, 'dd-mon-yy'))'
U can also alter this query in 'add_months' to the specified month u need.
Say, if u want records between 6 months(before) and present date,
'BETWEEN ADD_MONTHS(SYSDATE, -6) AND TO_CHAR(SYSDATE, 'dd-mon-yy'))'
U can run this query every year without any alterations.
This should do it. If not, get back to me.
Try this:
SELECT *
FROM details n
WHERE n.id (SELECT p.nhno
FROM patient p
WHERE p.fname = 'James'
AND p.surname = 'Gump'
AND p.nhno = n.nhno
AND n.daterecorded
BETWEEN TO_DATE('01-OCT-'||(SELECT EXTRACT(YEAR FROM SYSDATE) FROM DUAL),'DD-MON-YY')
AND TO_DATE(SYSDATE,'DD-MON-YY');

RedShift: Alternative to 'where in' to compare annual login activity

Here are the two cases:
Members Lost: Get the distinct count of user ids from 365 days ago who haven't had any activity since then
Members Added: Get the distinct count of user ids from today who don't exist in the previous 365 days.
Here are the SQL statements I've been writing. Logically I feel like this should work (and it does for sample data), but the dataset is 5Million+ rows and takes forever! Is there any way to do this more efficiently? (base_date is a calendar that I'm joining on to build out a 2 year trend. I figured this was faster than joining the 5million table on itself...)
-- Members Lost
SELECT
effective_date,
COUNT(DISTINCT dwuserid) as members_lost
FROM base_date
LEFT JOIN site_visit
-- Get Login Activity for 365th day
ON DATEDIFF(day, srclogindate, effective_date) = 365
WHERE dwuserid NOT IN (
-- Get Distinct Login activity for Current Day (PY) + 1 to Current Day (CY) (i.e. 2013-01-02 to 2014-01-01)
SELECT DISTINCT dwuserid
FROM site_visit b
WHERE DATEDIFF(day, b.srclogindate, effective_date) BETWEEN 0 AND 364
)
GROUP BY effective_date
ORDER BY effective_date;
-- Members Added
SELECT
effective_date,
COUNT(DISTINCT dwuserid) as members_added
FROM base_date
LEFT JOIN site_visit ON srclogindate = effective_date
WHERE dwuserid NOT IN (
SELECT DISTINCT dwuserid
FROM site_visit b
WHERE DATEDIFF(day, b.srclogindate, effective_date) BETWEEN 1 AND 365
)
GROUP BY effective_date
ORDER BY effective_date;
Thanks in advance for any help.
UPDATE
Thanks to #JohnR for pointing me in the right direction. I had to tweak your response a bit because I need to know on any login day how many were "Member Added" or "Member Lost" so it had to be a 365 rolling window looking back or looking forward. Finding the IDs that didn't have a match in the LEFT JOIN was much faster.
-- Trim data down to one user login per day
CREATE TABLE base_login AS
SELECT DISTINCT "dwuserid", "srclogindate"
FROM site_visit
-- Members Lost
SELECT
current."srclogindate",
COUNT(DISTINCT current."dwuserid") as "members_lost"
FROM base_login current
LEFT JOIN base_login future
ON current."dwuserid" = future."dwuserid"
AND current."srclogindate" < future."srclogindate"
AND DATEADD(day, 365, current."srclogindate") >= future."srclogindate"
WHERE future."dwuserid" IS NULL
GROUP BY current."srclogindate"
-- Members Added
SELECT
current."srclogindate",
COUNT(DISTINCT current."dwuserid") as "members_added"
FROM base_login current
LEFT JOIN base_login past
ON current."dwuserid" = past."dwuserid"
AND current."srclogindate" > past."srclogindate"
AND DATEADD(day, 365, past."srclogindate") >= current."srclogindate"
WHERE past."dwuserid" IS NULL
GROUP BY current."srclogindate"
NOT IN should generally be avoided because it has to scan all data.
Instead of joining to the site_visit table (which is presumably huge), try joining to a sub-query that selects UserID and the most recent login date -- that way, there is only one row per user instead of one row per visit.
For example:
SELECT dwuserid, min (srclogindate) as first_login, max(srclogindate) as last_login
FROM site_visit
GROUP BY dwuserid
You could then simplify the queries to something like:
-- Members Lost: Last login was between 12 and 13 months ago
SELECT
COUNT(*)
FROM
(
SELECT dwuserid, min(srclogindate) as first_login, max(srclogindate) as last_login
FROM site_visit
GROUP BY dwuserid
)
WHERE
last_login BETWEEN current_date - interval '13 months' and current_date - interval '12 months'
-- Members Added: First visit in last 12 months
SELECT
COUNT(*)
FROM
(
SELECT dwuserid, min(srclogindate) as first_login, max(srclogindate) as last_login
FROM site_visit
GROUP BY dwuserid
)
WHERE
first_login > current_date - interval '12 months'