Finding days when users haven't created any entries - sql

I've 2 tables: users and time_entries, time entries has a foreign key to the users table. Users may create time entries with some time amount in it. I want to write a query which could return summarized amounts of time in arbitrary dates range grouped by user and date - it's easy but I need to include also days when nobody entered any time_entry. I've tried to create an additional table called calendar with dates and left join time_entries to it but I couldn't retrieve a list of users that haven't entered any time_entry. Here is my query:
SELECT te.date, SUM(te.amount), user_name
FROM calendar c
LEFT JOIN time_entries te on c.date = te.date
RIGHT JOIN asp_net_users anu on te.user_id = anu.id
GROUP BY user_name, te.date

If you just want the days no user made any entry. you can use NOT EXISTS and a correlated subquery.
SELECT c.date
FROM calendar c
WHERE NOT EXISTS (SELECT *
FROM time_entries te
WHERE te.date = c.date);
If you want all users along with the days they haven't made any entry cross join the users and the days and then also use a NOT EXISTS.
SELECT anu.user_name,
c.date
FROM asp_net_users anu
CROSS JOIN calendar c
WHERE NOT EXISTS (SELECT *
FROM time_entries te
WHERE te.user_id = anu.id
AND te.date = c.date);

Thanks to sticky bit examples I was able to write the following query which solves my problem:
SELECT c.date, a.id, COALESCE(sum(te.amount), 0)
FROM asp_net_users a
CROSS JOIN (SELECT *
FROM calendar
WHERE date BETWEEN '2019-10-01 00:00:00'::timestamp AND '2019-10-31 00:00:00'::timestamp) c
LEFT JOIN time_entries te on a.id = te.user_id AND c.date = te.date
WHERE a.department_guid = '95b7538d-3830-48d7-ba06-ad7c51a57191'
GROUP BY c.date, a.id
ORDER BY c.date

Related

Why is the SQL full outer join is not presenting unmatched customers (avc_id)?

I appreciate your help in advance!
The right table avc_enr has 108K customers (b.avc_id) in it. In the 2nd table (alias a), we have about 97K customers (a.avc_id).
I tried to use right, left and full outer join but every time the count of customers shows 97K rather than 108K customers (under Total_users)... any idea why with full outer join the count function is not counting all customers even if no common match is found between two tables?
with avc_enr as
(
select
dt, avc_id, service_template_name
from
hive.thor_satellite.v_nms_inventory_nmsdb_avc_service
where
current_status = 'ACTIVE' and dt = 20220809
)
select
a.dt, a.metrics_date,
avg(a.vsat_fl_byte_count_kbps) as AUPU_Kbps,
count(b.avc_id) as Total_users
from
hive.thor_satellite.vda_satellite_nms_performance_smts_avc_pm_throughput a
full outer join
avc_enr b on a.avc_id = b.avc_id and a.dt = b.dt
where
a.dt = 20220809
group by
a.dt, a.metrics_date

How to select objects if not exist between a period in Sqlite

I like to select those users who haven't filled out a form in the last 7 days but I'm stuck. The background: I am working on an app that lists the users who have filled out the form but I wrote it in another query that works fine. Now I need to select just those users who haven't filled the form out in the last 7 days.
The query I wrote selects all the users because everyone has objects that outside the period.
How can I select just those users who haven't filled out the form in the given period but not to include all users. As you can see on the picture the user with id 1 appears two times with Yes and No.
Tha query I wrote:
SELECT DISTINCT auth_user.id,
CASE WHEN felmeres.date BETWEEN date("now", "-7 day") AND date('now')
THEN 'Yes'
ELSE 'No'
END AS period
FROM felmeres
LEFT JOIN profile ON profile.user_id = felmeres.user_name_id
ORDER BY felmeres.date DESC
You could use a join aggregation approach:
SELECT p.user_id
FROM profile p
INNER JOIN felmeres f
ON f.user_name_id = p.user_id
GROUP BY p.user_id
HAVING SUM(f.date BETWEEN date('now', '-7 day') AND date('now')) = 0;
If profile contains the users' data then it should be the left table in the LEFT join and the condition for the dates should be placed in the ON clause, so that you filter out the matching users:
SELECT p.*
FROM profile p LEFT JOIN felmeres f
ON f.user_name_id = p.user_id AND f.date BETWEEN date(CURRENT_DATE, '-7 day') AND CURRENT_DATE
WHERE f.user_name_id IS NULL;
Or, with NOT EXISTS:
SELECT p.*
FROM profile p
WHERE NOT EXISTS (
SELECT 1
FROM felmeres f
WHERE f.user_name_id = p.user_id AND f.date BETWEEN date(CURRENT_DATE, '-7 day') AND CURRENT_DATE
);

Getting single record in subquery

I have a DB2 database with multiple user records, which are related to a newsletter record through a couple intermediate tables.
My problem is that I need to get just the users where the latest newsletter they received was in the last week. I've been banging my head against this for hours and still haven't found a way to cleanly get the records I need. I thought this would be the solution, but I keep running into very generic errors that I don't really understand the cause of.
SELECT a.*, d.tech_id as newsletter FROM users a
JOIN user_profile b ON b.tech_id = a.profile_id
JOIN user_contact c ON c.user_id = b.tech_id
JOIN (
SELECT newsletters.tech_id, ROW_NUMBER() OVER (ORDER BY timestamp(tech_id) DESC) AS RN
FROM NEWSLETTERS
) d ON d.tech_id = c.newsletter_id
WHERE (timestamp(d.rn) < current_timestamp - 7 days)
Is there a better way to do this, or am I missing an obvious problem?
EDIT:
This is what I'd like to be doing, though it doesn't work right either:
SELECT a.* as newsletter FROM users a
WHERE (
SELECT MAX(timestamp(newsletters.tech_id))
FROM newsletters
WHERE newsletters.tech_id IN(
SELECT newsletter_id FROM user_contact WHERE user_contact.profile_id = a.tech_id
)
) < current_timestamp - 7 days
The structure is pretty straightforward. The users table has a foreign key of profile_id which is keyed to the user_profile.tech_id. The user_contact.user_id field is keyed to the user_profile.tech_id. And the user_contact table has a foreign key called user_contact.newsletter_id that is keyed to the newsletters.tech_id
RN is a row number generated based on the latest timestamp of the column. It should not be compared with dates in the final where clause. However, in your row_number function, you have ordered by timestamp(tech_id) which won't work as expected if tech_id is not a datetime datatype.
As per your requirement, row_number isn't needed.
Try the query below.
SELECT a.*, d.tech_id as newsletter
FROM users a
JOIN user_profile b ON b.tech_id = a.profile_id
JOIN user_contact c ON c.user_id = b.tech_id
JOIN NEWSLETTERS d ON d.tech_id = c.newsletter_id
WHERE (timestamp(d.datecolumn) < current_timestamp - 7 days)
--change the d.datecolumn to the datetime column in the table
It seems that you need to use GROUP BY. Try something like this:
SELECT a.*
FROM users a
JOIN (
SELECT d.tech_id as newsletter, MAX(d.datecol)
FROM user_profile b
JOIN user_contact c ON c.user_id = b.tech_id
JOIN NEWSLETTERS d ON d.tech_id = c.newsletter_id
WHERE (timestamp(d.datecol) < current_timestamp - 7 days)
GROUP BY d.tech_id
) e ON e.tech_id = a.profile_id

SQL join: selecting last record that meets a condition from the original table

I am new to SQL, so excuse any lapse of notation. A much simplified version of my problem is as follows. I have hospital admissions in table ADMISSIONS and need to collect the most recent outpatient claim of a certain type from table CLAIMS prior to the admission date:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b on (a.ID=b.ID) and (a.date>b.claim_date)
LEFT JOIN claims c on ((a.ID=c.ID) and (a.date>c.claim_date))
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
The problem is that for some IDs I get many records with duplicate a.date, c.claim_date values.
My problem is similar to one discussed here
SQL join: selecting the last records in a one-to-many relationship
and elaborated on here
SQL Left join: selecting the last records in a one-to-many relationship
However, there is the added wrinkle of looking only for records in CLAIMS that occur prior to a.date and I think that is causing the problem.
Update
Times are not stored, just dates, and since a patient can have multiple records on the same day, it's an issue. There is another wrinkle, which is that I only want to look at a subset of CLAIMS (let's say claims.flag=TRUE). Here's what I tried last:
SELECT a.ID , a.date, b.claim_date
FROM admissions as a
LEFT JOIN (
select d.ID , max(d.claim_date) cdate
from claims as d
where d.flag=TRUE
group by d.ID
) as b on (a.ID=b.ID) and (b.claim_date < a.date)
LEFT JOIN claims c on ((a.ID=c.ID) and (c.claim_date < a.claim_date))
and c.flag=TRUE
and (b.claim_date<c.claim_date or b.claim_date=c.claim_date and b.ID<c.ID)
WHERE c.ID is NULL
However, this ran for a couple of hours before aborting (typically takes about 30 mins with LIMIT 10).
You may want to try using a subquery to solve this problem:
SELECT a.ID, a.date, b.claim_date
FROM admissions as a
LEFT JOIN claims b ON (a.ID = b.ID)
WHERE b.claim_date = (
SELECT MAX(c.claim_date)
FROM claims c
WHERE c.id = a.id -- Assuming that c.id is a foreign key to a.id
AND c.claim_date < a.date -- Claim date is less than admission date
);
An attempt to clarify with different IDs, and using an additional subquery to account for duplicate dates:
SELECT a.ID, a.patient_id, a.date, b.claim_id, b.claim_date
FROM admissions as a
LEFT JOIN claims b ON (a.patient_ID = b.patient_ID)
WHERE b.claim_id = (
SELECT MAX(c.claim_id) -- Max claim identifier (likely most recent if sequential)
FROM claims c
WHERE c.patient_ID = a.patient_ID
AND c.flag = TRUE
AND c.claim_date = (
SELECT MAX(d.claim_date)
FROM claims d
WHERE d.patient_id = c.patient_id
AND c.claim_date < a.date -- Claim date is less than admission date
AND d.flag = TRUE
)
)
b.flag = TRUE;

SQL Three table join

I haven't been able to solve this problem for several days now and I'm hoping you can help.
I'm trying to write a query that returns all the information about a stock and the last time it was updated. I would like to filter the results based on the parameter #date and return only the stocks which it's latests update is less than the supplied #date parameter. I also need the stocks with a timestamp of null so I know that theses stocks need to be updated. I have the follwing three tables that I'm working with:
stocks
- id
- asset_id
- market_id
- name
- symbol
- IPOYear
- sector
- industry
updates
- id
- [timestamp]
stock_updates
- stock_id
- update_id
I've been using the following query and it was working well for me until I realized it dosen't work if the stock doesn't have an update
select * from stocks s
where #date < (
select top 1 u.timestamp from
updates u,
stock_updates su
where
s.id = su.stock_id and
u.id = su.update_id
order by u.timestamp desc
)
So after some research I came accross outer joins and I think it's what I need to fix my problem I just haven't been able to construct the correct query. The closest I've come is the following, but it returns a record for each time the stock was updated. Thanks in advance for your help!
This is where I'm at now:
select * from stocks s
left outer join stock_updates su on s.id = su.stock_id
left outer join updates u on u.id = su.update_id
where u.[timestamp] < #date
select s.*, u.timestamp
from stocks s
left join
(select su.stock_id, MAX(u.timestamp) timestamp
from updates u
inner join stock_updates su
on u.id = su.update_id
group by su.stock_id
) as u
on s.id = u.stock_id
where u.[timestamp] is null or u.[timestamp] < #date
Something like this perhaps?
SELECT s.*, v.timestamp
FROM stocks s
LEFT JOIN (
SELECT MAX(u.timestamp) AS timestamp, su.stock_id
FROM stock_updates su
INNER JOIN updates u ON (u.id = su.update_id)
GROUP BY su.stock_id
) v ON (v.stock_id = s.stock_id)
Basically it just joins the stocks table to an "inline view" that is the result of a query to determine the maximum timestamp for each stock_id.
I haven't included any filtering by the #date parameter, as your question states
"I'm trying to write a query that returns all the information about a
stock and the last time it was updated"
and for that you don't require any filtering.
This query does exactly that:
select s.*, dr.maxtime,
from stocks s
left join (select MAX(u.timestamp) as maxtime, su.stock_id
from stock_updates su inner join updates u on u.id = su.update_id
group by su.stock_id) dr
on dr.stock_id = s.stock_id
where
maxtime < #date or maxtime is null
[BTW: left join is the same as left outer join]
Try this
select s.*, max(su.timestamp)
from
stocks s
left outer join
stock_update su
on (s.id = su.stock_id)
left outer join
updates u
on (u.id = su.update_id)
group by s.*
It's written off the top of my head. What do you refer to with #date? Does that mean "now"? Do you mean the latest timestamp, or the latest before #date?