I'm having trouble with COUNT() values when joining tables SQL - sql

I have two independent tables, tbl_timesheet and tbl_absence. tbl_timesheet will have a row every day that an employee logs into a system. tbl_absence is a single row for a unique instance of absence, where the employee isn't in work. Each table looks like:
tbl_timesheet:
Staff_ID DEPT LOG_DATE
001 IT 2020-09-01
002 HR 2020-09-01
003 SALES 2020-09-01
001 IT 2020-09-02
002 HR 2020-09-02
003 SALES 2020-09-02
001 IT 2020-09-03
002 HR 2020-09-03
003 SALES 2020-09-03
tbl_absence:
Staff_ID ABSENCE_DATE
001 2020-09-10
003 2020-09-15
003 2020-09-22
I want to join the two tables, where I can count the instances of absence. I've attempted to do this using the following script:
SELECT t.Staff_ID as ID, t.DEPT as Dept, COUNT(a.Staff_ID) as 'Instances'
FROM tbl_timesheet t
JOIN tbl.absence a
ON t.Staff_ID = a.Staff_ID
GROUP BY t.Staff_ID, t.DEPT
I'd expect the following:
ID Dept Instances
001 IT 1
003 SALES 2
However due to the join between the tables, I believe the Staff_ID is being duplicated because each appears multiple times in tbl_timesheet.
Any suggestions?

when you JOIN two tables before getting distinct values of Staff_Id and Dept it will multiply the counts of records. for example staff_id='003' 2 record from absence table multiply 3 records from timesheet and you will get 6 records of it.Therefore you can code as below.
SELECT
t.Staff_ID as ID,
t.DEPT as Dept,
-----------
COUNT(a.Staff_ID) as Instances
-----------
FROM tbl_absence a
JOIN (select distinct Staff_ID, DEPT FROM tbl_timesheet) t
ON t.Staff_ID = a.Staff_ID
GROUP BY t.Staff_ID, t.DEPT

Related

Trying to count unique observations in SQL using Partition By

I have these two datasets:
Conditions: I would like to count the number of Unique Discharge_ID as Total_Discharges in my final dataset.
ICU_ID is a little bit more difficult. For PT_ID 001, what is happening is that PT 001 has 4 of the same discharge dates but 4 unique ICU_IDs. Since all of these ICU_IDs occur within 30 days of the Discharge_DT, I only want to count one of them. That is why total discharges for AZ is 1 and ICU_Admits = 1.
For PT_ID 002, I have 2 different Discharge_IDs but 1 ICU Admit that occurred within 30 days of both of the Discharge_IDs. I would like to count the Discharges as 2, and ICU_admits as 1.
DF1: Dataset of Discharges from hospital and admission to ICU within 30 days of Discharge_DT
City
PT_ID
Hospital_ID
Admit_Dt
Discharge_DT
Discharge_ID
ICU_ID
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-05-2021,01-06-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-08-2021,01-09-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-11-2021,01-11-2021
AZ
001
ABC
01-01-2021
01-03-2021
001,ABC,01-01-2021,01-03-2021
001,XYZ,01-15-2021,01-16-2021
CA
002
DEF
04-03-2021
04-07-2021
001,ABC,04-03-2021,04-07-2021
002,LMN,04-27-2021,04-27-2021
CA
002
DEF
04-20-2021
04-21-2021
001,ABC,04-20-2021,04-21-2021
002,LMN,04-27-2021,04-27-2021
DF desired:
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
1
Current Code:
DROP TABLE IF EXISTS #edit1
WITH CTE_df1 as (
select * from df1
)
select
City,
PT_ID,
Hospital_ID,
Admit_Dt,
Discharge_DT,
Discharge_ID,
count(ICU_ID) over (partition by ICU_ID) as ICU_Pts,
count(distinct Discharge_ID) as Total_Discharges
into #edit1
from CTE_df1
group by City, Discharge_ID, ICU_ID, PT_ID
order by City,
;with CTE_edit1 as (
select * from #edit1
)
select City, sum(ICU_Pts), sum(Total_Discharges)
from CTE_edit1
group by City
order by City
Current Output: PT_ID 001 works great but PT_ID 002 shows up at 2 in ICU_Admit as it is counting both as unique ICU visits.
City
TotalDischarges
ICU_Admit
AZ
1
1
CA
2
2
Any help would be appreciated

How to count teams and members

I have a table like this:
|personid| supervisorid| date_in|
each person can have only a supervisor (that is a person himself and can have a supervisor).
I'd like to compute:
the number of "teams" that should be the number of person without a supervisor that have at least two persons under them;
the number of members of persons for each teams;
the number of "active teams" defined as teams with a new person added less than X days ago.
Thanks in advance for your help.
DATA:
personid|supervisorid |datein
--------+---------------+----------
001 |NA |01/09/2020
002 |001 |01/09/2020
003 |001 |01/09/2020
004 |003 |01/09/2020
005 |003 |01/09/2020
006 |003 |01/09/2020
007 |003 |01/09/2020
008 |NA |01/09/2020
009 |008 |01/01/1990
010 |008 |01/01/1990
011 |NA |01/01/1990
012 |011 |01/01/1990
Result:
-number of teams:2
-members per team:
supervisor team|num_members
---------------+-----------
001 |7
008 |3
-active teams in the last 30 days: 1 (supervisorid=001)
if you you have sas, you can use proc sql
*the number of "teams" that should be the number of person without a supervisor that have at least two persons under them;
select count(distinct a.personid) as teams
from (select personid from yourtable where supervisorid='NA' group by 1) a
inner join (select supervisorid , count(distinct personid) as num_members from yourtable group by 1) b on a.personid=b.supervisorid ;
*the number of members of persons for each teams;
select a.person_id as supervisorteam,b.num_members
from(select personid from yourtable where supervisorid='NA' group by 1) a
inner join (select supervisorid , count(distinct personid) as num_members from yourtable group by 1 having num_members>1) b on a.personid=b.supervisorid ;
*the number of "active teams" defined as teams with a new person added less than X days ago
select count(distinct a.personid) as active
from (select personid from yourtable where supervisorid='NA' and datein<Xdaysago group by 1) a
inner join (select supervisorid , count(distinct personid) as num_members from yourtable group by 1 having num_members>1) b on a.personid=b.supervisorid ;

SQL Consecutive Days - Oracle

[Data]
[Emp] [Emp_group] [Date_purchase]
1 001 12-jan-2016
1 001 13-jan-2016
1 001 19-jan-2016
1 003 14-jan-2016
2 004 21-feb-2016
2 004 22-feb-2016
2 004 23-feb-2016
3 005 01-apr-2016
Need SQL to find consecutive purchase dates. Emp (1) of emp group (001) has purchased consecutively on 12 and 13 of January. Emp and Emp group partition must be considered.
Just use lag()/lead():
select t.*
from (select t.*,
lag(date_purchase) over (partition by emp, emp_group order by date_purchase) as prev_dp,
lead(date_purchase) over (partition by emp, emp_group order by date_purchase) as next_dp
from t
) t
where date_purchase in (prev_dp, next_dp);

insert duplicate rows in temp table

i'm a new to sql & forum - need help on how to insert duplicate rows in temp table. Would like to create a view as result
View - Employee:
Name Empid Status Manager Dept StartDate EndDate
AAA 111 Active A111 Cashier 2015-01-01 2015-05-01
AAA 111 Active A222 Sales 2015-05-01 NULL
I don't know how to write a function, but do have a DATE table.
Date Table: (365 days) goes up to 2018
Date Fiscal_Wk Fiscal_Mon Fiscal_Yr
2015-01-01 1 1 2015
Result inquiry
How do i duplicate rows for each record from Employee base on each of the start date for entire calendar year.
Result:
Name Empid Status Manager Dept Date FW FM FY
AAA 111 Active A111 Cashier 2015-01-01 1 1 2015
AAA 111 Active A111 Cashier 2015-01-02 1 1 2015
******so on!!!!!!
AAA 111 Active A222 Sales 2015-05-01 18 5 2015
AAA 111 Active A222 Sales 2015-05-02 18 5 2015
******so on!!!!!!
Thanks in advance,
Quinn
Select * from Employee cross join Calendar.
This will essentially join every record in calendar to every record in Employee.
so, if there are 2 records in Employee and 10 in calendar, you'll end up with 20 total, 10 for each.
What you are looking for is a join operation. However, the condition for the join is not equality, because you want all rows that match between two values.
The basic idea is:
select e.*, c.date
from employee e join
calendar c
on e.startdate >= c.date and
(e.enddate is null or c.date <= e.enddate);
modified query - this yields result of previous & most recent records
select e.*, c.date, c.FW, c.FM, c.FY
from employee e
join calendar c
on e.startdate <= c.date and
ISNULL(e.enddate,GETDATE()) > c.date

Last Invoice using Postgres

I have a Postgres 9.1 database with three tables - Customer, Invoice, and Line_Items
I want to create a customer list showing the customer and last invoice date for any customer with a specific item (specifically all invoices that have the line_items.code beginning with 'L3').
First, I am trying to pull the one transaction for each customer (the last invoice with the 'L3" code) (figuring I can JOIN the customer names once this list is created).
Tables are something like this:
Customers
cust_number last_name first_name
=========== ======== ====================
1 Smith John
2 Jones Paul
3 Jackson Mary
4 Brown Phil
Transactions
trans_number date cust_number
=========== =========== ====================
1001 2014-01-01 1
1002 2014-02-01 4
1003 2014-03-02 2
1004 2014-03-06 3
Line_Items
trans_number date item_code
=========== =========== ====================
1001 2014-01-01 L3000
1001 2014-01-01 M2420
1001 2014-01-01 L3500
1002 2014-02-01 M2420
1003 2014-03-02 M2420
1004 2014-03-06 L3000
So far, I have:
Select transactions.cust_number, transactions.trans_number
from transactions
where transactions.trans_number in
( SELECT Line_Items.trans_number
FROM Line_Items
WHERE Line_Items.item_code ilike 'L3%'
ORDER BY line_items.date DESC
)
order by transactions.pt_number
This pulls all the invoices for each customer with an 'L3' code on the invoice, but I can't figure out how to just have the last invoice.
Use DISTINCT ON:
SELECT DISTINCT ON (t.cust_number)
t.cust_number, t.trans_number
FROM line_items l
JOIN transactions t USING (trans_number)
WHERE l.item_code ILIKE 'L3%'
ORDER BY t.cust_number, l.date DESC;
This returns at most one row per cust_number - the one with the latest trans_number. You can add more columns to the SELECT list freely.
Detailed explanation:
Select first row in each GROUP BY group?
you could use MIN or MAX:
SELECT Line_Items.trans_number, Max(line_items.date) As [last]
From Line_Items
Group By Line_Items.trans_number