Getting records from three tables with sometimes unmatching fields - sql

I have a question that should be easy to answer...
I have a table called Projects, primary key is ProjectId.
Second table is called ProjectResources, with ProjectId as a foreign key, plus fields for user and hours (represents the users assigned to work on the project)
Third table is TimesheetEntries (which the users use to record hours users actually worked on a project), with ProjectId as foreign Key, and field User
What is required is for me to show the records of the projectId,BudgetedHours (from ProjectResources table) and ActualHours (from the TimesheetEntries table); I would like however to include the following cases where:
a user was assigned to the project but did not work on it (in this case the budgeted hours should have a value and the actual hours should have zero)
a user was not assigned to the project but has nonetheless worked on it (in which case the BudgetedHours should be zero and the ActualHours should have a value)
a user was both assigned to the project and has worked on it (both BudgetedHours and ActualHours have values)
Could somebody direct me to a T-SQL statement to get this kind of result?

You could try something like:
SELECT p.ProjectId,
(CASE WHEN pr.[User] Is NULL THEN te.[User] ELSE pr.[User] END) as [User],
IsNull(pr.BudgetedHours, 0.0),
IsNull(Sum(te.ActualHours), 0.0)
FROM Project p
LEFT JOIN ProjectResources pr on p.ProjectId = pr.ProjectId
LEFT JOIN TimeSheetEntries te on p.p.ProjectId = te.ProjectId
GROUP BY p.ProjectId, te.User, pr.BudgetedHours
But I'm guessing at your field names, etc. so you would need to adapt this for your own domain.

I think the reason you are having problems is because you are thinking about the actual hours and the budgeted hours as being all essentially equal, because they are assigned to the project.
However, the actual and budgeted are calculated in very different ways. You need to calculate them using separate subqueries, as in the following example:
select coalesce(budget.ProjectId, actual.ProjectId) as ProjectId,
budget.BudgetedHours, actual.ActualHours
from
(
select projectid, sum(pr.hours) as BudgetedHours
from Projects p
left outer join ProjectResources pr
) budget
full outer join
(
select projectid, sum(tse.hours) as ActualHours
from Projects p
left outer join TimeSheetEntries tse
on tse.ProjectId = p.ProjectId
) actual
on budget.ProjectId = actual.ProjectId

Related

Postgres Join 3 Tables(Timestamp)

I have 3 tables and need mutliple values from them. I need pname(project), lead(project), author(changegroup) and updated(jiraissue)/created(changegroup). pname and lead are project name and leader of the project. Author is the one who changed the ticket the last time, and update and created are the timestamps from that action. But update and created are often not the same. So cant use them to filter. I have no problem to get pname, lead, and update. But I need the author too.
SELECT pname, lead, MAX(updated) FROM project join jiraissue on project.id=jiraissue.project GROUP BY(pname, lead);
That is the query command to get pname, lead, and updated. I filter the all tickets from a project, and look which was edited the most recently. I take that timestamp for the time the entire project was edited the last time. But couldn't create an command to get the author too. you help me?
Project Table:
The Project itself with project name and lead| public.id=jiraissue.project
Jira Issue Table:
jiraissues are all tickets from the projects |project.id=jiraissue.project, jiraissue.updated is most times changegroup.created, changegroup.issueid=jiraissue.id
Changegroup Table:
changegroup is a list of the times when a ticket is edited| changegroup.issueid=jiraissue.id, changegroup.created=jiraissue.updated
Your explanation left me confused. But here is my educated guess:
SELECT p.pname, p.lead, c.author, j.updated, c.created
FROM project p
LEFT JOIN (
SELECT DISTINCT ON (j.project)
j.project, j.updated, j.id
FROM jiraissue j
ORDER BY j.project, j.updated DESC NULLS LAST
) j ON j.project = p.id
LEFT JOIN changegroup c ON c.issueid = j.id;
Another case for DISTINCT ON. See:
Select first row in each GROUP BY group?
Eliminating unneeded rows before joining should be the fastest way.
I use 2x LEFT JOIN retain projects without any Jira issues in the result.

Left Join on three tables in access with Group by

I have broken my head with syntax error response from Access Jet engine.
I have three tables.
First one "tblMstItem" is the master table for Item details contains two columns "colITemID" PK and "colItemName"
Second one "tblStocks" is the table where the purchases are maintained. This table has a column "colRQty" which keeps the quantity of the particular item purchased. "colItemID" FK from "tblMstItem"
Third one "tblSales" is the table where the sales are maintained. This table has a column "colSoldQty" which keeps the quantity of the particular item sold. "colItemID" FK from "tblMstItem"
Therefore "colItemID" is common in all the three tables and has links created.
My requirement is I need all the Items listed in the "tblMstItem" table columns are "colItemID" "colItemName" and if there is any item purchased or any item sold should be shown as sum of that particular item.
I have used Left Join shown in the following select statement but it always giving me an error message.
Select statement as follows:
SELECT
i.colItemID,
i.colItemName,
s.rqty,
n.soldqty
from tblMstItem i
left join
( select sum( colRQty ) as rqty from tblStocks group by colItemID ) s
on i.colItemID = s.colItemID
left join
( select sum( colSoldQty ) as soldqty from tblSales group by colItemID ) n
on i.colItemID=n.colItemID``
I tried the above given code with many different syntax but every time I get syntax error. It is making me to doubt do MS Access support three table joins, I am sure I am wrong.
See the error Message below
Table columns and table link shown below
I would be very thankful to get any help on this. Access sql please because this I am able to get results in SQL Server.
Thanks in Advance
MS Access has a picky syntax. For instance, joins need extra parentheses. So, try this:
select i.colItemID, i.colItemName,
s.rqty, n.soldqty
from (tblMstItem as i left join
(select colItemID, sum(colRQty ) as rqty
from tblStocks
group by colItemID
) as s
on i.colItemID = s.colItemID
) left join
(select colItemID, sum( colSoldQty ) as soldqty
from tblSales
group by colItemID
) as n
on i.colItemID = n.colItemID;
You also need to select colItemID in the subqueries.

Query to find projects without leader

I'm doing a java application with a Postgres database and the following schema:
The entity employee, rol, project has some information inside,and the entity participants is empty. I want to show in my application a table of all projects with no leader assigned yet. I'm sure that#s possible with an SQL query but I'm not sure how. I tried this query:
SELECT p.projectnumber from participants pa, projecto p
where p.projectnumber=pa.projectnumber and pa.leaderid IS NULL;
But no rows are returned. That's because the participants entity is empty, but I cannot fill that entity with only the projectnumbers. Do you think I could make it easier with a query or well any other suggestion?
I want to show on my application a table that shows all the projects with no Leader assigned yet
Guessing that leaders are signified by having a non-null value in participants.leaderid:
SELECT projectnumber
FROM projecto p
WHERE NOT EXISTS (
SELECT 1
FROM participants
WHERE projectnumber = p.projectnumber
AND leaderid IS NOT NULL
);
You can solve it with a LEFT JOIN as well, but then include the leaderid in the join condition:
SELECT p.projectnumber
FROM projecto p
LEFT JOIN participants pa ON pa.projectnumber = p.projectnumber
AND pa.leaderid IS NOT NULL
WHERE pa.projectnumber IS NULL;
The check on leaderid in the WHERE condition (after the LEFT JOIN) cannot distinguish whether the column leaderid is NULL in the underlying table or because there is no connected row in participants at all. In this particular query, the result would still be correct (no participant, no leader). But it would return one row per participant that's not a leader, and I expect you want to list every leader-less project once only. You would have to aggregate, but why join to multiple non-leaders to begin with?
Basics:
Select rows which are not present in other table
That aside, your relational design doesn't seem to add up. What's to prevent multiple leaders for the same project? Why varchar(30) for most columns? Why no FK constraint between participant and project? Why projecto in the query, but project in the ER diagram? Etc.
You can use left join assuming projects which don't have entries in Participants table will be without leader:
SELECT p.projectnumber
FROM projecto p LEFT JOIN participants pa
ON p.projectnumber=pa.projectnumber
WHERE pa.leaderid IS NULL;

Including NULL results after join

I am trying to do a report which shows all payments we have received and for the report I have to show names of patients who pay, but this table also contains checks from payers (insurance companies) and after I do a join all of the payers are excluded. I have tried every join version I know left, right, outer, inner, and combinations of the two. SQL Server 2005.
UPDATE
The line that is causing the left join not to work is
select p.*, max(episode_id) over (partition by patient_id) as maxei from patient p
which I am using in the join. Our system uses an episodic system, so if a patient leaves and comes back they get a new episode. The payment table does not have an episode field though, so I used that line of code in the join to only show the current episode. Any idea how I can keep it only showing the current episode while not losing the fields without patient_id's? Examples of the fields without patients id's are shown in the second image.
select
pay.patient_id,
p.lname + ', ' + p.fname as 'Name',
pay.source_type,
pay.instrument,
pay.doc_reference,
pay.instrument_date,
pay.payment_amount,
pay.user_id,
pay.entry_chron,
pay.payor_id
from payment pay
join (select p.*, max(episode_id) over (partition by patient_id) as maxei from patient p) p
on p.patient_id = pay.patient_id
where episode_id = maxei and (pay.instrument_date between '2014-11-01' and '2014-11-30')
order by pay.payment_amount
This is what the results look like for patients with some fields commented out for confidentiality.
These are the fields that are being excluded

Design : multiple visits per patient

Above is my schema. What you can't see in tblPatientVisits is the foreign key from tblPatient, which is patientid.
tblPatient contains a distinct copies of each patient in the dataset as well as their gender. tblPatientVists contains their demographic information, where they lived at time of admission and which hospital they went to. I chose to put that information into a separate table because it changes throughout the data (a person can move from one visit to the next and go to a different hospital).
I don't get any strange numbers with my queries until I add tblPatientVisits. There are just under one millions claims in tblClaims, but when I add tblPatientVisits so I can check out where that person was from, it returns over million. I thinkthis is due to the fact that in tblPatientVisits the same patientID shows up more than once (due to the fact that they had different admission/dischargedates).
For the life of me I can't see where this is incorrect design, nor do I know how to rectify it beyond doing one query with count(tblPatientVisits.PatientID=1 and then union with count(tblPatientVisits.patientid)>1.
Any insight into this type of design, or how I might more elegantly find a way to get the claimType from tblClaims to give me the correct number of rows with I associate a claim ID with a patientID?
EDIT: The biggest problem I'm having is the fact that if I include the admissionDate,dischargeDate or the patientStatein the tblPatient table I can't use the patientID as a primary key.
It should be noted that tblClaims are NOT necessarily related to tblPatientVisits.admissionDate, tblPatientVisits.dischargeDate.
EDIT: sample queries to show that when tblPatientVisits is added, more rows are returned than claims
SELECT tblclaims.id, tblClaims.claimType
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID INNER JOIN
tblPatientVisits ON tblPatient.patientID = tblPatientVisits.patientID
more than one million query rows returned
SELECT tblClaims.id, tblPatient.patientID
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID
less than one million query rows returned
I think this is crying for a better design. I really think that a visit should be associated with a claim, and that a claim can only be associated with a single patient, so I think the design should be (and eliminating the needless tbl prefix, which is just clutter):
CREATE TABLE dbo.Patients
(
PatientID INT PRIMARY KEY
-- , ... other columns ...
);
CREATE TABLE dbo.Claims
(
ClaimID INT PRIMARY KEY,
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID)
-- , ... other columns ...
);
CREATE TABLE dbo.PatientVisits
(
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID),
ClaimID INT NULL FOREIGN KEY
REFERENCES dbo.Claims(ClaimID),
VisitDate DATE
, -- ... other columns ...
, PRIMARY KEY (PatientID, ClaimID, VisitDate) -- not convinced on this one
);
There is some redundant information here, but it's not clear from your model whether a patient can have a visit that is not associated with a specific claim, or even whether you know that a visit belongs to a specific claim (this seems like crucial information given the type of query you're after).
In any case, given your current model, one query you might try is:
SELECT c.id, c.claimType
FROM dbo.tblClaims AS c
INNER JOIN dbo.tblPatientClaims AS pc
ON c.id = pc.id
INNER JOIN dbo.tblPatient AS p
ON pc.patientid = p.patientID
-- where exists tells SQL server you don't care how many
-- visits took place, as long as there was at least one:
WHERE EXISTS (SELECT 1 FROM dbo.tblPatientVisits AS pv
WHERE pv.patientID = p.patientID);
This will still return one row for every patient / claim combination, but it should only return one row per patient / visit combination. Again, it really feels like the design isn't right here. You should also get in the habit of using table aliases - they make your query much easier to read, especially if you insist on the messy tbl prefix. You should also always use the dbo (or whatever schema you use) prefix when creating and referencing objects.
I'm not sure I understand the concept of a claim but I suspect you want to remove the link table between claims and patient and instead make the association between patient visit and a claim.
Would that work out better for you?