In a loop How to set a value if even one record passes check - sql

I am joining two tables Employee and Wages. Now an employee can have multiple wages as they can be working in different projects and I want the sum of his wages.There is also a column called Employee_benefits_Claimed_Ind. This is an indicator that indicates if the employee claimed any benefits in each project. He can claim benefits for some projects and not claim for some but as far as I am concerned if he claims benefits on even one project he does not qualify. Here is the table I am trying to populate:
CREATE TABLE EMPLOYEE_QUAL
(Employee_id NUMBER,
TOTAL_WAGES NUMBER,
EMPLOYEE_disQUALIFY CHAR(1))
INSERT INTO EMPLOYEE_QUAL(Employee_id, Total_wages, EMployee_disQualify)
SELECT c_Employee_id,
c_Total_wages,
c_Employee_disQualify
FROM (
Select Employee_ID as c_Employee_id
from Employees
) e
LEFT JOIN (
Select SUM(Wages),
Employee_disQualify
from wages
group by Employee_disQualify
) w on e.employee_id = w.employee_id
However if an employee claims benefits for one and does not for another this will just have two entries because of the GROUP BY. Ideally it should only be one entry with the Employee_disqualify_ind as 'Y' since he claimed benefits on one project. It does not matter if he did not on the other one. How do I go about achieving this?

I'm going to guess at what you're really saying.
In English: For all employees as specified by Employees.Employee_ID, sum up wages on the WAGES table, and set EMployee_disQualify' = 'Y' if any of thewages.Employee_disqualify` flags are set to Yes.
In SQL, That would be:
SELECT e.Employee_ID,
sum(w.wages),
case when
(sum (case w.Employee_disQualify WHEN 'Y' THEN 1 else 0 end)) > 0
THEN 'Y'
ELSE 'N'
end
FROM Employees e
LEFT JOIN wages w
on e.employee_id = w.employee_id
group by e.employee_ID
(where I've just shown the select).
The main trick is to convert the employee_disqualify flag into something numeric (using CASE) so it can be easily aggregated, and then covert the result of this aggregation back to a Y or a N w/ another CASE. If there is at least one Y in any of matching rows, then the sum will be > 1, so, you'll get a Y as your final result. Otherwise, N. (And again, I'm guessing as to how your field is set.)
If you weren't aggregating, I'd might do it with an in-line select 'Y' where exists . . type cluase, but you're already aggregating anyway for sum(wages) and this gets calculated on the same scan through the wages table, so this should be reasonably efficient.
For example, see here: http://sqlfiddle.com/#!4/67691/2

Related

SQL Select Count Subquery, Joins messing up everything

I've got a task to work with 4 different tables. I think I've got the "logic" correct, but I think I'm failing on joining the various separately working things together.
The Case somehow returns two rows when the comparison is true; if it isn;t, it displays (correctly) just one. Works fine without joins.
The count subquery works when by itself, but when I'm trying to tie it together, it displays anything from showing the same number everywhere or displaying far too large numbers (likely multiples or multiples).
Select Distinct RPD_PERSONS.PERSON_ID "id",
RPD_PERSONS.SURN_TXT ||' '|| RPD_PERSONS.NAME_TXT "Name",
Case ADD_ROLE_PERS.ROLE_CODE When 'Manager'
Then 'yes'
Else 'no'
End "Manager",
(
Select Count(LDD_CERTS.Cert_ID)
From LDD_CERTS
Join LDD_PERS_CERTS
On LDD_PERS_CERTS.CERT_ID = LDD_CERTS.CERT_ID
Where MONTHS_BETWEEN(LDD_CERTS.VALID_TO,SYSDATE)>0
And LDD_PERS_CERTS.CERT_CHANGE_TYPE>=0
) "no. of certificates"
From RPD_PERSONS
Join ADD_ROLE_PERS
On ADD_ROLE_PERS.Person_ID = RPD_PERSONS.Person_ID
Where RPD_PERSONS.Partic_ID = 1
Group By RPD_PERSONS.PERSON_ID, RPD_PERSONS.SURN_TXT ||' '|| RPD_PERSONS.NAME_TXT, ADD_ROLE_PERS.ROLE_CODE
Order By RPD_PERSONS.Person_ID;
This is the subquery that, by itself, seems to work perfectly.
Select LDD_PERS_CERTS.PERSON_UID,Count(LDD_CERTS.Cert_ID)
From LDD_CERTS
Join LDD_PERS_CERTS
ON LDD_PERS_CERTS.CERT_ID = LDD_CERTS.CERT_ID
Where MONTHS_BETWEEN(LDD_CERTS.VALID_TO,SYSDATE)>0
AND LDD_PERS_CERTS.CERT_CHANGE_TYPE>=0
Group By LDD_PERS_CERTS.PERSON_UID
order by LDD_PERS_CERTS.PERSON_UID;
You have a lot of things going on although a short query to get it, but let me try to summarize what I THINK you are trying to get.
You want a list of distinct people within the company with a count of how many ACTIVE certs (not expired) per person. From that, you also want to know if they are in a management position or not (via roles).
Q: For a person who may be a manager, but also an under-manager to a higher-up, do you want to see that person in both roles as typical business structures could have multiple layers of management, OR... Do you only care to see a person once, and if they are a manager OR some other level. What if a person has 3 or more roles, do you want to see them every instance? If your PRIMARY care is Manager Yes or No, the query gets even more simplified.
Now, your query of counts for valid certs. The MONTHS_BETWEEN() function appears to be you are running in Oracle. Based on the two parameters of the Valid_To date compared to sysdate is an indication that the valid to is always intended to be in the future (ie: Still has an active cert). If this is the case, you will not be able to optimize query as function calling is not Sargable
Instead, you should only have to do where Valid_To > SysDate, in other words, only those that have not yet expired. You MIGHT even be better served by pre-aggregating all counts of still active cert counts per Cert ID, then joining to the person certs table since the person cert check is for all where the cert_change_type >= 0 which could imply ALL. What condition would a Cert_Change_Type be anything less than zero, and if never, that where clause is pointless.
Next, your SELECT DISTINCT query needs a bit of adjustments. Your column-based select has no context to the outer person ID and is just aggregating the total certs. There is no correlation to the person ID to the certs being counted for. I can only GUESS that there is some relationship such as
RPD_Persons.Person_id = LDD_Pers_Certs.Person_UID
Having stated all that, I would have the following table/indexes
table index
LDD_PERS_CERTS ( CERT_CHANGE_TYPE, PERSON_UID, CERT_ID )
LDD_CERTS ( valid_to, cert_id )
RPD_PERSONS ( partic_id, person_id, surn_txt, name_txt )
ADD_ROLE_PERS ( person_id, role_code )
I would try something like
Select
lpc.PERSON_UID,
ValCerts.CertCount
From
( select
Cert_id,
count(*) CertCounts
from
LDD_CERTS
where
Valid_To > sysDate
group by
Cert_id ) ValCerts
JOIN LDD_PERS_CERTS lpc
on ValCerts.Cert_id = lpc.cert_id
Where
lpc.CERT_CHANGE_TYPE >= 0
Now, if you only care if a given person is a manager or not, I would pre-query that only as you were not actually returning a person's SPECIFIC ROLE, just the fact they were a manager or not. My final query might look like'
select
p.PERSON_ID id,
max( p.SURN_TXT || ' ' || p.NAME_TXT ) Name,
max( Case when arp.Person_id IS NULL
then 'no' else 'yes' end ) Manager,
max( coalesce( certs.CertCount, 0 )) ActiveCertsForUser
from
RPD_PERSONS p
LEFT Join ADD_ROLE_PERS arp
On p.Person_ID = arp.Person_ID
AND arp.role_code = 'Manager'
LEFT JOIN
( Select
lpc.PERSON_UID,
ValCerts.CertCount
From
( select
Cert_id,
count(*) CertCounts
from
LDD_CERTS
where
Valid_To > sysDate
group by
Cert_id ) ValCerts
JOIN LDD_PERS_CERTS lpc
on ValCerts.Cert_id = lpc.cert_id
AND lpc.CERT_CHANGE_TYPE >= 0 )
) Certs
on p.Person_id = Certs.Person_uid
Where
p.Partic_ID = 1
GROUP BY
p.PERSON_ID
Now, if the p.partic_id = 1 represents only 1 person, then that wont make as much sense to query all people with a given certificate status, etc. But if Partic_id = 1 represents a group of people such as within a given association / division of a company, then it should be fine.
Any questions, let me know and I can revise / update answer
CASE issue: there can be, presumably, be multiple records in ADD_ROLE_PERS for each person. If a person can have two or more roles running concurrently then you need to decide what the business logic is that you need to use to handle this. If a person can only have one active role at a time presumably there is a "active/disabled" column or effective date columns you should be using to identify the active record (or, potentially, there is a data issue).
The subquery should return the same value for every single row in your resultset, as it is completely isolated/standalone from your main query. If you want it to produce counts that are relevant to each row then you will need to connect it to the tables in the main table (look up correlated subqueries if you don't know how to so this)

How do I sum a column based on two conditions, one based on field value, the other based upon retrieved record's values?

I have a list like the one below from which I'm looking to aggregate the sum in the "amount" column for a given company. The trick of the matter is that I want to include family members of employees of the company. Those relations are kept by the ID to the right and will differ by the 12th character (if the family in question only has one member, then the 12th character is a space).
My question is, what is the most efficient way to get the amount for all employees of ABC Inc, including family members. I believe that this will require first one query for all employees of ABC Inc, then another for their family members by using the resulting list from query one.
Is this the most efficient way to do this? My table is extremely large (over 10GB of flat data), and thousands of such queries will be required, so efficiency is important.
The code I'm using thus far to get the data without family members is:
select ID, Name, Company_Name, sum(Amount) from indivs
where Orgname ='APC Inc' --or Employer like '%APC Inc%'
group by ID, Name, Company_Name
However, this only gives me the amounts from the direct employees.
What would be the next step to add the amounts for family members?
I think you want:
select sum(amount)
from t
where exists (select 1
from t t2
where t2.company = 'APC Inc.' and
left(t2.id, 11) = left(t.id, 11)
);
For performance, you can create a computed column and index:
alter table t add id11 as (left(id, 11)) persisted;
create index idx_company_id11 on t(company, id11);
Then phrase the query as:
select sum(amount)
from t
where exists (select 1
from t t2
where t2.company = 'APC Inc.' and
t2.id11 = t.id11
);

auto updated column by another column in another table

I have two tables, Employee and Sales.
in the Employee table there is a column called 'number of sales'.
but I want it to be uninsertable.
so you cannot insert anything to it, and it will be updated by another factor:
for every column in the Sales that has the same ID as that employee I want to see the number of sales in the Employee 'number of sales' column.
something like [number of sales]=select count(*) from sales s group by employeeID where EmployeeID=s.EmployeeID
The usual approach to this is a trigger (documented here).
You can also use a generated column with a user-defined function.
However, I would caution you from both these approaches because they can be complex and can affect performance in unexpected ways. Instead, why not just create a view?
create view v_employees as
select e.*, s.cnt
from employees e outer apply
(select count(*) as cnt
from sales s
where s.EmployeeID = e.EmployeeID
) s;
You can query the view and get the value whenever you need it. The value is automatically "updated" when the values in sales change -- due to inserts, updates, and deletes.

SQL - Why Does This Happen?

These are the tables that I'm working with.
With that in mind, I want to showcase the Employees that are both a supervisor and a manager.
But when I used this
select e1.fname,e1.lname
from employee e1,employee e2,department
where e1.ssn=e2.super_ssn and e1.ssn = Mgr_ssn
This was the output
I know I can solve the problem with 'distinct', but I'm more interested to know why the output turned out like it did.
How about exists?
select e.*
from employee e
where exists (select 1 from employee e2 where e2.mgr_ssn = e.ssn) and
exists (select 1 from employee e2 where e2.super_ssn = e.ssn) ;
Your query returns duplicates for two reasons. First, presumably managers and supervisors have multiple employees below them. You end up with rows for each such employee. Second, you have a cartesian product with department, which further multiplies the rows. The department table is not used in the query.
Using select distinct is not a good solution in this case. The database just ends up having to do a lot more work than necessary -- first to create the duplicate rows and then to remove them.
add department matching clause in where like
select e1.fname,e1.lname
from employee e1,employee e2,department d
where e1.ssn=e2.super_ssn and e1.ssn = Mgr_ssn and
d.Dnumber=e1.Dno

Counting number of occurrences in subquery

My task is to find the number of occurrences of late timesheet submissions for each employee in our database. There are two tables which I have primarily been looking at, but I'm having trouble putting the two together and coming up with a decent view of the COUNT of occurrences and the employee ID for which they are associated with.
I have created this query which provides me with the EmployeeID for each occurrence.
SELECT db.Employee.EmployeeID
FROM db.LateTimesheets
INNER JOIN db.Employee ON Employee.LastName = LateTimesheets.LastName AND Employee.FirstName = Late Timesheets.FirstName
Now, with this simple query I have a view of the EmployeeID repeated however many times these incidents have occured. However, what I ultimately want to end up with is a table that displays a count for each occurance, along with the EmployeeID for which this count is associated with.
I would assume I would need to use the COUNT() function to count the amount of rows for each EmployeeID, and then select that value along with EmployeeID. However, I am having trouble structuring the subquery correctly, and everything I have tried thus far has only generated errors with MS SQL Server Management Studio.
A simpler version of usr's answer would be the following which avoids the construction of the derived table:
Select db.Employee.EmployeeID, Count( db.LateTimesheets.somecolumn ) As Total
From db.Employee
Left Join db.LateTimesheets
On LateTimesheets.LastName = Employee.LastName
And Late Timesheets.FirstName = Employee.FirstName
Group By db.Employee.EmployeeID
I may have misunderstood the question, but wouldn't GROUP BY solve your problem?
SELECT COUNT(db.LateTimesheets.somecolumn), db.Employee.EmployeeID
FROM db.LateTimesheets
INNER JOIN db.Employee ON Employee.LastName = LateTimesheets.LastName
AND Employee.FirstName = Late Timesheets.FirstName
GROUP BY db.Employee.EmployeeID
Just replace somecolumn with the name of a column that's actually in the table.
select e.*, isnull(lt.Count, 0) as Count
from Employee e
left join (
select LastName, count(*) as Count from LateTimesheets
) LateTimesheets lt on e.LastName = lt.LastName
The trick is to do the grouping in a derived table. You don't want to group everything, just the LateTimesheets.
We need a left join to still get employees with no LateTimesheets.