SQL Select Count Subquery, Joins messing up everything - sql

I've got a task to work with 4 different tables. I think I've got the "logic" correct, but I think I'm failing on joining the various separately working things together.
The Case somehow returns two rows when the comparison is true; if it isn;t, it displays (correctly) just one. Works fine without joins.
The count subquery works when by itself, but when I'm trying to tie it together, it displays anything from showing the same number everywhere or displaying far too large numbers (likely multiples or multiples).
Select Distinct RPD_PERSONS.PERSON_ID "id",
RPD_PERSONS.SURN_TXT ||' '|| RPD_PERSONS.NAME_TXT "Name",
Case ADD_ROLE_PERS.ROLE_CODE When 'Manager'
Then 'yes'
Else 'no'
End "Manager",
(
Select Count(LDD_CERTS.Cert_ID)
From LDD_CERTS
Join LDD_PERS_CERTS
On LDD_PERS_CERTS.CERT_ID = LDD_CERTS.CERT_ID
Where MONTHS_BETWEEN(LDD_CERTS.VALID_TO,SYSDATE)>0
And LDD_PERS_CERTS.CERT_CHANGE_TYPE>=0
) "no. of certificates"
From RPD_PERSONS
Join ADD_ROLE_PERS
On ADD_ROLE_PERS.Person_ID = RPD_PERSONS.Person_ID
Where RPD_PERSONS.Partic_ID = 1
Group By RPD_PERSONS.PERSON_ID, RPD_PERSONS.SURN_TXT ||' '|| RPD_PERSONS.NAME_TXT, ADD_ROLE_PERS.ROLE_CODE
Order By RPD_PERSONS.Person_ID;
This is the subquery that, by itself, seems to work perfectly.
Select LDD_PERS_CERTS.PERSON_UID,Count(LDD_CERTS.Cert_ID)
From LDD_CERTS
Join LDD_PERS_CERTS
ON LDD_PERS_CERTS.CERT_ID = LDD_CERTS.CERT_ID
Where MONTHS_BETWEEN(LDD_CERTS.VALID_TO,SYSDATE)>0
AND LDD_PERS_CERTS.CERT_CHANGE_TYPE>=0
Group By LDD_PERS_CERTS.PERSON_UID
order by LDD_PERS_CERTS.PERSON_UID;

You have a lot of things going on although a short query to get it, but let me try to summarize what I THINK you are trying to get.
You want a list of distinct people within the company with a count of how many ACTIVE certs (not expired) per person. From that, you also want to know if they are in a management position or not (via roles).
Q: For a person who may be a manager, but also an under-manager to a higher-up, do you want to see that person in both roles as typical business structures could have multiple layers of management, OR... Do you only care to see a person once, and if they are a manager OR some other level. What if a person has 3 or more roles, do you want to see them every instance? If your PRIMARY care is Manager Yes or No, the query gets even more simplified.
Now, your query of counts for valid certs. The MONTHS_BETWEEN() function appears to be you are running in Oracle. Based on the two parameters of the Valid_To date compared to sysdate is an indication that the valid to is always intended to be in the future (ie: Still has an active cert). If this is the case, you will not be able to optimize query as function calling is not Sargable
Instead, you should only have to do where Valid_To > SysDate, in other words, only those that have not yet expired. You MIGHT even be better served by pre-aggregating all counts of still active cert counts per Cert ID, then joining to the person certs table since the person cert check is for all where the cert_change_type >= 0 which could imply ALL. What condition would a Cert_Change_Type be anything less than zero, and if never, that where clause is pointless.
Next, your SELECT DISTINCT query needs a bit of adjustments. Your column-based select has no context to the outer person ID and is just aggregating the total certs. There is no correlation to the person ID to the certs being counted for. I can only GUESS that there is some relationship such as
RPD_Persons.Person_id = LDD_Pers_Certs.Person_UID
Having stated all that, I would have the following table/indexes
table index
LDD_PERS_CERTS ( CERT_CHANGE_TYPE, PERSON_UID, CERT_ID )
LDD_CERTS ( valid_to, cert_id )
RPD_PERSONS ( partic_id, person_id, surn_txt, name_txt )
ADD_ROLE_PERS ( person_id, role_code )
I would try something like
Select
lpc.PERSON_UID,
ValCerts.CertCount
From
( select
Cert_id,
count(*) CertCounts
from
LDD_CERTS
where
Valid_To > sysDate
group by
Cert_id ) ValCerts
JOIN LDD_PERS_CERTS lpc
on ValCerts.Cert_id = lpc.cert_id
Where
lpc.CERT_CHANGE_TYPE >= 0
Now, if you only care if a given person is a manager or not, I would pre-query that only as you were not actually returning a person's SPECIFIC ROLE, just the fact they were a manager or not. My final query might look like'
select
p.PERSON_ID id,
max( p.SURN_TXT || ' ' || p.NAME_TXT ) Name,
max( Case when arp.Person_id IS NULL
then 'no' else 'yes' end ) Manager,
max( coalesce( certs.CertCount, 0 )) ActiveCertsForUser
from
RPD_PERSONS p
LEFT Join ADD_ROLE_PERS arp
On p.Person_ID = arp.Person_ID
AND arp.role_code = 'Manager'
LEFT JOIN
( Select
lpc.PERSON_UID,
ValCerts.CertCount
From
( select
Cert_id,
count(*) CertCounts
from
LDD_CERTS
where
Valid_To > sysDate
group by
Cert_id ) ValCerts
JOIN LDD_PERS_CERTS lpc
on ValCerts.Cert_id = lpc.cert_id
AND lpc.CERT_CHANGE_TYPE >= 0 )
) Certs
on p.Person_id = Certs.Person_uid
Where
p.Partic_ID = 1
GROUP BY
p.PERSON_ID
Now, if the p.partic_id = 1 represents only 1 person, then that wont make as much sense to query all people with a given certificate status, etc. But if Partic_id = 1 represents a group of people such as within a given association / division of a company, then it should be fine.
Any questions, let me know and I can revise / update answer

CASE issue: there can be, presumably, be multiple records in ADD_ROLE_PERS for each person. If a person can have two or more roles running concurrently then you need to decide what the business logic is that you need to use to handle this. If a person can only have one active role at a time presumably there is a "active/disabled" column or effective date columns you should be using to identify the active record (or, potentially, there is a data issue).
The subquery should return the same value for every single row in your resultset, as it is completely isolated/standalone from your main query. If you want it to produce counts that are relevant to each row then you will need to connect it to the tables in the main table (look up correlated subqueries if you don't know how to so this)

Related

Get account information based on last login time

I have this query
SELECT
c.* ,concat ( s.FirstName,'',s.LastName) as FullName
FROM [dbo].[Monitor] c
left join acc.Staff s on s.Id = c.UserId where c.UserId=1
Results:
enter image description here
How to get account information based on last login time in SQL Server.
I don't know how to get account information based on last login time.
From what I understand you want to query the very last login.
SELECT TOP 1 * FROM Monitor m Join Staff s on s.Id = m.UserId
WHERE Object = 'Login' ORDER BY AccessDate Desc
Here is an explanation of the code.
SELECT TOP 1 * FROM Monitor m
The code above is going to query only 1 result (TOP 1) and show all the columns (*) from the table Monitor. If you wish to get only specific columns, you can change * to whatever columns needed. I've given the table Monitor the alias m, because the word starts with that letter, but you can name your alias however you please, for as long as you remember it, or it's easy to realize what column it refers to.
Join Staff s ON s.Id = m.UserID
I've used Join, because you haven't really specified what exact result you are expecting, your question is more about getting the last login, so whatever join is used depends on your expectations. The same goes with the columns I've joined the two tables on. I just copied yours, but they would depend on demanded result and obviously if you have a foreign key in any of the tables, then use that key to join them.
WHERE Object = 'Login' ORDER BY AccessDate DESC
This is the important part of the code for your question. By specifying that we only need rows WHERE the column Object has value of 'Login', we are making sure that only Logins, are shown and all the Logouts are excluded. With ORDER BY AccessDate DESC, we are making sure that the biggest date value is at the top. The way dates work in SQL Server, if you compare two dates, the later date is considered bigger, so the last login would be at the very top, and since we have SELECT TOP 1 at the beginning we are making sure that we are going to get only the very last row.

Query build to find records where all of a series of records have a value

Let me explain a little bit about what I am trying to do because I dont even know the vocab to use to ask. I have an Access 2016 database that records staff QA data. When a staff member misses a QA we assign a job aid that explains the process and they can optionally send back a worksheet showing they learned about what was missed. If they do all of these ina 3 month period they get a credit on their QA score. So I have a series of records all of whom have a date we assigned the work(RA1) and MAY have a work returned date(RC1).
In the below image "lavalleer" has earned the credit because both of her sheets got returned. "maduncn" Did not earn the credit because he didn't do one.
I want to create a query that returns to me only the people that are like "lavalleer". I tried hitting google and searched here and access.programmers.co.uk but I'm only coming up with instructions to use Not null statements. That wouldn't work for me because if I did a IS Not Null on "maduncn" I would get the 4 records but it would exclude the null.
What I need to do is build a query where I can see staff that have dates in ALL of their RC1 fields. If any of their RC1 fields are blank I dont want them to return.
Consider:
SELECT * FROM tablename WHERE NOT UserLogin IN (SELECT UserLogin FROM tablename WHERE RCI IS NULL);
You could use a not exists clause with a correlated subquery, e.g.
select t.* from YourTable t where not exists
(select 1 from YourTable u where t.userlogin = u.userlogin and u.rc1 is null)
Here, select 1 is used purely for optimisation - we don't care what the query returns, just that it has records (or doesn't have records).
Or, you could use a left join to exclude those users for which there is a null rc1 record, e.g.:
select t.* from YourTable t left join
(select u.userlogin from YourTable u where u.rc1 is null) v on t.userlogin = v.userlogin
where v.userlogin is null
In all of the above, change all occurrences of YourTable to the name of your table.

Oracle - Join multiple columns trying different combinations

I'll try to explain my problem:
I need to find the most efficient way to join two table on 4 columns, but data is really crappy so there could be cases where I can join only on 3 or 2 columns because the fourth and/or third were stored badly (with spaces, zeros, dashes,...)
I should try to achieve something like this:
select * from table a
join table b
on a.key1=b.key1
and a.key2=b.key2
or a.key3=b.key3
or a.key4=b.key4```
I already performed some data quality but the number of records is really high (table a is 300k records and table b is about 25M records).
I know that the example I provided is not efficient and it would be better making separate joins and then "union" them, but I'm asking you if there could be some better way to do it.
Thanks in advance
You haven't explained your problem very well, so let's create an example:
There is a table of clients and a table of orders. Both are not related via keys, because both are imported from different systems. Your task is now to find the client per order.
Both tables contain the client's last name, first name, city, and a client number. However, these columns are optional in the order table (but either last name or client number are always given). And sometimes a first name or city may be abbreviated or misspelled (e.g. J./James, NY/New York, Cris/Chris).
So, if the order contains a client number, we have a match and are done. Otherwise the last name must match. In the latter case we look at first name and city, too. Do both match? Only one? Neither?
We use RANK to rank the clients per order and pick the best matches. Some orders will end up with exactly one match, others will have ties and we must examine the data manually then (the worst case being no client number and no last name match because of a misspelled name).
select *
from
(
select
o.*,
c.*,
rank() over
(
partition by o.order_number
order by
case
when c.client_number = o.client_number then 1
when c.last_name = o.last_name and c.first_name = o.first_name and c.city = o.city then 2
when c.last_name = o.last_name and (c.first_name = o.first_name or c.city = o.city) then 3
when c.last_name = o.last_name then 4
else 5
end
) as rnk
from orders o
left join clients c on c.client_number = o.client_number or c.last_name = o.last_name
) ranked
where rnk = 1
order by order_number;
I hope this gets you an idea how to write such a query and you will be able to adapt this concept to your case.

JOIN'ing by 3 tables and retrieving field based on content of those tables

Kind of hard to explain, so I'll map it out. Given these four tables: houses, landlord_houses, company and tenant, I need to find all the houses that have signed up in the last 14 days and get some information about them.
Previously I've done this with a very simple select query from the houses table. But now I need to get the letting agent of the house to display on a report. The problem is, the letting agent can be in any one of 3 locations: in the company table, the houses table or in the tenant table. I've come up with this query so far:
select distinct h.id as id,
h.address1 as address1,
h.town as town,
h.postcode as postcode,
h.valid_from as valid_from,
h.valid_to as valid_to,
(CASE WHEN c.name IS NOT NULL THEN c.name || ' (MS)'
WHEN h.letting_agent IS NOT NULL THEN h.letting_agent
WHEN t.id IS NOT NULL THEN t.letting_agent
ELSE 'Unknown (Not set yet)' END) AS agent
from houses h
left join landlord_houses lh on lh.house_id = h.id
left join company c on c.id = lh.company_id
left join tenant t on t.house_id = h.id
where h.deleted IS FALSE
and h.archived IS FALSE
and h.sign_up_complete IS TRUE
and h.completed > NOW() - '14 days'::INTERVAL
order by h.id
This kind of works, however I'm getting results back that have an empty agent field even though it's meant to say "Unknown (Not set yet)". I'm also getting duplicate houses returned even though I've used distinct h.id. I think this is because there are multiple letting agents for these houses in the company, houses and tenant tables.
What needs to be changed in the query to get this to work?
Thank you.
That case statement looks a little wonky. You may be returning multiple values. Try this:
COALESCE(c.name, h.letting_agent, t.letting_agent, 'Unknown (Not set yet)') as Agent
This checks c.name. If it is null, move to the next argument and do the same.
Everything else in the query looks fine and this does work in postgresql.

In a loop How to set a value if even one record passes check

I am joining two tables Employee and Wages. Now an employee can have multiple wages as they can be working in different projects and I want the sum of his wages.There is also a column called Employee_benefits_Claimed_Ind. This is an indicator that indicates if the employee claimed any benefits in each project. He can claim benefits for some projects and not claim for some but as far as I am concerned if he claims benefits on even one project he does not qualify. Here is the table I am trying to populate:
CREATE TABLE EMPLOYEE_QUAL
(Employee_id NUMBER,
TOTAL_WAGES NUMBER,
EMPLOYEE_disQUALIFY CHAR(1))
INSERT INTO EMPLOYEE_QUAL(Employee_id, Total_wages, EMployee_disQualify)
SELECT c_Employee_id,
c_Total_wages,
c_Employee_disQualify
FROM (
Select Employee_ID as c_Employee_id
from Employees
) e
LEFT JOIN (
Select SUM(Wages),
Employee_disQualify
from wages
group by Employee_disQualify
) w on e.employee_id = w.employee_id
However if an employee claims benefits for one and does not for another this will just have two entries because of the GROUP BY. Ideally it should only be one entry with the Employee_disqualify_ind as 'Y' since he claimed benefits on one project. It does not matter if he did not on the other one. How do I go about achieving this?
I'm going to guess at what you're really saying.
In English: For all employees as specified by Employees.Employee_ID, sum up wages on the WAGES table, and set EMployee_disQualify' = 'Y' if any of thewages.Employee_disqualify` flags are set to Yes.
In SQL, That would be:
SELECT e.Employee_ID,
sum(w.wages),
case when
(sum (case w.Employee_disQualify WHEN 'Y' THEN 1 else 0 end)) > 0
THEN 'Y'
ELSE 'N'
end
FROM Employees e
LEFT JOIN wages w
on e.employee_id = w.employee_id
group by e.employee_ID
(where I've just shown the select).
The main trick is to convert the employee_disqualify flag into something numeric (using CASE) so it can be easily aggregated, and then covert the result of this aggregation back to a Y or a N w/ another CASE. If there is at least one Y in any of matching rows, then the sum will be > 1, so, you'll get a Y as your final result. Otherwise, N. (And again, I'm guessing as to how your field is set.)
If you weren't aggregating, I'd might do it with an in-line select 'Y' where exists . . type cluase, but you're already aggregating anyway for sum(wages) and this gets calculated on the same scan through the wages table, so this should be reasonably efficient.
For example, see here: http://sqlfiddle.com/#!4/67691/2