How do I check if all posts from a joined table has the same value in a column? - sql

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.

select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.

Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …

Related

SQL query to return results from one to many table

I'm having difficulties trying to return some data from a poorly structured one to many table.
I've been provided with a data export where everything from 'Section Codes' onwards (in cat_fullxPath) relates to a 'skillID' in my clients database.
The results previously returned on one line but I've used a split function to break these out (from the cat_fullXPath column). You can see the relevant 'skillID' from my clients DB in the far right column:
From here, there are thousands of records that may have a mixture of these skillIDs (and many others, I've just provided this one example). I want to be able to find the records that match all 4 (or however many match from another example) skillIDs and ONLY those.
For example (I just happen to know this ID gives me the results I want):
SELECT
id
skillID
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
AND id = 69580;
This returns me:
Note that these are the only columns in that table.
So this is an ID I'd want to return.
These are results I'd not want to return as one of the skillIDs are missing:
I've created a temp table with a count of all the skills for each ID but I'm not sure if I'm going down the right path at this point
I'm pretty sure that there's a simple solution to this, however I'm hitting my head against the wall. Hope someone can help!
EDIT
This might be a clearer example of when there are different groups of skillIds that I need to align. I've partitioned these off by cat_fullxpath to see if this makes things clearer:
In this screenshot, for example I want to find the ids for everything in table1 where skillID IN (1003914,1005354,1004701) then repeat for (1004659,1004492,1004493,1004701). etc
We know that you need exactly 4 skills, so just make a subquery:
select id from
(
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id;
)
where countSkill = 4;
Could work with sum, instead of count. But instead of filtering by the 4, you filter by 4022352, which is the sum of all skillID.
You can also remove the subquery and use HAVING. But you will obtain worse performance.
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id
having count(skillID) = 4;
You haven't told us your DBMS. Here is a standard SQL approach:
select id
from table1
group by id
having count(case when skillid = 1004464 then 1 end) > 0
and count(case when skillid = 1006543 then 1 end) > 0
and count(case when skillid = 1004605 then 1 end) > 0
and count(case when skillid = 1006740 then 1 end) > 0
and count(case when skillid not in (1004464, 1006543, 1004605, 1006740) then 1 end) = 0;
Another option is to concatenate all skills and see if the resulting skill list matches the desired skill list. In SQL Server the string aggregation function is STRING_AGG.
select id
from table1
group by id
having string_agg(skillid, ',') within group (order by skillid) in
(
'1004464,1004605,1006543,1006740'
);
You can easily extend the IN clause with other combinations or even get the list from another table. Only make sure the skill IDs in the strings are sorted in order to make the strings comparable ('1004464,1004605,1006543,1006740' <> '1006740,1004464,1004605,1006543').

Grouping and null values in column

Need some help in how to fix a problem.
Below is my input data. Here I am doing a group by based on name field. The query which I am currently used for grouping is given below.
select name from Table
group by name having count(distinct DOB)='1'
But the problem is that the above query won't fecth records if the DOB field is null for all records within a group.In case if I try to give some dummy value for DOB field, then It won't fetch the result for first two rows and if I didn't give the dummy value for it won't fecth the records in 3 and 4
I tried something like this, but it is wrong
select name from Table
group by name having count(distinct case when DOB is null then '9999-01-01' else DOB END)='1'
Could someone help here with some suggestions. My expected result is given below.
You can replace the logic with:
having min(dob) = max(dob) or
min(dob) is null
Depending on your data, count(distinct) can be relatively expensive, so this can actually be cheaper than using it.
You can use count(distinct). Just change the comparison value:
having count(distinct dob) <= 1

if-then-else construction in complex stored procedure

I am relatively new to sql queries and I was wondering how to create a complex stored procedure. My database runs on SQL server.
I have a table customer (id, name) and a table customer_events (id, customer_id, timestamp, action_type). I want add a calculated field customer_status to table customer which is
0: (if there is no event for this customer in customer_events) or (the most recent event is > 5 minutes ago)
1: if the most recent event is < 5 minutes ago and action_type=0
2: if the most recent event is < 5 minutes ago and action_type=1
Can I use if-then-else constructions or should I solve this challenge differently?
As you mentioned in comments, you actually want to add a field to a select query, and in a general sense what you want is a CASE statement. They work like this:
SELECT field1,
field2,
CASE
WHEN some_condition THEN some_result
WHEN another_condition THEN another_result
END AS field_alias
FROM table
Applied to your specific scenario, well it's not totally straightforward. You're certainly going to need to left join your status table, you also want to aggregate to find the most recent event, along with that event's action type. Once you have that information, the case statement is straightforward.
Always hard to write sql without access to your data, but something like:
SELECT c.id,
c.name,
CASE
WHEN e.id IS NULL OR DATEDIFF(minute,e.timestamp,getDate())>=5 THEN 0
WHEN DATEDIFF(minute,e.timestamp,getDate())<5 AND s.action_type=1 THEN 1
WHEN DATEDIFF(minute,e.timestamp,getDate())<5 AND s.action_type=0 THEN 2
END as customer_status
FROM clients c
LEFT JOIN (
SELECT id, client_id, action_type,
rank() OVER(partition by client_id order by timestamp desc) AS r
FROM customer_events
) e
ON c.id=e.client_id AND e.r=1
The core of this is the subquery in the middle, it's using a rank funtion to give a number to each status by client_id ordered by the timestamp descending. Therefore every record with a rank of 1 will be the most recent (for that client). Thereafter, you simply join it on to the client table, and use it to determine the right value for customer_status
Presuming you get the event info into "Most_Recent_Event_Mins_Ago". If none it will be NULL.
SELECT Id, Name,
CASE
WHEN Most_Recent_Event_Mins_Ago IS NULL THEN 0
WHEN Most_Recent_Event_Mins_Ago <5 AND Action_type = 0 THEN 1
WHEN Most_Recent_Event_Mins_Ago <5 AND Action_type = 1 THEN 0
..other scenarions
ELSE yourDefaultValueForStatus
END as Status
FROM customer
WHERE
...
...

SQL Query For Grouping

Hi i need help with a query
I have a table Where Jobs and Employee are linked its called EmployeeToJobsApplied
Id EmployeeId JobsId Applied Viewed
1 1 1 True True
2 1 2 False True
3 1 1 True True
4 1 3 True True
If you noticed there are repeating values like in ID=3
I didn't create the database structure. I can't do much about the table structure as of this point since this is a post production project.
The thing i can change is the StoredProcedure that could retrieve information from this table.
So what i need is a single column sigle row value of the Total of Jobs Applied
So basically what i need based on this example is to get a value of
2 Jobs Applied for Employee ID = 1
i want to ignore the duplicates.
Thank You!
Please feel free to edit/retag
UPDATE
I do need the total of the result,
I need the total count (not the list) of Employees who applied for a specific job.
I tried using count and i'ts not working accordingly, Because it counts also those who are not distinct. Thank you for your kind help
If you need to aggregate on distinct values, then you can write:
select EmployeeId, count(distinct JobsId) as JobsApplied
from EmployeeToJobsApplied
where Applied = 1
group by EmployeeId
Use distinct.
select distinct * from tablename
if you dont need id in your result set then, it's simple to use distinct something like:
SELECT distinct employeeid, JobsId, Applied, Viewed
FROM EmployeeToJobsApplied
additionally you can use the where clause to remove results where Applied is false:
WHERE Applied = true
select distinct EmployeeId, JobsId from EmployeeToJobsApplied
since you don't seem to care about the applied and viewed columns
Instead of distinct we can group by our columns
select id,JobsId,Applied,Viewed from Emp
group by id,JobsId
As distinct hits performance.

Getting the last record in SQL in WHERE condition

i have loanTable that contain two field loan_id and status
loan_id status
==============
1 0
2 9
1 6
5 3
4 5
1 4 <-- How do I select this??
4 6
In this Situation i need to show the last Status of loan_id 1 i.e is status 4. Can please help me in this query.
Since the 'last' row for ID 1 is neither the minimum nor the maximum, you are living in a state of mild confusion. Rows in a table have no order. So, you should be providing another column, possibly the date/time when each row is inserted, to provide the sequencing of the data. Another option could be a separate, automatically incremented column which records the sequence in which the rows are inserted. Then the query can be written.
If the extra column is called status_id, then you could write:
SELECT L1.*
FROM LoanTable AS L1
WHERE L1.Status_ID = (SELECT MAX(Status_ID)
FROM LoanTable AS L2
WHERE L2.Loan_ID = 1);
(The table aliases L1 and L2 could be omitted without confusing the DBMS or experienced SQL programmers.)
As it stands, there is no reliable way of knowing which is the last row, so your query is unanswerable.
Does your table happen to have a primary id or a timestamp? If not then what you want is not really possible.
If yes then:
SELECT TOP 1 status
FROM loanTable
WHERE loan_id = 1
ORDER BY primaryId DESC
-- or
-- ORDER BY yourTimestamp DESC
I assume that with "last status" you mean the record that was inserted most recently? AFAIK there is no way to make such a query unless you add timestamp into your table where you store the date and time when the record was added. RDBMS don't keep any internal order of the records.
But if last = last inserted, that's not possible for current schema, until a PK addition:
select top 1 status, loan_id
from loanTable
where loan_id = 1
order by id desc -- PK
Use a data reader. When it exits the while loop it will be on the last row. As the other posters stated unless you put a sort on the query, the row order could change. Even if there is a clustered index on the table it might not return the rows in that order (without a sort on the clustered index).
SqlDataReader rdr = SQLcmd.ExecuteReader();
while (rdr.Read())
{
}
string lastVal = rdr[0].ToString()
rdr.Close();
You could also use a ROW_NUMBER() but that requires a sort and you cannot use ROW_NUMBER() directly in the Where. But you can fool it by creating a derived table. The rdr solution above is faster.
In oracle database this is very simple.
select * from (select * from loanTable order by rownum desc) where rownum=1
Hi if this has not been solved yet.
To get the last record for any field from a table the easiest way would be to add an ID to each record say pID. Also say that in your table you would like to hhet the last record for each 'Name', run the simple query
SELECT Name, MAX(pID) as LastID
INTO [TableName]
FROM [YourTableName]
GROUP BY [Name]/[Any other field you would like your last records to appear by]
You should now have a table containing the Names in one column and the last available ID for that Name.
Now you can use a join to get the other details from your primary table, say this is some price or date then run the following:
SELECT a.*,b.Price/b.date/b.[Whatever other field you want]
FROM [TableName] a LEFT JOIN [YourTableName]
ON a.Name = b.Name and a.LastID = b.pID
This should then give you the last records for each Name, for the first record run the same queries as above just replace the Max by Min above.
This should be easy to follow and should run quicker as well
If you don't have any identifying columns you could use to get the insert order. You can always do it like this. But it's hacky, and not very pretty.
select
t.row1,
t.row2,
ROW_NUMBER() OVER (ORDER BY t.[count]) AS rownum from (
select
tab.row1,
tab.row2,
1 as [count]
from table tab) t
So basically you get the 'natural order' if you can call it that, and add some column with all the same data. This can be used to sort by the 'natural order', giving you an opportunity to place a row number column on the next query.
Personally, if the system you are using hasn't got a time stamp/identity column, and the current users are using the 'natural order', I would quickly add a column and use this query to create some sort of time stamp/incremental key. Rather than risking having some automation mechanism change the 'natural order', breaking the data needed.
I think this code may help you:
WITH cte_Loans
AS
(
SELECT LoanID
,[Status]
,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS RN
FROM LoanTable
)
SELECT LoanID
,[Status]
FROM LoanTable L1
WHERE RN = ( SELECT max(RN)
FROM LoanTable L2
WHERE L2.LoanID = L1.LoanID)