SQL Query For Grouping - sql

Hi i need help with a query
I have a table Where Jobs and Employee are linked its called EmployeeToJobsApplied
Id EmployeeId JobsId Applied Viewed
1 1 1 True True
2 1 2 False True
3 1 1 True True
4 1 3 True True
If you noticed there are repeating values like in ID=3
I didn't create the database structure. I can't do much about the table structure as of this point since this is a post production project.
The thing i can change is the StoredProcedure that could retrieve information from this table.
So what i need is a single column sigle row value of the Total of Jobs Applied
So basically what i need based on this example is to get a value of
2 Jobs Applied for Employee ID = 1
i want to ignore the duplicates.
Thank You!
Please feel free to edit/retag
UPDATE
I do need the total of the result,
I need the total count (not the list) of Employees who applied for a specific job.
I tried using count and i'ts not working accordingly, Because it counts also those who are not distinct. Thank you for your kind help

If you need to aggregate on distinct values, then you can write:
select EmployeeId, count(distinct JobsId) as JobsApplied
from EmployeeToJobsApplied
where Applied = 1
group by EmployeeId

Use distinct.
select distinct * from tablename

if you dont need id in your result set then, it's simple to use distinct something like:
SELECT distinct employeeid, JobsId, Applied, Viewed
FROM EmployeeToJobsApplied
additionally you can use the where clause to remove results where Applied is false:
WHERE Applied = true

select distinct EmployeeId, JobsId from EmployeeToJobsApplied
since you don't seem to care about the applied and viewed columns

Instead of distinct we can group by our columns
select id,JobsId,Applied,Viewed from Emp
group by id,JobsId
As distinct hits performance.

Related

How to create a new table that only keeps rows with more than 5 data records under the same id in Bigquery

I have a table like this:
Id
Date
Steps
Distance
1
2016-06-01
1000
1
There are over 1000 records and 50 Ids in this table, most ids have about 20 records, and some ids only have 1, or 2 records which I think are useless.
I want to create a table that excludes those ids with less than 5 records.
I wrote this code to find the ids that I want to exclude:
SELECT
Id,
COUNT(Id) AS num_id
FROM `table`
GROUP BY
Id
ORDER BY
num_id
Since there are only two ids I need to exclude, I use WHERE clause:
CREATE TABLE `` AS
SELECT
*
FROM ``
WHERE
Id <> 2320127002
AND Id <> 7007744171
Although I can get the result I want, I think there are better ways to solve this kind of problem. For example, if there are over 20 ids with less than 5 records in this table, what shall I do? Thank you.
Consider this:
CREATE TABLE `filtered_table` AS
SELECT *
FROM `table`
WHERE TRUE QUALIFY COUNT(*) OVER (PARTITION BY Id) >= 5
Note: You can remove WHERE TRUE if it runs successfully without it.

Need sql query to pull back data that meets several groups of criteria from same table in one query

I need to write an sql query that will pull back the data that meets several groups of criteria from the same table. The easiest way to describe is to imagine using an SQL "in" clause but instead of the internals of that clause being "or"s joining the parameters you want it to match it is instead an "and".
I attempted to use count to verify the correct amount of data was pulled back for each "in" statement but the count can't always be trusted due to other entries being similar for each column.
A sample table might be this:
id count animal
--- ----- ------
1 5 puppy
1 6 cat
1 6 puppy
So, now I need a query that will pull back all entries with an id of 1 and a count of 5 and 6 and an animal of puppy and cat. I pretty much need to verify the entire path of the table entry to know I want to pull it back. Is there any built in function that can do this? Do I need to use a recursive CTE to dig deep after confirming that one set of criteria is met? Thanks for any help.
If I got it right
with cnt as(
select id
from tbl
where [count] in (5,6) and animal in ('puppy', 'cat')
group by id
having count(distinct[count])=2 and count(distinct animal)=2
)
select id, [count], animal
from tbl
where id in (select id from cnt);
It's kind of confusing what you're looking for exactly but can you not use or's and ands?
select id, count, animal
from table
where id = 1 and
(count = 5 or count = 6) and
(animal = puppy or anmial = cat)
I think you just want:
select t.*
from t
where id = 1 and
count in (5, 6) and
animal in ('puppy', 'cat');
EDIT:
If you want them all in the same row, just rearrange the conditions:
select t.*
from t
where id = 1 and
( (count = 5 and animal = 'puppy') or
(count = 6 and animal = 'cat')
);

Query to find duplicate values for two fields

Sorry for the Title, But didn't know how to explain.
I have a table that have 2 fields A and B.
I want find all rows in the table that have duplicate A (more than one record) but at the same time A will consider as a duplicate only if B is different in both rows.
Example:
FIELD A Field B
10 10
10 10 // This is not duplicate
10 10
10 5 // this is a duplicate
How to to this in a single query
Let's break this down into how you would go about constructing such a query. You don't make it clear whether you're looking for all values of A or all rows but let's assume all values of A initially.
The first step therefore is to create a list of all values of A. This can be done two ways, DISTINCT or GROUP BY. I'm going to use GROUP BY because of what else you want to do:
select a
from your_table
group by a
This returns a single column that is unique on A. Now, how can you change this to give you the unique values? The most obvious thing to use is the HAVING clause, which allows you to restrict on aggregated values. For instance the following will give you all values of A which only appear once in the table
select a
from your_table
group by a
having count(*) = 1
That is the count of all values of A inside the group is 1. You don't want this of course, you want to do this with the column B. You need there to exist more than one value of B in order for the situation you want to identify to be possible (if there's only one value of B then it's impossible). This gets us to
select a
from your_table
group by a
having count(b) > 1
This still isn't enough as you want two different values of B. The above just counts the number of records with the column B. Inside an aggregate function you use the DISTINCT keyword to determine unique values; bringing us to:
select a
from your_table
group by a
having count(distinct b) > 1
To transcribe this into English this means select all unique values of A from YOUR_TABLE that have more than one values of B in the group.
You can use this method, or something similar, to build up your own queries as you create them. Determine what you want to achieve and slowly build up to it.
select FIELD from your_table group by FIELD having count(b) > 1
take in consideration that this will return count of all duplicate
example
if you have values
1
1
2
1
it will return 3 for value 1 not 2

SQL Server Sum multiple rows into one - no temp table

I would like to see a most concise way to do what is outlined in this SO question: Sum values from multiple rows into one row
that is, combine multiple rows while summing a column.
But how to then delete the duplicates. In other words I have data like this:
Person Value
--------------
1 10
1 20
2 15
And I want to sum the values for any duplicates (on the Person col) into a single row and get rid of the other duplicates on the Person value. So my output would be:
Person Value
-------------
1 30
2 15
And I would like to do this without using a temp table. I think that I'll need to use OVER PARTITION BY but just not sure. Just trying to challenge myself in not doing it the temp table way. Working with SQL Server 2008 R2
Simply put, give me a concise stmt getting from my input to my output in the same table. So if my table name is People if I do a select * from People on it before the operation that I am asking in this question I get the first set above and then when I do a select * from People after the operation, I get the second set of data above.
Not sure why not using Temp table but here's one way to avoid it (tho imho this is an overkill):
UPDATE MyTable SET VALUE = (SELECT SUM(Value) FROM MyTable MT WHERE MT.Person = MyTable.Person);
WITH DUP_TABLE AS
(SELECT ROW_NUMBER()
OVER (PARTITION BY Person ORDER BY Person) As ROW_NO
FROM MyTable)
DELETE FROM DUP_TABLE WHERE ROW_NO > 1;
First query updates every duplicate person to the summary value. Second query removes duplicate persons.
Demo: http://sqlfiddle.com/#!3/db7aa/11
All you're asking for is a simple SUM() aggregate function and a GROUP BY
SELECT Person, SUM(Value)
FROM myTable
GROUP BY Person
The SUM() by itself would sum up the values in a column, but when you add a secondary column and GROUP BY it, SQL will show distinct values from the secondary column and perform the aggregate function by those distinct categories.

How do I check if all posts from a joined table has the same value in a column?

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.
select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.
Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …