count the no. of employee in each department and fill the counting in respecting experience column - sql

I have two tables employee and department,Employee table schema(Eid,Ename,DOJ,Sal,Dept ID) and Department schema(Dept id,Dname).So what i want in output is count the no. of employee by each department and according to experience.
Output:
dept |0-5yrs|5-10yrs|10-15yrs
HR | 4 | 9 | 0
Account | 2 | 3 | 1
what I mean by the output is 4 employees in HR department have less than 5 years of experience and 9 people have more than 5 and less than 10 years of experience and 0 have 10-15 years of experience

You can use a pivot table in Microsoft SQL, like this:
select p.DName, p.Under5, p.From5To10, p.MoreThan10
from (
select d.DName, case when datediff(day, e.DOJ, getdate()) / 365 < 5 then 'Under5' when datediff(day, e.DOJ, getdate()) / 365 > 10 then 'MoreThan10' else 'From5To10' end as ExperienceBucket
from Employee e
join Department d on e.[Dept Id] = d.[Dept Id]
) as s
pivot (
count(ExperienceBucket)
for ExperienceBucket in (Under5, From5To10, MoreThan10)
) as p

Related

Doing a distinct count on an employee history table, based on departments at a current point in time

So I have an employee table with data on all employee since the beginning. In the data I have all the data I should need. I have the employee startdate, enddate (null if nothing), I have the name of the department, and if a department has changed, that specific employee has a new line, with a new department value, and two columns called "DepValidFrom" and "DepValidto", in date format that determines the time-period that the current employee was in that specific department.
My goal is, to get into a matrix, a list of all the departments as rows, and with year and month as columns, and the number of employees in that department at that time as values. I have all the data, I just cannot find the exact way to write my PowerBI Measure or perhaps even SQL query.
So.... I am trying to pull this into Power BI, and I am getting an incomplete view. I want my data to look like the following:
Department | Jan | Feb | Mar | Apr |
Dep1 | 3 | 5 | 6 | 4 |
Dep2 | 2 | 3 | 2 | 3 |
Dep3 | 1 | 1 | 2 | 3 |
Right now I am just using a very simple DISTINCTCOUNT(Emp_Table[EmployeeInitials]) which gives me an incomplete view, as it only counts on the specific date, and doesn't retain the number into a total, leaving a bunch of empty values.
I hope someone can understand what I mean, and that someone can help!
Thanks!
You can start by unpivoting the dates and generating a query that gives the number of employee per department and date:
select e.dept, x.dt, sum(cnt) over(partition by dept order by dt) cnt
from employees e
cross apply (values (startdate, 1), (enddate, -1)) as x(dt, cnt)
where dt is not null
Then, you can do conditional aggregation to pivot the results - this requires enumerating the dates though:
select dept,
max(case when dt >= '20200101' and dt < '20200201' then cnt else 0 end) cnt_202001,
max(case when dt >= '20200201' and dt < '20200301' then cnt else 0 end) cnt_202002,
...
from (
select e.dept, x.dt, sum(cnt) over(partition by dept order by dt) cnt
from employees e
cross apply (values (startdate, 1), (enddate, -1)) as x(dt, cnt)
where dt is not null
) t
group by dept
When an employee changes in the middle of the month, it is counted in both departments for that month.

Vertica SQL for running count distinct and running conditional count

I'm trying to build a department level score table based on a deeper product url level score table.
Date is not consecutive
Not all urls got score updates at same day (independent to each other)
dist_url should be running count distinct (cumulative count distinct)
dist urls and urls score >=30 are both count distinct
What I have now is:
Date url Store Dept Page Score
10/1 a US A X 10
10/1 b US A X 30
10/1 c US A X 60
10/4 a US A X 20
10/4 d US A X 60
10/6 b US A X 22
10/9 a US A X 40
10/9 e US A X 10
Date Store Dept Page dist urls urls score >=30
10/1 US A X 3 2
10/4 US A X 4 3
10/6 US A X 4 2
10/9 US A X 5 2
I think the dist_url can be done by using window function, just not sure on query.
Current query is as below, but it's wrong since not cumulative count distinct:
SELECT
bm.AnalysisDate,
su.SoID AS Store,
su.DptCaID AS DTID,
su.PageTypeID AS PTID,
COUNT(DISTINCT bm.SeoURLID) AS NumURLsWithDupScore,
SUM(CASE WHEN bm.DuplicationScore > 30 THEN 1 ELSE 0 END) AS Over30Count
FROM csn_seo.tblBotifyMetrics bm
INNER JOIN csn_seo.tblSEOURLs su
ON bm.SeoURLID = su.ID
WHERE su.DptCaID IS NOT NULL
AND su.DptCaID <> 0
AND su.PageTypeID IS NOT NULL
AND su.PageTypeID <> -1
AND bm.iscompliant = 1
GROUP BY bm.AnalysisDate, su.SoID, su.DptCaID, su.PageTypeID;
Please let me know if anyone has any idea.
Based on your question, you seem to want two levels of logic:
select date, store, dept,
sum(sum(start)) over (partition by dept, page order by date) as distinct_urls,
sum(sum(start_30)) over (partition by dept, page order by date) as distinct_urls_30
from ((select store, dept, page, url, min(date) as date, 1 as start, 0 as start_30
from t
group by store, dept, page, url
) union all
(select store, dept, page, url, min(date) as date, 0, 1
from t
where score >= 30
group by store, dept, page, url
)
) t
group by date, store, dept, page;
I don't understand how your query is related to your question.
Try as I might, I don't get your output either:
But I think you can avoid UNION SELECTs - Does this do what you expect?
NULLS don't figure in COUNT DISTINCTs - and here you can combine an aggregate expression with an OLAP one ...
And Vertica has named windows to increase readability ....
WITH
input(Date,url,Store,Dept,Page,Score) AS (
SELECT DATE '2019-10-01','a','US','A','X',10
UNION ALL SELECT DATE '2019-10-01','b','US','A','X',30
UNION ALL SELECT DATE '2019-10-01','c','US','A','X',60
UNION ALL SELECT DATE '2019-10-04','a','US','A','X',20
UNION ALL SELECT DATE '2019-10-04','d','US','A','X',60
UNION ALL SELECT DATE '2019-10-06','b','US','A','X',22
UNION ALL SELECT DATE '2019-10-09','a','US','A','X',40
UNION ALL SELECT DATE '2019-10-09','e','US','A','X',10
)
SELECT
date
, store
, dept
, page
, SUM(COUNT(DISTINCT url) ) OVER(w) AS dist_urls
, SUM(COUNT(DISTINCT CASE WHEN score >=30 THEN url END)) OVER(w) AS dist_urls_gt_30
FROM input
GROUP BY
date
, store
, dept
, page
WINDOW w AS (PARTITION BY store,dept,page ORDER BY date)
;
-- out date | store | dept | page | dist_urls | dist_urls_gt_30
-- out ------------+-------+------+------+-----------+-----------------
-- out 2019-10-01 | US | A | X | 3 | 2
-- out 2019-10-04 | US | A | X | 5 | 3
-- out 2019-10-06 | US | A | X | 6 | 3
-- out 2019-10-09 | US | A | X | 8 | 4
-- out (4 rows)
-- out
-- out Time: First fetch (4 rows): 45.321 ms. All rows formatted: 45.364 ms

How to select items according to their sums in SQL?

I've got the following table:
ID Name Sales
1 Kalle 1
2 Kalle -1
3 Simon 10
4 Simon 20
5 Anna 11
6 Anna 0
7 Tina 0
I want to write a SQL query that only returns the rows that
represents a salesperson with sum of sales > 0.
ID Name Sales
3 Simon 10
4 Simon 20
5 Anna 11
6 Anna 0
Is this possible?
You can easily get names of the people with the sum of sales that are greater than 0 by using the a HAVING clause:
select name
from yourtable
group by name
having sum(sales) > 0;
This query will return both Simon and Anna, then if you want to return all of the details for each of these names you can use the above in a WHERE clause to get the final result:
select id, name, sales
from yourtable
where name in (select name
from yourtable
group by name
having sum(sales) > 0);
See SQL Fiddle with Demo.
You can make it like this, I think the join will be more effective than the where name in() clause.
SELECT Sales.name, Sales.sales
FROM Sales
JOIN (SELECT name FROM Sales GROUP BY Sales.name HAVING SUM(sales) > 0) AS Sales2 ON Sales2.name = Sales.name
This will work on some databases, like oracle, mssql, db2
SELECT ID, Name, Sales
FROM
(
SELECT ID, Name, Sales, sum(sales) over (partition by name) sum1
FROM <table>
) a
WHERE sum1 > 0

Sum(case results not right

I have a contact table that includes the length of time each contact lived in the neighborhood:
ID First_Name Last_Name Neighborhood_Time
1 John Smith 1-2 years
2 Mary Jones 2-5 years
3 Dennis White 2-5 years
4 Martha Olson 5+ years
5 Jeff Black 5+ years
6 Jean Rogers 2-5 years
I want to show the percentage of time, the result would look like this:
One_to_2_Years Two_to_5_Years 5+_Years
16 50 33
This is what I'm using:
select
sum(case when Neighborhoods_time ='1-2 years' then 1 else 0 end)*100/(select count(*) from contact) as One_to_2_Years,
sum(case when Neighborhoods_time ='2-5 years' then 1 else 0 end)*100/(select count(*) from contact) as Two_to_6_Years,
sum(case when Neighborhoods_time ='5+years' then 1 else 0 end)*100/(select count(*) from contact) as Six_to_10_Years
from dbo.contact
This is my result:
One_to_2_Years Two_to_5_Years 5+_Years
0 0 16
16 33 0
0 16 16
I see the numbers under each column are correct, I'm having a problem summing them.
What am I missing?
Thanks.
Add Group by Neighborhoods_time
The basis of your query can be produced like
select
Neighborhood_Time,
100*COUNT(*)/(Select COUNT(*) from contact) as percentvalue
from
contact
group by
Neighborhood_Time
If you want to arrange it horizontally, then you should use a pivot
select
*
from
(
select
Neighborhood_Time,
100*COUNT(*)/(Select COUNT(*) from contact) as percentvalue
from
contact
group by
Neighborhood_Time
) src
PIVOT
( SUM(percentvalue) for Neighborhood_Time in ([1-2 years],[2-5 years],[5+ years])) as pt

SQL Group By Help Required

I have a table named People in the following format:
Date | Name.
When I count the people by Grouping By Name with
Select Date, Name, count(*)
From People
Group By Date, Name;
Will give the following
Date Name count(*)
10 Peter 25
10 John 30
10 Mark 25
11 Peter 15
11 John 10
11 Mark 5
But I would like the following result:
Date Peter John Mark
10 25 30 25
11 15 10 5
Is this possible? This is a simple example of a more complicated database. If someone helps me in solving this problem I will use the concept to implement it in my table
Thanks!
Select Date
, count(case when Name = 'Peter' then 1 else null end)
, count(case when Name = 'John' then 1 else null end)
, count(case when Name = 'Mark' then 1 else null end)
From People
Group By Date;
another option different from turbanoff's if, for some reason, you find yourself in a situation that you cant apply a group by:
Select distinct(P.Date),
(select count(*) from People where date=p.date and name='Peter') as Peter,
(select count(*) from People where date=p.date and name='John') as John,
(select count(*) from People where date=p.date and name='Mark') as Mark
From People P