I have two tables employee and department,Employee table schema(Eid,Ename,DOJ,Sal,Dept ID) and Department schema(Dept id,Dname).So what i want in output is count the no. of employee by each department and according to experience.
Output:
dept |0-5yrs|5-10yrs|10-15yrs
HR | 4 | 9 | 0
Account | 2 | 3 | 1
what I mean by the output is 4 employees in HR department have less than 5 years of experience and 9 people have more than 5 and less than 10 years of experience and 0 have 10-15 years of experience
You can use a pivot table in Microsoft SQL, like this:
select p.DName, p.Under5, p.From5To10, p.MoreThan10
from (
select d.DName, case when datediff(day, e.DOJ, getdate()) / 365 < 5 then 'Under5' when datediff(day, e.DOJ, getdate()) / 365 > 10 then 'MoreThan10' else 'From5To10' end as ExperienceBucket
from Employee e
join Department d on e.[Dept Id] = d.[Dept Id]
) as s
pivot (
count(ExperienceBucket)
for ExperienceBucket in (Under5, From5To10, MoreThan10)
) as p
Related
So I have an employee table with data on all employee since the beginning. In the data I have all the data I should need. I have the employee startdate, enddate (null if nothing), I have the name of the department, and if a department has changed, that specific employee has a new line, with a new department value, and two columns called "DepValidFrom" and "DepValidto", in date format that determines the time-period that the current employee was in that specific department.
My goal is, to get into a matrix, a list of all the departments as rows, and with year and month as columns, and the number of employees in that department at that time as values. I have all the data, I just cannot find the exact way to write my PowerBI Measure or perhaps even SQL query.
So.... I am trying to pull this into Power BI, and I am getting an incomplete view. I want my data to look like the following:
Department | Jan | Feb | Mar | Apr |
Dep1 | 3 | 5 | 6 | 4 |
Dep2 | 2 | 3 | 2 | 3 |
Dep3 | 1 | 1 | 2 | 3 |
Right now I am just using a very simple DISTINCTCOUNT(Emp_Table[EmployeeInitials]) which gives me an incomplete view, as it only counts on the specific date, and doesn't retain the number into a total, leaving a bunch of empty values.
I hope someone can understand what I mean, and that someone can help!
Thanks!
You can start by unpivoting the dates and generating a query that gives the number of employee per department and date:
select e.dept, x.dt, sum(cnt) over(partition by dept order by dt) cnt
from employees e
cross apply (values (startdate, 1), (enddate, -1)) as x(dt, cnt)
where dt is not null
Then, you can do conditional aggregation to pivot the results - this requires enumerating the dates though:
select dept,
max(case when dt >= '20200101' and dt < '20200201' then cnt else 0 end) cnt_202001,
max(case when dt >= '20200201' and dt < '20200301' then cnt else 0 end) cnt_202002,
...
from (
select e.dept, x.dt, sum(cnt) over(partition by dept order by dt) cnt
from employees e
cross apply (values (startdate, 1), (enddate, -1)) as x(dt, cnt)
where dt is not null
) t
group by dept
When an employee changes in the middle of the month, it is counted in both departments for that month.
I'm trying to build a department level score table based on a deeper product url level score table.
Date is not consecutive
Not all urls got score updates at same day (independent to each other)
dist_url should be running count distinct (cumulative count distinct)
dist urls and urls score >=30 are both count distinct
What I have now is:
Date url Store Dept Page Score
10/1 a US A X 10
10/1 b US A X 30
10/1 c US A X 60
10/4 a US A X 20
10/4 d US A X 60
10/6 b US A X 22
10/9 a US A X 40
10/9 e US A X 10
Date Store Dept Page dist urls urls score >=30
10/1 US A X 3 2
10/4 US A X 4 3
10/6 US A X 4 2
10/9 US A X 5 2
I think the dist_url can be done by using window function, just not sure on query.
Current query is as below, but it's wrong since not cumulative count distinct:
SELECT
bm.AnalysisDate,
su.SoID AS Store,
su.DptCaID AS DTID,
su.PageTypeID AS PTID,
COUNT(DISTINCT bm.SeoURLID) AS NumURLsWithDupScore,
SUM(CASE WHEN bm.DuplicationScore > 30 THEN 1 ELSE 0 END) AS Over30Count
FROM csn_seo.tblBotifyMetrics bm
INNER JOIN csn_seo.tblSEOURLs su
ON bm.SeoURLID = su.ID
WHERE su.DptCaID IS NOT NULL
AND su.DptCaID <> 0
AND su.PageTypeID IS NOT NULL
AND su.PageTypeID <> -1
AND bm.iscompliant = 1
GROUP BY bm.AnalysisDate, su.SoID, su.DptCaID, su.PageTypeID;
Please let me know if anyone has any idea.
Based on your question, you seem to want two levels of logic:
select date, store, dept,
sum(sum(start)) over (partition by dept, page order by date) as distinct_urls,
sum(sum(start_30)) over (partition by dept, page order by date) as distinct_urls_30
from ((select store, dept, page, url, min(date) as date, 1 as start, 0 as start_30
from t
group by store, dept, page, url
) union all
(select store, dept, page, url, min(date) as date, 0, 1
from t
where score >= 30
group by store, dept, page, url
)
) t
group by date, store, dept, page;
I don't understand how your query is related to your question.
Try as I might, I don't get your output either:
But I think you can avoid UNION SELECTs - Does this do what you expect?
NULLS don't figure in COUNT DISTINCTs - and here you can combine an aggregate expression with an OLAP one ...
And Vertica has named windows to increase readability ....
WITH
input(Date,url,Store,Dept,Page,Score) AS (
SELECT DATE '2019-10-01','a','US','A','X',10
UNION ALL SELECT DATE '2019-10-01','b','US','A','X',30
UNION ALL SELECT DATE '2019-10-01','c','US','A','X',60
UNION ALL SELECT DATE '2019-10-04','a','US','A','X',20
UNION ALL SELECT DATE '2019-10-04','d','US','A','X',60
UNION ALL SELECT DATE '2019-10-06','b','US','A','X',22
UNION ALL SELECT DATE '2019-10-09','a','US','A','X',40
UNION ALL SELECT DATE '2019-10-09','e','US','A','X',10
)
SELECT
date
, store
, dept
, page
, SUM(COUNT(DISTINCT url) ) OVER(w) AS dist_urls
, SUM(COUNT(DISTINCT CASE WHEN score >=30 THEN url END)) OVER(w) AS dist_urls_gt_30
FROM input
GROUP BY
date
, store
, dept
, page
WINDOW w AS (PARTITION BY store,dept,page ORDER BY date)
;
-- out date | store | dept | page | dist_urls | dist_urls_gt_30
-- out ------------+-------+------+------+-----------+-----------------
-- out 2019-10-01 | US | A | X | 3 | 2
-- out 2019-10-04 | US | A | X | 5 | 3
-- out 2019-10-06 | US | A | X | 6 | 3
-- out 2019-10-09 | US | A | X | 8 | 4
-- out (4 rows)
-- out
-- out Time: First fetch (4 rows): 45.321 ms. All rows formatted: 45.364 ms
I've got the following table:
ID Name Sales
1 Kalle 1
2 Kalle -1
3 Simon 10
4 Simon 20
5 Anna 11
6 Anna 0
7 Tina 0
I want to write a SQL query that only returns the rows that
represents a salesperson with sum of sales > 0.
ID Name Sales
3 Simon 10
4 Simon 20
5 Anna 11
6 Anna 0
Is this possible?
You can easily get names of the people with the sum of sales that are greater than 0 by using the a HAVING clause:
select name
from yourtable
group by name
having sum(sales) > 0;
This query will return both Simon and Anna, then if you want to return all of the details for each of these names you can use the above in a WHERE clause to get the final result:
select id, name, sales
from yourtable
where name in (select name
from yourtable
group by name
having sum(sales) > 0);
See SQL Fiddle with Demo.
You can make it like this, I think the join will be more effective than the where name in() clause.
SELECT Sales.name, Sales.sales
FROM Sales
JOIN (SELECT name FROM Sales GROUP BY Sales.name HAVING SUM(sales) > 0) AS Sales2 ON Sales2.name = Sales.name
This will work on some databases, like oracle, mssql, db2
SELECT ID, Name, Sales
FROM
(
SELECT ID, Name, Sales, sum(sales) over (partition by name) sum1
FROM <table>
) a
WHERE sum1 > 0
I have a contact table that includes the length of time each contact lived in the neighborhood:
ID First_Name Last_Name Neighborhood_Time
1 John Smith 1-2 years
2 Mary Jones 2-5 years
3 Dennis White 2-5 years
4 Martha Olson 5+ years
5 Jeff Black 5+ years
6 Jean Rogers 2-5 years
I want to show the percentage of time, the result would look like this:
One_to_2_Years Two_to_5_Years 5+_Years
16 50 33
This is what I'm using:
select
sum(case when Neighborhoods_time ='1-2 years' then 1 else 0 end)*100/(select count(*) from contact) as One_to_2_Years,
sum(case when Neighborhoods_time ='2-5 years' then 1 else 0 end)*100/(select count(*) from contact) as Two_to_6_Years,
sum(case when Neighborhoods_time ='5+years' then 1 else 0 end)*100/(select count(*) from contact) as Six_to_10_Years
from dbo.contact
This is my result:
One_to_2_Years Two_to_5_Years 5+_Years
0 0 16
16 33 0
0 16 16
I see the numbers under each column are correct, I'm having a problem summing them.
What am I missing?
Thanks.
Add Group by Neighborhoods_time
The basis of your query can be produced like
select
Neighborhood_Time,
100*COUNT(*)/(Select COUNT(*) from contact) as percentvalue
from
contact
group by
Neighborhood_Time
If you want to arrange it horizontally, then you should use a pivot
select
*
from
(
select
Neighborhood_Time,
100*COUNT(*)/(Select COUNT(*) from contact) as percentvalue
from
contact
group by
Neighborhood_Time
) src
PIVOT
( SUM(percentvalue) for Neighborhood_Time in ([1-2 years],[2-5 years],[5+ years])) as pt
I have a table named People in the following format:
Date | Name.
When I count the people by Grouping By Name with
Select Date, Name, count(*)
From People
Group By Date, Name;
Will give the following
Date Name count(*)
10 Peter 25
10 John 30
10 Mark 25
11 Peter 15
11 John 10
11 Mark 5
But I would like the following result:
Date Peter John Mark
10 25 30 25
11 15 10 5
Is this possible? This is a simple example of a more complicated database. If someone helps me in solving this problem I will use the concept to implement it in my table
Thanks!
Select Date
, count(case when Name = 'Peter' then 1 else null end)
, count(case when Name = 'John' then 1 else null end)
, count(case when Name = 'Mark' then 1 else null end)
From People
Group By Date;
another option different from turbanoff's if, for some reason, you find yourself in a situation that you cant apply a group by:
Select distinct(P.Date),
(select count(*) from People where date=p.date and name='Peter') as Peter,
(select count(*) from People where date=p.date and name='John') as John,
(select count(*) from People where date=p.date and name='Mark') as Mark
From People P