As the title says the query needs to combine multiple select queries. The question is as follows:
Display the total number of employees, and of that total the number of employees hired in 1995,1996,1997,1998.
My query:
select (select count(*) from employees) as "Total",
(select count(*) from employees where hire_date between 'JAN-1-0095' and 'DEC-1-0095')as "1995",
(select count(*) from employees where hire_date between 'JAN-1-0096' and 'DEC-1-0096') as "1996",
(select count(*) from employees where hire_date between 'JAN-1-0097' and 'DEC-1-0097') as "1997",
(select count(*) from employees where hire_date between 'JAN-1-0098' and 'DEC-1-0098') as "1998"
from employees
but the issue is instead of returning only single record this query is being executed for all the records in the table and hence producing the following output:
You can use conditional counting:
select count(*) as total_count,
count(case when extract(year from hire_date) = 1995 then 1 end) as "1995",
count(case when extract(year from hire_date) = 1996 then 1 end) as "1996",
count(case when extract(year from hire_date) = 1997 then 1 end) as "1997",
count(case when extract(year from hire_date) = 1998 then 1 end) as "1997",
from employees;
this makes use of the fact that aggregate functions ignore NULL values and therefor the count() will only count those rows where the case expressions returns a non-null value.
Your query returns one row for each row in the employees table because you do not apply any grouping. Each select is a scalar sub-select that gets executed for each and every row in the employees table.
You could make it only return a single row if you replace the final from employees with from dual - but you'd still count over all rows within each sub-select.
You should also avoid implicit data type conversion like you did. 'JAN-1-0095' is a string and will implicitly be converted to a date depending on your NLS settings. Your query would not run if executed from my computer (because of different NLS settings).
As you are looking for a complete year, just comparing the year is a bit shorter to write and easier to understand (at least in my eyes).
Another option would be to use proper date literals, e.g. where hire_date between DATE '1995-01-01' and DATE '1995-12-31' or a bit more verbose using Oracle's to_date() function: where hire_date between to_date('1995-01-01', 'yyyy-mm-dd') and to_date('1995-12-31', 'yyyy-mm-dd')
Assuming the years are really what you want, the problem with your query is that you are selecting from employees, so you get a row for each one. You could use:
select (select count(*) from employees) as "Total",
(select count(*) from employees where hire_date between 'JAN-1-0095' and 'DEC-1-0095')as "1995",
(select count(*) from employees where hire_date between 'JAN-1-0096' and 'DEC-1-0096') as "1996",
(select count(*) from employees where hire_date between 'JAN-1-0097' and 'DEC-1-0097') as "1997",
(select count(*) from employees where hire_date between 'JAN-1-0098' and 'DEC-1-0098') as "1998"
from dual;
And I would use date '1998-01-01' for the date constants.
However, I prefer #a_horse_with_no_name's solution.
You should avoid using a lot of subqueries. You should try this:
SQL Server:
SELECT count(*) as Total, hire_date
FROM employees
WHERE year(hire_date) IN ('1995','1996','1997','1998')
GROUP BY hire_date WITH CUBE
In ORACLE
SELECT count(*) as Total, hire_date
FROM employees
WHERE extract(year from hire_date) IN ('1995','1996','1997','1998')
GROUP BY CUBE (hire_date)
In addition to the subtotals generated by the GROUP BY, the CUBE extension will generate subtotals for each hire_date.
Related
Here is the table. My initial observation would be to query the salary to which employee has an incremental salary per year, but am confused on how to do that. Employee 1 is the only employee that has a three year increase, but not sure how to single them out. Thanks!
You can do this using lead()/lag() and aggregation:
select employee_id
from (select t.*,
lag(salary) over (partition by employee_id order by year) as prev_salary
from t
) t
group by employee_id
having min(salary - prev_salary) > 0 and
count(*) = 3;
This compares salaries in adjacent years and returns employees where the value is always increasing. It assumes that there are no gaps in the years -- as in your sample data.
The advantage of this approach is that you don't need to know the years in advance.
One option uses aggregation:
select employee_id
from mytable t
group by employee_id
having max(salary) filter(where year = 2020) > max(salary) filter(where year = 2019)
and max(salary) filter(where year = 2019) > max(salary) filter(where year = 2018)
This brings employee whose 2020 salary is greather than their 2019 salary, and whose 2019 salary is greater than their 2018 salary - which is how I understood your question.
I guess you don't want to hardcode the years in your query.
The best choice is the use of the window function LAG() to get the salary of the previous 2 years, but you should also check that the 3 years that you check are consecutive:
SELECT DISTINCT employee_id
FROM (
SELECT *,
LAG(year, 1) OVER (PARTITION BY employee_id ORDER BY year) year1,
LAG(salary, 1) OVER (PARTITION BY employee_id ORDER BY year) salary1,
LAG(year, 2) OVER (PARTITION BY employee_id ORDER BY year) year2,
LAG(salary, 2) OVER (PARTITION BY employee_id ORDER BY year) salary2
FROM tablename t
)
WHERE year1 = year - 1 AND year2 = year - 2 AND salary > salary1 AND salary1 > salary2
If you want to check only for the current year and the 2 previous years then add 1 more condition in the WHERE clause:
...AND year = strftime('%Y', CURRENT_DATE)
so you don't need to hardcode the current year,
I am getting 'not a valid month' error for the following code:
SELECT last_name, employee_id, hire_date
FROM employees
WHERE EXTRACT(YEAR FROM TO_DATE(hire_date, 'DD-MON-RR')) > 1998
ORDER BY hire_date;
I don’t see the point for using extract() here. It is suboptimal, because the database needs to apply the function to all values in the column before it is able to filter. I would recommend direct filtering against a literal date:
where hire_date >= date '1999-01-01'
This predicate would take advantage of an index on hire_date. You can even add more columns to the index to entirely cover the query like: (hire_date, last_name, employee_id).
Assuming this is the employees table from Oracle's tutorial, hire_date is already a date column. You don't need to use to_date on it:
SELECT last_name, employee_id, hire_date
FROM employees
WHERE EXTRACT(YEAR FROM hire_date) > 1998
ORDER BY hire_date;
I have made an table which has 3 columns
Table
In this above table there are 3 columns i.e Trandsaction, EmpID, Date having some data. I need to form a table using some queries so that in the result table I will get how many transactions done in each month by a particular EmpID.
So the result table must be like:
So how to get month wise number of transactions per EmpID?
Get all employees (I assume you have an employee table).
Get all months (you could get them from the transaction table).
Cross join the two in order to get all combinations, because you want to show these in your result.
Outer join the transaction counts per month and employee (an aggregation subquery). SQL Server's date to string conversion is a bit awkward compared to other DBMS. You need to convert to a predefined format and use a substring of that.
Use COALESCE to turn null (for no count for the employee and month) to zero.
The query:
select m.month, e.empid, coalesce(t.cnt, 0) as transaction_count
from employees e
cross join (select distinct convert(varchar(7), t.date, 126) as month from transactions) m
left join
(
select convert(varchar(7), t.date, 126) as month, empid, count(*) as total
from transactions
group by convert(varchar(7), t.date, 126), empid
) t on t.empid = e.empid and t.month = m.month
order by m.month, e.empid;
If you don't want all employees, but only those that have at least one transaction in some month, then replace from employees e with from (select distinct empid from transactions) e.
SELECT Date, EmpID, COUNT(transaction)
FROM your_table
GROUP BY Date, EmpID
To get the date like "YYYY-MM" use
for MySQL (replace the String with the date column):
DATE_FORMAT("2019-11-25 10:49:30.000", "%Y - %m")
for oracle:
to_char(date, 'YYYY-MM')
I have an employee table with the hire_date column.
I am stuck with one query related to the date function, where I have used data type 'DATE' to insert date of hiring and using DATE_FORMAT fun. to retrieve no. of employees hired in every month, but in SQL-server it is not supporting the date_format function.
I'm using SQL -server
Query: - list of the no.of employee hired every month in ascending order.
select date_format(hire_date,'%b') month, count(*)
from employee
group by DATE_FORMAT(hire_date,'%b')
order by month
date_format(hire_date,'%b') in MySQL will return abbreviated monthname. However, you can still have this functionality by combining MONTHNAME with LEFT in SQL Server.
select LEFT(DATENAME(MONTH,hire_date),3) month, count(*)
from employee
group by LEFT(DATENAME(MONTH,hire_date),3)
order by month
select Month(Hire_Date),count('x') 'count' from employee
group by Month(Hire_Date)
order by Month(Hire_Date) asc
Instead, you can directly use:
select MONTH(hire_date), count(*)
from employee
group by MONTH(hire_date)
order by MONTH(hire_date)
or
select hire_date.MONTH, count(*)
from employee
group by hire_date.MONTH
order by hire_date.MONTH
I have a table o employees that contains names, date of employment and some more information.
I want to check which year the most employees were employed.
I write a query which count employment for each year:
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date);
And result of this query are tuples:
YEAR | EMPL_NUMBER
1993 | 3
1997 | 2
and so on...
And now I want to get max of EMPL_NUMBER:
SELECT YEAR, MAX(EMPL_NUMBER)
FROM (SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date));
And then I get an error:
ORA-00937: not a single-group group function
I don't understand why I get an error because subquery returns tuple with 2 columns.
You are using an aggregation function on the select result so If you need all the distinct YEAR you ust group by
SELECT T.YEAR, MAX(T.EMPL_NUMBER)
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T
GROUP BY T.YEAR ;
Otherwise if you need the year of the MAX(EMPL_NUMBER) you could
SELECT T.YEAR, T.EMPL_NUMBER
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T
WHERE (T.EMPL_NUMBER) IN (SELECT MAX(EMPL_NUMBER)
FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
) T1 )
In Oracle 12C, you can do:
SELECT EXTRACT(YEAR FROM e1.empl_date) AS YEAR, COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT(YEAR FROM e1.empl_date)
ORDER BY COUNT(e1.id_empl) DESC
FETCH FIRST 1 ROW ONLY;
One way to do this is to use an aggregate query as you were doing already, and then to use aggregate functions to their full extent. For example, using the FIRST/LAST function (and using the SCOTT schema, EMP table for illustration):
select min(extract(year from hiredate)) keep (dense_rank last order by count(empno)) as yr,
max(count(empno)) as emp_count
from emp
group by extract(year from hiredate)
;
YR EMP_COUNT
---- ---------
1981 10
There are two problems with this solution. First, many developers (including many experienced ones) seem unaware of the FIRST/LAST function, or otherwise unwilling to use it. The other, more serious problem is that in this problem it is possible that there are several years with the same, highest number of hires. The problem requirement must be more detailed than in the Original Post. What is the desired output when there are ties for first place?
The query above returns the earliest of all the different years when the max hires were achieved. Change MIN in the SELECT clause to MAX and you will get the most recent year when the highest number of hires happened. However, often we want a query that, in the case of ties, will return all the years tied for most hires. That cannot be done with the FIRST/LAST function.
For that, a compact solution would add an analytic function to your original query, to rank the years by number of hires. Then in an outer query just filter for the rows where rank = 1.
select yr, emp_count
from (
select extract(year from hiredate) as yr, count(empno) as emp_count,
rank() over (order by count(empno) desc) as rnk
from emp
group by extract(year from hiredate)
)
where rnk = 1
;
Or, using the max() analytic function in the SELECT clause of the subquery (instead of a rank-type analytic function):
select yr, emp_count
from (
select extract(year from hiredate) as yr, count(empno) as emp_count,
max(count(empno)) over () as max_count
from emp
group by extract(year from hiredate)
)
where emp_count = max_count
;
I assume you want a single row showing the year most people where hired:
SELECT * FROM (
SELECT EXTRACT (YEAR FROM e1.empl_date) AS YEAR,
COUNT(e1.id_empl) AS EMPL_NUMBER
FROM employees e1
GROUP BY EXTRACT (YEAR FROM e1.empl_date)
ORDER BY COUNT(*))
WHERE ROWNUM=1;