must appear in the GROUP BY clause in postgresql - sql

I am getting this error:
ERROR: column "programmer.pname" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select pname, min(age(doj)) from programmer ;
I have a table called programmer and columns dob, doj with date.
Here doj is date of joining.
I want to find the least experienced programmer of all the programmers.
That's my try:
SELECT pname, min(age(doj)) FROM programmer;
and I got the above error.
What is that programmer.pname and what is the correct query for the above?

The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
select pname, min(age(doj))
from programmer
group by pname
To find the minimum experienced programmer of all the programmers
select pname
,min(age(doj)) mindoj
from programmer
group by pname
order by mindoj limit 1
or
select pname,doj
from programmer
order by doj limit 1
You may have more than one minimum experienced programmer(programmers with same minimum experience) in that case you use this
select pname,doj
from programmer
where doj=(select min(doj) from programmer)
what is that programmer.pname and what is the correct query for the
above?
programmer.pname = tablename.columnname

Related

Microsoft SQL server select statements on multiple tables?

so I've been struggling with some of the select statements on multiple tables:
Employee table
Employee_ID
First_Name
Last_Name
Assignment table
Assignment_ID
Employee_ID
Host_Country_arrival_Date
Host_Country_departure_Date
Scheduled_End_Date
I'm being asked to display query to display employee full name, number of days between the host country arrival date and host country departure date, number of days between today's date and the assignment scheduled end date and the results sorted according to host country arrival date with the oldest date on top.
also, I'm not familiar with the sort function in SQL server..
Here's my query and I've been getting syntax errors:
SELECT
First_Name
Last_Name
FROM Employee
SELECT
Host_Country_Arrival_Date
Host_Country_Departure_Date
FROM Assignment;
So, Basically what your code is doing is 2 different queries. The first getting all the employees names, and the second one getting the dates of the assignments.
What you'll want to do here is take advantage of the relationship between the tables using a JOIN. That is basically saying "Give me all employees and all of HIS/HERS assignments". So, for each assignment that the employee has, it will bring a row in the result with his name and the assignment info.
To get the difference between days you use DATEDIFF passing 3 parameters, the timespan in which to calculate the difference, the first and the second date. It will then Subtract the first one from the second one and give you the result in the selected timespan.
And finnaly the sorting: Just add 'ORDER BY' followed by each column that you want to use for ordering and then specify if you want it ascending (ASC) or descending (DESC).
You can check how I would answer the if that question was proposed to me in a coding challenge.
SELECT
CONCAT(E.First_Name,' ', E.Last_Name) FullName,
DATEDIFF(DAY,Scheduled_End_Date,getdate()) DaysTillScheduledDate,
DATEDIFF(DAY,Host_Country_Arrival_Date,Host_Country_Departure_Date) DaysTillScheduledDate
FROM Employee As E --Is nice to add aliases
Inner Join
Assignment As A
on E.Employee_ID = A.Employee_ID -- Read a little bit about joins, there are a lot of material availabel an its going to be really necessary moving forward with SQL
order by Host_Country_Arrival_Date DESC -- Just put the field that you want to order by here, desc indicates that it should be descending
You should use a JOIN to link the tables together on Employee_ID:
SELECT
First_Name,
Last_Name,
Host_Country_Arrival_Date,
Host_Country_Departure_Date
FROM Employee
JOIN Assignment ON Assignment.Employee_ID = Employee.Employee_ID;
What this is saying basically is that for each employee, go out to the assignments table and add the assignments for that employee. If there are multiple assignments, the employee columns will be repeated on each row with the assignment columns for the assignment.
You need to look for the join and group by. Please find this link for reference tutorial
For now you may try this...
SELECT
CONCAT(Emp.First_Name,' ', Emp.Last_Name) FullName,
DATEDIFF(DAY,Scheduled_End_Date,getdate()) DaysTillScheduledDate,
DATEDIFF(DAY,Host_Country_Arrival_Date,Host_Country_Departure_Date) DaysTillScheduledDate
FROM Employee As Emp Inner Join Assignment As Assign on Emp.Employee_ID = Assign.Employee_ID
order by Host_Country_Arrival_Date DESC

what does Group By multiple columns means?

I use oracle 11g , so i read alot of artics about it but i dont understand
how exactly its happened in database , so lets say that have two tables:
select * from Employee
select * from student
so when we want to make group by in multi columns :
SELECT SUBJECT, YEAR, Count(*)
FROM Student
GROUP BY SUBJECT, YEAR;
so my question is: what exactly happened in database ? i mean the query count(*) do first in every column in group by and then sort it ? or what? can any one explain it in details ?.
SQL is a descriptive language, not a procedural language.
What the query does is determine all rows in the original data where the group by keys are the same. It then reduces them to one row.
For example, in your data, these all have the same data:
subject year name
English 1 Harsh
English 1 Pratik
English 1 Ramesh
You are saying to group by subject, year, so these become:
Subject Year Count(*)
English 1 3
Often, this aggregation is implemented using sorting. However, that is up to the database -- and there are many other algorithms. You cannot assume that the database will sort the data. But, if it easier for you to think of it, you can think of the data being sorted by the group by keys, in order to identify the groups. Just one caution, the returned values are not necessarily in any particular order (unless your query includes an order by).

Select department id, seniority based on hire_time and the ealierst date of hire date

i have this task:
Select department id, the longest time of working based on months and time of the person who was hired as the first one
I wrote sthg like this:
SELECT department_id, min(hire_date) as earliest_hire_date, sysdate-hire_date dni
FROM employees
it generates the following error:
*ORA-00937: not a single-group group function
00937. 00000 - "not a single-group group function"*
i use oracle. Do you have any idea how to fix it?
You need to find a record that meets a certain criteria (which generally means you are going to need a WHERE clause). The criteria for the WHERE clause is the record with the min(hire_date). So:
SELECT department_id, hire_date, sysdate-hire_date
FROM employees
WHERE hire_date = (SELECT min(hire_date) FROM employees);
Or something along those lines. IF this returns more than one record, you could probably just toss a GROUP BY or DISTINCT in there. If that doesn't squash the multiples then you'll have to implement further logic to pick the winner (in the even that multiple departments share the same "earliest hiring date").
You could also do something similar to what you were doing (Aggregating at min(hire_date)) and then order by that field picking out the top most record.
As for the error you were facing in your attempt, when you use an Aggregate function like sum(), avg(), min() and similar, then fields that are not being aggregated must be in the GROUP BY clause, which was absent in your query.
You would fix your problem by using min() in the select:
select department_id, min(hire_date) as earliest_hire_date,
sysdate - min(hire_date) as dni
from employees
group by department_id;
I'm not sure if this does what you really need -- your question doesn't have sample data, desired results, or a clear explanation of what you want. On the other hand, it does fix the syntax error.

Select and Group by together

I have my query like this:
Select
a.abc,
a.cde,
a.efg,
a.agh,
c.dummy
p.test
max(b.this)
sum(b.sugar)
sum(b.bucket)
sum(b.something)
followed by some outer join and inner join. Now the problem is when in group by
group by
a.abc,
a.cde,
a.efg,
a.agh,
c.dummy,
p.test
The query works fine. But if I remove any one of them from group by it gives:
SQLSTATE: 42803
Can anyone explain the cause of this error?
Generally, any column that isn't in the group by section can only be included in the select section if it has an aggregating function applied to it. Or, another way, any non-aggregated data in the select section must be grouped on.
Otherewise, how do you know what you want done with it. For example, if you group on a.abc, there can only be one thing that a.abc can be for that grouped row (since all other values of a.abc will come out in a different row). Here's a short example, with a table containing:
LastName FirstName Salary
-------- --------- ------
Smith John 123456
Smith George 111111
Diablo Pax 999999
With the query select LastName, Salary from Employees group by LastName, you would expect to see:
LastName Salary
-------- ------
Smith ??????
Diablo 999999
The salary for the Smiths is incalculable since you don't know what function to apply to it, which is what's causing that error. In other words, the DBMS doesn't know what to do with 123456 and 111111 to get a single value for the grouped row.
If you instead used select LastName, sum(Salary) from Employees group by LastName (or max() or min() or ave() or any other aggregating function), the DBMS would know what to do. For sum(), it will simply add them and give you 234567.
In your query, the equivalent of trying to use Salary without an aggregating function is to change sum(b.this) to just b.this but not include it in the group by section. Or alternatively, remove one of the group by columns without changing it to an aggregation in the select section.
In both cases, you'll have one row that has multiple possible values for the column.
The DB2 docs at publib for sqlstate 42803 describe your problem:
A column reference in the SELECT or HAVING clause is invalid, because it is not a grouping column; or a column reference in the GROUP BY clause is invalid.
SQL will insist that any column in the SELECT section is either included in the GROUP BY section or has an aggregate function applied to it in the SELECT section.
This article gives a nice explanation of why this is the case. The article is sql server specific but the principle should be roughly similar for all RDBMS

Different ways to alias a column

What is the difference between
select empName as EmployeeName from employees
versus
select EmployeeName = empName from employees
from a technical point of view. Not sure if this is just SQL server specific or not.
Appreciate your answers.
I'd prefer the first one, since the second one is not portable -
select EmployeeName = empName from employees
is either a syntax error (at least in SQLite and Oracle), or it might not give you what you expect (comparing two columns EmployeeName and empName and returning the comparison result as a boolean/integer), whereas
select empName EmployeeName from employees
is the same as
select empName as EmployeeName from employees
which is my preferred variant.
The main advantage of the second syntax is that it allows the column aliases to be all lined up which can be of benefit for long expressions.
SELECT foo,
bar,
baz = ROW_NUMBER() OVER (PARTITION BY foo ORDER BY bar)
FROM T
I don't think there's a technical difference. Its mainly preferential. I go for the second as its easier to spot columns in big queries, especially if the query is properly indented.