SQL EXTRACT(YEAR FROM MYDATE) not a GROUP BY expression - sql

I have table MYTABLE with columns mydate and quantity of VARCHAR2 type.
|mydate| |quantity|
10/15/2010 15
01/20/2010 20
05/16/2005 30
04/29/2005 50
03/30/2008 5
I want to get:
|year| |quantity|
2010 35
2005 80
2008 5
I try:
SELECT
to_char(mydate,'yyyy') YEAR,
SUM(to_number(quantity))
FROM MYTABLE
GROUP BY
to_char(mydate,'yyyy');
But I get an error
ORA-00979: not a GROUP BY expression
What did I do wrong?

You must put all columns of the SELECT in the GROUP BY or use functions on them which compress the results to a single value (like MIN, MAX or SUM).
A simple example to understand why this happens: Imagine you have a database like this:
FOO BAR
0 A
0 B
and you run SELECT * FROM table GROUP BY foo. This means the database must return a single row as result with the first column 0 to fulfill the GROUP BY but there are now two values of bar to chose from. Which result would you expect - A or B? Or should the database return more than one row, violating the contract of GROUP BY?

Try this
select extract(year from mydate),sum(to_number(quant)) from mytable
group by extract(year from mydate);
SQLFiddle Example

Related

SQL Query to get the Average age from multiple 'date' values

I'm needing to get a SQL Query which returns the average age of multiple data inputs in my table
Households
User
Dates
1
2002-01-01
2
2004-06-10
I want to grab both User 1 and 2 date of births and return the average age of them.
Managed to get the age from the date of births using
SELECT *, DATE_FORMAT(FROM_DAYS(DATEDIFF(NOW(), Date)), '%Y') + 0 AS age
FROM Households;
I just can't get the rest of it working to then average the ages out.
Assuming that you are running MySQL, as the syntax of your SQL code suggests.
For starts, I would recommend simplifying the age computation. MySQL provides timestampdiff(), which we can use like so:
select user_id, user_date,
timestampdiff(year, user_date, current_date) age
from houshold
user_id
user_date
age
1
2002-01-01
20
2
2004-06-10
18
3
2004-11-10
17
To compute the average age over all rows of the table, we can use aggregate function avg():
select avg(timestampdiff(year, user_date, current_date)) avg_age
from houshold
avg_age
18.3333
Here is a small demo based on your sample data. Note that I renamed the columns so they do not clash with meaningful SQL names.

Impala: values are in wrong columns in result query

In my result query the values are in wrong columns.
My SQL Query is like:
create table some_database.table name as
select
extract(year from t.operation_date) operation_year,
extract(month from t.operation_date) operation_month,
extract(day from t.operation_date) operation_day,
d.status_name,
sum(t.operation_amount) operation_amt,
current_timestamp() calculation_moment
from operations t
left join status_dict d on
d.status_id = t.status_id
group by
extract(year from t.operation_date) operation_year,
extract(month from t.operation_date) operation_month,
extract(day from t.operation_date) operation_day,
d.status_name
(In fact, it's more complicated, but the main idea is that I'm aggregating source table and making some joins.)
The result I get is like:
#
operation_year
operation_month
operation_day
status_name
operation_amt
1
2021
1
1
success
100
2
2021
1
1
success
150
3
2021
1
2
success
120
4
null
2021-01-01 21:53:00
success
120
null
The problem is in row 4.
The field t.operation_date is not nullable, but in result query in column operation_year we get null
In operation_month we get untruncated timestamp
In operation_day we get string value from d.status_name
In status_name we get numeric aggregate from t.operation_amount
In operation_amt we get null
It looks very similar to a wrong parsing of a csv file when values jump to other columns, but obviously it can't be the case here. I can't figure out how on earth is it possible. I'm new to Hadoop and apparently I'm not aware of some important concept which causes the problem.

SQL Why am I getting the invalid identifier error?

I am trying to use columns that I created in this query to create another column.
Let me first my messy query. The query looks like this:
SELECT tb.team, tb.player, tb.type, tb.date, ToChar(Current Date-1, 'DD-MON-YY') as yesterday,
CASE WHEN to_date(tb.date) = yesterday then 1 else 0 end dateindicator,
FROM (
COUNT DISTINCT(*)
FROM TABLE_A, dual
where dateindicator = 1
Group by tb.team
)
What I am trying to do here is:
creating a column with "Yesterday's date"
Using the "Yesterday" column to create another column called dateindicator indicating each row is yesterday's data or not.
then using that dateindicator, I want to count the distinct number of player for each team that has 1 of the dateindicator column.
But I am getting the "invalid identifier" error. I am new to this oracle SQL, and trying to learn here.
You cannot use an Alias in your Select statement.
see here: SQL: Alias Column Name for Use in CASE Statement
you need to use the full toChar(.. in the CASE WHEN.
Also:
Your WHERE-condition (Line 5) doesnt belong there.. it should be:
SELECT DISTINCT .>. FROM .>. WHERE. you have to specify the table first. then you can filter it with where.
If I follow your explanation correctly: for each team, you want to count the number of players whose date column is yesterday.
If so, you can just filter and aggregate:
select team, count(*) as cnt
from mytable
where mydate >= trunc(sysdate) - 1 and mydate < trunc(sysdate)
group by team
This assumes that the dates are stored in column mydate, that is of date datatype.
I am unsure what you mean by counting distinct players; presumably, a given player appears just once per team, so I used count(*). If you really need to, you can change that to count(distinct player).
Finally: if you want to allow teams where no player matches, you can move the filtering logic within the aggregate function:
select team,
sum(case when mydate >= trunc(sysdate) - 1 and mydate < trunc(sysdate) then 1 else 0 end) as cnt
from mytable
group by team

I Want to write a subquery to Sum a column

This query returns the results below.
Select BookTitle,TotalNumberInStock
From CurrentStock
Where (Year<2000 OR Year >2010);
Table 1
BookTitle TotalNumberInStock
The Tower 2
Orange Goblins 1
The future of Metal 3
Chronicles of the Banjo 2
Opera 4
Advanced SQL 5
The GAA 4
I want to write a subquery that sums the TotalNumber In Stock so I used this statement:
Select Sum(TotalNumberInstock)
From
(
Select BookTitle,TotalNumberInStock
From CurrentStock
Where (Year<2000 OR Year >2010)
);
I get an error saying:
Incorrect syntax near ';'.
What is wrong with this code?
you need to give an alias name to your sub query (aka derived table)
select Sum(TotalNumberInstock)
From (Select BookTitle,TotalNumberInStock From CurrentStock
Where (Year<2000 OR Year >2010)) x ;
this correction makes your query work, but you don't need such complexity to have your desired result. you can simply get what you want by this query
select sum(totalNumberInStock)
from CurrentStock
where year < 2000 or year > 2010
I'm not sure why you need a sub query.
select sum(totalNumberInStock)
from CurrentStock
where year < 2000 or year > 2010
This should do the trick:
Select BookTitle,SUM(TotalNumberInStock)
From CurrentStock
Where (Year < 2000 and YEAR > 2010)
GROUP BY BookTitle

Group by two columns is possible?

I have this table:
ID Price Time
0 20,00 20/10/10
1 20,00 20/10/10
2 20,00 12/12/10
3 14,00 23/01/12
4 87,00 30/07/14
4 20,00 30/07/14
I use this syntax sql to get the list of all prices in a way that does not get repeated values:
SELECT * FROM myTable WHERE id in (select min(id) from %# group by Price)
This code return me the values (20,14,87,20)
But in this case I would implement another check, that will not only sort by price but also by date, example: That syntax is getting the list by price, if I find a way to check by date, the code will return me the values (20,20,14,87,20)
He repeats 20 two times but if we see in the table we have three numbers 20 (two with the date 20/10/10 and one with the date 12/12/10) and is exactly what I'm wanting to get!
Somebody could help me?
To group by multiple columns, just put a comma in between the list.
SELECT price FROM myTable group by price, time order by time
The group by looks at all distinct combinations of the listed columns values, and discards duplicates. You can also use aggregate functions like sum or max to pull in additional columns to the results.
The following should work as long as all you need is the price/time combination. If you need to include the ID, things get more complicated:
SELECT `Price` FROM items
GROUP BY `Price`, `Time`
ORDER BY `Time`;
Here's a fiddle with the result in action: http://sqlfiddle.com/#!2/40821/1