data function grouped item in a subquery - sql

I'm using Postgresql to write a query which for every day calculate the sum of diff values and get unit price from other table il_costs, what I try to achieve with a subquery. Below whole query:
SELECT date(read.readed_at),
SUM(read.diff),
(SELECT water_unit
FROM il_costs
WHERE EXTRACT(MONTH FROM created_at) = EXTRACT(MONTH FROM date(read.readed_at))
AND EXTRACT(YEAR FROM created_at) = EXTRACT(YEAR FROM date(read.readed_at)))
FROM il_communicators_readings read
GROUP BY date(read.readed_at)
ORDER BY date(read.readed_at) ASC;
I'm getting error about ungrouped column, but I'm using date(read.readed_at) also in grouped function:
ERROR: subquery uses ungrouped column "read.readed_at" from outer query
LINE 1: ...(MONTH FROM created_at) = EXTRACT(MONTH FROM date(read.reade...

You can try using your query as a subquery and perform the correlated subquery in the outer query:
SELECT mydate, s_diff,
(SELECT water_unit
FROM il_costs
WHERE EXTRACT(MONTH FROM created_at) = EXTRACT(MONTH FROM mydate)
AND EXTRACT(YEAR FROM created_at) = EXTRACT(YEAR FROM mydate))
FROM (
SELECT date(read.readed_at) AS mydate,
SUM(read.diff) as s_diff
FROM il_communicators_readings read
GROUP BY date(read.readed_at) ) AS t
ORDER BY mydate ASC;

the problem is not read.readed_at, it's the subquery not in the group by. and i don't think it's a good idea to use subquery like this.
Try this:
SELECT date(read.readed_at),
SUM(read.diff),
costs.water_unit
FROM il_communicators_readings read
Left Join il_costs costs
ON EXTRACT(MONTH FROM costs.created_at) = EXTRACT(MONTH FROM date(read.readed_at))
AND EXTRACT(YEAR FROM costs.created_at) = EXTRACT(YEAR FROM date(read.readed_at))
GROUP BY date(read.readed_at),costs.water_unit
ORDER BY date(read.readed_at) ASC;

Related

Select and Count Multiple Group By SQL

Can someone tell me how to do this in Database?
I've tried some sql like:
SELECT disastertype, YEAR(eventdate) as year,
COUNT(disastertype) AS disastertype_total
FROM v_disasterlogs_all
WHERE YEAR(eventdate) >= year(CURRENT_TIMESTAMP) - 4
GROUP BY YEAR(eventdate)
ORDER BY YEAR(eventdate) ASC
But, it only shows like this:
include disastertype on our group by statement.
SELECT disastertype, YEAR(eventdate) as year,
COUNT(disastertype) AS disastertype_total
FROM v_disasterlogs_all
WHERE YEAR(eventdate) >= year(CURRENT_TIMESTAMP) - 4
GROUP BY YEAR(eventdate), disastertype
ORDER BY YEAR(eventdate) ASC
I am assuming you want a count (the column index) to be associated with each unique year?
In this case, a possible solution in postgres will be as below.
select
dense_rank() over (order by date_part('year', (eventdate))) as index ,
date_part('year', (eventdate)) as year,
disastertype,
count(disastertype)
from
v_disaterlogs_all
where
date_part('year', (eventdate)) >= date_part('year', now()) - 4
group by
year,
disastertype
order by
year asc;
In postgres, I have used the function date_part to extract the year from the timestamp.
Working solution on dbfiddle.

Issues with SQL window function- error that column must be aggregated or in group by

I have a table called "sales" with two columns: transaction_date, and transaction_amount: VALUES ('2020-01-16 00:05:54.000000', '122.02'), ('2020-01-07 20:53:04.000000', '1240.00')
I want to find the 3-day moving average for each day in January 2020. I am returning the error that transaction_amount must be included in either an aggregated function or in the group by. It does not make sense to group by it, as I only want one entry per day in the resulting table. In my code, I already have the amount in the aggregate function SUM, so I am not sure what else to try. Here is my query so far:
SELECT EXTRACT(DAY FROM transaction_time) AS Jan20_day, SUM(transaction_amount), SUM(transaction_amount) OVER(ORDER BY EXTRACT(DAY FROM transaction_time) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_average FROM sales WHERE EXTRACT(MONTH FROM transaction_time)=1 AND EXTRACT(YEAR FROM transaction_time)=2020 GROUP BY EXTRACT(DAY FROM transaction_time)
Any insight on why I am returning the following error?
Query Error: error: column "transactions.transaction_amount" must appear in the GROUP BY clause or be used in an aggregate function
I would expect something like this:
SELECT EXTRACT(DAY FROM transaction_time) AS Jan20_day,
SUM(transaction_amount),
SUM(SUM(transaction_amount)) OVER (ORDER BY EXTRACT(DAY FROM transaction_time) ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_average
FROM sales
WHERE transaction_time >= DATE '2020-01-01' AND
transaction_time < DATE '2020-02-01'
GROUP BY EXTRACT(DAY FROM transaction_time);
But the basic issue with your query is that you need to apply the window function to SUM(), so SUM(SUM(transaction_amount)) . . . .
There is need to use GroupBy Before Where Clause
GROUP BY clause is used with the SELECT statement. In the query,
GROUP BY clause is placed after the WHERE clause. In the query,
GROUP BY clause is placed before ORDER BY clause if used any.
SELECT EXTRACT(DAY FROM transaction_time) AS Jan20_day, SUM(transaction_amount),
SUM(transaction_amount)
OVER(ORDER BY EXTRACT(DAY FROM transaction_time)
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS rolling_average
FROM sales
WHERE EXTRACT(MONTH FROM transaction_time)=1
GROUP BY {property}
AND EXTRACT(YEAR FROM transaction_time)=2020 GROUP BY EXTRACT(DAY FROM transaction_time)

IIs there any way to put a name to a column used in both GROUP BY and ORDER BY?

I have a query to find the closest date to 31 March for every year.
SELECT MAX(date_close)
FROM stock s
WHERE EXTRACT(MONTH FROM s.date_close) = '03'
GROUP BY EXTRACT(YEAR FROM s.date_close)
ORDER BY EXTRACT(YEAR FROM s.date_close);
Because EXTRACT(YEAR FROM s.date_close) apppears 2 times, I would like to ask if we can give it a name and thus compactify the query.
You can use lateral to introduce a reusable expression
SELECT MAX(date_close)
FROM stock s
, lateral (SELECT EXTRACT(YEAR FROM s.date_close) y FROM DUAL) t
WHERE EXTRACT(MONTH FROM s.date_close) = '03'
GROUP BY t.y
ORDER BY t.y

SQL subquery using group by item from main query

I have a table with a created timestamp and id identifier.
I can get number of unique id's per week with:
SELECT date_trunc('week', created)::date AS week, count(distinct id)
FROM my_table
GROUP BY week ORDER BY week;
Now I want to have the accumulated number of created by unique id's per week, something like this:
SELECT date_trunc('week', created)::date AS week, count(distinct id),
(SELECT count(distinct id)
FROM my_table
WHERE date_trunc('week', created)::date <= week) as acc
FROM my_table
GROUP BY week ORDER BY week;
But that doesn't work, as week is not accessible in the sub select (ERROR: column "week" does not exist).
How do I solve this?
I'm using PostgreSQL
Use a cumulative aggregation. But, I don't think you need the distinct, so:
SELECT date_trunc('week', created)::date AS week, count(*) as cnt,
SUM(COUNT(*)) OVER (ORDER BY MIN(created)) as running_cnt
FROM my_table
GROUP BY week
ORDER BY week;
In any case, as you've phrased the problem, you can change cnt to use count(distinct). Your subquery is not using distinct at all.
CTEs or a temp table should fix your problem. Here is an example using CTEs.
WITH abc AS (
SELECT date_trunc('week', created)::date AS week, count(distinct id) as IDCount
FROM my_table
GROUP BY week ORDER BY week;
)
SELECT abc.week, abc.IDcount,
(SELECT count(*)
FROM my_table
WHERE date_trunc('week', created)::date <= adc.week) as acc
FROM abc
GROUP BY week ORDER BY abc.week;
Hope this helps

To find the last updated record of each month for each policy(another field)

I have a table named a, and other fields as eff_date,policy no.
Now for each policy, consider all the records, and take out the last updated one (eff_date) from each month.
So I need the last updated record for each month for each policy. How would I write a query for this?
I'm not 100 percent on Teradata syntax, but I believe you're after this:
SELECT policy_no,eff_date
FROM (SELECT policy_no,eff_date, ROW_NUMBER() OVER (PARTITION BY policy no, EXTRACT(YEAR FROM eff_date),EXTRACT(MONTH FROM eff_date) ORDER BY eff_date DESC) as RowRank
FROM a) as sub
WHERE RowRank = 1
I'm assuming when you say by month you also want to differentiate by year, but if not, just remove the EXTRACT(YEAR FROM eff_date) from the PARTITION BY section.
Edit: Update for Teradata syntax.
SELECT * from a
qualify ROW_NUMBER() OVER (PARTITION BY policy no, EXTRACT(YEAR FROM eff_date),
EXTRACT(MONTH FROM eff_date) ORDER BY eff_date DESC) = 1
The main difficulty, is that the group by needs to be made both the conbination of policy_no, but also the month (extracted from the date). For example:
In Mysql
SELECT policy_no,
month(eff_date),
year(eff_date),
max(eff_date)
FROM myTable
GROUP BY policy_no,
month(eff_date),
year(eff_date);
Update
I saw derived tables are allowed in teradata. Using a join to a derived table, here is how to access the full rows:
select * from a,
(SELECT policy_no,
month(eff_date),
year(eff_date),
max(eff_date) as MaxMonthDate
FROM a
GROUP BY policy_no,
month(eff_date),
year(eff_date)
) as b
where a.policy_no = b.policy_no and
a.eff_date = b.MaxMonthDate;
http://www.sqlfiddle.com/#!2/1f728/5
Update (Using Extract)
select * from a,
(SELECT a2.policy_no,
EXTRACT(MONTH FROM a2.eff_date),
EXTRACT(YEAR FROM a2.eff_date),
max(a2.eff_date) as MaxMonthDate
FROM a as a2
GROUP BY a2.policy_no,
EXTRACT(MONTH FROM a2.eff_date),
EXTRACT(YEAR FROM a2.eff_date)
) as b
where a.policy_no = b.policy_no and
a.eff_date = b.MaxMonthDate;
I'm going to suggest looking into Windows Aggregate functions and the QUALIFY statement. I believe the following SQL will work.
SELECT Policy_No
, EXTRACT(MONTH FROM Eff_Date) AS Eff_Month_
, Eff_Date
FROM TableA
QUALIFY ROW_NUMBER() OVER (PARTITION BY Policy_No, EXTRACT(MONTH FROM Eff_Date)
ORDER BY Eff_Date DESC) = 1;