Cumulated sums in sql - sql

I have a problem, I need to do an acomulative sum for each month of each year and I have been searching the internet and the solution is to use the ORDER (ORDER BY), but it gives me an error that there is an error near order and it does not explain me nothing more. The syntax is correct according to the internet, but I do not understand why it does not go. I tried to convert the date to string but it does not let me either. Any solution for this?
This is my consult:
SELECT YEAR(FECHA_IMPUT) AÑO,
MONTH(FECHA_IMPUT) MES,
COD_MAQUINA ,
SUM(CANTIDAD_OK) SUMA,
SUM(CANTIDAD_OK) OVER(ORDER BY DATEPART(mm,FECHA_IMPUT)) AS suma
FROM RTMAQUINA
WHERE COD_MAQUINA='LB_TRASVASE'
GROUP BY COD_MAQUINA, MONTH(FECHA_IMPUT),YEAR(FECHA_IMPUT)
ORDER BY YEAR(FECHA_IMPUT), MONTH(FECHA_IMPUT) ASC
ERROR: incorrect syntax near the order
Result query
ERROR: incorrect syntax near the order

I believe the syntax you want is:
SELECT YEAR(FECHA_IMPUT) AÑO,
MONTH(FECHA_IMPUT) MES,
COD_MAQUINA ,
SUM(CANTIDAD_OK) SUMA,
SUM(SUM(CANTIDAD_OK)) OVER (PARTITION BY YEAR(FECHA_IMPUT) ORDER BY MONTH(FECHA_IMPUT)) AS suma
FROM RTMAQUINA
WHERE COD_MAQUINA = 'LB_TRASVASE'
GROUP BY COD_MAQUINA,
MONTH(FECHA_IMPUT),
YEAR(FECHA_IMPUT)
ORDER BY YEAR(FECHA_IMPUT), MONTH(FECHA_IMPUT) ASC;
Note the nested SUM()s. This syntax looks awkward, but is correct when using window functions with aggregation functions. The inner SUM() is the aggregation. The outer SUM() is for the window function.
Also note the window clause. First, it needs to reference the same expressions used in the GROUP BY -- or aggregation functions. Second, I think you want to partition by year based on how your question is phrased.

Related

ORA-00923 error: FROM keyword not found where expected

When calculating retention on Oracle DB, I wrote this code:
select
sessions.sessionDate ,
count(distinct sessions.visitorIdd) as active_users,
count(distinct futureactivity.visitorIdd) as retained_users,
count(distinct futureactivity.visitorIdd) / count(distinct sessions.visitorIdd)::float as retention
FROM sessions
left join sessions futureactivity on
sessions.visitorIdd=futureactivity.visitorIdd
and sessions.sessionDate = futureactivity.sessionDate - interval '3' day
group by 3;
but I always get the error: "ORA-00923: mot-clé FROM absent à l'emplacement prévu" (ORA-00923 FROM keyword not found where expected)
Can you help me guys?
Oracle does not recognize :: syntax of Postgres, so it complains of the missing FROM keyword not being found where expected.
Use a cast instead:
count(distinct futureactivity.visitorIdd) / cast(count(distinct sessions.visitorIdd) as float) as retention
Here is a more "Oracle" way of writing the query:
select s.sessionDate ,
count(distinct s.visitorIdd) as active_users,
count(distinct fs.visitorIdd) as retained_users,
count(distinct fs.visitorIdd) / count(distinct s.visitorIdd) as retention
from sessions s left join
sessions fs
on s.visitorIdd = fs.visitorIdd and
s.sessionDate = fs.sessionDate - interval '3' day
group by s.sessionDate
order by s.sessionDate;
Notes:
Oracle does not require conversion with dividing integers.
The group by should contain the column name, and it is actually "1", not "3".
Shorter table aliases make the query easier to write and to read.
You'll probably want an order by, because the results will be an in indeterminate order.
There is probably a better way to write this query using window functions.

BigQuery - Cannot re-use lagged records

When using lag(value,offset), I don't seem to be able to re-use the output in other functions.
The output of the following code shows that previous_timestamp_utc exists, but neither of the functions, casting to date() or datediff(), return values.
SELECT
id,
timestamp_utc,
DATE(timestamp_utc) AS date_timestamp_utc,
previous_timestamp_utc,
DATE(previous_timestamp_utc) AS date_previous_timestamp_utc,
DATEDIFF(timestamp_utc,previous_timestamp_utc),
FROM (
SELECT
id,
timestamp_utc,
LAG(timestamp_utc,1) OVER (PARTITION BY id ORDER BY timestamp_utc) AS previous_timestamp_utc,
FROM (
SELECT
SEC_TO_TIMESTAMP (timestamp) AS timestamp_utc,
id,
num_characters,
FROM
[publicdata:samples.wikipedia] ) )
ORDER BY
4 DESC
LIMIT
1000
Can anyone explain why this is occurring?
Workaround: I'm unclear as to why this works, but a spotted workaround is to pre-cast the lag() field into a date(): replacing
LAG(timestamp_utc,1) OVER (PARTITION BY id ORDER BY timestamp_utc)
with
LAG(date(timestamp_utc),1) OVER (PARTITION BY id ORDER BY timestamp_utc)
causes the previous_timestamp_utc to be used in a date() and datediff(). This is not something we should be expected to do when using the lag() function.
This is a bug in BigQuery in handling timestamps with the LAG function.
The timestamp type is lost during intermediate results. When the table is written it will be correctly written as a timestamp type in the resulting table, but any intermediate results interpret the type as a raw integer resulting in unexpected results.
You found the work-around: cast to a non-timestamp type before the LAG function.
This issue is logged in our internal issue tracker. Thank you for the bug report!

Not a group by function at a cumulative query

I'm making a cumulative query, which shows the evolution of clients in my database. To get these query, I use the year and the week of year they joined in the client database.
I have following query to search for relevant data:
SELECT DD.CAL_YEAR, DD.WEEK_OF_YEAR, SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY DD.CAL_DATE ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS "Number of account statements"
FROM CLIENT_DATABASE FAB
JOIN DIM_DATE DD ON FAB.BALANCE_DATE_ID = DD.ID
GROUP BY DD.CAL_YEAR, DD.WEEK_OF_YEAR;
But when I compile this query, I get following error:
Error: ORA-00979: not a GROUP BY expression
SQLState: 42000 ErrorCode: 979
How can I fix this?
Since you are grouping by DD.CAL_YEAR, DD.WEEK_OF_YEAR, you can't use DD.CAL_DATE in the order by clause of your cumulative sum function.
It's hard for me to say exactly what you are trying to do without fully understanding your data. But, logically, it does seem like you should be able to simply use DD.CAL_YEAR, DD.WEEK_OF_YEAR in the order by clause instead of DD.CAL_DATE, and still get the results the way you are expecting.
So something like this:
SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY D.CAL_YEAR, DD.WEEK_OF_YEAR ...

how to use median as a analytic function (oracle SQL)

Can you explain why the following works:
select recdate,avg(logtime)
over
(ORDER BY recdate rows between 10 preceding and 0 following) as logtime
from v_download_times;
and the following doesn’t
select recdate,median(logtime)
over
(ORDER BY recdate rows between 10 preceding and 0 following) as logtime
from v_download_times;
(median instead of avg)
I get an ORA-30487 error.
and I would be grateful for a workaround.
The error message is ORA-30487: ORDER BY not allowed here. And sure enough, if we consult the documentation for the MEDIAN function it says:
"You can use MEDIAN as an analytic function. You can specify only the
query_partition_clause in its OVER clause."
But it is not redundant if you only want to take it from a certain number of rows preceding the current one.
A way around may be limiting your data set just for the median purpose, like
select
median(field) over (partition by field2)
from ( select * from dataset
where period_back between 0 and 2 )
MEDIAN doesn't allow an ORDER BY clause. As APC points out in his answer, the documentation tells us we can only specify the query_partition_clause.
ORDER BY is redundant as we're looking for the central value -- it's the same regardless of order.

SQL select invalid because it is not contained in aggregate function

Here's the problem, I want to display the month, count and avg of one column in a table, but I keep getting an error when I try and group it by the month.
This is the code:
SELECT MONTH(ContractDate) AS Q,
DATENAME(month, ContractDate) AS M,
COUNT(ContractDate) AS C, SUM(ContractPrice) AS S
FROM dashboard
WHERE YEAR(ContractDate) = $year
AND ContractDate IS NOT NULL
AND ContractPrice IS NOT NULL
GROUP BY MONTH(ContractDate)
But this results in the error:
[Microsoft][SQL Server Native Client 10.0][SQL Server]
Column 'dashboard.ContractDate' is invalid in the select
list because it is not contained in either an aggregate
function or the GROUP BY clause.
But if I removed the MONTH() from the group by... it works fine.. But I need to have them grouped by month otherwise I get multiple of the same month not counted as one.
Sorry again, I did search and there is HEAPS of answers, but like I said I'm noob and they didn't really help me because I don't understand why this happens.
You have to have all columns that are not aggregates in the GROUP BY. Either add your DATENAME column into the GROUP BY or remove it from the query altogether.
GROUP BY MONTH(ContractDate) AS Q, DATENAME(month, ContractDate)
Try executing your query after removing DATENAME(month, ContractDate) AS M. I guess this is causing the issue. You are doing a GROUP BY MONTH(ContractDate) but also trying to use ContractDate which is not in the GROUP BY list.