Sql Using Rounding and Averages together - google-bigquery

I am new to SQL and trying to complete my first project. I want to round the average deaths for each month to a whole number Right now, if the average death for March 2021 is 1345. 66666, I want to round it to 1345. How do I do that.
Here is my code for now:
SELECT FORMAT_DATE('%Y %b' , date) AS Month,
ROUND(AVG(death/1,0)), AVG(hospitalizedCurrently) AS Hospitalized_Currently, AVG(onVentilatorCurrently) AS ON_Ventilator
FROM `cogent-splicer-372320.covid_datat.missouri_data`
GROUP BY Month
ORDER BY PARSE_DATE('%Y %b', MONTH) DESC
Limit 12;
Everytime I run it, it tells me
"No matching signature for aggregate function AVG for argument types: FLOAT64, INT64. Supported signatures: AVG(INT64); AVG(FLOAT64); AVG(NUMERIC); AVG(BIGNUMERIC); AVG(INTERVAL) at [2:7]".
I want the rounded number of the average death rate for the month to be saved AS Death.

Related

Finding the initial sampled time window after using SAMPLE BY again

I can't seem to find a perhaps easy solution to what I'm trying to accomplish here, using SQL and, more importantly, QuestDB. I also find it hard to put my exact question into words so bear with me.
Input
My real input is different of course but a similar dataset or case is the gas_prices table on the demo page of QuestDB. On https://demo.questdb.io, you can directly write and run queries against some sample database, so it should be easy enough to follow.
The main task I want to accomplish is to find out which month was responsible for the year's highest galon price.
Output
Using the following query, I can get the average galon price per month just fine.
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
timestamp
avg_per_month
2000-06-05T00:00:00.000000Z
1.6724
2000-07-05T00:00:00.000000Z
1.69275
2000-08-05T00:00:00.000000Z
1.635
...
...
Then, I get all these monthly averages, group them by year and return the maximum galon price per year by wrapping the above query in a subquery, like so:
SELECT timestamp, max(avg_per_month) as max_per_year FROM (
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
) SAMPLE BY 12M
timestamp
max_per_year
2000-01-05T00:00:00.000000Z
1.69275
2001-01-05T00:00:00.000000Z
1.767399999999
2002-01-05T00:00:00.000000Z
1.52075
...
...
Wanted output
I want to know which month was responsible for the maximum price of a year.
Looking at the output of the above query, we see that the maximum galon price for the year 2000 was 1.69275. Which month of the year 2000 had this amount as average price? I'd like to display this month in an additional column.
For the first row, July 2000 is shown in the additional column for year 2000 because it is responsible for the highest average price in 2000. For the second row, it was May 2001 as that month had the highest average price of 2001.
timestamp
max_per_year
which_month_is_responsible
2000-01-05T00:00:00.000000Z
1.69275
2000-07-05T00:00:00.000000Z
2001-01-05T00:00:00.000000Z
1.767399999999
2001-05-05T00:00:00.000000Z
...
...
What did I try?
I tried by adding a subquery to the SELECT to have a "duplicate" of some sort for the timestamp column but that's apparently never valid in QuestDB (?), so probably the solution is by adding even more subqueries in the FROM? Or a UNION?
Who can help me out with this? The data is there in the database and it can be calculated. It's just a matter of getting it out.
I think 'wanted output' can be achieved with window functions.
Please have a look at:
CREATE TABLE electricity (ts TIMESTAMP, consumption DOUBLE) TIMESTAMP(ts);
INSERT INTO electricity
SELECT (x*1000000)::timestamp, rnd_double()
FROM long_sequence(10000000);
SELECT day, ts, max_per_day
FROM
(
SELECT timestamp_floor('d', ts) as day,
ts,
avg_in_15_min as max_per_day,
row_number() OVER (PARTITION BY timestamp_floor('d', ts) ORDER BY avg_in_15_min desc) as rn_per_day
FROM
(
SELECT ts, avg(consumption) as avg_in_15_min
FROM electricity
SAMPLE BY 15m
)
) WHERE rn_per_day = 1

How to create weekly YoY metric in AWS QuickSight?

I am trying to calculate a sum of a column for a specific calendar week, and want to compare that to the same week in the previous year.
So let's say I have the sum of WK21 for 2020 is 1000 and sum of WK21 in 2019 is 800. The field should return: 200.
Attached is how my dataset looks like.
I would like to sum the credit per WK for each segment, and see how the difference is to the sum of that week for that same segment, in the previous year. Later on I would change the difference to percentDifference, but I assume the main formula is the same.
Is that easy doable?
Tried this (percentDifference) for WoW (in the same year), and it worked using this formula:
percentDifference(sum({credit}),[{wk} ASC],-1,[{year}, industry, segment])
But when trying for weekly YoY the following formula, it didnt work (-52 because 52 weeks in year):
percentDifference(sum({credit}),[{wk} ASC, year ASC],-52,[industry, segment])
PS: for the country, I didnt take that in consideration, as I want to filter for the countries I want later on...
For Year-on-year difference create a calculated field like the following
difference(sum({credit}), [{wk} ASC, {year} ASC], -1, [{Industry}, {Segment}])
For year-on-year percentage difference your calculated field would be
percentDifference(sum({credit}), [{wk} ASC, {year} ASC], -1, [{Industry}, {Segment}])
This is because the lookup index in the function (-1) is based on the sort of the rows (sort order in the square brackets in the function)
Source: percentDifference and difference QuickSight Docs
Try -100 instead of -52. Since the 'wk' field is not a date-defined field, it is probably just considering 'wk' as an int.

How to round off a ratio to the lowest integral value in Postgresql?

I have data in column 'day', which is calculated by finding difference between two dates and this tells me the number of days it took for a person to complete an activity. I want to represent the column 'day' in weeks in the following manner.
6 days = 6/7 = 0 week
I tried rounding off but it is getting rounded to the highest integral value rather than the lowest integral value. Below is the query used by me.
SELECT ROUND(DATE_PART('day', '2011-12-31 00:00:00'::timestamp - '2011-12-26 00:00:00'::timestamp)/7)
The expected and actual results are as follows
Use the floor function instead of round.
The function date_part() returns a double precision number and not an integer, so the result of the division is not an integer.
So use the function trunc() to remove any decimal digits from the result of the division:
SELECT day,
trunc(day/7)
FROM Activity

SQL Average of total days in DATA per month

I have a SQL question.
I am trying to find the average injection volume per month. Currently my code takes the sum of all days of injection, and divides them by the TOTAL DAYS in the month.
Sum(W1."INJECTION_VOLUME" /
EXTRACT(DAY FROM LAST_DAY(W1."INJECTION_DATE"))) AS "AVGINJ"
This is not what I wanted.
I need to take the injection_volume and divide by the total days in the DATA .
ie. right now the data only 8 days of injection volume, lets say it is 3000.
So right now the sql is 3000/31.
I need to have it be 3000/8 (the total days in the data for the current month.)
Also, this should only be for the current month. All other completed months should be divided by the total days in the month.
Use
SELECT
SUM(W1.INJECTION_VOLUME) / COUNT(DISTINCT MyDateField)
FROM MyTable
WHERE X=Value
This gives you what you're after
SUM(W1.INJECTION_VOLUME) is the total volume for the dataset
Gives you the number of days, no matter how many records you have
COUNT(DISTINCT MyDateField)
So if you have 100 records but only 5 actual unique days in this time, this expression gives you 5
Note that this kind of calc is normally worked out with
SUM(A) / SUM(B)
not
SUM(A/B)
They give you completely different answers.
In order to get the average of the data for the current month you will need to divide by the count in the month:
SUM(`W1`.`INJECTION_VOLUME` / COUNT(EXTRACT(YEAR_MONTH FROM `W1`.`INJECTION_DATE`)))
To get all other data as the full month you'll need to combine your code:
SUM(`W1`.`INJECTION_VOLUME` / EXTRACT(DAY FROM LAST_DAY(`W1`.`INJECTION_DATE`)))
With an IF. So something like this:
SUM(
IF(
EXTRACT(YEAR_MONTH FROM `W1`.`INJECTION_DATE`) = EXTRACT(YEAR_MONTH FROM NOW()),
`W1`.`INJECTION_VOLUME` / COUNT(EXTRACT(YEAR_MONTH FROM `W1`.`INJECTION_DATE`)),
`W1`.`INJECTION_VOLUME` / EXTRACT(DAY FROM LAST_DAY(`W1`.`INJECTION_DATE`)
)
)
Note: this is untested and I'm not sure about the RDBMS you are using so you may need to change the code slightly to make it work.

SQL Statement for MS Access Query to Calculate Quarterly Growth Rate

I have a table named "Historical_Stock_Prices" in a MS Access database. This table has the columns: Ticker, Date1, Open1, High, Low, Close1, Volume, Adj_Close. The rows consist of the data for each ticker for every business day.
I need to run a query from inside my VB.net program that will return a table in my program that displays the growth rates for each quarter of every year for each ticker symbol listed. So for this example I would need to find the growth rate for GOOG in the 4th quarter of 2012.
To calculate this manually I would need to take the Close Price on the last BUSINESS day of the 4th quarter (12/31/2012) divided by the Open Price of the first BUSINESS day of the 4th quarter (10/1/2012). Then I need to subtract by 1 and multiply by 100 in order to get a percentage.
The actual calculation would look like this: ((707.38/759.05)-1)*100 = -6.807%
The first and last days of each quarter may vary due to weekend days.
I cannot come up with the correct syntax for the SQL statement to create a table of Growth Rates from a table of raw Historical Prices. Can anyone help me with the SQL statment?
Here's how I would approach the problem:
I'd start by creating a saved query Access named [Stock_Price_with_qtr] that calculates the year and quarter for each row:
SELECT
Historical_Stock_Prices.*,
Year([Date1]) AS Yr,
Switch(Month([Date1])<4,1,Month([Date1])<7,2,Month([Date1])<10,3,True,4) AS Qtr
FROM Historical_Stock_Prices
Then I'd create another saved query in Access named [Qtr_Dates] that finds the first and last business days for each ticker and quarter:
SELECT
Stock_Price_with_qtr.Ticker,
Stock_Price_with_qtr.Yr,
Stock_Price_with_qtr.Qtr,
Min(Stock_Price_with_qtr.Date1) AS Qtr_Start,
Max(Stock_Price_with_qtr.Date1) AS Qtr_End
FROM Stock_Price_with_qtr
GROUP BY
Stock_Price_with_qtr.Ticker,
Stock_Price_with_qtr.Yr,
Stock_Price_with_qtr.Qtr
That would allow me to use the following query in VB.NET (or C#, or Access itself) to calculate the quarterly growth rates:
SELECT
Qtr_Dates.Ticker,
Qtr_Dates.Yr,
Qtr_Dates.Qtr,
(([Close_Prices]![Close1]/[Open_Prices]![Open1])-1)*100 AS Qtr_Growth
FROM
(
Historical_Stock_Prices AS Open_Prices
INNER JOIN Qtr_Dates
ON (Open_Prices.Ticker = Qtr_Dates.Ticker)
AND (Open_Prices.Date1 = Qtr_Dates.Qtr_Start)
)
INNER JOIN
Historical_Stock_Prices AS Close_Prices
ON (Qtr_Dates.Ticker = Close_Prices.Ticker)
AND (Qtr_Dates.Qtr_End = Close_Prices.Date1)