SQL/HSQLDB query and sub-query in Aggregate Function

SQL/HSQLDB query and sub-query in Aggregate Function - sql

My database looks like this (very simple) and is called "RideDate":
BikeDate Bike Miles
What I am looking to achieve is a query that for each month is a total(Sum) across all years, average(Avg) across all years, and a total for a specific year
(WHERE YEAR("Date")= '2014"). (I don"t have my exact code in front of me due to power fluctuations, pushing me onto an iPad (high winds and wet/heavy snow)).
My attempt goes something like this:
SElECT MONTH("BikeDate") AS "Month", SUM("Miles") AS "SMiles", AVG("AMiles") AS "Average",
(SELECT MONTH("BikeDate") SUM("Miles") WHERE YEAR("BikeDate") = '2014') AS "2014"
FROM "RideDate"
GROUP BY MONTH("BikeDate")
ORDER BY MONTH("BikeDate") ASC
The results should be:
(month) (sum of month over all years) (avg of month over all years) (sum of month for '14)
The last column will not collate by the 'group by month' and gives a sum for the whole year.
How can I write the sub-query to sum across the iterated month of the main query for the selected year? Is there another way of solving this?

You can try it with a CROSS JOIN
SELECT * FROM
(
(SELECT MONTH("BikeDate") AS "Month", SUM("Miles") AS "SMiles", AVG("AMiles") AS "Average",
FROM "RideDate"
GROUP BY MONTH("BikeDate"))a
CROSS JOIN
(SELECT SUM("Miles") as "YearSum"
FROM "RideDate"
WHERE YEAR("BikeDate") = '2014')b
) results

Related

SQL Pivot table, with multiple pivots on criteria

Here is my dataset,
It has a reservation (unique ID) a reservation_dt a fiscal year (all the same year for the most part) month both numerical and name as well as a reservation status then it has total number reserved followed by a counter (basically
1 for each reservation row)
these are my guidelines (they need to be turned into columns by Month)
Requested - Count of All Distinct reservations
Num_Requested (sum total_number_requested by month)
Booked (count of All Distinct reservations status is order created)
Num_Booked (sum total_number_requested by month) where status is order created
Not_Booked (count of All Distinct reservations where status unfulfilled)
Not_Num_Booked, (sum total_number_requested by month where status is unfulfilled)
I am looking to translate this into a pivot table and this is what I've got so far and can't figure out why its not working.
I figured I would turn each of the above guidlines into a column, using either sum(total_number_Requested) or count(total_requested) where reseravation status is ... and such.
I'm open to any other ideas of how to make this simpler and make it work.
SELECT [month_name],
fyear AS fyear,
Requested,
Num_Requested
FROM (SELECT reservation,
reservation_status,
total_number_requested,
fyear,
[month_name],
[month],
total_requested
FROM #temp2) SourceTable
PIVOT (SUM(total_number_requested)
FOR reservation_status IN ([Requested])) PivotNumbRequested PIVOT(COUNT(reservation)
FOR total_requested IN ([Num_Requested])) PivotCountRequested
WHERE [month] = 7
ORDER BY fyear,
[month];

Use conditional expressions to emulate data pivot. Example:
SELECT fyear, Month, Monthname, Count(*) AS CountALL, Sum(total_number_requested) AS TotNum,
Sum(IIf(reservation_status = "Order Created", total_number_Requested, Null)) AS SumCreated
FROM tablename
GROUP BY fyear, Month, MonthName
More info:
SQLServer - Multiple PIVOT on same columns
Crosstab Query on multiple data points

Running Total - Create row for months that don't have any sales in the region (1 row for each region in each month)

I am working on the below query that I will use inside Tableau to create a line chart that will be color-coded by year and will use the region as a filter for the user. The query works, but I found there are months in regions that don't have any sales. These sections break up the line chart and I am not able to fill in the missing spaces (I am using a non-date dimension on the X-Axis - Number of months until the end of its fiscal year).
I am looking for some help to alter my query to create a row for every month and every region in my dataset so that my running total will have a value to display in the line chart. if there are no values in my table, then = 0 and update the running total for the region.
I have a dimDate table and also a Regions table I can use in the query.
My Query now, (Results sorted in Excel to view easier) Results Table Now
What I want to do; New rows highlighted in Yellow What I want to do
My Code using SQL Server:
SELECT b.gy,
b.sales_month,
b.region,
b.gs_year_total,
b.months_away,
Sum(b.gs_year_total)
OVER (
partition BY b.gy, b.region
ORDER BY b.months_away DESC) RT_by_Region_GY
FROM (SELECT a.gy,
a.region,
a.sales_month,
Sum(a.gy_total) Gs_Year_Total,
a.months_away
FROM (SELECT g.val_id,
g.[gs year] AS GY
,
g.sales_month
AS
Sales_Month,
g.gy_total,
Datediff(month, g.sales_month, dt.lastdayofyear) AS
months_away,
g.value_type,
val.region
FROM uv_sales g
JOIN dbo.dimdate AS dt
ON g.[gs year] = dt.gsyear
JOIN dimvalsummary val
ON g.val_id = val.val_id
WHERE g.[gs year] IN ( 2017, 2018, 2019, 2020, 2021 )
GROUP BY g.valuation_id,
g.[gs year],
val.region,
g.sales_month,
dt.lastdayofyear,
g.gy_total,
g.value_type) a
WHERE a.months_away >= 0
AND sales_month < Dateadd(month, -1, Getdate())
GROUP BY a.gy,
a.region,
a.sales_month,
a.months_away) b

It's tough to envision the best method to solve without data and the meaning of all those fields. Here's a rough sketch of how one might attempt to solve it. This is not complete or tested, sorry, but I'm not sure the meaning of all those fields and don't have data to test.
Create a table called all_months and insert all the months from oldest to whatever date in the future you need.
01/01/2017
02/01/2017
...
12/01/2049
May need one query per region and union them together. Select the year & month from that all_months table, and left join to your other table on month. Coalesce your dollar values.
select 'East' as region,
extract(year from m.month) as gy_year,
m.month as sales_month,
coalesce(g.gy_total, 0) as gy_total,
datediff(month, m.month, dt.lastdayofyear) as months_away
from all_months m
left join uv_sales g on g.sales_month = m.month
--and so on

SQLite - Use a CTE to divide a query

quick question for those SQL experts out there. I feel a bit stupid because I have the feeling I am close to reaching the solution but have not been able to do so.
If I have these two tables, how can I use the former one to divide a column of the second one?
WITH month_usage AS
(SELECT strftime('%m', starttime) AS month, SUM(slots) AS total
FROM Bookings
GROUP BY month)
SELECT strftime('%m', b.starttime) AS month, f.name, SUM(slots) AS usage
FROM Bookings as b
LEFT JOIN Facilities as f
ON b.facid = f.facid
GROUP BY name, month
ORDER BY month
The first one computes the total for each month
The second one is the one I want to divide the usage column by the total of each month to get the percentage
When I JOIN both tables using month as an id it messes up the content, any suggestion?

I want to divide the usage column by the total of each month to get the percentage
Just use window functions:
SELECT
strftime('%m', b.starttime) AS month,
f.name,
SUM(slots) AS usage
1.0 * SUM(slots) AS usage
/ SUM(SUM(slots)) OVER(PARTITION BY strftime('%m', b.starttime)) ratio
FROM Bookings as b
LEFT JOIN Facilities as f
ON b.facid = f.facid
GROUP BY name, month
ORDER BY month

I'm trying to calculate the difference between two weeks but I'm getting a weird peak when plotting the results ( SQL / BigQuery )

so I have this daily table that contains the number of visitors per store, everyday.
My tables columns are:
Date
Store
Number_of_Visitors
Views : number of views of the stores' ads.
So I first started with aggregating my table to a weekly table so that I can calculate the variance between a week and the next one.
Here is how I defined variance:
Variance = `Number Of Visitors in WEEK N+1 / Number of Visitors in WEEK N
I wrote the following query to do that (new table called: weekly)
SELECT
year_week,
min(date) as date,
Store,
SUM(Number_Of_Visitors) AS TOTAL_VISITORS
FROM (
SELECT
*,
CONCAT(cast((extract(YEAR from date)), LPAD(cast((extract(WEEK from date)) as string), 2, '0') ) AS year_week
FROM `my-project`)
GROUP BY
year_week, Store
ORDER BY year_week
Then, in order to calculate the variance, I used the following query as well:
SELECT
base.*,
((base.TOTAL_VISITORS-lw.TOTAL_VISITORS)/lw.TOTAL_VISITORS) AS VAR_FF,
FROM
`weekly` base
JOIN (
SELECT
* EXCEPT (date),
DATE_ADD(DATE(TIMESTAMP(date)), INTERVAL 1 Week)AS n_date
FROM
`weekly` ) lw
ON
base.date = lw.n_date
AND base.Store= lw.Store
When I'm plotting the variance (VAR_FF) using Data Studio and I'm getting the following plot that doesnt 't seem to be making sense with the high peak in the middle;

I am thinking your code should look like this:
SELECT date_trunc(date, week) as year_week,
Store,
SUM(Number_Of_Visitors) AS TOTAL_VISITORS,
(1 -
(LAG(SUM(Number_Of_Visitors)) OVER (PARTITION BY Store ORDER BY MIN(date) /
SUM(Number_Of_Visitors)
)
) as VAR_FF,
FROM`my-project`
GROUP BY year_week, Store
ORDER BY year_week;
I'm not sure what your weird calculations for calculating the week are really doing. This is based on the previous week in the data.

Average Group size per month Over previous ten years

I need to find the average size (average number of employees) of all the groups (employers) that we do business with per month for the last ten years.
So I have no problem getting the average group size for each month. For the Current month I can use the following:
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
group by ER.EmployerName
This will give me a list of how many employees are in each group. I can then copy and paste the column into excel get the average for the current month.
For the previous month, I want exclude any employees that were added after that month. I have a query for this too:
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
where EE.dateadded <= DATEADD(month, -1,GETDATE())
group by ER.EmployerName
That will exclude all employees that were added this month. I can continue to this all the way back ten years, but I know there is a better way to do this. I have no problem running this query 120 times, copying and pasting the results into excel to compute the average. However, I'd rather learn a more efficient way to do this.
Another Question, I can't do the following, anyone know a way around it:
Select avg(count(*))
Thanks in advance guys!!
Edit: Employees that have been terminated can be found like this. NULL are employees that are currently employed.
Select count(*)
from Employees EE
join Employers ER on EE.employerid = ER.employerid
join Gen_Info gne on gne.id = EE.newuserid
where EE.dateadded <= DATEADD(month, -1,GETDATE())
and (gne.TerminationDate is NULL OR gen.TerminationDate < DATEADD(day, -14,GETDATE())
group by ER.EmployerName

Are you after a query that shows the count by year and month they were added? if so this seems pretty straight forward.
this is using mySQL date functions Year & month.
Select AVG(cnt) FROM (
Select count(*) cnt, Year(dateAdded), Month(dateAdded)
from System_Users su
join system_Employers se on se.employerid = su.employerid
group by Year(dateAdded), Month(dateAdded)) B
The inner query counts and breaks out the counts by year and month We then wrap that in a query to show the avg.
--2nd attempt but I'm Brain FriDay'd out.
This uses a Common table Expression (CTE) to generate a set of data for the count by Year, Month of the employees, and then averages out by month.
if this isn't what your after, sample data w/ expected results would help better frame the question and I can making assumptions about what you need/want.
With CTE AS (
Select Year(dateAdded) YR , Month(DateAdded) MO, count(*) over (partition by Year(dateAdded), Month(dateAdded) order by DateAdded Asc) as RunningTotal
from System_Users su
join system_Employers se on se.employerid = su.employerid
Order by YR ASC, MO ASC)
Select avg(RunningTotal), mo from cte;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL/HSQLDB query and sub-query in Aggregate Function - sql

You can try it with a CROSS JOIN SELECT * FROM ( (SELECT MONTH("BikeDate") AS "Month", SUM("Miles") AS "SMiles", AVG("AMiles") AS "Average", FROM "RideDate" GROUP BY MONTH("BikeDate"))a CROSS JOIN (SELECT SUM("Miles") as "YearSum" FROM "RideDate" WHERE YEAR("BikeDate") = '2014')b ) results

Related

SQL Pivot table, with multiple pivots on criteria

Running Total - Create row for months that don't have any sales in the region (1 row for each region in each month)

SQLite - Use a CTE to divide a query

I'm trying to calculate the difference between two weeks but I'm getting a weird peak when plotting the results ( SQL / BigQuery )

Average Group size per month Over previous ten years

Categories

Resources