Select part of an integer in SQL (GBQ) - sql

I have the following database on Google Big Query (SQL Standard) with the date and the revenue. Both in int format. I need to get the totals by month and year. I am not able to get the part of the date I am interested on. Basically the numbers of position 1 and 6 from the first column.
Revenue Database:
This is what I have tried but then I need to run this code for every month separately:
SELECT sum(revenue)
from revenue.table
where date between 20210601 and 20210630
Any clue on how to do this? Thanks!

If the value is in int format, then use arithmetic:
select floor(date / 100) as yyyymm, sum(revenue)
from revenue.table
group by yyyymm;
If it were stored properly as a date, then you would use the built-in date_trunc():
select date_trunc(date, month) as yyyymm, sum(revenue)
from revenue.table
group by yyyymm;
There are similar functions for related data types: timestamp_trunc() and datetime_trunc().

If your date is actually an INTEGER, then I would:
GROUP BY DIV(date/100)

I would go with below
select
date_trunc(parse_date('%Y%m%d', '' || date), month) month,
sum(revenue) revenue
from `revenue.table`
group by month

Try this:
SELECT substring(cast(Date as string),1,6) YearMonth,
sum(Revenue)
FROM `<Dataset_NAME>.<Table_NAME>`
group by YearMonth;

Adding my own solution in case it helps someone in the future:
SELECT substring(CAST(date AS STRING), 1, 6) AS month, sum(revenue) AS total_rev
FROM `revenue.table`
GROUP BY month
ORDER BY month DESC

Related

SQL No of count month wise

I have a data set as below,
data is basically year and month YYYYMM, I need to bring a count of months eg 202001 is appearing 3 times, hence the count should be Nov 3 ( Desired output is shared below )
I'm unable to start to bring out the desired output, help would be much appreciated.
(Temp tables are not allowed to be created in the servers)
Please find the link for sample data link
Help would be much appretiated.
You can use to_date() to convert your number to a proper date, then group by that date:
select to_date(due_date_key::text, 'yyyymm') as due_date,
count(*)
from t
group by due_date;
The "due_date" column is a proper date, you can use the to_char() function to format it differently:
select to_char(due_date, 'yyyy') as year,
to_char(due_date, 'Mon') as output,
count
from (
select to_date(due_date_key::text, 'yyyymm') as due_date,
count(*)
from t
group by due_date
) t
order by due_date;
Online example

group by year month in postgresql

customer Date location
1 25Jan2018 texas
2 15Jan2018 texas
3 12Feb2018 Boston
4 19Mar2017 Boston.
I am trying to find out count of customers group by yearmon of Date column.Date column is of text data type
eg: In jan2018 ,the count is 2
I would do something like the following:
SELECT
date_part('year', formattedDate) as Year
,date_part('month', formattedDate) as Month
,count(*) as CustomerCountByYearMonth
FROM
(SELECT to_date(Date,'DDMonYYYY') as formattedDate from <table>) as tbl1
GROUP BY
date_part('year', formattedDate)
,date_part('month', formattedDate)
Any additional formatting for dates could be done on the inner query that will allow for adjustments in case some single digit days need to be padded or a month has four letters instead of three etc.
By converting to date type, you can properly order by date type and not alphabetical etc.
Optionally:
SELECT
Year
,Month
,count(*) as CustomerCountByYearMonth
FROM
(SELECT
date_part('year', to_date(Date,'DDMonYYYY')) as Year
,date_part('month', to_date(Date,'DDMonYYYY')) as Month
FROM <table>) as tbl1
GROUP BY
Year
,Month
You shouldn't store dates in a text column...
select substring(Date, length(Date)-6), count(*)
from tablename
group by substring(Date, length(Date)-6)
I thought #Jarlh asked a good question -- what about dates like January 1? Is it 01Jan2019 or 1Jan2019? If it can be either, perhaps a regex would work.
select
substring (date from '\d+(\D{3}\d{4})') as month,
count (distinct customer)
from t
group by month
The 'distinct customer' also presupposes you may have the same customer listed in the same month, but you only want to count it once. If that's not the case, just remove 'distinct.'
And, if you wanted the output in date format:
select
to_date (substring (date from '\d+(\D{3}\d{4})'), 'monyyyy') as month,
count (distinct customer)
from t
group by month
If it is a date column, you can truncate the date:
select date_trunc('month', date) as yyyymm, count(*)
from t
group by yyyymm
order by yyyymm;
I really read that the type was date. For a string, just use string functions:
select substr(date, 3, 7) as mmmyyyy, count(*)
from t
group by mmmyyyy;
Unfortunately, ordering doesn't work in this case. You should really be storing dates using the proper type.

SQL BigQuery : Calculate Value per time period

I'm new to SQL on BigQuery and I'm blocked on a project I have to compile.
I'm being asked to find the year over year growth of sales in percentage on a database that doesn't even sum the revenues... I know I have to assemble various request but can't figure out how to calculate the growth of sales.
Here is where I am at :
Has Anybody an insight on how to do so?
Thanks a lot !
(1) Starting from what you have, group by product line to get this year and last year's revenue in each row:
#standardsql
with yearly_sales AS (
select year, product_line, sum(revenue) as revenue
from `dataset.sales`
group by product_line, year
),
year_on_year AS (
select array_agg(struct(year, revenue))
OVER(partition by product_line ORDER BY year
RANGE BETWEEN PRECEDING AND CURRENT ROW) AS data
from yearly_sales
)
(2) Compute year-on-year growth from the two values you now have in each row
Below is for BigQuery Standard SQL
#standardSQL
SELECT product_line, year, revenue, prev_year_revenue,
ROUND(100 * (revenue - prev_year_revenue)/prev_year_revenue) year_over_year_growth_percent
FROM (
SELECT product_line, year, revenue,
LAG(revenue) OVER(PARTITION BY product_line ORDER BY year) prev_year_revenue
FROM (
SELECT product_line, year, SUM(revenue) revenue
FROM `project.dataset.table`
GROUP BY product_line, year
)
)
-- ORDER BY product_line, year
I tried with your information (plus mine made up data for 2007) and I arrived here:
SELECT
year,
sum(revenue) as year_sum
FROM
YearlyRevenue.SportCompany
GROUP BY
year
ORDER BY
year_sum
Whose result is:
R year year_sum
1 2005 1.159E9
2 2006 1.4953E9
3 2007 1.5708E9
Now the % growth should be added. Have a look here for inspiration.
Let me know if you don't succeed and I will try the hard part, with no guarantees.

BigQuery RATIO_TO_REPORT for all data no partition

I want calculate ratio of specify field, I know in legacy sql I can use RATIO_TO_REPORT function ex:
SELECT
month,
RATIO_TO_REPORT(totalPoint) over (partition by month)
FROM (
SELECT
format_datetime('%Y-%m', ts) AS month,
SUM(point) AS totalPoint
FROM
`userPurchase`
GROUP BY
month
ORDER BY
month )
but I want get ratio that calculate by all data without partition, ex:(this code not work)
SELECT
month,
RATIO_TO_REPORT(totalPoint) over (partition by "all"),
# RATIO_TO_REPORT(totalPoint) over (partition by null)
FROM (
SELECT
format_datetime('%Y-%m', ts) AS month,
SUM(point) AS totalPoint
FROM
`userPurchase`
GROUP BY
month
ORDER BY
month )
It doesn't work, How I can do for same thing? thanks!
assuming the rest of the code is correct - just omit partition by part
RATIO_TO_REPORT(totalPoint) OVER ()

How do I correctly use the SQL Sum function with multiple variables and grouping?

I am trying to write an SQL statement based on the following code.
CREATE TABLE mytable (
year INTEGER,
month INTEGER,
day INTEGER,
hoursWorked INTEGER )
Assuming that each employee works multiple days over each month in a 3 year period.
I need to write an sql statement that returns the total hours worked in each month, grouped by earliest year/month first.
I tried doing this, but I don't think it is correct:
SELECT Sum(hoursWorked) FROM mytable
ORDER BY(year,month)
GROUP BY(month);
I am a little confused about how to operate the sum function in conjunction with thee GROUP BY or ORDER BY function. How does one go about doing this?
Try this:
SELECT year, month, SUM(hoursWorked)
FROM mytable
GROUP BY year, month
ORDER BY year, month
This way you will have for example:
2014 December 30
2015 January 12
2015 February 40
Fields you want to group by always have be present in SELECT part of query. And vice-versa - what you put in SELECT part, need be also in GROUP BY.
SELECT year, month, Sum(hoursWorked)as workedhours
FROM mytable
GROUP BY year,month
ORDER BY year,month;
You have to group by year and month.
Is this what you are trying to do. This will sum by Year/Month and Order by Year/Month.
Select [Year], [Month], Sum(HoursWorked) as WorkedHours
From mytable
Group By [Year], [Month]
Order by [Year], [Month]
You have to group by year and month, otherwise you will have the hours you worked on March 2014 and 2015 in one record :)
SELECT Sum(hoursWorked) as hoursWorked, year, month
FROM mytable
GROUP BY(year, month)
ORDER BY(year,month)
;