Sum entries of an sqlite query - sql

I have a table similar to this:
| id(INTEGER) | id2(INTEGER) | time(DATETIME) | value(REAL) |
|-------------|--------------|----------------------|-------------|
| 1 | 2000 | 2004-01-01 00:00:00 | 1000 |
which I query with visual basic. Now I want to sum all entries between year 2004 and 2010 so the result looks like this:
2004 11,000
2005 35,000
2006 46,000
cIf I do it inside visual basis it is achieved with few loops but unfortunately this is not very performant.
Is it possible to create a query which yields the result say between two years grouped by years. Or between two months (within one year), grouped by months, days (within one month), hours (within one day), minutes (within one hour)?
EDIT:
Query for year interval:
SELECT STRFTIME('%Y', time) AS year, SUM(value) AS cumsum FROM mytable WHERE year >= STRFTIME('%Y','2005-01-01 00:00:00') AND year <= STRFTIME('%Y','2010-01-01 00:00:00') GROUP BY STRFTIME('%Y', mytable.time) ORDER BY year;
Now need an idea for months, days and hours.

This is an aggregation query, but you need to extract the year from the date:
select strftime('%Y', time) as yr, sum(value)
from table
group by strftime('%Y', time)
order by yr;

Hi Below is the code for both sql and sqlite.we can do group by time instead of using the strftime function again.
SQLITE:
select strftime('%Y',
time)as year , sum(value) value
from attendence group by time
SQL:
select EXTRACT(YEAR FROM time) as year, sum(value) value
from attendence Group by time

Related

Remove Duplicates and show Total sales by year and month

i am trying to work with this query to produce a list of all 11 years and 12 months within the years with the sales data for each month. Any suggestions? this is my query so far.
SELECT
distinct(extract(year from date)) as year
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by date
it just creates a long list of over 2000 results when i am expecting 132 max one for each month in the years.
You should change your group by statement if you have more results than you expected.
You can try:
group by YEAR(date), MONTH(date)
or
group by EXTRACT(YEAR_MONTH FROM date)
A Grouping function is for takes a subsection of the date in your case year and moth and collect all rows that fit, and sum it up,
So a sĀ“GROUp BY date makes no sense, what so ever as you don't want the sum of every day
So make this
SELECT
extract(year from date) as year
,extract(MONTH from date) as month
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by 1,2
Or you can combine both year and month
SELECT
extract(YEAR_MONTH from date) as year
, sum(sale_dollars) as year_sales
from `project-1-349215.Dataset.sales`
group by 1

Occurrences by Year-Month in Presto DB

I have a table in Presto with this schema:
created_at Record
timestamp String
created_at has records from 2020 to 2022.
What's the best way to get the total number of record by month, like this output:
Date N_Records
2020-01 1000
2020-02 1500
----
2022-03 3000
What I did so far:
select date_format(created_at, '%b') month, count(*) count
from table
group by date_format(created_at, '%b')
order by 1 asc
Problems with my code:
I don't have the respective year and the results are not sorted by asc month.
Can someone help with to improve my query?
You can use format including 4-digit year and 2-digit month:
select date_format(created_at, '%Y-%m') Date, count(*) N_Records
from table
group by 1
order by 1 asc

prestosql get average from last 7 days for each day

The question I have is very similar to the question here, but I am using Presto SQL (on aws athena) and couldn't find information on loops in presto.
To reiterate the issue, I want the query that:
Given table that contains: Day, Number of Items for this Day
I want: Day, Average Items for Last 7 Days before "Day"
So if I have a table that has data from Dec 25th to Jan 25th, my output table should have data from Jan 1st to Jan 25th. And for each day from Jan 1-25th, it will be the average number of items from last 7 days.
Is it possible to do this with presto?
maybe you can try this one
calendar Common Table Expression (CTE) is used to generate dates between two dates range.
with calendar as (
select date_generated
from (
values (sequence(date'2021-12-25', date'2022-01-25', interval '1' day))
) as t1(date_array)
cross join unnest(date_array) as t2(date_generated)),
temp CTE is basically used to make a date group which contains last 7 days for each date group.
temp as (select c1.date_generated as date_groups
, format_datetime(c2.date_generated, 'yyyy-MM-dd') as dates
from calendar c1, calendar c2
where c2.date_generated between c1.date_generated - interval '6' day and c1.date_generated
and c1.date_generated >= date'2021-12-25' + interval '6' day)
Output for this part:
date_groups
dates
2022-01-01
2021-12-26
2022-01-01
2021-12-27
2022-01-01
2021-12-28
2022-01-01
2021-12-29
2022-01-01
2021-12-30
2022-01-01
2021-12-31
2022-01-01
2022-01-01
last part is joining day column from your table with each date and then group it by the date group
select temp.date_groups as day
, avg(your_table.num_of_items) avg_last_7_days
from your_table
join temp on your_table.day = temp.dates
group by 1
You want a running average (AVG OVER)
select
day, amount,
avg(amount) over (order by day rows between 6 preceding and current row) as avg_amount
from mytable
order by day
offset 6;
I tried many different variations of getting the "running average" (which I now know is what I was looking for thanks to Thorsten's answer), but couldn't get the output I wanted exactly with my other columns (that weren't included in my original question) in the table, but this ended up working:
SELECT day, <other columns>, avg(amount) OVER (
PARTITION BY <other columns>
ORDER BY date(day) ASC
ROWS 6 PRECEDING) as avg_7_days_amount FROM table ORDER BY date(day) ASC

I want find customers transacting for any consecutive 3 months from year 2017 to 2018

I want to know the trick to find the list of customers who are transacting for consecutive 3 months ,that could be any 3 consecutive months with any number of occurrence.
example: suppose there is customer who transact in January then keep transacting till march then he stopped transacting.I want the list of these customer from my database .
I am working on AWS Athena.
One method uses aggregation and window functions:
select customer_id, yyyymm_2
from (select date_trunc(month, transactdate) as yyyymm, customer_id,
lag(date_trunc(month, transactdate), 2) over (partition by customer_id order by date_trunc(month, transactdate)) as prev_yyyymm_2
from t
where transactdate >= '2017-01-01' and
transactadte < '2019-01-01'
)
where prev_dt_2 = yyyymm - interval '2' month;
This aggregates transactions by month and looks at the transaction date two rows earlier. The outer filter checks that that date is exactly 2 months earlier.

Running Sum for the last 30 days on BigQuery

I am trying to get the following query on Google Merchandise Store public dataset in BigQuery:
Date
Number of distinct users
Running sum of the number of distinct users in the last 30 days
For eg (I used 3 days in the example for simplicity):
date distinct_users distinct_users_3days
15/07/2018 8 15
14/07/2018 2 12
13/07/2018 5 20
12/07/2018 5 15
11/07/2018 10 10
...
This is my current SQL code which gets the first two columns, but I can't figure out how to get the running sum:
SELECT
date,
COUNT(DISTINCT(fullVisitorId)) as daily_active_user
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_2017*`
WHERE
_table_suffix BETWEEN "0101"
AND "0715"
GROUP BY
date
Any help is appreciated! :)
I managed to figure out the answer to my question so I would like to share with the others who may encounter this problem in future.
The SQL code is:
SELECT
date,
COUNT(DISTINCT(fullVisitorId)) as daily_active_user,
SUM(count(Distinct(fullVisitorId))) OVER (ORDER BY date ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) AS monthly_active_user
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_2017*`,
unnest(hits) as h
WHERE
_table_suffix BETWEEN "0101" AND "0715"
GROUP BY
date
This gives a column which sums the distinct users in a 30 day window.
Please try the following query for 3 days (SQL server 2014 )-:
SELECT date,COUNT(DISTINCT(fullVisitorId)) as daily_active_user,sum(COUNT(DISTINCT(fullVisitorId))) over (PARTITION BY null ORDER BY date desc ROWS
BETWEEN CURRENT ROW AND 2 FOLLOWING) AS distinct_users_3days FROM YOUR_TABLE_NAME WHERE _table_suffix BETWEEN '0101' AND '715' GROUP BY date
For 30 days-:
SELECT
date,COUNT(DISTINCT(fullVisitorId)) as daily_active_user,
sum(COUNT(DISTINCT(fullVisitorId))) over (PARTITION BY null ORDER BY date desc ROWS
BETWEEN CURRENT ROW AND 29 FOLLOWING) AS distinct_users_3days
FROM YOUR_TABLE_NAME
WHERE _table_suffix
BETWEEN '0101' AND '715'
GROUP BY date