How to aggregate rows on BigQuery - sql

I need to group different years in my dataset so that I can see the total number of login_log_id each year has(BigQuery)
SELECT login_log_id,
DATE(login_time) as login_date,
EXTRACT(YEAR FROM login_time) as login_year,
TIME(login_time) as login_time,
FROM `steel-time-347714.flex.logs`
GROUP BY login_log_id
I want to make a group by so that I can see total number of login_log_id generated in different years.
My columns are login_log_id, login_time
I am getting following error :-
SELECT list expression references column login_time which is neither grouped nor aggregated at [2:6]

The error is because every column you refer to in the select need to be aggregated or be in the GROUP BY.
If you want the total logins by year, you can do:
SELECT
EXTRACT(YEAR FROM login_time) as login_year,
COUNT(1) as total_logins,
COUNT(DISTINCT login_log_id) as total_unique_logins
FROM `steel-time-347714.flex.logs`
GROUP BY login_year
But if you want the total by login_log_id and year:
SELECT
login_log_id,
EXTRACT(YEAR FROM login_time) as login_year,
COUNT(1) as total_logins
FROM `steel-time-347714.flex.logs`
GROUP BY login_log_id, login_year

Related

Is there a way to count how many strings in a specific column are seen for the 1st time?

**Is there a way to count how many strings in a specific column are seen for
Since the value in the column 2 gets repeated sometimes due to the fact that some clients make several transactions in different times (the client can make a transaction in the 1st month then later in the next year).
Is there a way for me to count how many IDs are completely new per month through a group by (never seen before)?
Please let me know if you need more context.
Thanks!
A simple way is two levels of aggregation. The inner level gets the first date for each customer. The outer summarizes by year and month:
select year(min_date), month(min_date), count(*) as num_firsts
from (select customerid, min(date) as min_date
from t
group by customerid
) c
group by year(min_date), month(min_date)
order by year(min_date), month(min_date);
Note that date/time functions depends on the database you are using, so the syntax for getting the year/month from the date may differ in your database.
You can do the following which will assign a rank to each of the transactions which are unique for that particular customer_id (rank 1 therefore will mean that it is the first order for that customer_id)
The above is included in an inline view and the inline view is then queried to give you the month and the count of the customer id for that month ONLY if their rank = 1.
I have tested on Oracle and works as expected.
SELECT DISTINCT
EXTRACT(MONTH FROM date_of_transaction) AS month,
COUNT(customer_id)
FROM
(
SELECT
date_of_transaction,
customer_id,
RANK() OVER(PARTITION BY customer_id
ORDER BY
date_of_transaction ASC
) AS rank
FROM
table_1
)
WHERE
rank = 1
GROUP BY
EXTRACT(MONTH FROM date_of_transaction)
ORDER BY
EXTRACT(MONTH FROM date_of_transaction) ASC;
Firstly you should generate associate every ID with year and month which are completely new then count, while grouping by year and month:
SELECT count(*) as new_customers, extract(year from t1.date) as year,
extract(month from t1.date) as month FROM table t1
WHERE not exists (SELECT 1 FROM table t2 WHERE t1.id==t2.id AND t2.date<t1.date)
GROUP BY year, month;
Your results will contain, new customer count, year and month

Looking to create a query in SQL that states

i am relatively new to SQL and I'm looking to create a query that states how many records were created by those other than a certain "good" group of users (userids). If possible grouped by month as well. Any suggestions? I have some basic logic set out below.
Table is called newcompanies
SELECT COUNT(record_num), userid
FROM Newcompanies
WHERE userID <> (certain group of userIds)
GROUP BY Month
Will i be required to create a second table where the group of "good" userids is held
There are a few ways to do this. Without knowing your exact columns, this will be a rough estimate.
SELECT id,
DATEPART(MONTH, created_date) AS created_month,
COUNT(*)
FROM your_table
WHERE id NOT IN(
--hardcode userID's here
)
GROUP BY
id,
DATEPART(MONTH, created_date)
Or you could have a table with your good id's and then exclude those.
SELECT id,
DATEPART(MONTH, created_date) AS created_month,
COUNT(*)
FROM your_table
WHERE id NOT IN(
SELECT id
from your_good_id_table
)
GROUP BY
id,
DATEPART(MONTH, created_date)
-- if month is not a field in the table you will have to do a function to parse out the month that will depend on the sql database you are using, if it is MS SQL you can do Month(datefield)
SELECT COUNT(record_num), userid, Month
FROM Newcompanies
WHERE userID NOT IN (
Select UserID
from ExcludeTheseUserIDs
)
GROUP BY Month, userid

Unique new users per period

In psql, I've written a query that returns unique users per week with
COUNT(DISTINCT user_id)
However, I am also interested in counting the number of unique new users per week, in other words, users that have never been active before in any of the previous weeks.
How would one write this query in postgresql?
Current query:
SELECT TO_CHAR(date_trunc('week', start_time::date), 'YYYY-MM-DD')
AS weekly, COUNT(*) AS total_transactions, COUNT(DISTINCT user_id) AS unique_users
FROM transactions
GROUP BY weekly ORDER BY weekly
Use min to get the first appearance of a user_id. Use that to calculate unique users per week. You may also want to include grouping on year.
select
TO_CHAR(date_trunc('week', first_appearance), 'YYYY-MM-DD') AS weekly,
COUNT(*) AS total_transactions,
COUNT(DISTINCT user_id) AS unique_users
from (SELECT t.*,
MIN(start_time::date) OVER(PARTITION BY user_id) AS first_appearance
FROM transactions t
) t
GROUP BY weekly

Unique values per time period

In my table trips , I have two columns: created_at and user_id
My goal is to count unique user_ids per month with a query in postgres. So far, I have written this - but it returns an error
SELECT user_id,
to_char(created_at, 'YYYY-MM') as t COUNT(*)
FROM (SELECT DISTINCT user_id
FROM trips) group by t;
How should I change this query?
The query is much simpler than that:
SELECT to_char(created_at, 'YYYY-MM') as yyyymm, COUNT(DISTINCT user_id)
FROM trips
GROUP BY yyyymm
ORDER BY yyyymm;

Create new table with number of incidents per month

Whats up mates , i have already started to learn SQL database thing and i am confused here . i have to create a table with number of incidents per month.
I already know how to create table but the rest ?
SELECT
EXTRACT(month FROM dateofcall) AS x,
incidentnumber,
dateofcall
FROM
incidents
GROUP BY
incidentnumber,
x
ORDER BY
x ASC;
But its not giving me the results of incidents number per month . =(
It looks like you are grouping by too many items in your GROUP BY clause, and you are not COUNTing your incidents, just showing their details.
Try this:
SELECT EXTRACT(month FROM dateofcall) AS x,
COUNT(*) AS incidents
FROM
incidents
GROUP BY
EXTRACT(month FROM dateofcall)
ORDER BY
EXTRACT(month FROM dateofcall)
SELECT
EXTRACT(month FROM dateofcall) AS theMonth,
COUNT(*) AS theNumberOfIncidents
FROM
incidents
GROUP BY
EXTRACT(month FROM dateofcall)
ORDER BY
theMonth
Your original query wasn't counting anything. You were also grouping by incidentNumber which I assume is your primary-key, which is a nonsensical operation.
Due to a quirk in the SQL language you cannot use a column alias in GROUP BY statements, which is why you need to duplicate the EXTRACT(month FROM dateofcall) code.