How to count unique occurences of column in Big Query

How to count unique occurences of column in Big Query - sql

Given a table such as:
| ID | Value |
|-------------|
| 1 | "some" |
| 1 | "some" |
| 1 | "value"|
| 2 | "some" |
| 3 | "some" |
| 3 | "value |
| 3 | "value |
How can I count the number of unique occurrences of value for each ID?
So you end up with a table such as:
| ID | Value | number |
|-------------|--------|
| 1 | "some" | 2 |
| | "value"| 1 |
| 2 | "some" | 1 |
| 3 | "some" | 1 |
| | "value | 2 |
I attempted to use OVER(PARTITION BY ID order by Value) to separate the table by IDs and count the separate values. However this counts the number of unique occurences, but then adds them together. So I end up with a table such as:
| ID | Value | number |
|-------------|--------|
| 1 | "some" | 2 |
| 1 | "some" | 2 |
| 1 | "value"| 3 |
| 2 | "some" | 1 |
| 3 | "some" | 1 |
| 3 | "value | 3 |
| 3 | "value | 3 |
Is there a way to count the unique values like the second example I gave?

Below is for BigQuery Standard SQL
#standardSQL
SELECT id, value, COUNT(1) number
FROM `project.dataset.table`
GROUP BY id, value
with result
Row id value number
1 1 some 2
2 1 value 1
3 2 some 1
4 3 value 2
5 3 some 1

Related

Count rows in table that are the same in a sequence

I have a table that looks like this
+----+------------+------+
| ID | Session_ID | Type |
+----+------------+------+
| 1 | 1 | 2 |
| 2 | 1 | 4 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 3 | 2 |
| 7 | 3 | 1 |
+----+------------+------+
And I would like to count all occurences of a type that are in a sequence.
Output look some how like this:
+------------+------+-----+
| Session_ID | Type | cnt |
+------------+------+-----+
| 1 | 2 | 1 |
| 1 | 4 | 1 |
| 1 | 2 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 1 |
| 3 | 1 | 1 |
+------------+------+-----+
A simple group by like
SELECT session_id, type, COUNT(type)
FROM table
GROUP BY session_id, type
doesn't work, since I need to group only rows that are "touching".
Is this possible with a merge sql-select or will I need some sort of coding. Stored Procedure or Application side coding?
UPDATE Sequence:
If the following row has the same type, it should be counted (ordered by ID).
to determine the sequence the ID is the key with the session_ID, since I just want to group rows with the same session_ID.
So if there are 3 rows is in one session
row with the ID 1 has type 1,
and the second row has type 1
and row 3 has type 2
Input:
+----+------------+------+
| ID | Session_ID | Type |
+----+------------+------+
| 1 | 1 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 2 |
+----+------------+------+
The squence is Row 1 to Row 2. This three row should output
Output:
+------------+------+-------+
| Session_ID | Type | count |
+------------+------+-------+
| 1 | 1 | 2 |
| 3 | 2 | 1 |
+------------+------+-------+

You can use a difference of id and row_number() to identify the gaps and then perform your count
;with cte as
(
Select *, id - row_number() over (partition by session_id,type order by id) as grp
from table
)
select session_id,type,count(*) as cnt
from cte
group by session_id,type,grp
order by max(id)

SQL order by but repeat crescent numbers

I'm using SQL Server 2014 and i'm having a trouble with a query.
I have this scenario bellow:
| Number | Series | Name |
|--------|--------|---------|
| 9 | 1 | Name 1 |
| 5 | 3 | Name 2 |
| 8 | 2 | Name 3 |
| 7 | 3 | Name 4 |
| 0 | 1 | Name 5 |
| 1 | 2 | Name 6 |
| 9 | 2 | Name 7 |
| 3 | 3 | Name 8 |
| 4 | 1 | Name 9 |
| 0 | 1 | Name 10 |
and I need to get it ordered by series column like this:
| Number | Series | Name |
|--------|--------|---------|
| 9 | 1 | Name 1 |
| 8 | 2 | Name 3 |
| 5 | 3 | Name 2 |
| 7 | 1 | Name 5 |
| 1 | 2 | Name 6 |
| 0 | 3 | Name 4 |
| 4 | 1 | Name 9 |
| 9 | 2 | Name 7 |
| 3 | 3 | Name 8 |
| 0 | 1 | Name 10 |
Actually is more a sequency in "series" column than an ordenation.
1,2,3 again 1,2,3...
Somebody could help me?

You can do this using the ANSI standard function row_number():
select number, series, name
from (select t.*, row_number() over (partition by series order by number) as seqnum
from t
) t
order by seqnum, series;
This assigns "1" to the first record for each series, "2" to the second, and so on. The outer order by then puts all the "1"s together, all the "2" together. This has the effect of interleaving the values of the series.

Limit a sorted number of rows joined

I have two tables, A and B, and a join table M. I want to, for each A.id, get the top 2 B.id's sorting on the value in table M, producing the results below. This is running on an Azure SQL database
Table A Table M Table B
+-----+ +-----+-----+-------+ +-----+
| Id | | AId | BId | Value | | Id |
+-----+ +-----+-----+-------+ +-----+
| 1 | | 1 | 3 | 4 | | 1 |
| 2 | | 1 | 2 | 3 | | 2 |
| 3 | | 3 | 2 | 3 | | 3 |
| 4 | | 3 | 5 | 6 | | 4 |
+-----+ | 3 | 3 | 4 | | 5 |
| 4 | 1 | 2 | +-----+
| 4 | 2 | 1 |
| 4 | 4 | 3 |
+-----+-----+-------+
Result
+-----+-----+-------+
| AId | BId | Value |
+-----+-----+-------+
| 1 | 3 | 4 |
| 1 | 2 | 3 |
| 3 | 5 | 6 |
| 3 | 3 | 4 |
| 4 | 1 | 2 |
| 4 | 4 | 3 |
+-----+-----+-------+
I know that I can select all the M.AId rows where they equal 1, sort it, and limit by 2, but I need to do this for every row in Table A. I've made an attempt to use group by, but I wasn't sure how to sort and limit it. I've also tried to search for resources associated with this issue but I couldn't find any resources.
(I also wasn't sure how to word the title for this issue)

You can just use ROW_NUMBER:
SELECT
AId, BId, Value
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY AId ORDER BY Value DESC)
FROM M
) t
WHERE Rn <= 2

SQL - only display rows that have the max value

I have this table that is already sorted but I want it to only display the maximum values... so instead of this table:
+------+-------+
| id | value |
+------+-------+
| 1 | 3 |
| 5 | 3 |
| 4 | 3 |
| 9 | 2 |
| 8 | 2 |
| 3 | 2 |
| 2 | 1 |
| 6 | 1 |
| 7 | 1 |
+------+-------+
I want this:
+------+-------+
| id | value |
+------+-------+
| 1 | 3 |
| 5 | 3 |
| 4 | 3 |
+------+-------+
I'm using SQLite. thanks for any help.

You can do this using a subquery. Here is one way:
select t.*
from t
where t.value = (select max(value) from t);

Marking records with 1 on first occurence of unique value

I have a table that I'd like to add a column to that shows a 1 on the first occurrence of a given value for the record within the dataset.
So, for example, if I was using the ID field as where to look for unique occurrences, I'd want a "FirstOccur" column (like the one below) putting a 1 on the first occurrence of a unique ID value in the dataset and just ignoring (leaving as null) any other occurrence:
| ID | FirstOccur |
|------|--------------|
| 1 | 1 |
| 1 | |
| 1 | |
| 2 | 1 |
| 2 | |
| 3 | 1 |
| 4 | 1 |
| 4 | |
I have a working 2-step approach that first applies some ranking sql that will give me something like this:
| ID | FirstOccur |
|------|--------------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 3 | 1 |
| 4 | 1 |
| 4 | 2 |
..and I just apply some update SQL to null any value above 1 to get the desired result.
I was just wondering if there was a (simpler) one-hit approach.

Assuming you have a creation date or auto incremented id or something that specifies the ordering, you can do:
update t
set firstoccur = 1
where creationdate = (select min(creationdate)
from t as t2
where t2.id = t.id
);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count unique occurences of column in Big Query - sql

Below is for BigQuery Standard SQL #standardSQL SELECT id, value, COUNT(1) number FROM `project.dataset.table` GROUP BY id, value with result Row id value number 1 1 some 2 2 1 value 1 3 2 some 1 4 3 value 2 5 3 some 1

Related

Count rows in table that are the same in a sequence

SQL order by but repeat crescent numbers

Limit a sorted number of rows joined

SQL - only display rows that have the max value

Marking records with 1 on first occurence of unique value

Categories

Resources