How to count elements based on a unique value in BigQuery

How to count elements based on a unique value in BigQuery - sql

I have this table 1 in Bigquery and I need to count the elements in column segments and category that correspond to a single user id. Desired outcome presented in table 2. I haven't been able to figure out how to do it... maybe transforming those elements to arrays?
TABLE 1
TABLE 2

Use below
select `desc`, count(distinct user_id) distinct_user_id
from (
select category as `desc`, user_id from your_table
union all
select segment, user_id from your_table,
unnest(split(segment, ';')) segment
)
where `desc` != ''
group by `desc`

Related

Using Derby SQL to calculate value for histogram

I have a table with various SKU in totes.
The table is totecontents with below columns:
ToteID
SKU
Each Tote can contain a maximum of 6 SKUs. (programmatically constrained)
select toteid, count(*) as qtypertote
from totecontents
group by toteid;
gives me a list of totes with the number of skus in each.
I now want to get to a table with following result
SkuCount Occurences where each row would have the ordinal value (1 through 6 ) and then the number of occurences of that value.
My efforts included the following approach
select count(*)
from
( select toteid, count(*) as qtypertote
from totecontents
group by toteid)
group by qtypertote;

Stung by the comments I performed more research. This works:
SELECT CountOfskus, COUNT(1) groupedCount
FROM
( SELECT COUNT(*) as countofskus, toteid
FROM totecontents
Group By toteid
) MyTable
GROUP BY countofskus;

how can I select rows that column does NOT have more than 1 value?

I am very new to SQL and I am wondering how to solve this issue. For example, my table looks as follows:
As you see in the table item_id 1 appears in both city_id 1 and 2, so does the item_id 4, but I want to get all the items where appears only in one city_id.
In this example, these would be item_id 2 (appearing only in city_id 2) and item_id 3 (appearing in city_id 1).

Use aggregation on item_id and count distinct values of city_id. The having clause can be used to filter on aggregates.
select item_id from mytable group by id having count(distinct city_id) = 1

You can use the following query:
SELECT item_id
FROM table_name
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
In case you want to see the city_id to you can use this query:
SELECT item_id, MIN(city_id) AS city_id
FROM example
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
Since there is only one city_id you can use MIN or MAX to get the id.
demo on dbfiddle.uk

You want all the id where they have only one distinct city:
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
It works by counting all the different values that city_id has for the same item_id. For those item ids where they repeat a lot, but the city_id is always the same the count of unique values in the city id is 1, and we can look for these using a HAVING clause. "Having" is like a where clause that runs after a GROUP BY operation is completed. It is the conceptual equivalent of this:
SELECT item_id
FROM
(
SELECT item_id, count(distinct city_id) as cdci
FROM table
GROUP BY item_id
) x
WHERE cdci = 1
If you want the city id too you can either get the MAX city (because in this case there is only one city so it's safe to do):
SELECT item_id, MAX(city_id) as city_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
or you could join this query back to the item table as a subquery:
SELECT t.*
(
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
) x
INNER JOIN
table t
ON x.item_id = t.item_id
This technique is the more general process for performing a group by that finds some particular set of rows, then bringing in the rest of the data from that row. You cant always stick every other column you want in a MAX because it will mix row data up, and you can't put the extra columns in your group by because that will subdivide what you're grouping on, giving the wrong results. Doing the group as a subquery and joining it back is a typical way to get all the row data when you have to group it to find which rows are interesting
In your case this form of query will bring all the duplicated rows (whereas the group by/max won't). If you don't want the duplicate rows you can make the top line SELECT DISTINCT t.* but don't make a habit of slapping distinct in to get rid of duplicated rows; if your tables don't have duplicates to start with but suddenly after you wrote a JOIN you got duplicated rows, google fornwhat a Cartesian product is in database queries and how to prevent it

You just need a group by on item id with having
Select item_id from table group by
item_id having count(distinct city_id)
=1
Also, if you want to have majority of same no of rows as input then
Select item_id, city, rank()
over(partition by item_id order by city)
rn
From table where rn=1;

How to delete the duplicate data in table (Postgres)

I want to delete the duplicated data in a table , I know there is a way use
SELECT
fruit,
COUNT( fruit )
FROM
basket
GROUP BY
fruit
HAVING
COUNT( fruit )> 1
ORDER BY
fruit;
to find them , buy I need to determine every column's value is equal , which means tableA.* = tableA.* (except id , id is the auto-increment primary key )
and I tried this:
SELECT
*,
COUNT( * )
FROM
myTable
GROUP BY
*
HAVING
COUNT( * )> 1
ORDER BY
id;
but it says I can't use GROUP BY * , so how can I find & delete the duplicated data(need every column's value is equal except id)?

using
SELECT * DISTINCT
DISTINCT remove duplicated result

You need to try something similar to be below query. You apply PARTITION BY for the columns other than Id (as it is incrementing unique value). PARTITION BY should be applied for columns, for which you want to check duplicates.
Also refer to Row_Number in Postgres & Common Table expression in Postgres
WITH DuplicateTableRows AS
(
SELECT Id, Row_Number() OVER (PARTITION BY col1, col2... ORDER BY Id)
FROM
Table1
)
DELETE FROM Table1
WHERE Id IN (SELECT Id FROM Table1 WHERE row_number > 1)

You can do this using JSON:
select (to_jsonb(b) - 'id')
from basket b
group by 1
having count(*) > 1;
The result is as JSON. Unfortunately, to extract the values back into a record, you need to list the columns individually.

DB2 - how to find count multiple occurrences of column value

Im new to DB2 , and tried based on some similar posts, I have a table where I need to find the count of IDs based on where status=P and
the count of(primary=1) more than once.
so my result should be 2 here - (9876,3456)
Tried:
SELECT id, COUNT(isprimary) Counts
FROM table
GROUP BY id
HAVING COUNT(isprimary)=1;

Try the query below:
select ID as IDs,Count(isPrimary) as isPrimary
From Table
where Status = 'p'
Group by ID
Having Count(isPrimary) >1

You are close, I think all you need to do is to add a where clause like:
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
EDIT: if you need to count only the distinct IDs, then try:
SELECT COUNT(t.ID) FROM
(
SELECT id, COUNT(*) as Counted
FROM table
WHERE PrimaryFlag = 1
AND[status] = 'P'
GROUP BY id
) as t

SQL Server : Getting distinct count on every column in a large view

I have a large SQL Server 2012 database with a couple of views I need to analyse.
What I want to know for each view is the number of unique values of each column in the view. I could not find any script yet that would give me this.
So the input should be the view name and the output would be two rows like:
Column Uniques
accountid 200
accountname 178
numberofemp 23
telephone 154
notusedyet 0

You need to use COUNT() (an aggregate function) with Distinct to count only the unique values.
SELECT [column], COUNT(DISTINCT value) [Uniques]
FROM tableName
GROUP BY [column]

Get a distinct count for each column via count(distinct [ColA]) for each column you want to count (no group by). You can then unpivot to get the tabular format you desire. Here's an example:
;with DistinctColumnCount( Id, Description )
as
(
select
count(distinct Id) Id
, count(distinct Description) Description
from
EntityB
)
SELECT CountColumn, [Count].[DistinctCount]
FROM
DistinctColumnCount
unpivot
( DistinctCount for CountColumn in ( Id, [Description] ) ) as [Count]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count elements based on a unique value in BigQuery - sql

I have this table 1 in Bigquery and I need to count the elements in column segments and category that correspond to a single user id. Desired outcome presented in table 2. I haven't been able to figure out how to do it... maybe transforming those elements to arrays? TABLE 1 TABLE 2

Use below select `desc`, count(distinct user_id) distinct_user_id from ( select category as `desc`, user_id from your_table union all select segment, user_id from your_table, unnest(split(segment, ';')) segment ) where `desc` != '' group by `desc`

Related

Using Derby SQL to calculate value for histogram

how can I select rows that column does NOT have more than 1 value?

How to delete the duplicate data in table (Postgres)

DB2 - how to find count multiple occurrences of column value

SQL Server : Getting distinct count on every column in a large view

Categories

Resources