Count distinct by link on sql?

Count distinct by link on sql? - sql

I'm trying to count distinct by the link between two columns.
Here is the example.
rownum
type
id
1
A
a
2
A
b
3
B
b
4
B
c
5
C
c
6
C
d
If I count distinct by type column, it returns 3. However, what I'd like to do is to consider rownum 2 and 3, 4 and 5 are not distinctive because they got the same value on id column.
To rephrase,
type
array of id
A
a, b
B
b, c
C
c, d
Since A and B got same b, and B and C got same c on their arrays, it would return 1 as a result.
I have no idea where to start. Would appreciate if I can get any hint or something.

Consider below:
you might use STRING_AGG
WITH TMP_TBL AS
(
SELECT 1 AS ROWNUM, 'A' AS TYPE, 'a' AS ID UNION ALL
SELECT 2,'A','b' UNION ALL
SELECT 3,'B','b' UNION ALL
SELECT 4,'B','b' UNION ALL
SELECT 5,'C','c' UNION ALL
SELECT 6,'C','d'
);
SELECT DISTINCT TYPE,N_ID
FROM
(
SELECT TYPE,STRING_AGG(ID)OVER(PARTITION BY TYPE) AS N_ID FROM TMP_TBL
)

Related

How to get the record (by group) with max value? (Big Query)

Consider the following data
Column A
Column B
Column C
A
t
9
A
d
12
A
l
8
B
x
7
B
z
9
B
q
6
How do I extract the record with the max value in Col C for each value in Col A.
So the expected result would be...
Column A
Column B
Column C
A
d
12
B
z
9
Trying
select ColA, max(ColC) from table group by ColA
doesn't provide the value in ColB.
I'm sure there is a simple and elegant solution here, but it's escaping me....

Consider below approach
select * from your_table
qualify 1 = row_number() over(partition by colA order by colC desc)
if applied to sample data in y our question - output is

window function can be used here:
with tbl as (
Select "A" as colA, "t" as colB, 9 as colC
union all select "A","d", 12
union all select "A","dd", 12
union all select "A","l", 8
union all select "B","x", 7
union all select "B","z", 9
union all select "B","q", 6
)
select
colA,
max(colC),
any_Value(colB_max)
from (select *, first_value(colB) over (partition by colA order by colC desc) as colB_max from tbl )
group by 1
I added an entry for column A is "A". Then there are two entries for the max value of column C. The selected value for it from column B is more or less random.

How to select the total count?

I have the following two tables (postgresql)
tableA
a b
----------
1 A
2 B
table B
c b
----------
1 A
3 B
I want to find out the same number of columns b, but if column a and column c are the same, count one.
So the final result should be
b count
----------
A 1
B 2
How should I write sql?

You need union all for the 2 tables and then group by b to count distinct values of a:
select t.b, count(distinct t.a) counter
from (select * from tablea union all select * from tableb) t
group by t.b

Aggregate by column b and take the distinct count of column a:
SELECT b, COUNT(DISTINCT a) AS count
FROM yourTable
GROUP BY b
ORDER BY b;

Sum col2 of two tables based on duplicate matching col1

I have 2 tables both structured as (id, views)
Table 1:
id views
A 1
B 2
B 3
C 3
C 4
D 4
Table 2:
id views
C 1
D 3
D 4
E 5
E 7
F 8
I'm looking to sum views of ids that are both in table 1 and 2 (id C and D) in this case so the output would be:
Table 3:
id views
C 8
D 11

You could use the following query in your case :
select a.id,sum(a.views) from ( select * from table1 union table2 ) as a group by id;

select id,sum(views) from (select * from table1 union all select * from table2)a where a.id="C" or a.id="D" group by id;

Remove multiple entries for a column

I need two columns A and B but of them A has repeated values and B has single unique values. I have to fetch only those values of A which has max(C) value. C is another column.

You can use ROW_NUMBER.
ROW_NUMBER
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.
PARTITION BY value_expression
Divides the result set produced by the
FROM clause into partitions to which the ROW_NUMBER function is
applied. value_expression specifies the column by which the result set
is partitioned. If PARTITION BY is not specified, the function treats
all rows of the query result set as a single group.
ORDER BY
The ORDER BY clause determines the sequence in which the rows are assigned their unique ROW_NUMBER within a specified
partition. It is required.
Sample of ROW_NUMBER in your case:
SELECT A, B
FROM
(
SELECT ROW_NUMBER() OVER(PARTITION BY A ORDER BY C DESC) AS RowNum, A, B, C
FROM TableName
)
WHERE RowNum = 1

Use Row_Number Analytic function to do this
select A,B
from
(
select row_number() over(partition by A order by C desc)rn,A,B,C
from yourtable
)
where RN=1

An alternative to #NoDisplayName's solution is to use keep dense_rank first/last:
with your_table as (select 1 a, 3 b, 10 c from dual union all
select 1 a, 2 b, 20 c from dual union all
select 1 a, 1 b, 30 c from dual union all
select 2 a, 4 b, 40 c from dual union all
select 2 a, 5 b, 60 c from dual union all
select 2 a, 3 b, 60 c from dual union all
select 3 a, 6 b, 70 c from dual union all
select 4 a, 2 b, 80 c from dual)
select a,
max(b) keep (dense_rank first order by c desc) b,
max(c) max_c
from your_table
group by a;
A B MAX_C
---------- ---------- ----------
1 1 30
2 5 60
3 6 70
4 2 80

Using the INTERSECT keyword get those rows which have maximum value of ColC for the ColA.
select ColA, ColB from
(
select ColA, ColB, max(colC) from Tabl
group by ColA, ColB
intersect
select ColA, ColB, ColC from Tabl
) as A

Use a calculated field in the where clause

Is there a way to use a calculated field in the where clause?
I want to do something like
SELECT a, b, a+b as TOTAL FROM (
select 7 as a, 8 as b FROM DUAL
UNION ALL
select 8 as a, 8 as b FROM DUAL
UNION ALL
select 0 as a, 0 as b FROM DUAL
)
WHERE TOTAL <> 0
;
but I get ORA-00904: "TOTAL": invalid identifier.
So I have to use
SELECT a, b, a+b as TOTAL FROM (
select 7 as a, 8 as b FROM DUAL
UNION ALL
select 8 as a, 8 as b FROM DUAL
UNION ALL
select 0 as a, 0 as b FROM DUAL
)
WHERE a+b <> 0
;

Logically, the select clause is one of the last parts of a query evaluated, so the aliases and derived columns are not available. (Except to order by, which logically happens last.)
Using a derived table is away around this:
select *
from (SELECT a, b, a+b as TOTAL FROM (
select 7 as a, 8 as b FROM DUAL
UNION ALL
select 8 as a, 8 as b FROM DUAL
UNION ALL
select 0 as a, 0 as b FROM DUAL)
)
WHERE TOTAL <> 0
;

This will work...
select *
from (SELECT a, b, a+b as TOTAL FROM (
select 7 as a, 8 as b FROM DUAL
UNION ALL
select 8 as a, 8 as b FROM DUAL
UNION ALL
select 0 as a, 0 as b FROM DUAL)
) as Temp
WHERE TOTAL <> 0;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Count distinct by link on sql? - sql

Related

How to get the record (by group) with max value? (Big Query)

How to select the total count?

Sum col2 of two tables based on duplicate matching col1

Remove multiple entries for a column

Use a calculated field in the where clause

Categories

Resources