I am trying to add a new column based on a condition in a group.
Could we do something like
BOOL() OVER (PARTITION BY id 'D' in val)
That is something like GROUP BY id and check if the value 'D" val column
Input:
-------------
| id | val |
------------|
| 1 | A |
| 1 | B |
| 1 | D |
| 2 | B |
| 2 | C |
| 2 | A |
Output
-------------------
| id | val | res |
------------|-----|
| 1 | A | 1 |
| 1 | B | 1 |
| 1 | D | 1 |
| 2 | B | 0 |
| 2 | C | 0 |
| 2 | A | 0 |
You didn't specify your DBMS, but in standard ANSI SQL, you can use a filter() clause:
count(*) filter (where val = 'D') over (partition by id) > 0
In Postgres, you can use bool_or() for this:
select t.*, bool_or(val = 'D') over(partition by id) res
from mytable t
Demo on DB Fiddle:
id | val | res
-: | :-- | :--
1 | A | t
1 | B | t
1 | D | t
2 | B | f
2 | C | f
2 | A | f
This gives you a boolean result. If you want it as an integer value instead, then:
(bool_or(val = 'D') over(partition by id))::int
Related
I have a table like this in MS SQL SERVER
+------+------+
| ID | Cust |
+------+------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | A |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 3 | B |
| 3 | C |
| 3 | C |
+------+------+
I don't know the values in column "Cust" and I want to return all rows where the value of "Cust" appears multiple times and where at least one of the "ID" values is "1".
Like this:
+------+------+
| ID | Cust |
+------+------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | A |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 3 | B |
+------+------+
Any ideas? I can't find it.
You may use COUNT window function as the following:
SELECT ID, Cust
FROM
(
SELECT ID, Cust,
COUNT(*) OVER (PARTITION BY Cust) cn,
COUNT(CASE WHEN ID=1 THEN 1 END) OVER (PARTITION BY Cust) cn2
FROM table_name
) T
WHERE cn>1 AND cn2>0
ORDER BY ID, Cust
COUNT(*) OVER (PARTITION BY Cust) to check if the value of "Cust" appears multiple times.
COUNT(CASE WHEN ID=1 THEN 1 END) OVER (PARTITION BY Cust) to check that at least one of the "ID" values is "1".
See a demo.
I have a dataset looks like this
| Country | id |
-------------------
| a | 5 |
| a | 1 |
| a | 2 |
| b | 1 |
| b | 5 |
| b | 4 |
| b | 7 |
| c | 5 |
| c | 1 |
| c | 2 |
and i need a query which returns 2 random values from where country in ('a', 'c'):
| Country | id |
------------------
| a | 2 | -- Two random rows from Country = 'a'
| a | 1 |
| c | 1 |
| c | 5 | --Two random rows from Country = 'c'
This should work:
select Country, id from
(select Country,
id,
row_number() over(partition by Country order by rand()) as rn
from table_name
) t
where Country in ('a', 'c') and rn <= 2
Replace rand() with random() if you're using Postgres or newid() in SQL Server.
Suppose we have the following input table
cat | value | position
------------------------
1 | A | 1
1 | B | 2
1 | C | 3
1 | D | 4
2 | C | 1
2 | B | 2
2 | A | 3
2 | D | 4
As you can see, the values A,B,C,D change position in each category, I want to track this change by adding a column change in front of each value, the output should look like this:
cat | value | position | change
---------------------------------
1 | A | 1 | NULL
1 | B | 2 | NULL
1 | C | 3 | NULL
1 | D | 4 | NULL
2 | C | 1 | 2
2 | B | 2 | 0
2 | A | 3 | -2
2 | D | 4 | 0
For example C was in position 3 in category 1 and moved to position 1 in category 2 and therefore has a change of 2. I tried inmplementing this using the LAG() function with an offset of 4 but I failed, how can I write this query.
Use lag() - with the proper partition by clause:
select
t.*,
lag(position) over(partition by value order by cat) - position change
from mytable t
You can use lag and then order by to maintain original order. Here is the demo.
select
*,
lag(position) over (partition by value order by cat) - position as change
from yourTable
order by
cat, position
output:
| cat | value | position | change |
| --- | ----- | -------- | ------ |
| 1 | A | 1 | null |
| 1 | B | 2 | null |
| 1 | C | 3 | null |
| 1 | D | 4 | null |
| 2 | C | 1 | 2 |
| 2 | B | 2 | 0 |
| 2 | A | 3 | -2 |
| 2 | D | 4 | 0 |
I think you just want lag() with the right partition by:
select t.*,
(lag(position) over (partition by value order by cat) - position) as change
from t;
Here is a db<>fiddle.
How can I write an SQL query (DB2) that will run on this table:
| A | B | C | V |
+---+---+---+----+
| | | | |
| 1 | 1 | 1 | k1 |
| | | | |
| 1 | 1 | 2 | k1 |
| | | | |
| 1 | 2 | 3 | k2 |
| | | | |
| 2 | 3 | 4 | k2 |
| | | | |
| 1 | 2 | 3 | k3 |
| | | | |
| 1 | 3 | 5 | k3 |
| | | | |
| 1 | 4 | 6 | k3 |
+---+---+---+----+
and produce this result
+---+---+---+----+
| A | B | C | V |
+---+---+---+----+
| | | | |
| 1 | 1 | 2 | k1 |
| | | | |
| 2 | 3 | 4 | k2 |
| | | | |
| 1 | 4 | 6 | k3 |
+---+---+---+----+
that is it will select rows based on a max of a "tuple" (A,B,C) in a group:
or for two rows R1, R2 :
if R1.A <> R2.A return Row where A = Max(R1.A,R2.A)
if R2.B <> R2.B return Row where B = Max(R1.B,R2.B)
return Row where C = Max(R1.C,R2.C)
I think row_number() does what you want -- if by "group" you mean V:
select t.*
from (select t.*,
row_number() over (partition by v order by a desc, b desc, c desc) as seqnum
from t
) t
where seqnum = 1;
I have a table like this:
+-----+-------+-----+
| id | value | ... |
+-----+-------+-----+
| 1 | A | ... |
| 1 | B | ... |
| 1 | C | ... |
| 2 | B | ... |
| 2 | C | ... |
| 3 | A | ... |
| 3 | C | ... |
| 4 | B | ... |
| 4 | A | ... |
| ... | ... | ... |
+-----+-------+-----+
I want to limit this to just ids that have both rows with A and rows with B in the value column. In this case, the table would look like this:
+-----+-------+-----+
| id | value | ... |
+-----+-------+-----+
| 1 | A | ... |
| 1 | B | ... |
| 1 | C | ... |
| 4 | B | ... |
| 4 | A | ... |
| ... | ... | ... |
+-----+-------+-----+
… because neither id 2 nor 3 had both A and B in the value column.
Is there a succinct way to locate these IDs?
select id, value
from t
where id in (
select id
from t
group by id
having bool_or(value = 'A') and bool_or(value = 'B')
)
or
select id, value
from t t0
where
exists (
select 1
from t
where id = t0.id and value = 'A'
) and
exists (
select 1
from t
where id = t0.id and value = 'B'
)
One way to do this is to count the distinct number of a/bs an id has:
SELECT *
FROM mytable
WHERE id IN (SELECT id
FROM mytable
WHERE value in ('a', 'b')
GROUP BY id
HAVING COUNT(DISTINCT value) = 2)