Get the number of duplicate values in separate columns

Get the number of duplicate values in separate columns - sql

Im have table
+----+-----------+-----------+--------+
| id | peer | whom | action |
+----+-----------+-----------+--------+
| 1 | 200000001 | 213321213 | 0 |
| 2 | 200000001 | 124321213 | 1 |
| 3 | 200000001 | 124321213 | 1 |
| 4 | 200000001 | 124789123 | 1 |
+----+-----------+-----------+--------+
I need to get how many pluses and minuses the user received in total, those should result in the following table
action 0 is minus, action 1 is plus
+-----------+------+-------+
| whom | plus | minus |
+-----------+------+-------+
| 213321213 | 2 | 1 |
| 124789123 | 1 | 0 |
+-----------+------+-------+

With conditional aggregation:
select
whom,
sum(action = 1) plus,
sum(action = 0) minus
from tablename
where peer = ?
group by whom

You seem to want a simple aggregation:
select whom, sum(action) as plus, sum(1 - action) as minus
from t
group by whom;

Related

SQL return only rows where value exists multiple times and other value is present

I have a table like this in MS SQL SERVER
+------+------+
| ID | Cust |
+------+------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | A |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 3 | B |
| 3 | C |
| 3 | C |
+------+------+
I don't know the values in column "Cust" and I want to return all rows where the value of "Cust" appears multiple times and where at least one of the "ID" values is "1".
Like this:
+------+------+
| ID | Cust |
+------+------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | A |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 3 | B |
+------+------+
Any ideas? I can't find it.

You may use COUNT window function as the following:
SELECT ID, Cust
FROM
(
SELECT ID, Cust,
COUNT(*) OVER (PARTITION BY Cust) cn,
COUNT(CASE WHEN ID=1 THEN 1 END) OVER (PARTITION BY Cust) cn2
FROM table_name
) T
WHERE cn>1 AND cn2>0
ORDER BY ID, Cust
COUNT(*) OVER (PARTITION BY Cust) to check if the value of "Cust" appears multiple times.
COUNT(CASE WHEN ID=1 THEN 1 END) OVER (PARTITION BY Cust) to check that at least one of the "ID" values is "1".
See a demo.

Get last value in sequence

I am trying to return the last value in a sequence of a particular event, my inital thought was to use LAST_VALUE() but I can't get this to work. I could do this with subqueries and joins however is there a window function that would give this result far easier?
Right now the query is pulling the max amount but what I want is the last amount based on the seq column
SQL Fiddle
Data
| PaymentID | Description | Result | Seq |
|-----------|----------------------|--------|-----|
| 1 | Entered Payment Page | Yes | 1 |
| 1 | Amount Entered | 50 | 2 |
| 1 | Amount Entered | 60 | 3 |
| 1 | Amount Entered | 20 | 4 |
| 1 | Amount Confirmed | Yes | 5 |
| 2 | Entered Payment Page | Yes | 1 |
| 2 | Amount Entered | 100 | 2 |
| 2 | Amount Confirmed | Yes | 3 |
| 3 | Entered Payment Page | Yes | 1 |
| 3 | Amount Entered | 4 | 2 |
| 3 | Amount Confirmed | No | 3 |
| 3 | Amount Entered | 8 | 4 |
| 3 | Amount Confirmed | Yes | 5 |
Current Query Result
| PaymentID | InPayment | Amount | Confirmed |
|-----------|-----------|--------|-----------|
| 1 | Yes | 60 | Yes |
| 2 | Yes | 100 | Yes |
| 3 | Yes | 8 | Yes |
Desired result
| PaymentID | InPayment | Amount | Confirmed |
|-----------|-----------|--------|-----------|
| 1 | Yes | 20 | Yes |
| 2 | Yes | 100 | Yes |
| 3 | Yes | 8 | Yes |

You can use row_number() and conditional aggregation:
select paymentid,
max(case when description = 'Entered Payment Page' then result end) as inpayment,
max(case when description = 'Amount Entered' then result end) as amount_entered,
max(case when description = 'Amount Confirmed' then result end) as amount_confirmed
from (select t.*,
row_number() over (partition by paymentid, description order by seq desc) as seqnum
from paymentinfo t
) t
where seqnum = 1
group by paymentid;
Here is a SQL Fiddle.

SQL create a new field sessions given the value of another field

I have problems approaching the following task.
Given a table like
| user_id | hit_id | new_session |
|---------------|--------------|--------------|
| 1 | 1 | 0 |
| 1 | 2 | 0 |
| 1 | 3 | 1 |
| 1 | 4 | 0 |
| ... | ... | ... |
| 5 | 19 | 0 |
where
the combination of user_id and hit_id is unique
new_session is a boolean that determines if the hit started a new session or not for this particular user
I want to create a new column, session_number that splits hit_ids into sessions, taking into account that:
the first row for each user_id, once ordered by hit_id asc gets a value of 1 for the new column session_number
as long as new_session is 0, the value of session_number stays the same
when new_session is 1, I have to sum up 1 to the actual session count
the logic works over a partition by user_id ordered by hit_id asc, and therefore once the user_id changes, the session count is reset
I have created a db-fiddle with some example data
The expected output for user_id = 1 (which cover multiple corner cases) would be:
| user_id | hit_id | new_session | session_number |
|---------------|--------------|--------------|----------------|
| 1 | 1 | 0 | 1 |
| 1 | 2 | 0 | 1 |
| 1 | 3 | 1 | 2 |
| 1 | 4 | 0 | 2 |
| 1 | 5 | 0 | 2 |
| 1 | 6 | 1 | 3 |
| 1 | 7 | 0 | 3 |
| 1 | 8 | 1 | 4 |
| 1 | 8 | 1 | 5 |
I have tried with a combination of lag(), rank(), and dense_rank(), but I always find a corner case that makes all the attempts unsuccessful. Additionally, I am totally sure that there is a very easy approach for that that I am not taking into account.

You can use a cumulative sum:
select pv.*,
(1 + sum(new_session) over (partition by user_id order by hit_id)) as session_number
from pageviews pv;
Here is a db-fiddle.

Exclude select duplication

There is a table: prov_dl
| ID | Code | Value |
+----+----------+-------+
| 2 | PRC | 0,1701|
| 2 | Stad | 3 |
Data is stored in this form, that is,
there are several entries by code
You need to pull the data in this form:
| ID | Stadya | Percent |
+----+----------+-----------+
| 2 | 3 | 0,1701 |
I try this:
select id,
case when code='Stad' then Value end Stadya,
case when code='PRC' then Value end Percent
from prov_dl
| ID | Stadya | Percent|
+----+----------+--------+
| 2 | | 0,1701 |
| 2 | 3 | |

use max()
select id,
max(case when code='Stad' then Value end) as Stadya,
max(case when code='PRC' then Value end) as Percent
from prov_dl group by id

Limit a sorted number of rows joined

I have two tables, A and B, and a join table M. I want to, for each A.id, get the top 2 B.id's sorting on the value in table M, producing the results below. This is running on an Azure SQL database
Table A Table M Table B
+-----+ +-----+-----+-------+ +-----+
| Id | | AId | BId | Value | | Id |
+-----+ +-----+-----+-------+ +-----+
| 1 | | 1 | 3 | 4 | | 1 |
| 2 | | 1 | 2 | 3 | | 2 |
| 3 | | 3 | 2 | 3 | | 3 |
| 4 | | 3 | 5 | 6 | | 4 |
+-----+ | 3 | 3 | 4 | | 5 |
| 4 | 1 | 2 | +-----+
| 4 | 2 | 1 |
| 4 | 4 | 3 |
+-----+-----+-------+
Result
+-----+-----+-------+
| AId | BId | Value |
+-----+-----+-------+
| 1 | 3 | 4 |
| 1 | 2 | 3 |
| 3 | 5 | 6 |
| 3 | 3 | 4 |
| 4 | 1 | 2 |
| 4 | 4 | 3 |
+-----+-----+-------+
I know that I can select all the M.AId rows where they equal 1, sort it, and limit by 2, but I need to do this for every row in Table A. I've made an attempt to use group by, but I wasn't sure how to sort and limit it. I've also tried to search for resources associated with this issue but I couldn't find any resources.
(I also wasn't sure how to word the title for this issue)

You can just use ROW_NUMBER:
SELECT
AId, BId, Value
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY AId ORDER BY Value DESC)
FROM M
) t
WHERE Rn <= 2

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get the number of duplicate values in separate columns - sql

With conditional aggregation: select whom, sum(action = 1) plus, sum(action = 0) minus from tablename where peer = ? group by whom

You seem to want a simple aggregation: select whom, sum(action) as plus, sum(1 - action) as minus from t group by whom;

Related

SQL return only rows where value exists multiple times and other value is present

Get last value in sequence

SQL create a new field sessions given the value of another field

Exclude select duplication

Limit a sorted number of rows joined

Categories

Resources