Merging multiple rows into one using Postgresql - sql

I am trying to combine multiple rows with the same IDs to one.
My raw table looks like this:
ID | person_id | eur_amount
1 3 200
1 2 100
2 3 80
2 2 100
The output should look like this:
ID | person_1 | eur_amount_1 | person_2 | eur_amount_2 |
1 3 200 2 100
2 3 80 2 100
The max number of persons is the same. I already tried solving it with a multiple JOIN statements and the crosstab() function as mentioned here PostgreSQL Crosstab Query.
But I couldn't find a solution for this - does anyone know a good way to achive the desired output?
Thanks in advance!

You can do this using cross tab or conditional aggregation. But you need to identify the rows, using row_number(). For instance:
select id,
max(case when seqnum = 1 then person_id end) as person_id_1,
max(case when seqnum = 1 then eur_amount end) as eur_amount_1,
max(case when seqnum = 2 then person_id end) as person_id_2,
max(case when seqnum = 2 then eur_amount end) as eur_amount_2
from (select t.*,
row_number() over (partition by id order by id) as seqnum
from t
) t
group by id;

Related

query a table with multiple rows for same id, into single data row in results

I have a few tables like this where a person has multiple data rows. The IDs are sequential but do not always start at 1. Is there a way to have the results come out in a single data row for each person. I have a few tables like this and I ultimately would like to join them via CLIENT_ID, but I'm a bit stumped. Is this possible?
Using oracle sql.
CLIENT_ID
NAME
ID
ID_DESCRIPTION
5
joe
1
apple
5
joe
5
orange
68
brian
2
orange
68
brian
6
mango
68
brian
10
lemon
12
katie
3
watermelon
where the results look like this
CLIENT_ID
NAME
ID1
ID1_DESCRIPTION
ID2
ID2_DESCRIPTION
ID3
ID3_DESCRIPTION
5
joe
1
apple
5
orange
68
brian
2
orange
6
mango
10
lemon
12
katie
3
watermelon
If Pivot ist not available, this should do it:
Select
Client_id,
sum(case when id_description='apple' then 1 else 0 end) as Apples,
sum(case when id_description='orange' then 1 else 0 end) as Oranges...
[]etc.
from
t
group by Client_ID
Might need some minor tweaking as I wrote this just off the top of my head, but something like this should work. Will say this doesn't account for more than 3 rows per CLIENT_ID. For that, would need to do a dynamic pivot (plenty of online articles on this topic).
Pivoting Based on Order of Items
WITH cte_RowNum AS (
SELECT ROW_NUMBER() OVER (PARTITION BY CLIENT_ID ORDER BY ID) AS RowNum
,*
FROM YourTable
)
SELECT CLIENT_ID
,MAX(CASE WHEN RowNum = 1 THEN ID END) AS ID1
,MAX(CASE WHEN RowNum = 1 THEN [Description] END) AS ID1_DESCRIPTION
,MAX(CASE WHEN RowNum = 2 THEN ID END) AS ID2
,MAX(CASE WHEN RowNum = 2 THEN [Description] END) AS ID2_DESCRIPTION
,MAX(CASE WHEN RowNum = 3 THEN ID END) AS ID3
,MAX(CASE WHEN RowNum = 3 THEN [Description] END) AS ID3_DESCRIPTION
FROM cte_RowNum
GROUP BY CLIENT_ID;

SQL Find following session - different logic than cross join

I have a set of data that stores two types of sessions. It is mobile data usage versus wifi data usage.
ID Session_Type
1 Cell
2 WiFi
3 Cell
4 Cell
5 WiFi
.
.
.
.
1000 Cell
1001 WiFi
Desired Results
Cell_ID. Next_WiFi_sess_id
1 2
3 5
4 5
.
.
1000 1001
I have gotten to the extent of joining the table by itself and done such that an id is > than the wifi id, but I am sure if this is perfect solution. Can you do this in a Lag for better performance?
select a.id, b.id
from
table a
join table b
where a.id > min(b.id)
You can use window functions -- specifically, a cumulative minimum:
select t.*
from (select t.*,
min(case when session_type = 'WiFi' then id end) over (order by id rows between current row and unbounded following) as next_wifi_id
from t
) t
where session_type = 'Cell';
Here is one option that uses window functions: you can get the next WiFi session with a window min; the trick is to order the frame by descending id:
select id, next_wifi_id
from (
select t.*,
min(case when session_type = 'WiFi' then id end) over(order by id desc) next_wifi_id
from mytable t
) t
where session_type = 'Cell'
Demo on DB Fiddle - this is Postgres, but the behavior is the same in Hive.
id | next_wifi_id
-: | -----------:
1 | 2
3 | 5
4 | 5

Duplicate id rows with few columns to unique id row with many columns Oracle SQL

I have a pole table that can have one to four streetlights on it. Each row has a pole ID and the type (a description) of streetlight. I need the ID's to be unique with a column for each of the possible streetlights. The type/description can anyone of 26 strings.
I have something like this:
ID Description
----------------
1 S 400
1 M 200
1 HPS 1000
1 S 400
2 M 400
2 S 250
3 S 300
What I need:
ID Description_1 Description_2 Description_3 Description_4
------------------------------------------------------------------
1 S 400 M 200 HPS 1000 S 400
2 M 400 S 250
3 S 300
The order the descriptions get populated in the description columns is not important, e.g. for ID = 1 the HPS 1000 value could be in description column 1, 2, 3, or 4. So, long as all values are present.
I tried to pivot it but I don't think that is the right tool.
select * from table t
pivot (
max(Description) for ID in (1, 2, 3))
Because there are ~3000 IDs I would end up with a table that is ~3001 rows wide...
I also looked at this Oracle SQL Cross Tab Query But it is not quite the same situation.
What is the right way to solve this problem?
You can use row_number() and conditional aggregation:
select
id,
max(case when rn = 1 then description end) description_1,
max(case when rn = 2 then description end) description_2,
max(case when rn = 3 then description end) description_3,
max(case when rn = 4 then description end) description_4
from (
select t.*, row_number() over(partition by id order by description) rn
from mytable t
) t
group by id
This handles up to 4 descriptions per id. To handle more, you can just expand the select clause with more conditional max()s.

sql - select single ID for each group with the lowest value

Consider the following table:
ID GroupId Rank
1 1 1
2 1 2
3 1 1
4 2 10
5 2 1
6 3 1
7 4 5
I need an sql (for MS-SQL) select query selecting a single Id for each group with the lowest rank. Each group needs to only return a single ID, even if there are two with the same rank (as 1 and 2 do in the above table). I've tried to select the min value, but the requirement that only one be returned, and the value to be returned is the ID column, is throwing me.
Does anyone know how to do this?
Use row_number():
select t.*
from (select t.*,
row_number() over (partition by groupid order by rank) as seqnum
from t
) t
where seqnum = 1;

Is there a way to group this data?

Data Looks like -
1
2
3
1
2
2
2
3
1
5
4
1
2
So whenever there is a 1, it marks the beginning of a group which includes all the elements until it hits the next 1. So here,
1 2 3 - group 1
1 2 2 2 3 - group 2
and so on..
What would be the SQL query to show the average for every such group.
I could not figure out how to group them without using for loops or PLSQL code.
Result should look like two columns, one with the actual data and col 2 with the average value-
1 - avg value of 1,2 3
2
3
1 - avg value of 1,2,2,2,3
2
2
2
3
1 - avg value of 1,5,4
5
4
1 - avg value of 1,2
2
SQL tables represent unordered sets. There is no ordering, unless a column specifies the ordering. Let me assume that you have such a column.
You can identify the groups using a cumulative sum:
select t.*,
sum(case when t.col = 1 then 1 else 0 end) over (order by ?) as grp
from t;
? is the column that specifies the ordering.
You can then calculate the average using aggregation:
select grp, avg(col)
from (select t.*,
sum(case when t.col = 1 then 1 else 0 end) over (order by ?) as grp
from t
) t
group by grp;