I am trying to merge integer and numeric values from different SQL rows within the same table into one row so that they are summarized.
| ID | Count | Total Payment
1 | 1 | 5 | 10.99
2 | 1 | 3 | 4.86
3 | 2 | 8 | 19.88
4 | 2 | 2 | 15.99
5 | 2 | 5 | 8.45
6 | 3 | 4 | 12.98
7 | 3 | 10 | 40.42
As such I want to summarize the above rows into the below rows.
| ID | Count | Total Payment
1 | 1 | 8 | 15.85
2 | 2 | 15 | 44.32
3 | 3 | 14 | 53.40
How do I do this?
Thank you HonyBadger and Mathieu Guindon.
The correct code was:
SELECT [id], SUM([count]), SUM([total_payment])
FROM [table_name]
GROUP BY [id]
ORDER BY [count], [total_payment];
I have columns amount & assets. Column target should be the cumsum of amount, but the sum should be reset to the current amount if the previous assets was equal to zero.
Sample:
+--------+--------+--------+
| amount | assets | target |
+--------+--------+--------+
| 6 | 10 | 6 |
| 8 | 20 | 14 |
| -1 | 0 | 13 |
| 6 | 1 | 6 |
| -7 | 0 | -1 |
| 2 | 4 | 2 |
| -5 | 7 | -3 |
| 3 | 9 | 0 |
| 7 | 0 | 7 |
| 9 | 2 | 9 |
| 1 | 3 | 10 |
| -4 | 5 | 6 |
+--------+--------+--------+
Use GroupBy.cumsum with groups created by compare column by 0 with shifting Series.shift, processing first NaN and Series.cumsum:
g = df['assets'].eq(0).shift().bfill().cumsum()
#alternative
#g = df['assets'].eq(0).shift(fill_value=0).cumsum()
df['new'] = df.groupby(g)['amount'].cumsum()
print (df)
amount assets target new
0 6 10 6 6
1 8 20 14 14
2 -1 0 13 13
3 6 1 6 6
4 -7 0 -1 -1
5 2 4 2 2
6 -5 7 -3 -3
7 3 9 0 0
8 7 0 7 7
9 9 2 9 9
10 1 3 10 10
11 -4 5 6 6
| ID | CUSTOMER_ID | LAST_TRAN_DATE | is_active | NO_OF_ACC | |
|----|-------------|----------------|-----------|-----------|--|
| | | | | | |
| 1 | 1 | 3-Apr-15 | 0 | 5 | |
| 2 | 2 | 26-Mar-04 | 0 | 4 | |
| 3 | 2 | 25-Jul-14 | 0 | 4 | |
| 4 | 2 | 3-Jan-13 | 0 | 4 | |
| 5 | 2 | 28-Jun-13 | 0 | 4 | |
| 6 | 3 | 19-Nov-08 | 0 | 3 | |
| 7 | 3 | 21-May-09 | 0 | 3 | |
| 8 | 3 | 24-Feb-12 | 0 | 3 | |
| 9 | 1 | 1-Jun-16 | 0 | 5 | |
| 10 | 1 | 8-Apr-19 | 1 | 5 | |
| 11 | 1 | 25-Nov-17 | 0 | 5 | |
| 12 | 1 | 22-Feb-19 | 1 | 5 | |
My data is like above and I want to calculate no of active accounts for each customer id, create a new column and display them in front of each row.
I used
df.groupby(['CUSTOMER_ID', 'is_active']).size()
which gave me the following result.
| CUSTOMER_ID | is_active | |
|--------------|-----------|------|
| 1 | 0 | 3 |
| | 1 | 2 |
| 2 | 0 | 4 |
| 3 | 0 | 3 |
| dtype: int64 | | |
But I have no idea how to map them in front of each row by creating a new column.
Please help me
IIUC, you need transform .sum with an initial filter and .map to apply the operation to the entire index of the dataframe.
df["active_accounts"] = df["CUSTOMER_ID"].map(
df[df["is_active"].eq(1)].groupby("CUSTOMER_ID")["NO_OF_ACC"].sum()
)
print(df)
ID CUSTOMER_ID LAST_TRAN_DATE is_active Count_Column NO_OF_ACC \
2 1 1 3-Apr-15 0 5 5
3 2 2 26-Mar-04 0 4 4
4 3 2 25-Jul-14 0 4 4
5 4 2 3-Jan-13 0 4 4
6 5 2 28-Jun-13 0 4 4
7 6 3 19-Nov-08 0 3 3
8 7 3 21-May-09 0 3 3
9 8 3 24-Feb-12 0 3 3
10 9 1 1-Jun-16 0 5 5
11 10 1 8-Apr-19 1 5 5
12 11 1 25-Nov-17 0 5 5
13 12 1 22-Feb-19 1 5 5
active_accounts
2 10.0
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 10.0
11 10.0
12 10.0
CREATE TABLE products(
id integer,
country_id integer,
category_id smallint,
product_count integer
);
INSERT INTO products VALUES
(1,12,1,2),
(2,12,1, 4),
(3,12,2,1),
(4,45,5,2),
(5,45,5,1),
(6,45,8,5),
(7,3,1,3),
(8,3,1,3)
-----------------------------------------------------
id | country_id | category_id | product_count
-----------------------------------------------------
1 12 1 2
2 12 1 4
3 12 2 1
4 45 5 2
5 45 5 1
6 45 8 5
7 3 1 3
8 3 1 3
What i want to see is like that, I want to sum product_counts by grouping category_id under every grouped country_id;
---------------------------------------------------------------------
id | country_id | category_id | product_count | total_count
---------------------------------------------------------------------
1 12 1 2 6
2 12 1 4 6
3 12 2 1 1
4 45 5 2 3
5 45 5 1 3
6 45 8 5 5
7 3 1 3 6
8 3 1 3 6
I tried this, but it didn't help. This doesn't make the trick and bring summed value of product_count for each grouped category_id;
SELECT *,SUM(r.product_count) as sum FROM (
SELECT id,
country_id,
category_id,
product_count
FROM products
) r
GROUP BY r.country_id,r.category_id,r.product_count, r.id
ORDER BY r.country_id , r.category_id, r.product_count;
I grouped by both country_id,category_id to get requested result:
et=# select *,sum(product_count) over (partition by country_id,category_id) from products order by id;
id | country_id | category_id | product_count | sum
----+------------+-------------+---------------+-----
1 | 12 | 1 | 2 | 6
2 | 12 | 1 | 4 | 6
3 | 12 | 2 | 1 | 1
4 | 45 | 5 | 2 | 3
5 | 45 | 5 | 1 | 3
6 | 45 | 8 | 5 | 5
7 | 3 | 1 | 3 | 6
8 | 3 | 1 | 3 | 6
(8 rows)
I have table like this
a | b
_____
1 | 1
2 | 2
3 | 3
4 | 4
5 | 5
and i want the result like this
a | b | c
_________
1 | 1 | 1
2 | 2 | 3
3 | 3 | 6
4 | 4 | 10
5 | 5 | 15