How can I get the matrix for these tables? - sql

I have two tables here and need to produce a matrix for all combinations
Table 1
Brand Company ID
1 1 1
2 2 2
3 3 3
Table 2
Prod1 Prod2 Prod3 Prod4 Prod5
4 5 6 18 19
5 6 7 20 5
The result I'm trying to achieve
Result table:
Brand Company ID Prod1 Prod2 Prod3 Prod4 Prod5
1 1 1 4 5 6 18 19
1 1 1 5 6 7 20 5
2 2 2 4 5 6 18 19
2 2 2 5 6 7 20 5
I could have worked with this if they have some kind of ID just not to how to approach this to get the matrix.
Thank you

Not sure what happened to the third row from table1 in your query and why it isn't in the result, but I think you are looking for a cross join.
select Brand, Company, ID, Prod1, Prod2, Prod3, Prod4, Prod5
from table1
cross join table2
rextester demo: http://rextester.com/UOZ33372
returns (with added order by):
+-------+---------+----+-------+-------+-------+-------+-------+
| Brand | Company | ID | Prod1 | Prod2 | Prod3 | Prod4 | Prod5 |
+-------+---------+----+-------+-------+-------+-------+-------+
| 1 | 1 | 1 | 4 | 5 | 6 | 18 | 19 |
| 1 | 1 | 1 | 5 | 6 | 7 | 20 | 5 |
| 2 | 2 | 2 | 4 | 5 | 6 | 18 | 19 |
| 2 | 2 | 2 | 5 | 6 | 7 | 20 | 5 |
| 3 | 3 | 3 | 4 | 5 | 6 | 18 | 19 |
| 3 | 3 | 3 | 5 | 6 | 7 | 20 | 5 |
+-------+---------+----+-------+-------+-------+-------+-------+

Related

SQL How to summarize integer/numeric values on different rows

I am trying to merge integer and numeric values from different SQL rows within the same table into one row so that they are summarized.
| ID | Count | Total Payment
1 | 1 | 5 | 10.99
2 | 1 | 3 | 4.86
3 | 2 | 8 | 19.88
4 | 2 | 2 | 15.99
5 | 2 | 5 | 8.45
6 | 3 | 4 | 12.98
7 | 3 | 10 | 40.42
As such I want to summarize the above rows into the below rows.
| ID | Count | Total Payment
1 | 1 | 8 | 15.85
2 | 2 | 15 | 44.32
3 | 3 | 14 | 53.40
How do I do this?
Thank you HonyBadger and Mathieu Guindon.
The correct code was:
SELECT [id], SUM([count]), SUM([total_payment])
FROM [table_name]
GROUP BY [id]
ORDER BY [count], [total_payment];

Restart cumsum in Pandas with condition

I have columns amount & assets. Column target should be the cumsum of amount, but the sum should be reset to the current amount if the previous assets was equal to zero.
Sample:
+--------+--------+--------+
| amount | assets | target |
+--------+--------+--------+
| 6 | 10 | 6 |
| 8 | 20 | 14 |
| -1 | 0 | 13 |
| 6 | 1 | 6 |
| -7 | 0 | -1 |
| 2 | 4 | 2 |
| -5 | 7 | -3 |
| 3 | 9 | 0 |
| 7 | 0 | 7 |
| 9 | 2 | 9 |
| 1 | 3 | 10 |
| -4 | 5 | 6 |
+--------+--------+--------+
Use GroupBy.cumsum with groups created by compare column by 0 with shifting Series.shift, processing first NaN and Series.cumsum:
g = df['assets'].eq(0).shift().bfill().cumsum()
#alternative
#g = df['assets'].eq(0).shift(fill_value=0).cumsum()
df['new'] = df.groupby(g)['amount'].cumsum()
print (df)
amount assets target new
0 6 10 6 6
1 8 20 14 14
2 -1 0 13 13
3 6 1 6 6
4 -7 0 -1 -1
5 2 4 2 2
6 -5 7 -3 -3
7 3 9 0 0
8 7 0 7 7
9 9 2 9 9
10 1 3 10 10
11 -4 5 6 6

Group items in a data frame using a conditions

| ID | CUSTOMER_ID | LAST_TRAN_DATE | is_active | NO_OF_ACC | |
|----|-------------|----------------|-----------|-----------|--|
| | | | | | |
| 1 | 1 | 3-Apr-15 | 0 | 5 | |
| 2 | 2 | 26-Mar-04 | 0 | 4 | |
| 3 | 2 | 25-Jul-14 | 0 | 4 | |
| 4 | 2 | 3-Jan-13 | 0 | 4 | |
| 5 | 2 | 28-Jun-13 | 0 | 4 | |
| 6 | 3 | 19-Nov-08 | 0 | 3 | |
| 7 | 3 | 21-May-09 | 0 | 3 | |
| 8 | 3 | 24-Feb-12 | 0 | 3 | |
| 9 | 1 | 1-Jun-16 | 0 | 5 | |
| 10 | 1 | 8-Apr-19 | 1 | 5 | |
| 11 | 1 | 25-Nov-17 | 0 | 5 | |
| 12 | 1 | 22-Feb-19 | 1 | 5 | |
My data is like above and I want to calculate no of active accounts for each customer id, create a new column and display them in front of each row.
I used
df.groupby(['CUSTOMER_ID', 'is_active']).size()
which gave me the following result.
| CUSTOMER_ID | is_active | |
|--------------|-----------|------|
| 1 | 0 | 3 |
| | 1 | 2 |
| 2 | 0 | 4 |
| 3 | 0 | 3 |
| dtype: int64 | | |
But I have no idea how to map them in front of each row by creating a new column.
Please help me
IIUC, you need transform .sum with an initial filter and .map to apply the operation to the entire index of the dataframe.
df["active_accounts"] = df["CUSTOMER_ID"].map(
df[df["is_active"].eq(1)].groupby("CUSTOMER_ID")["NO_OF_ACC"].sum()
)
print(df)
ID CUSTOMER_ID LAST_TRAN_DATE is_active Count_Column NO_OF_ACC \
2 1 1 3-Apr-15 0 5 5
3 2 2 26-Mar-04 0 4 4
4 3 2 25-Jul-14 0 4 4
5 4 2 3-Jan-13 0 4 4
6 5 2 28-Jun-13 0 4 4
7 6 3 19-Nov-08 0 3 3
8 7 3 21-May-09 0 3 3
9 8 3 24-Feb-12 0 3 3
10 9 1 1-Jun-16 0 5 5
11 10 1 8-Apr-19 1 5 5
12 11 1 25-Nov-17 0 5 5
13 12 1 22-Feb-19 1 5 5
active_accounts
2 10.0
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
10 10.0
11 10.0
12 10.0

sum values grouping sets of columns in postgresql 9.6

CREATE TABLE products(
id integer,
country_id integer,
category_id smallint,
product_count integer
);
INSERT INTO products VALUES
(1,12,1,2),
(2,12,1, 4),
(3,12,2,1),
(4,45,5,2),
(5,45,5,1),
(6,45,8,5),
(7,3,1,3),
(8,3,1,3)
-----------------------------------------------------
id | country_id | category_id | product_count
-----------------------------------------------------
1 12 1 2
2 12 1 4
3 12 2 1
4 45 5 2
5 45 5 1
6 45 8 5
7 3 1 3
8 3 1 3
What i want to see is like that, I want to sum product_counts by grouping category_id under every grouped country_id;
---------------------------------------------------------------------
id | country_id | category_id | product_count | total_count
---------------------------------------------------------------------
1 12 1 2 6
2 12 1 4 6
3 12 2 1 1
4 45 5 2 3
5 45 5 1 3
6 45 8 5 5
7 3 1 3 6
8 3 1 3 6
I tried this, but it didn't help. This doesn't make the trick and bring summed value of product_count for each grouped category_id;
SELECT *,SUM(r.product_count) as sum FROM (
SELECT id,
country_id,
category_id,
product_count
FROM products
) r
GROUP BY r.country_id,r.category_id,r.product_count, r.id
ORDER BY r.country_id , r.category_id, r.product_count;
I grouped by both country_id,category_id to get requested result:
et=# select *,sum(product_count) over (partition by country_id,category_id) from products order by id;
id | country_id | category_id | product_count | sum
----+------------+-------------+---------------+-----
1 | 12 | 1 | 2 | 6
2 | 12 | 1 | 4 | 6
3 | 12 | 2 | 1 | 1
4 | 45 | 5 | 2 | 3
5 | 45 | 5 | 1 | 3
6 | 45 | 8 | 5 | 5
7 | 3 | 1 | 3 | 6
8 | 3 | 1 | 3 | 6
(8 rows)

Summary of row values according to summation of n number

I have table like this
a | b
_____
1 | 1
2 | 2
3 | 3
4 | 4
5 | 5
and i want the result like this
a | b | c
_________
1 | 1 | 1
2 | 2 | 3
3 | 3 | 6
4 | 4 | 10
5 | 5 | 15