Transforming a dataset containing bank transactions into SQL Server - sql

I would like to transform a dataset containing some bank transactions.
The ultimate goal is to make a report in Power BI to track daily expenses.
For this, I have the following situation that gives me a headache. :)
This is an example:
Date
Transaction_Details
Debit
Credit
21 Jan 2023
Transfer HomeBank
500
NULL
NULL
Reference: 4944
NULL
NULL
NULL
Beneficiary: David John
NULL
NULL
NULL
In Bank Account: RO97INGB1111333380218
500
NULL
20 Jan 2023
POS Payment
36
NULL
NULL
Card number: xxxx xxxx xxxx 1020
NULL
NULL
NULL
Terminal: OZKARDES A/S
NULL
NULL
NULL
Date: 19-01-2023
NULL
NULL
The desired output would be to transpose all rows in Transaction_Details that have NULL values in Date column, into a new column (e.g Other_Details) and for each transaction to add another column with "Transaction_Key".
Below, I have attached an example:
Transaction_Key
Date
Transaction_Details
Other_Details
Debit
Credit
1
21 Jan 2023
Transfer HomeBank
Reference: 4944, Beneficiary: David John, In Bank Account: RO97INGB1111333380218
500
NULL
2
20 Jan 2023
POS Payment
Card number: xxxx xxxx xxxx 1020, Terminal: OZKARDES A/S, Date: 19-01-2023
36
NULL
I used some COALESCE functions but it didn't work.

If we can assume you are able to create an Id/Sequence either in the data source or when importing the data, such that you end up with an incrementing number per row, then by using a windowed aggregation as follows you can convert your data as required:
select Transaction_Key,
Max(Date) Date,
Max(case when date is not null then Transaction_Details end) Transaction_Details,
String_Agg(case when date is null then Transaction_Details end, ',') Other_details,
Max(case when date is not null then Debit end) Debit,
Max(case when date is not null then Credit end) Credit
from (
select *,
Sum(case when date is null then 0 else 1 end) over(order by id) as Transaction_Key
from t
)t
group by Transaction_Key;
See this example Fiddle

Related

Running SQL function in HUE IMPALA

I have started working on HUE IMPALA and I am stuck at a complex problem which I am not able to get through. So my table looks like this.
Id
Month
Base Rate
Payment
New Payment
a
Jan
1
100
NULL
a
Feb
1
100
NULL
a
Mar
1
100
NULL
a
Apr
2
NULL
NULL
a
May
3
NULL
NULL
a
Jun
4
NULL
NULL
a
Jul
5
NULL
NULL
So my aim is to fill the values in new payment column with this logic.
if Payment IS NULL THEN New Payment = (New Base Rate (current base rate - previous base rate)* Previous Payment) + Previous Payment... ELSE Payment
Eg: For mar new payment = 100
But for Apr, New Payment = 100 + (100* (1-1)) = 100
For this I have written the following code:
Select id, month,
CASE WHEN payment is NULL then
LAG(payment)
over(Partition BY id order by month) +
((LAG(payment)
over(Partition BY id order by month))*
(base_rate-lag(base_rate)
OVER (Partition by id order by month)))
Else payment end as New Payment
With this I get following answer
Id
Month
Base Rate
Payment
New Payment
a
Jan
1
100
100
a
Feb
1
100
100
a
Mar
1
100
100
a
Apr
2
NULL
100
a
May
3
NULL
NULL
a
Jun
4
NULL
NULL
a
Jul
5
NULL
NULL
Now the problem is the New Payment variable stops at May Month because there is NULL value in the Previous month (Apr) in the payment column. What I want is once the NULL value comes in the payment column, the code then starts using the updated value in new payment column in the above mentioned logic. So the answer I want is this:
Id
Month
Base Rate
Payment
New Payment
a
Jan
1
100
100
a
Feb
1
100
100
a
Mar
1
100
100
a
Apr
2
NULL
100
a
May
3
NULL
200
a
Jun
4
NULL
400
a
Jul
5
NULL
800
May -- New Payment = 100 + (100*(2-1)) = 200
June -- New Payment = 200 + (200 * (3-2))= 400
It's okay if a new variable needs to be created or if I have to split this code into multiple parts like create a table first then apply the rest of the logic. Entirely new logic which doesn't use the lag function is also welcome.

Referencing different row in different column with same date

Please find sample date below. I want to create a new column Payment_received, that finds payment_dates which are not NULL, and subtracts (payment date - earliest SMS date for that account number). For example, for account number 12345, the calculation would be (2021-07-22 - 2021-07-20) = 2 days and for account number 99999, the calculation would be (2021-08-13 - 2021-08-10) = 3 days. I was thinking if I could create a case when to do this calculation, however, I don't know how to reference different rows for the same account number.
SMS_Date Account Number Payment_Date Payment_received
2021-07-20 12345 NULL NULL
2021-07-21 12345 NULL NULL
2021-07-22 12345 2021-07-22 2
2021-08-10 99999 NULL NULL
2021-08-11 99999 NULL NULL
2021-08-12 99999 NULL NULL
2021-08-13 99999 2021-08-13 3
Use CROSS APPLY to find earliest SMS date for each row in original table.
SELECT
pi.*,
CASE
WHEN pi.Payment_Date IS NOT NULL THEN DATEDIFF(d, smsdate.First_SMS_Date, pi.Payment_Date)
ELSE NULL
END AS Payment_Received
FROM
PaymentsInfo pi
CROSS APPLY (
SELECT
MIN(SMS_Date) AS First_SMS_Date
FROM
PaymentsInfo smst
WHERE
pi.Payment_Date IS NOT NULL AND pi.Account_Number = smst.Account_Number
) smsdate

Postgres Crosstab query Dynamic pivot

Does any one know how to create the following crosstab in Postgres?
For example I have the following table:
Store Month Sales
A Mar-2020 100
A Feb-2020 200
B Mar-2020 400
B Feb-2020 500
A Jan-2020 400
C Apr-2020 600
I would like the query to return the following crosstab, the column headings should not be hardcoded values but reflect the values in "month" column from the first table:
Store Jan-2020 Feb-2020 Mar-2020 Apr-2020
A 400 200 100 -
B - 500 400 -
C - - - 600
Is this possible?
Postgres does have a crosstab function, but I think using the built in filtering functionality is simple in this case:
select store,
sum(sales) filter (where month = 'Jan-2020') as Jan_2020,
sum(sales) filter (where month = 'Feb-2020') as Feb_2020,
sum(sales) filter (where month = 'Mar-2020') as Mar_2020,
sum(sales) filter (where month = 'Apr-2020') as Apr_2020
from t
group by store
order by store;
Note: This puts NULL values in the columns with no corresponding value, rather than -. If you really wanted a hyphen, you would need to convert the value to a string -- and that seems needlessly complicated.
Try this with CASE expression inside SUM(), here is the db-fiddle.
select
store,
sum(case when month = 'Jan-2020' then sales end) as "Jan-2020",
sum(case when month = 'Feb-2020' then sales end) as "Feb-2020",
sum(case when month = 'Mar-2020' then sales end) as "Mar-2020",
sum(case when month = 'Apr-2020' then sales end) as "Apr-2020"
from myTable
group by
store
order by
store
Output:
+---------------------------------------------------+
|store Jan-2020 Feb-2020 Mar-2020 Apr-2020|
+---------------------------------------------------+
| A 400 200 100 null |
| B null 500 400 null |
| C null null null 600 |
+---------------------------------------------------+
If you want to replace null values with 0 in the output then use coalesce()
e.g.
coalesce(sum(case when month = 'Jan-2020' then sales end), 0)

how to format a table content

I am new in database.I have execute a query like
SELECT name as request_name,
count(name) as no_of_open_req,
DATENAME(MM, Convert(DATE, created_date)) as month
FROM usm_request
WHERE DATENAME(YEAR, Convert(DATE,created_date)) = DATENAME(YEAR,GETDATE())
GROUP BY name,
DATENAME(YEAR, Convert(DATE, created_date)),
DATENAME(MM, Convert(DATE,created_date))
ORDER BY DATENAME(MM, Convert(DATE, created_date))
and got the result
request_name no_of_open_req month
Computer Request 1 April
Desk Phone Request 1 April
E-mail ID Creation Request 1 April
Computer Request 19 February
Desk Phone Request 12 February
Email ID Creation Request 8 February
Computer Request 45 January
Desk Phone Request 28 January
Email ID Creation Request 55 January
Computer Request 18 March
Desk Phone Request 24 March
E-mail ID Creation Request 35 March
But we need the result like
request_name January February March April
Computer Request 45 19 18 1
Desk Phone Request 28 12 24 1
E-mail ID Creation Request 55 8 35 1
Please help.
We have tried this query..
SELECT * from (select name as [request_name],
count(name) as [no_of_open_req],
DATENAME(MM, Convert(DATE, created_date)) as [month] FROM usm_request WHERE DATENAME(YEAR,Convert(DATE,created_date))=DATENAME(YEAR,GETDATE()) group by name, DATENAME(YEAR, Convert(DATE, created_date)),DATENAME(MM, Convert(DATE,created_date)))as t PIVOT(sum(no_of_open_req) FOR month IN (['January'],['February'],['March'],['April'])) AS PivotTable
and we are getting this as a result. ALL NULL VALUES
request_name 'January' 'February' 'March' 'April'
Cell Phone Allocation NULL NULL NULL NULL
Computer Request NULL NULL NULL NULL
Desk Phone Request NULL NULL NULL NULL
Desk Phone Request test NULL NULL NULL NULL
Email ID Creation Request NULL NULL NULL NULL
E-mail ID Creation Request NULL NULL NULL NULL
International Dialing Request NULL NULL NULL NULL
New Employee Request NULL NULL NULL NULL
New Non-Employee Request NULL NULL NULL NULL
Onboard a Non-Employee NULL NULL NULL NULL
Onboard a Non-Employee – Step 1 NULL NULL NULL NULL
Onboard a Non-Employee - Step 2 NULL NULL NULL NULL
Thanks
Monika
I think the problem is in the following part of your PIVOT statement:
FOR month IN (['January'],['February'],['March'],['April'])
Instead it should be
FOR month IN ([January],[February],[March],[April])
So, without the single-quotes. If you add those single quotes between the rectangular brackets, the statement will actually be looking for month values including the single quotes. For example, if you have ['January'] in your PIVOT list the statement will look for Month values of 'January'. Of course the values in the Month column do not have single-quotes in them.

query that is splitting a column into two

Hello I have an ID column and an amount column at the moment.
A value is represented as a Debit if the amount is positive. A credit if the amount is negative. I'm wondering how can I "Split" my amount column.
Select * from Test.dbo.Accounts
Produces
ID | Amount
1 | 500
2 | -600
So Item 1 is a Debit, Item two is a credit. I want to query the Database so that it displays as followed
ID | Debit | Credit
1 | 500 | null
2 | null |-600
You can use a case statement to find which column the amount belongs in:
SELECT id ,
CASE WHEN amount >= 0 THEN amount
ELSE NULL
END AS debit ,
CASE WHEN amount < 0 THEN amount
ELSE NULL
END AS credit
FROM Test.dbo.Accounts
I assumed 0 should go in debits but that'd be your call.
Select ID, Amount as Debit, null as Credit
From Account
Where Amount >= 0
Union All
Select ID, null as Debit, Amount as Credit
From Account
Where Amount < 0