Converting Columns into Rows in Pentaho

Converting Columns into Rows in Pentaho - pentaho

i have a table below that looks like the current one below. How can i put the values for date, count2, amount2 from columns into rows just ONCE?
current:
date type count amount type2 date count2 amount2
1/1/20 00 5 13.49 ZZZ 1/1/20 0 5.00
1/1/20 12 6 14.69 ZZZ 1/1/20 0 5.00
1/1/20 11 10 20.66 ZZZ 1/1/20 0 5.00
expected
date type count amount
1/1/20 12 6 14.69
1/1/20 12 6 14.69
1/1/20 11 10 20.66
1/1/20 ZZZ 0 5.00

what can you do it just splits data flow into two select steps.
in the first select step use your first three columns.
in the second select step use your remaining other columns. and use filter rows to remove redundancy. then use dummy step and link to both select steps.

Related

PostgreSQL query not fetching correct result for the conditional comparison of aggregate function

I have a products table with following values with id as INT and profit as numeric data type
id profit
1 6.00
2 3.00
3 2.00
4 3.00
5 2.00
6 8.00
7 4.00
8 3.00
9 1.00
10 4.00
11 10.00
12 3.00
13 6.00
14 5.00
15 2.00
16 7.00
17 6.00
18 5.00
19 2.00
20 16.00
21 3.00
22 6.00
23 5.00
24 5.00
25 1.00
26 4.00
27 1.00
28 7.00
29 11.00
30 2.00
31 1.00
32 3.00
33 2.00
34 5.00
35 4.00
I want to fetch id's which have profit more than average profit
My QUERY:
SELECT product_id,profit
FROM products
GROUP BY product_id,profit
HAVING profit > AVG(profit)::INT
But, the above query return's empty result.

when you execute the group by query, the records are grouped based on the parameters and then where/having clauses are applied.
so first group is student id 1 and further grouped by profit 6.00 making its average as 6.00 and with having condition profit >avg(profit), there are no records which match the criteria.
same for all other records. that is why you get empty result set as no number can be > itself.
based on your description though, it can be achieved by multiple selects.
select * from products where profit >(select avg(profit) from products)

Selecting records from slowly changing table with a set of dates

I have a slowly changing table,a new row is created each time any of the source fields are changed. Some metadata is added to show when that version was valid. This is a simplified example(dates are dd/mm/yyyy format) that doesn't show the fields which have changed.
Startdate
Enddate
Currentrecord
unique id
serial_number
15/12/2020
31/12/2020
0
1
2345
15/12/2020
8/3/2021
0
2
1234
19/9/2020
15/2/2021
0
3
2345
15/12/2020
8/3/2021
0
4
3456
9/3/2021
10/3/2021
0
5
3456
16/2/2021
10/3/2021
0
6
2345
9/3/2021
26/3/2021
0
7
1234
27/3/2021
2/5/2021
0
8
1234
11/3/2021
17/5/2021
0
9
3456
3/3/2021
27/4/2021
0
10
4567
20/1/2021
7/4/2021
0
11
5678
3/5/2021
30/6/2021
1
12
1234
25/5/2021
31/5/2021
0
13
2345
8/4/2021
22/5/2021
0
14
5678
1/6/2021
26/6/2021
0
15
2345
18/5/2021
3/6/2021
0
16
3456
27/6/2021
2/8/2021
0
17
2345
28/4/2021
28/6/2021
0
18
4567
23/5/2021
6/9/2021
0
19
5678
4/6/2021
28/6/2021
0
20
3456
29/6/2021
25/7/2021
0
21
3456
3/8/2021
31/12/9999
1
22
2345
26/7/2021
31/12/9999
1
23
3456
15/10/2021
31/12/9999
1
24
4567
7/9/2021
1/11/2021
0
25
5678
22/9/2021
10/11/2021
0
26
6789
2/11/2021
16/11/2021
0
27
5678
17/11/2021
21/11/2021
0
28
5678
15/7/2021
31/12/9999
1
29
7891
22/11/2021
31/12/9999
1
30
5678
26/11/2021
31/12/9999
1
31
6789
15/6/2021
31/12/9999
1
32
8912
There is only one record for each serial_number for any given point in time (i.e. the dates ranges will not overlap for identical serial_numbers) but there might be gaps between episodes for a some serial_numbers (representing something leaving and returning after a gap in service).
I want to supply an arbitrary list of datetimes, say midnight on 01/01/2021, 15/03/2021, 27/05/2021. 23/10/2021. I want to return a set of records, containing every record which was in effect on each of the dates, with each row labelled with the date it was selected by. So the above example should return this.
date
unique id
serial_number
1/1/2021
2
1234
1/1/2021
3
2345
1/1/2021
4
3456
15/3/2021
7
1234
15/3/2021
9
3456
15/3/2021
10
4567
15/3/2021
11
5678
27/5/2021
12
1234
27/5/2021
13
2345
27/5/2021
16
3456
27/5/2021
18
4567
27/5/2021
19
5678
23/10/2021
22
2345
23/10/2021
23
3456
23/10/2021
24
4567
23/10/2021
25
5678
23/10/2021
26
6789
23/10/2021
29
7891
23/10/2021
32
8912
I can see how to do this with a cursor, stepping through each date putting them into a variable and using something like
select #date, [unique id], serial_number
from example
where #date between start_date and end_date
to get the rows.
I can’t work out a pattern that would do it in a set based approach. My preferred SQL version is TSQL. Sorry as this is almost certainly a repeat, but I can't find a form of words that hits a worked example.

You can use a temporary table to accomplish this.
CREATE TABLE #RequestedDates([Date] DATE)
You insert your dates you want into a temporary table.
INSERT INTO #RequestedDates([Date])
VALUES ('2021-01-01'), ('2021-03-15') /*Other dates*/
And then you join with the temporary table and use the between clause to get the valid results.
SELECT rd.[Date]
, t.UniqueId
, t.SerialNumber
FROM MyTable t
INNER JOIN #RequestedDates rd on rd.[Date] BETWEEN t.StartDate AND t.EndDate
ORDER BY rd.[Date]
, t.UniqueId
, t.SerialNumber

You can join to VALUES with the dates you need.
Then join the datetimes on the range.
SELECT
datetimes.dt as [date]
, t.[unique id]
, t.serial_number
FROM example t
JOIN (VALUES
(cast('2021-01-01 00:00:00' as datetime)),
('2021-03-15 00:00:00'),
('2021-05-27 00:00:00'),
('2021-10-23 00:00:00')
) datetimes(dt)
ON datetimes.dt >= t.start_date
AND datetimes.dt <= t.end_date
ORDER BY datetimes.dt, t.[unique id], t.serial_number

Combine 2 queries into 1 table with user entering the parameters twice for both queries

My project needs me to come up with a query which can compare any 2 months data side by side, by just keying in the dates of the 2 months.
I have done 2 separate queries that can only do single month data because I can only enter 1 date per query. I tried to combine this 2 separate query into 1 single query by selecting the columns from each table but it gives me blank data.
I will need some help in combining the 2 queries together, into 1 table as a view form and allowing the user to key in the 2 dates they want to get their data from. Below will be the 2 queries result I can achieve and also the end result I want to achieve from combining this 2 queries.
Conditions to merge the two table is that the company will be the same for both dates, and the item the company bought (if any). If the company did not buy the item on the month , data should be blank.
Query 1 : User will enter "First month" they want the data from
Inv Number Company Date Item Price Quantity Total
123 ABC 1/1/2018 Table 5 3 15
123 ABC 1/1/2018 Chair 2 4 8
345 XYZ 1/1/2018 Table 5 5 25
345 XYZ 1/1/2018 Chair 2 6 12
Query 2: User will enter "Second Month" they want the data from
Inv Number Company Date Item Price Quantity Total
999 ABC 1/2/2018 Table 4 3 12
999 ABC 1/2/2018 Chair 2 5 10
899 XYZ 1/2/2018 Table 4 3 12
End result : User will be allowed to key in both dates they want the data from
Inv Number Company Date Item Price Quantity Total Date Item Price Quantity Total Inv Number
123 ABC 1/1/2018 Table 5 3 15 1/2/2018 Table 4 3 12 999
123 ABC 1/1/2018 Chair 2 4 8 1/2/2018 Chair 2 5 10 999
345 XYZ 1/1/2018 Table 5 5 25 1/2/2018 Table 4 3 12 899
345 XYZ 1/1/2018 Chair 2 6 12

How to write the query to make report by month in sql

I have the receiving and sending data for whole year. so i want to built the monthly report base on that data with the rule is Fisrt in first out. It means is the first receiving will be sent out first ...
DECLARE #ReceivingTbl AS TABLE(Id INT,ProId int, RecQty INT,ReceivingDate DateTime)
INSERT INTO #ReceivingTbl
VALUES (1,1001,210,'2019-03-12'),
(2,1001,315,'2019-06-15'),
(3,2001,500,'2019-04-01'),
(4,2001,10,'2019-06-15'),
(5,1001,105,'2019-07-10')
DECLARE #SendTbl AS TABLE(Id INT,ProId int, SentQty INT,SendMonth int)
INSERT INTO #SendTbl
VALUES (1,1001,50,3),
(2,1001,100,4),
(3,1001,80,5),
(4,1001,80,6),
(5,2001,200,6)
SELECT * FROM #ReceivingTbl ORDER BY ProId,ReceivingDate
SELECT * FROM #SendTbl ORDER BY ProId,SendMonth
Id ProId RecQty ReceivingDate
1 1001 210 2019-03-12
2 1001 315 2019-06-15
5 1001 105 2019-07-10
3 2001 500 2019-04-01
4 2001 10 2019-06-15
Id ProId SentQty SendMonth
1 1001 50 3
2 1001 100 4
3 1001 80 5
4 1001 80 6
5 2001 200 6
--- And the below is what i want:
Id ProId RecQty ReceivingDate ... Mar Apr May Jun
1 1001 210 2019-03-12 ... 50 100 60 0
2 1001 315 2019-06-15 ... 0 0 20 80
5 1001 105 2019-07-10 ... 0 0 0 0
3 2001 500 2019-04-01 ... 0 0 0 200
4 2001 10 2019-06-15 ... 0 0 0 0
Thanks!

Your question is not clear to me.
If you want to purely use the FIFO approach, therefore ignore any data the table contains, you necessarely need to order by ID, which in your example you are providing, and looks like it is in order of insert.
The first line inserted should be also the first line appearing in the select (FIFO), in order to do so you have to use:
ORDER BY Id ASC
Which will place the lower value of the ID first (1, 2, 3, ...)
To me though, this doesn't make much sense, so pay attention to the meaning o the data you actually have and leverage dates like ReceivingDate, and order by that, maybe even filtering by month of the date, below an example for January data:
WHERE MONTH(ReceivingDate) = 1

Subtract nonconsecutive values in same row in t-SQL

I have a data table that has annual data points and quarterly data points. I want to subtract the quarterly data points from the corresponding prior annual entry, e.g. Annual 2014 - Q3 2014, using t-SQL. I have an id variable for each entry, plus a reconcile id variable that shows which quarterly entry corresponds to which annual entry. See below:
CurrentDate PreviousDate Value Entry Id Reconcile Id Annual/Quarterly
9/30/2012 9/30/2011 112 2 3 Annual
9/30/2013 9/30/2012 123 1 2 Annual
9/30/2014 9/30/2013 123.5 9 1 Annual
12/31/2013 9/30/2014 124 4 1 Quarterly
3/31/2014 12/31/2013 124.5 5 1 Quarterly
6/30/2014 3/31/2014 125 6 1 Quarterly
9/30/2014 6/30/2014 125.5 7 1 Quarterly
12/31/2014 9/30/2014 126 10 9 Quarterly
3/31/2015 12/31/2014 126.5 11 9 Quarterly
6/30/2015 3/31/2015 127 12 9 Quarterly
For example, Reconcile ID 9 for the quarterly entries corresponds to Entry ID 9, which is an annual entry.
I have code to just subtract the prior entry from the current entry, but I cannot figure out how to subtract quarterly entries from annual entries where the Entry ID and Reconcile ID are the same.
Here is the code I am using, which is resulting in the right calculation, but increasing the number of results by many rows. I have also tried this as an inner join. I only want the original 10 rows, plus a new difference column:
SELECT DISTINCT T1.[EntryID]
, [T1].[RECONCILEID]
, [T1].[CurrentDate]
, [T1].[Annual_Quarterly]
, [T1].[Value]
, [T1].[Value]-T2.[Value] AS Difference
FROM Table T1
LEFT JOIN Table T2 ON T2.EntryID = T1.RECONCILEID;

Your code should be fine, here's the results I'm getting:
EntryId Annual_Quarterly CurrentDate ReconcileId Value recVal diff
2 Annual 9/30/2012 3 112
1 Annual 9/30/2013 2 123 112 11
9 Annual 9/30/2014 1 123.5 123 0.5
4 Quarterly 12/31/2013 1 124 123 1
5 Quarterly 3/31/2014 1 124.5 123 1.5
6 Quarterly 6/30/2014 1 125 123 2
7 Quarterly 9/30/2014 1 125.5 123 2.5
10 Quarterly 12/31/2014 9 126 123.5 2.5
11 Quarterly 3/31/2015 9 126.5 123.5 3
12 Quarterly 6/30/2015 9 127 123.5 3.5
with your data and this SQL:
SELECT
tr.EntryId,
tr.Annual_Quarterly,
tr.CurrentDate,
tr.ReconcileId,
tr.Value,
te.Value AS recVal,
tr.[VALUE]-te.[VALUE] AS diff
FROM
t AS tr LEFT JOIN
t AS te ON
tr.ReconcileId = te.EntryId
ORDER BY
tr.Annual_Quarterly,
tr.CurrentDate;

Your question is a bit vague as far as how you're wanting to subtract these values, but this should give you some idea.
Select T1.*, T1.Value - Coalesce(T2.Value, 0) As Difference
From Table T1
Left Join Table T2 On T2.[Entry Id] = T1.[Reconcile Id]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Converting Columns into Rows in Pentaho - pentaho

what can you do it just splits data flow into two select steps. in the first select step use your first three columns. in the second select step use your remaining other columns. and use filter rows to remove redundancy. then use dummy step and link to both select steps.

Related

PostgreSQL query not fetching correct result for the conditional comparison of aggregate function

Selecting records from slowly changing table with a set of dates

Combine 2 queries into 1 table with user entering the parameters twice for both queries

How to write the query to make report by month in sql

Subtract nonconsecutive values in same row in t-SQL

Categories

Resources