Nested Table needs to be separated by columns - sql

I have a nested table and I have tried to unnest it. I have used the following code to do it:
SELECT purchaseId
, CAST (PURCHASECREATED AS DATE) AS SALE_DT
, (SELECT ARRAY_AGG (DISTINCT UPPER (DISCOUNTTYPE) IGNORE NULLS) FROM
UNNEST(PURCHASEDISCOUNTS)) AS DISCOUNT_TYPE
, (SELECT ARRAY_AGG (DISTINCT UPPER (DISCOUNTCODE) IGNORE NULLS) FROM
UNNEST(PURCHASEDISCOUNTS)) AS DISCOUNT_CODE
, UPPER (g.DISCOUNTTYPE) AS DISCOUNT_SALESSYSTEM
FROM `TABLE
, UNNEST(PURCHASELINEITEMS) f
, UNNEST(ITEMDISCOUNTS) g
I get the following results:
As you can see the SALE_DT and DISCOUNT_SALESSYSTEM have blank rows. I would like each row to be unique.
SALE_DT
DISCOUNT_TYPE
DISCOUNT_CODE
DISCOUNT_SALES_SYSTEM
2023-01-04
EMPLOYEE
9876
EMPLOYEEDISCOUNT1
SUBSCRIPTION
1234
How can I separated the nested columns so they look like this:
SALE_DT
DISCOUNT_TYPE_1
DISCOUNT_TYPE_2
DISCOUNT_SALES_SYSTEM
DISCOUNT_CODE_1
DISCOUNT_CODE_2
2023-01-04
EMPLOYEE
SUBSCRIPTION
EMPLOYEEDISCOUNT1
9876
1234

Related

SQL query to allow for latest datasets per items

I have this table in an SQL server database:
and I would like a query that gives me the values of cw1, cw2,cw3 for a restricted date condition.
I would like a query giving me the "latest" values of cw1, cw2, cw3 giving me previous values of cw1, cw2, cw3, if they are null for the last plan_date. This would be with a date condition.
So if the condition is plan_date between "02.01.2020" and "04.01.2020" then the result should be
1 04.01.2020 null, 9, 4
2 03.01.2020 30 , 15, 2
where, for example, the "30" is from the last previous date for item_nr 2.
You can get the last value using first_value(). Unfortunately, that is a window function, but select distinct solves that:
select distinct item_nr,
first_value(cw1) over (partition by item_nr
order by (case when cw1 is not null then 1 else 2 end), plan_date desc
) as imputed_cw1,
first_value(cw2) over (partition by item_nr
order by (case when cw2 is not null then 1 else 2 end), plan_date desc
) as imputed_cw2,
first_value(cw3) over (partition by item_nr
order by (case when cw3 is not null then 1 else 2 end), plan_date desc
) as imputed_cw3
from t;
You can add a where clause after the from.
The first_value() window function returns the first value from each partition. The partition is ordered to put the non-NULL values first, and then order by time descending. So, the most recent non-NULL value is first.
The only downside is that it is a window function, so the select distinct is needed to get the most recent value for each item_nr.

What is the total revenue generated by all the trips? The fare is stored in the column total_amount

where last column total_amount is of decimal(9,6) type from where we need to find out the revenue.
I tried
select sum(total_amount) from taxidata;
(#taxidata is table & total_amount is column)
which gives answer as 'NULL'
select sum(total_amount) from taxidata group by vendor_id;
Answer: 1 NULL
4 NULL
2 NULL
I also tried select sum(total_amount decimal (9,6)) from taxidata;
but failed.

How to get rows which have non-null values in all periods but doesn't exists in other document of same company?

I have a database where there are different components with 'Current'/'Historical' period values. These components can belong to different documents of same company. "period" is a boolean column with - 'current' & 'historical'.
My derived table after multiple joins is like this -
company_id document_id component_id value period
1000 100 1 456 current
1000 100 1 870 historical
1000 100 2 67 current
1000 100 2 NULL historical
1000 200 2 67 historical
I want to get component_id '1' from above : it has non-null values in all periods for document_id '100' but it doesn't exists for document_id '200'. The values of columns "document_id", "company_id" & "component_id" is not know so can't be used in query.
If I understand correctly, you want component that have a document that does not have values for all periods. If so:
select distinct component_id
from t
group by component_id, document_id
having count(value) over (filter where period = 'current') = 0 or
count(value) over (filter where period = 'historical') = 0 ;
If this is what you want, then this is one of the few occasions where select distinct is can be used with group by.
I understand that you want components that, for a given company, belong to only one document_id and have no null value.
If so, you can aggregate by component_id and company_id, and implement the filtering logic in the having clause:
select company_id, component_id
from mytable t
group by company_id, component_id
having
bool_and(value is not null) -- no "value" is "null"
and min(document_id) = max(document_id) -- only one distinct "document_id"

How to find the minimum value in a postgres sql column which contains jsonb data?

I have a table t in postgres database. It has a column data which contains jsonb data in the following format (for each record)-
{
"20161214": {"4": ["3-14", "5-16", "642"], "9": ["3-10", "5-10", "664"] },
"20161217": {"3": ["3-14", "5-16", "643"], "7": ["3-10", "5-10", "661"] }
}
where 20161214 is the date, "4" is the month, 642 is the amount.
I need to find the minimum amount for each record of the table and the month that amount belongs to.
What I have tried:
Using jsonb_each function and separating key value pairs and then using min function.But still I cant get the month it belongs to.
How can this be achieved?
select j2.date
,j2.month
,j2.amount
from t
left join lateral
(select j1.date
,j2.month
,(j2.value->>2)::numeric as amount
from jsonb_each (t.data) j1 (date,value)
left join lateral jsonb_each (j1.value) j2 (month,value)
on true
order by amount
limit 1
) j2
on true
+----------+-------+--------+
| date | month | amount |
+----------+-------+--------+
| 20161214 | 4 | 642 |
+----------+-------+--------+
Alternatively (without joins):
select
min(case when amount = min_amount then month end) as month,
min_amount as amout
from (
select
key as month,
(select min((value->>2)::int) from jsonb_each(value)) as amount,
min((select min((value->>2)::int) from jsonb_each(value))) over(partition by rnum) as min_amount,
rnum
from (
select
(jsonb_each(data)).*,
row_number() over() as rnum
from t
) t
) t
group by
rnum, min_amount;

Fill values "down" when pivoting

I'm doing a PIVOT command. My row label is a date field. My columns are locations like [NY], [TX], etc. Some of the values from the source data are null, but once it's pivoted I'd like to "fill down" those nulls with the last known value in date order.
That is if column NY has a value for 1/1/2010 but null for 1/2/2010 I want to fill down the value from 1/1/2010 to 1/2/2010, and any other null dates below until another value already exists. So basically I'm filling in the null gaps with the same data for the closes date that has data for each of the columns.
An example of my pivot query I currently have is:
SELECT ReadingDate, [NY],[TX],[WI]
FROM
(SELECT NAME As 'NodeName',
CAST(FORMAT(readingdate, 'M/d/yyyy') as Date) As 'ReadingDate',
SUM(myvalue) As 'Value'
FROM MyTable) as SourceData
PIVOT (SUM(Value) FOR NodeName IN ([NY],[TX],[WI])) as PivotTable
Order BY ReadingDate
But I'm not sure how to do this "fill down" to fill in null values
Sample source data
1/1/2010, TX, 1
1/1/2010, NY, 5
1/2/2010, NY null
1/1/2010, WI, 3
1/3/2010, WI, 7
...
Notice how there is no WI for 1/2 or NY for 1/3 which would result in nulls in the pivot result. There is also a null record too also resulting in a null. For NY once pivoted 1/2 needs to be filled in with 5 because it's the last known value, but 1/3 also needs to be filed in with 5 once pivoted since that record didn't even exist but when pivoted it would show up as null value because it didn't exist but another location had the record.
This can be a pain in SQL Server. ANSI supports a nice feature on LAG(), called IGNORE NULLs, but SQL Server doesn't (yet) support it. I would start with the using conditional aggregation (personal preference):
select cast(readingdate as date) as readingdate,,
sum(case when name = 'NY' then value end) as NY,
sum(case when name = 'TX' then value end) as TX,
sum(case when name = 'WI' then value end) as WI
from mytable
group by cast(readingdate as date);
So, we have to be a bit more clever. We can assign the NULL values into groups based on the number of non-NULL values before them. Fortunately, this is easy to do using a cumulative COUNT() function. Then, we can get the one non-NULL value in this group by using MAX() (or MIN()):
with t as (
select cast(readingdate as date) as readingdate,
sum(case when name = 'NY' then value end) as NY,
sum(case when name = 'TX' then value end) as TX,
sum(case when name = 'WI' then value end) as WI,
from mytable
group by cast(readingdate as date)
),
t2 as (
select t.*,
count(NY) over (order by readingdate) as NYgrp,
count(TX) over (order by readingdate) as TXgrp,
count(WI) over (order by readingdate) as WIgrp
from t
)
select readingdate,
coalesce(NY, max(NY) over (partition by NYgrp)) as NY,
coalesce(TX, max(TX) over (partition by TXgrp)) as TX,
coalesce(WI, max(WI) over (partition by WIgrp)) as WI
from t2;