How to pivot in postgresql - sql

I have table like following,and I would like to transform them.
year month week type count
2021 1 1 A 5
2021 1 1 B 6
2021 1 1 C 7
2021 1 2 A 0
2021 1 2 B 8
2021 1 2 C 9
I'd like to pivot like following.
year month week A B C
2021 1 1 5 6 7
2021 1 2 0 8 9
I tried like following statement, but it returned a lot of null columns.
And I wonder I must add columns one by one when new type will be added.
select
year,
month,
week,
case when type in ('A') then count end as A,
case when type in ('B') then count end as B,
case when type in ('C') then count end as C,
from
table
If someone has opinion, please let me know.
Thanks

demo: db<>fiddle
You can either use the FILTER clause:
SELECT
year, month, week,
MAX("count") FILTER (WHERE type = 'A') as A, -- 2
MAX("count") FILTER (WHERE type = 'B') as B,
MAX("count") FILTER (WHERE type = 'C') as C
FROM mytable
GROUP BY year, month, week -- 1
ORDER BY year, month, week
or you can use the CASE clause:
SELECT
year, month, week,
MAX (CASE WHEN type = 'A' THEN "count" END) AS A,
MAX (CASE WHEN type = 'B' THEN "count" END) AS B,
MAX (CASE WHEN type = 'C' THEN "count" END) AS C
FROM mytable
GROUP BY year, month, week
ORDER BY year, month, week
In both cases you need to perform a GROUP BY action.
This makes an aggregation function necessary, like MAX() or SUM(). Finally you need to apply a kind of filter (CASE or FILTER) to only aggregate the related data.
Additionally: Please note that the words count, year, month, week are keywords of SQL. To avoid any complications you should think about other column names.

This question has been asked many times, & there are decent (even dynamic) solutions. While CROSSTAB() is available in recent versions of Postgres, not everyone has sufficient user privileges to install the prerequisite extension.
One such solution involves a temp type (temp table) created by an anonymous function & JSON expansion of the resultant type.
See also: DB FIDDLE (UK): https://dbfiddle.uk/Sn7iO4zL
How to pivot or crosstab in postgresql without writing a function?

Related

LAG function alternative. I need the results for the missing year in between

I have this table so far. However, I would like to obtain the results for 2019 which there are no records so it becomes 0. Are there any alternatives to the LAG funciton.
ID
Year
Year_Count
1
2018
10
1
2020
20
Whenever I use the LAG function in SQL it gives me the results for 2018. However, I would like to get 0 for 2019 and then 10 for 2018
LAG(YEAR_COUNT) OVER (PARTITION BY ID ORDER BY YEAR) AS previous_year_count
untested notepad scribble
CASE
WHEN 1 = YEAR - LAG(YEAR) OVER (PARTITION BY ID ORDER BY YEAR)
THEN LAG(YEAR_COUNT) OVER (PARTITION BY ID ORDER BY YEAR)
ELSE 0
END AS previous_year_count
I'll add on to Nick's comment here with an example.
The YEARS CTE here is creating that table of years as he suggested, the RECORDS table is matching the above posted. Then they get joined together with COALESCE to fill in the null values left by the LEFT JOIN (filled ID with 0, not sure what your case would be).
You would need to LEFT JOIN onto the YEAR table and select the YEAR variable from the YEAR table in the final query, otherwise you'd only end up with only 2018/2020 or those years and some null values
WITH
YEARS AS
(
SELECT 2016 AS YEAR UNION ALL
SELECT 2017 UNION ALL
SELECT 2018 UNION ALL
SELECT 2019 UNION ALL
SELECT 2020 UNION ALL
SELECT 2021 UNION ALL
SELECT 2022
)
,
RECORDS AS
(
SELECT 1 ID, 2018 YEAR, 10 YEAR_COUNT UNION ALL
SELECT 1, 2020, 20)
SELECT
COALESCE(ID, 0) AS ID,
Y.YEAR,
COALESCE(YEAR_COUNT, 0) AS YEAR_COUNT
FROM YEARS AS Y
LEFT JOIN RECORDS AS R
ON R.YEAR = Y.YEAR
Here is the dbfiddle so you can visualize - https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9e777ad925b09eb8ba299d610a78b999
Vertica SQL is not an available test environment, so this may not work directly but should at least get you on the right track.
The LAG function would not work to get 2019 for a few reasons
It's a window function and can only grab from data that is available - the default for LAG in your case appears to be 1 aka LAG(YEAR_COUNT, 1)
Statements in the select typically can't add any rows data back into a table, you would need to add in data with JOINs
If 2019 does exist in a prior table and you're using group by to get year count, it's possible that you have a where clause excluding the data.

How can I view 1 result per line, whether the 2 rows and 2 columns match?

SELECT ID, YEAR, C1, C2
FROM test_table
ID
YEAR
C1
C2
1
2019
1
1
1
2018
1
1
How can I write a SQL statement that shows me if the ones are matching for both years for each column?
I am thinking of something like that but this is wrong
CASE
WHEN (YEAR = 2018 AND C1 = 1) = (YEAR = 2019 AND C1 = 1)
THEN 'matching records'
ELSE 'not matching'
END AS C1_Decision
I would like to achieve the following result
ID
YEAR
C1_Decision
1
2019
Matching
1
2018
Matching
LAG() and LEAD() can help here:
To test if the previous year matches the current record:
SELECT ID, YEAR, CASE WHEN C1 = LAG(C1) OVER (PARTITION BY ID ORDER BY YEAR) THEN 'Matching' ELSE 'Not Matching'
FROM test_table
If you are interested, these foot into a wider range of functions called "Window Functions" that can be very handy for aggregating and comparing other rows in the result set based on partitions of data within that result set: https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AnalyzingData/SQLAnalytics/WindowPartitioning.htm

SQL SUM and value conversion

I'm looking to transform data in my SUM query to acknowledge that some numeric values are negative in nature, although not represented as such.
I look for customer balance where the example dataset includes also credit transactions that are not written as negative in the database (although all records that have value C for credit in inv_type column should be treated as negative in the SQL SUM function). As an example:
INVOICES
inv_no inv_type cust_no value
1 D 25 10
2 D 35 30
3 C 25 5
4 D 25 50
5 C 35 2
My simple SUM function would not give me the correct answer:
select cust_no, sum(value) from INVOICES
group by cust_no
This query would obviously sum the balance of customer no 25 for 65 and no 35 for 32, although the anticipated answer would be 10-5+50 = 55 and 30 - 2 = 28
Should I perhaps utilize CAST function somehow? Unfortunately I'm not up to date on the underlying db engine, however good chance of it being of IBM origin. Most of the basic SQL code has worked out so far though.
You can use the case expression inside of a sum(). The simplest syntax would be:
select cust_no,
sum(case when inv_type = 'C' then - value else value end) as total
from invoices
group by cust_no;
Note that value could be a reserved word in your database, so you might need to escape the column name.
You should be able to write a projection (select) first to obtain a signed value column based on inv_type or whatever, and then do a sum over that.
Like this:
select cust_no, sum(value) from (
select cust_no
, case when inv_type='D' then [value] else -[value] end [value]
from INVOICES
) SUMS
group by cust_no
You can put an expression in the sum that calculates a negative value if the invoice is a credit:
select
cust_no,
sum
(
case inv_type
when 'C' then -[value]
else [value]
end
) as [Total]
from INVOICES

How to write a LEFT JOIN in BigQuery's Standard SQL?

We have a query that works in BigQuery's Legacy SQL. How do we write it in Standard SQL so it works?
SELECT Hour, Average, L.Key AS Key FROM
(SELECT 1 AS Key, *
FROM test.table_L AS L)
LEFT JOIN
(SELECT 1 AS Key, Avg(Total) AS Average
FROM test.table_R) AS R
ON L.Key = R.Key ORDER BY Hour ASC
Currently the error it gives is:
Equality is not defined for arguments of type ARRAY<INT64> at [4:74]
BigQuery has two modes for queries: Legacy SQL and Standard SQL. We have looked at the BigQuery Standard SQL documentation and also see just one SO answer on Standard SQL joins in BigQuery - but so far, it is unclear to us what the key change needed might be.
Table_L looks like this:
Row Hour
1 A
2 B
3 C
Table_R looks like this:
Row Value
1 10
2 20
3 30
Results Desired:
Row Hour Average(OfR) Key
1 A 20 1
2 B 20 1
3 C 20 1
How do we rewrite this BigQuery Legacy SQL query to work in Standard SQL?
Based on your recent update in question and comments - try below
WITH Table_L AS (
SELECT 1 AS Row, 'A' AS Hour UNION ALL
SELECT 2 AS Row, 'B' AS Hour UNION ALL
SELECT 3 AS Row, 'C' AS Hour
),
Table_R AS (
SELECT 1 AS Row, 10 AS Value UNION ALL
SELECT 2 AS Row, 20 AS Value UNION ALL
SELECT 3 AS Row, 30 AS Value
)
SELECT
Row,
Hour,
(SELECT AVG(Value) FROM Table_R) AS AverageOfR,
1 AS Key
FROM Table_L
Above is for testing
the query you should run in "production" is
SELECT
Row,
Hour,
(SELECT AVG(Value) FROM Table_R) AS AverageOfR,
1 AS Key
FROM Table_L
In case, if for some reason you are bound to JOIN, use below CROSS JOIN version
SELECT
Row,
Hour,
AverageOfR,
1 AS Key
FROM Table_L
CROSS JOIN ((SELECT AVG(Value) AS AverageOfR FROM Table_R))
or below LEFT JOIN version with Key field involved (in case if Key really important for your logic - which somehow I feel is true)
SELECT
Row,
Hour,
AverageOfR,
L.Key AS Key
FROM (SELECT 1 AS Key, Row, Hour FROM Table_L) AS L
LEFT JOIN ((SELECT 1 AS Key, AVG(Value) AS AverageOfR FROM Table_R)) AS R
ON L.Key = R.Key
Your error message suggests that key is not a column in table_L. If no, then don't include it in the query.
It looks like you simply want the average of the total from table_R. You can approach this as:
SELECT l.*, r.average
FROM test.table_L as l CROSS JOIN
(SELECT Avg(Total) as average
FROM test.table_R
) R
ORDER BY l.hour ASC;

How to divide two values from the different row

I have used this formula.
Quote change = (current month data / previous month data) * 100
Then my data stored on SQL SERVER table look like below :
id DATE DATA
1 2015/01/01 10
2 2015/02/01 20
3 2015/03/01 30
4 2015/04/01 40
5 2015/05/01 50
6 2015/06/01 60
7 2015/07/01 70
8 2015/08/01 80
9 2015/09/01 90
How can i implement this formula on SQL Function ?
For Example
current month is 2015/02/1
Quote change = (Current Month Data / Previous Month Data ) * 100
Quote change =( 15/10)*100
Then if current date is 2015/01/01. Because no data before 2015/01/01, I need to show 0 or #
Sql server 2012 have a window function called LAG that is very useful in situations like this.
Lag returns the value of a specific column in the previous row (specified by the order by part of the over clause).
Try this:
;With cte as
(
SELECT Id, Date, Data, LAG(Data) OVER(ORDER BY Date) As LastMonthData
FROM YourTable
)
SELECT Id,
Date,
Data,
CASE WHEN ISNULL(LastMonthData, 0) = 0 THEN 0 ELSE (Data/LastMonthData) * 100 END As Quote
FROM cte
I've used a CTE just so I wouldn't have to repeat the LAG twice.
The CASE expression is to prevent an exception in case the LastMonthData is 0 or null.
You can use inner join like mentioned below -
select a.*,isnull(cast(a.data/b.data as decimal(4,2))*100,0)
from TableA as a
inner join TableA as b
on b.date = dateadd(mm,-1,a.date)
Let me know if this helps