Pivot Table with Redshift (PostgreSQL) with Count

Pivot Table with Redshift (PostgreSQL) with Count - sql

I'm facing a challenge with Redshift:
I'm trying to dynamically move rows into columns and aggregate by count, however I noticed the pivot table feature is only available from PostgreSQL 9.
Any idea about how to do the following?
index fruit color
1 apple red
2 apple yellow
2 banana blue
2 banana blue
3 banana blue
3 banana green
3 pear green
3 pear red
to:
index red yellow blue green
1 1 0 0 0
2 0 1 2 0
3 1 0 1 2
Essentially, grouping and counting occurrences of color per id (fruit is not so important, although I'll use it as a filter later).
Note: I might also want to do a binary transformation later on (i.e 0 for 0 and 1 if > 0)
Edit: If the above is not possible, any way to do this instead ?
index color count
1 red 1
1 yellow 0
1 blue 0
1 green 0
2 red 0
2 yellow 1
2 blue 2
2 green 0
3 red 1
3 yellow 0
3 blue 1
3 green 2
(again blue,yellow,blue and green should be dynamic)

For the Edit, you could do
select x.index, x.color, sum(case when y.index is not null then 1 else 0 end) as count
from
((select index
from [table]
group by index
order by index) a
inner join
(select color
from [table]
group by color
order by color) b
on 1 = 1) x
left outer join
[table] y
on x.index = y.index
and x.color = y.color
group by x.index, x.color
order by x.index, x.color

If PIVOT is not available in Redshift, then you could always just use a standard pivot query:
SELECT
index,
SUM(CASE WHEN color = 'red' THEN 1 ELSE 0 END) AS red,
SUM(CASE WHEN color = 'yellow' THEN 1 ELSE 0 END) AS yellow,
SUM(CASE WHEN color = 'blue' THEN 1 ELSE 0 END) AS blue,
SUM(CASE WHEN color = 'green' THEN 1 ELSE 0 END) AS green
FROM yourTable
GROUP BY index

Related

T-SQL LAG function for returning previous rows with different WHERE condition

I have data like:
table name: "Data"
ID Name Color Value
1 A Blue 1
2 B Red 2
3 A Blue 3
4 B Red 4
5 B Blue 3
6 A Red 4
Can I use a SQL LAG function to get for each Name that is Red, the previous value for for that name that was Blue (ordering by ID)?
Result set:
ID Name Color Value PreviousValue
2 B Red 2 NULL
4 B Red 4 NULL
6 A Red 4 3

select *
from
(
select *
,case when color = 'red' and color != lag(color) over(partition by name order by id) then lag(value) over(partition by name order by ID) end PreviousValue
from t
) t
where color = 'red'
order by id
ID
Name
Color
Value
PreviousValue
2
B
Red
2
null
4
B
Red
4
null
6
A
Red
4
3
Fiddle

Total column in a pivot example

check here for background if needed:
Pivoting a table with parametrization
We have 3 tables.
tid_color - parametrization table
--------------------------
ID ColorDescription
--------------------------
1 Green
2 Yellow
3 Red
-------------------------
tid_car - parametrization table
--------------------------
ID CARDescription
-------------------------
1 Car X
2 Car Y
3 Car Z
--------------------------
table_owners_cars
------------------------------------------------
ID CarID ColorID Owner
------------------------------------------------
1 1 1 John
2 1 2 Mary
3 1 3 Mary
4 1 3 Giovanni
5 2 2 Mary
6 3 1 Carl
7 1 1 Hawking
8 1 1 Fanny
------------------------------------------------
CarID is FOREIGN KEY to tid_car
ColorId is FOREIGN KEY to tid_color
If we code:
SELECT tcar.CarDescription, tco.ColorDescription, Count(*) as Total
FROM table_owners_cars tocar
LEFT JOIN tid_color tco ON tco.Id = tocar.ColorId
LEFT JOIN tid_Car tcar ON tcar.Id = tocar.CarId
GROUP BY CarDescription, ColorDescription
it results as:
Id CarDescription ColorDescription Total
1 CarX Green 3
2 CarX Yellow 1
3 CarX Red 1
4 CarY Yellow 1
5 CarZ Green 1
I want to pivot exactly as follows:
---------------------------------------------
Id Car Green Yellow Red Total
---------------------------------------------
1 CarX 3 1 1 5
2 CarY 0 1 0 1
3 CarZ 1 0 0 1
---------------------------------------------
Now:
we want to count the total for each row in a particular column of the table_owners_cars and this value is close to total like we see in the last column (between parenthesis). There are CarX WITH a NULL for the colorID (same can happen with the other Car) and we want to know all the number of carX, carY, CarZ (with and without (=null or 0) assigned ColorId
---------------------------------------------------
Id Car Green Yellow Red Violet Total
---------------------------------------------------
1 CarX 3 1 1 0 5 (40)
2 CarY 0 1 0 0 1 (35)
3 CarZ 1 0 0 0 1 (4)
---------------------------------------------------
DESIRED TABLE
One try with the code (very similar to one provided in the aforementioned hyperlink):
SELECT pvt.CarID, tc.Description AS Car, CONCAT (' [1] as 'Green', [2] as 'Yellow', [3] as 'Red', [1]+[2]+[3] as 'total'', '(', count(*), ')' )
FROM
(SELECT CarID, colorId
FROM table_owners_cars tocar
) p
PIVOT
(
COUNT (ColorId)
FOR ColorId IN ( [1], [2], [3])
) AS pvt
INNER JOIN tid_car tc ON pvt.CarId=tc.Id
group by p.Car
this does not work. single quotes are also a nightmare with concat. Thanks in advance.

I just find these queries easier to do with conditional aggregation:
SELECT CarId, Description,
SUM(CASE WHEN color = 'Green' THEN 1 ELSE 0 END) as Green,
SUM(CASE WHEN color = 'Yellow' THEN 1 ELSE 0 END) as Yellow,
SUM(CASE WHEN color = 'Red' THEN 1 ELSE 0 END) as Red,
SUM(CASE WHEN color IN ('Green', 'Yellow', 'Red') THEN 1 ELSE 0 END) as total_gyr,
COUNT(*) as total
FROM table_owners_cars tocar
GROUP BY CarId, Description;
I see no reason to combine the two totals into a single string column -- as opposed to having them in separate integer columns. But, you can combine them if you want.

Conditional formatting on MAX value row

Below is a table:
Paration by ID & capture the row of MAX value when Role = Red
ID Role HistID Date Style
1 Yellow 101 1/1/17 M
1 Red 101 1/2/17 F
1 Red (Null) 1/5/17 C
2 Blue 101 5/1/17 a
2 Yellow 201 4/1/17 b
2 Red 301 5/5/17 C
3 Yellow (Null)
Referece the below rows:
ID Role HistID Date Style
1 Red (Null) 1/5/17 c
2 Red 301 5/5/17 c
Now based off those rows apply a condition.
WHEN HistID IS NOT NULL and Style = C THEN 'Assigned'
ELSE'Unassigned'
END Status
Output:
ID Role HistID Date Style Status
1 Yellow 101 1/1/17 M Unassigned
1 Red 101 1/2/17 F Unassigned
1 Red (Null) 1/5/17 C Unassigned
2 Blue 101 5/1/17 a Assigned
2 Yellow 201 4/1/17 b Assigned
2 Red 301 5/5/17 C Assigned
3 Yellow (Null) Unassigned
Not so much the answer here, I would like understand and learn the syntax behind applying MAX , Case Expression and Keep clause.

Use window functions:
select t.*,
(case when matches_flag > 0 then 'Assigned' else 'Unassigned' end) as status
from (select t.*,
sum(case when role = 'Red' and histid is not null and style = 'C' then 1 else 0 end) over
(partition by id) as matches_flag
from t
) t;
EDIT:
The subquery is not actually needed. I just think it makes the logic easier to follow. You can do:
select t.*,
(case when sum(case when role = 'Red' and histid is not null and style = 'C' then 1 else 0 end) over (partition by id) > 0
then 'Assigned'
else 'Unassigned'
end) as status
from t;

Show result with atleast two matching values and one matching is set to primary

How do you display results where the person have 2 matching values with Color blue and red and person have primary of blue is set to 1 and red to 0
PersonList primary color
person1 1 blue
person1 0 red
person2 1 blue
person3 1 red
person3 0 blue
person4 1 blue
person4 0 red
person4 1 blue
Result Should Display Person1 and Person 4
NOTE: As long as blue its primary is set to 1 and red set to 0.
So far this is the my query from the result above
Select * person p Inner Join COLOR c ON p.person_colorid = c.person_colorid
I have tried this query but I know there is wrong with this. Which will display red with primary as 1 and also blue as 1
Person table contains [personList],[person_colorid] and [is_primary] while color table contains [color] and [person_colorid]
Select * person p Inner Join COLOR
c ON p.person_colorid = c.person_colorid where c.color IN (blue,red) AND p.primary = 1

Use a CASE expression to check the conditions that the PersonList which matching both the conditions.
Query
select t.[PersonList] from(
select [PersonList],
sum(case when [color] = 'red' and [primary] = 0 then 1 else 0 end) as [red],
sum(case when [color] = 'blue' and [primary] = 1 then 1 else 0 end) as [blue]
from [your_table_name]
group by [PersonList]
)t
where t.[red] > 0 and t.[blue] > 0;

Select records with the same name and another field having two values

Given a table of products that are available in different colors,
NAME COLOR
---- -----
pen red
pen blue
pen yellow
box red
mic red
tape blue
How can I find the names of the products that are available in both red and blue (pen), and the names of products available in red, but not in blue (box, mic)?
Fiddle: http://sqlfiddle.com/#!9/021a6/3

I like group by with having for these types of queries.
For both colors:
select name
from t
group by name
having sum(case when color = 'red' then 1 else 0 end) > 0 and
sum(case when color = 'blue' then 1 else 0 end) > 0;
For red but not blue:
select name
from t
group by name
having sum(case when color = 'red' then 1 else 0 end) > 0 and
sum(case when color = 'blue' then 1 else 0 end) = 0;
The conditions in the having clause count the number of rows that match the condition (for each name). So, > 0 means that at least one row matched and = 0 means that no row matched.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pivot Table with Redshift (PostgreSQL) with Count - sql

Related

T-SQL LAG function for returning previous rows with different WHERE condition

Total column in a pivot example

Conditional formatting on MAX value row

Show result with atleast two matching values and one matching is set to primary

Select records with the same name and another field having two values

Categories

Resources