Convert wide format data to long format using SQL

Convert wide format data to long format using SQL - sql

I recently started working with SQL for data manipulation. So, forgive me if this is too basic.
I have data table1 in the following format.
id pg_a pg_b
12 1 0
35 1 1
46 0 1
And I would like to convert this into a long format like shown in the following table.
id pg value
12 a 1
12 b 0
35 a 1
35 b 1
46 a 0
46 a 1
I have an sql query using case when but, in no luck. The query only executes the first when in both case statements.
This is the query:
select id,
(case when pg_a is not null then pg_a
when pg_b is not null then pg_b
else null
end) AS pg,
(case when pg_a is not null then a
when pg_b is not null then b
else null
end) AS value
from table1
What do I need to do differently? Any pointers will be appreciated.
Thank you in advance.

You can use a lateral join:
select t.id, v.*
from t cross join lateral
(values ('a', pg_a), ('b', pg_b)
) v(pg, value);
If you are new to SQL, you might you might be more comfortable with union all:
select id, 'a' as pg, pg_a as value
from t
union all
select id, 'b', pg_b
from t;

Related

Select distinct count after count?

I'll cut right to the chase: I have a select I'm currently writing with a rather lengthy where clause, what I want to do is calculate percentages.
So what I need is the count of all results and then my each distinct counts.
SELECT distinct count(*)
FROM mytable
WHERE mywhereclause
ORDER BY columnIuseInWhereClause
works fine for getting each individual values, but I want to avoid doing something like
Select (Select count(*) from mytable WHERE mywhereclause),
distinct count(*)
FROM mytable
WHERE mywhereclause
because I'd be using the same where-clause twice which just seems unnecessary.
This is for OracleDB but I'm only using standard SQL syntax, nothing database specific if I can help it.
Thanks for any ideas.
Edit:
Sample Data
__ID__,__someValue__
1 | A
2 | A
3 | B
4 | C
I want the occurances of A, B, C as numbers as well as the overall count.
__CountAll__,__ACounts__,__BCounts__,__CCounts__
4 | 2 | 1 | 1
So I can get to
100% | 50% | 25% | 25%
That last part I can probably figure out on my own. Excuse my lack of experience or even logic thinking, it's early in the morning. ;)
Edit2:
I do have written a query that works but is clumsy and long as all holy heck, this one is for trying with group by.

Try:
select count(*) as CountAll,
count(distinct SomeColumn) as CoundDistinct -- The DISTINCT goes inside the brackets
from myTable
where SomeOtherColumn = 'Something'

Use case expressions to do conditional counting:
select count(*) as CountAll,
count(case when someValue = 'A' then 1 end) as ACounts,
count(case when someValue = 'B' then 1 end) as BCounts,
count(case when someValue = 'C' then 1 end) as CCounts
FROM mytable
WHERE mywhereclause
Wrap it up in a derived table to do the % part easy:
select 100,
ACounts * 100 / CountAll,
BCounts * 100 / CountAll,
CCounts * 100 / CountAll
from
(
select count(*) as CountAll,
count(case when someValue = 'A' then 1 end) as ACounts,
count(case when someValue = 'B' then 1 end) as BCounts,
count(case when someValue = 'C' then 1 end) as CCounts
FROM mytable
WHERE mywhereclause
) dt

Here's an alternative using window function:
with data_table(ID, some_value)
AS
(SELECT 1,'A' UNION ALL
SELECT 2,'A' UNION ALL
SELECT 3,'B' UNION ALL
SELECT 4,'C'
)
SELECT DISTINCT [some_value],
COUNT([some_value]) OVER () AS Count_All,
COUNT([some_value]) OVER (PARTITION BY [some_value]) AS 'Counts' FROM [data_table]
ORDER BY [some_value]
The advantage is that you don't have to hard-code your [some_value]

SQL AVG with Case

I've a DB which stores a value from C to AAA, while C is the worst and AAA the best.
Now I need the average of this value and I don't know how to first convert the values into an int, calculate the average, round the average to an int and convert it back.
Definitions:
C = 1
B = 2
A = 3
AA = 4
AAA = 5
Is that even possible with an SQL statement? I tried to combine AVG and CASE, but I don't bring it to work...
Thanks for your help!
Regards,

select avg(case score
when 'C' then 1
when 'B' then 2
when 'A' then 3
when 'AA' then 4
when 'AAA' then 5
end) as avg_score
from the_table;
(this assumes that the column is called score)
To convert this back into the "character value", wrap the output in another case:
select case cast(avg_score as int)
when 1 then 'C'
when 2 then 'B'
when 3 then 'A'
when 4 then 'AA'
when 5 then 'AAA'
end as avg_score_value
from (
select avg(case score
when 'C' then 1
when 'B' then 2
when 'A' then 3
when 'AA' then 4
when 'AAA' then 5
end) as avg_score
from the_table;
) t
The above cast(avg_score as int) assumes ANSI SQL. Your DBMS might have different ways to cast a value to an integer.

I've created this example for u.
u can cast ur ranking into temp table, then calculate and when ur done, drop it.
create table sof (id int identity,a nvarchar (10))
insert into sof values ('a')
insert into sof values ('b')
insert into sof values ('c')
select case a when 'AAA ' then 5
when 'AA' then 4
when 'A' then 3
when 'B' then 2
else 1
end as av
into #temp
from sof
----for rounded
select ROUND(AVG(CAST(av AS FLOAT)), 4)
from #temp
--not rounded
select AVG (av)
from #temp

Conditional SUM with SELECT statement

I like to sum values in a table based on a condition taken from the same table called. The structure of the table as per below. The table is called Data
Data
Type Value
1 5
1 10
1 15
1 25
1 15
1 20
1 5
2 10
3 5
If the Value of Type 2 is larger than the Value of Type 3 then I like to subtract the Value of Type 2 from the sum of all the Values in the table. I'm not sure how to write the IF statements using Values looked up in the table. I have tried below but it doesn't work.
SELECT SUM(Value)-IF(SELECT Value FROM Data WHERE Type=2>SELECT Value
FROM Data WHERE Type=3 THEN SELECT Value FROM Data
WHERE Type=2 ELSE SELECT Value FROM Data WHERE Type=3) FROM Data
or
SELECT SUM(d.Value)-IIF(a.type2>b.type3, a.type2, b.type3)
FROM Data d, (SELECT Value AS type2 FROM Data WHERE Type=2) a,
(SELECT Value AS type3 FROM Data WHERE Type=3) b

If I follow your logic correctly, then this would seem to do what you want:
select d.value - (case when d2.value > d3.value then d2.value else 0 end)
from data d cross join
(select value from data where type = 2) d2 cross join
(select value from data where type = 3) d3 ;
EDIT:
If you want just one number, then use conditional aggregation:
select sum(value) -
(case when sum(case when type = 2 then value else 0 end) >
sum(case when type = 3 then value else 0 end)
then sum(case when type = 2 then value else 0 end)
else 0
end)
from data;

Thanks for pointing me in the right direction. This is what I came up with in the end. It is a little bit different to the reply above since I'm using MS Access
SELECT SUM(Value)-IIf(SUM(IIf(Type=2, Value, 0)>SUM(IIf(Type=3, Value, 0), SUM(IIf(Type=2, Value, 0), SUM(IIf(Type=3, Value, 0) FROM Data
It is them same as the second suggestion above but adapted to MS Access SQL.

SQL (TSQL) - Select values in a column where another column is not null?

I will keep this simple- I would like to know if there is a good way to select all the values in a column when it never has a null in another column. For example.
A B
----- -----
1 7
2 7
NULL 7
4 9
1 9
2 9
From the above set I would just want 9 from B and not 7 because 7 has a NULL in A. Obviously I could wrap this as a subquery and USE the IN clause etc. but this is already part of a pretty unique set and am looking to keep this efficient.
I should note that for my purposes this would only be a one-way comparison... I would only be returning values in B and examining A.
I imagine there is an easy way to do this that I am missing, but being in the thick of things I don't see it right now.

You can do something like this:
select *
from t
where t.b not in (select b from t where a is null);
If you want only distinct b values, then you can do:
select b
from t
group by b
having sum(case when a is null then 1 else 0 end) = 0;
And, finally, you could use window functions:
select a, b
from (select t.*,
sum(case when a is null then 1 else 0 end) over (partition by b) as NullCnt
from t
) t
where NullCnt = 0;

The query below will only output one column in the final result. The records are grouped by column B and test if the record is null or not. When the record is null, the value for the group will increment each time by 1. The HAVING clause filters only the group which has a value of 0.
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
If you want to get all the rows from the records, you can use join.
SELECT a.*
FROM TableName a
INNER JOIN
(
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
) b ON a.b = b.b

Select Distinct Attribute and Print out Count of another even when the count is 0

I don't quite know how I should describe the problem for title, but here's my question.
I have a table named hello with two columns named time and state.
Time | State
Here's an example of the data I have
1 DC
1 VA
1 VA
2 DC
2 MD
3 MD
3 MD
3 VA
3 DC
I would like to get all the possible time and the count of "VA" (0 if "VA" doesn't appear at the time)
The output would look like this
Time Number
1 2
2 0
3 1
I tried to do
SELECT DISTINCT time,
COUNT(state) as Number
FROM hello
WHERE state = 'VA'
GROUP BY time
but it doesn't seem to work.

This is a conditional aggregation:
select time, sum(case when state = 'VA' then 1 else 0 end) as NumVA
from hello
group by time
I want to add that you should never use distinct when you have a group by. The two are redundant. Distinct as a keyword is not even needed in the SQL language; semantically, it is just shorthand for grouping by all the columns.

SELECT TIME,
SUM(CASE WHEN State = 'VA' THEN 1 ELSE 0 END)
FROm tableName
GROUP BY Time
SQLFiddle Demo

One rule of thumb is to get your counts first and put them into a temp for use later.
See below:
Create table temp(Num int, [state] varchar(2))
Insert into temp(Num,[state])
Select 1,'DC'
UNION ALL
Select 1,'VA'
UNION ALL
Select 1,'VA'
UNION ALL
Select 2,'DC'
UNION ALL
Select 2,'MD'
UNION ALL
Select 3,'MD'
UNION All
Select 3,'MD'
UNION ALL
Select 3,'VA'
UNION ALL
Select 3,'DC'
Select t.Num [Time],t.[State]
, CASE WHEN t.[state] = 'VA' THEN Count(t.[State]) ELSE 0 END [Number]
INTO #temp2
From temp t
Group by t.Num, t.[state]
--drop table #temp2
Select
t2.[time]
,SUM(t2.[Number])
From #temp2 t2
group by t2.[time]

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Convert wide format data to long format using SQL - sql

You can use a lateral join: select t.id, v.* from t cross join lateral (values ('a', pg_a), ('b', pg_b) ) v(pg, value); If you are new to SQL, you might you might be more comfortable with union all: select id, 'a' as pg, pg_a as value from t union all select id, 'b', pg_b from t;

Related

Select distinct count after count?

SQL AVG with Case

Conditional SUM with SELECT statement

SQL (TSQL) - Select values in a column where another column is not null?

Select Distinct Attribute and Print out Count of another even when the count is 0

Categories

Resources