How to turn values of a column into new individual columns in SQL - sql

Hello everyone I am trying to convert a categorical variable which is a column named Educational Group and has values like
State | Educational Group | No of Persons |
-------+-----------------------+---------------+
A Below Metric 123
A metric/secondary 456
A diploma 789
A graduate and above 101112
A post graduate 131415
B Below Metric 145
B metric/secondary 467
B diploma 564
B graduate and above 987
B post graduate 875
I want this to be converted as
State | Below Metric_ NO of persons | Metric/Secondary_No of persons | Diploma_No of Persons| ...
-------+-------------------------------+--------------------------------+---------------------+
A 123 456 789
B 145 467 564
and so on for all states and all educational levels.
Is it possible to do in SQL? Actually I did the same in Python using pivot function and it worked pretty well and now I the same to be done in Microsoft SQL Server Management Studio.
I want to convert this
https://ibb.co/L15m2sS
into this https://ibb.co/9tLpk7V

As mentioned PIVOT should do the trick.
SELECT *
FROM
(
SELECT *
FROM mytable
) AS SourceTable PIVOT(AVG([No_of_Persons]) FOR [Educational_Group] IN([Below Metric],
[metric/secondary],
[graduate and above],
[post graduate])) AS PivotTable;
Online demonstration using your table on db<>iddle.

Related

Crosstab query to get results of three tables based on results table

This request might be asked many times but I have done a search last night to figure out but I came up with nothing.
I have three tables
Table A
ID
City
1
LA
2
NY
3
LV
Table B
ID
Job
11
Programmer
22
Engineer
33
Database Administrator
44
Cyber Security Analyst
Table C
ID
Job level
111
Junior
222
Associate
333
Senior
444
Director
Final table
ID
EmployeeName
City_ID
Job_ID
Level_ID
1000
Susie
1
11
333
1001
Nora
2
11
222
1002
Jackie
2
22
111
1003
Mackey
1
11
444
1004
Noah
1
11
111
I’d like to have a crosstab query using Microsoft Access that returns the following result ( based on city )
LA Table
Jobs
Junior
Associate
Senior
Director
Programmer
1
-
1
1
Engineer
-
-
-
-
Database Administrator
-
-
-
-
Cyber Security Analyst
-
-
-
-
How can I do it?
The best approach for this is always:
Create a "base" query that joins the base tables and returns all data columns that you will need for the crosstab query.
Run the crosstab query wizard using the "base" query as input.

How do I join or concat 2 dataframes where I get a new column for each row where the left_on/right_on key is the same?

Given 2 dataframes:
DF1
ID
Name
123
Jim
456
Bob
DF2
record_id
model_year
make_desc
model_desc
vin
123
2008
Chevy
Tahoe
cvin
456
2020
Hyundai
Elantra
hvin
456
2018
Ford
F-150
fvin
I want to merge/join/groupby, not sure really such that the result is:
ID
Name
model_year1
make_desc1
model_desc1
vin1
123
Jim
2008
Chevy
Tahoe
cvin
456
Bob
2020
Hyundai
Elantra
hvin
model_year2
make_desc2
model_desc2
vin2
2008
Chevy
Tahoe
cvin
2018
Ford
F150
fvin
(the second table of results is just more columns from the first table, i couldnt figure out the markup)
so kind of like a join, I need to be able to join data on a value
but I want to add columns instead of adding rows, when there are multiple matches,
and the number of matches cant be known upfront so it could need to add 10 columns.
I tried a horizontal concat but it doesnt seem to match on value
I have also read up a bunch on groupby, but I can't get it.
any help would be appreciated.
Didnt fight a straigtfoward way. Please try as explained and coded below;
df3=pd.merge(df1,df2, how='left', on='ID')#Merge the two dfs
df3=df3.groupby(['ID','Name'])['JobCode'].unique().reset_index()# JobCode to list
df3[['JobCode','JobCode_x']]=pd.DataFrame(df3['JobCode'].tolist(), index= df3.index)#Create required columns
ID Name JobCode JobCode_x
0 123 Jim H1B None
1 456 Bob H1B H2B

Complex multi level hierarchical SQL

How can I achieve the below results using a query in SQL Server.
Table: shares_info
Complex multilevel hierarchy:
comp_name investee
APPLE MS
APPLE INTEL
APPLE MRF
APPLE GOOG
MS GOOG
MS MRF
MRF STF
MRF ABC
GOOG INTEL
GOOG TRF
GOOG XYZ
The idea is something like this. APPLE has invested in MS,INTEL,MRF,GOOG. And so on. Now the below input is something like sell my shares but first sell off shares without dependencies first. That is what my output conveys. If I want to sell GOOG shares then based on my below input GOOG has dependency on INTEL/TRF/XYZ and hence before selling GOOG I need to sell (123, XYZ) and (456 INTEL). Next, if I want to sell APPLE it has dependency on MS/INTEL/MRF/GOOG and hence as per below input I need to first sell INTEL/MRF/GOOG to sell off APPLE.
Table: shares_sell_info
Some input
id comp_name
123 APPLE
456 APPLE
123 XYZ
789 GOOG
456 INTEL
243 MRF
432 ABC
The ordering should be like below
123 XYZ (XYZ does not have any dependency and hence should come at the top)
432 ABC (MRF has a dependency on ABC and hence ABC comes on top)
243 MRF (MRF’s dependency is all taken care and hence we have MRF)
456 INTEL (APPLE and GOOGLE has a dependency on INTEL and hence INTEL is on top)
789 GOOG (At this point we can add GOOG because all its dependents are already at top)
123 APPLE (APPLE has a dependency on GOOG and hence GOOG come before APPLE)
456 APPLE
In the above ordering one among XYZ/ABC could have been first and it does not matter because they both do not have any dependency
dbfiddle
WITH
cte_com as (SELECT * FROM (VALUES
(123 ,'APPLE'),
(456 ,'APPLE'),
(123 ,'XYZ'),
(789 ,'GOOG'),
(456 ,'INTEL'),
(243 ,'MRF'),
(432 ,'ABC')) as cte_com(id, comp))
,cte_temp as (SELECT * FROM (VALUES
('APPLE', 'MS'),
('APPLE', 'INTEL' ),
('APPLE', 'MRF' ),
('APPLE', 'GOOG' ),
('MS', 'GOOG' ),
('MS', 'MRF' ),
('MRF', 'STF' ),
('MRF', 'ABC' ),
('GOOG', 'INTEL' ),
('GOOG', 'TRF' ),
('GOOG', 'XYZ')) as cte_temp(one, two))
SELECT id, comp , one
, count(*) as count
from cte_com
left join cte_temp on cte_temp.one=cte_com.comp
group by id, comp, one
order by count(*)
But it's unclear why this solution gives the ordering you want.
What is the difference between 'XYZ' and 'ABC'?
They are both depending on 1 other comp.
output:
id comp one count
123 XYZ 1
432 ABC 1
456 INTEL 1
243 MRF MRF 2
789 GOOG GOOG 3
123 APPLE APPLE 4
456 APPLE APPLE 4
7 rows
I think #Luuk's idea is right with some slight modifications. Here is the query which worked for me.
select * from shares_sell_info as ssi
left join (
select comp_name, count(*) as count
from shares_info si
group by comp_name
UNION
select comp_name, 0 as count
from shares_info
where investee is null
) temp on temp.comp_name = share_info.comp_name
where id in (
)
order by count
Here is the actual answer for my problem that I got from another post.
https://stackoverflow.com/questions/60420380/assign-weight-based-on-hierarchical-depth

Access SQL - Select only the last sequence

I have a table with an ID and multiple informative columns. Sometimes however, I can have multiple data for an ID, so I added a column called "Sequence". Here is a shortened example:
ID Sequence Name Tel Date Amount
124 1 Bob 873-4356 2001-02-03 10
124 2 Bob 873-4356 2002-03-12 7
124 3 Bob 873-4351 2006-07-08 24
125 1 John 983-4568 2007-02-01 3
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
So, I would like to obtain only these lines:
124 3 Bob 873-4351 2006-07-08 24
125 2 John 983-4568 2008-02-08 13
126 1 Eric 345-9845 2010-01-01 18
Anyone could give me a hand on how I could build a SQL query to do this ?
Thanks !
You can calculate the maximum sequence using group by. Then you can use join to get only the maximum in the original data.
Assuming your table is called t:
select t.*
from t join
(select id, MAX(sequence) as maxs
from t
group by id
) tmax
on t.id = tmax.id and
t.sequence = tmax.maxs

MS Access SQL, summing a few fields and comparing the value

My table called lets say "table1" looks as follows:
Area | Owner | Numberid | Average
1200 | Fed_G | 998 | 1400
1220 | Priv | 1001 | 1600
1220 | Local_G | 1001 | 1430
1220 | Prov_G | 1001 | 1560
1220 | Priv | 1674 | 1845
1450 | Prov_G | 1874 | 1982
Ideally what I would like to do is sum a few rows in the average column if:
1. they have the same numberid (lets say three rows all had a numberid=1000 then they would have their average added together)
2.Area=1220
Then take that and append it to the existing table, while setting the Owner field equal to "ALL".
I just started working with Access so I'm not really sure how to do this, this is my horrible attempt:
SELECT ind.Area, ind.Owner, ind.numberid,
(SELECT SUM([Average]) FROM [table1]
WHERE [numberid]=ind.numberid) AS Average
FROM [table1] AS ind
WHERE (((ind.numberid)>="1000" And (ind.numberid)<"10000") AND ((ind.Area)="1220"))
Can anyone guide me through what I should do? I'm not used to sql syntax.
I tried to use "ind" as a variable to compare to.
So far it gives me column names but no output.
I'm still unsure whether I understand what you want. So I'll offer you this query and let you tell us if the result is what you want or how it differs from what you want.
SELECT
t.Area,
'ALL' AS Owner,
t.Numberid,
Sum(t.Average) AS SumOfAverage
FROM table1 AS t
GROUP BY t.Area, 'ALL', t.Numberid;
Using your sample data, that query gives me this result set.
Area Owner Numberid SumOfAverage
1200 ALL 998 1400
1220 ALL 1001 4590
1220 ALL 1674 1845
1450 ALL 1874 1982
Probably I would be able to (maybe) give you a better answer if you improve the wording of your question.
However, to <> you need to select average column and group by numberid and Area columns. Since the Owner field is <> I guess it doesn't matter in this query that I'm writing:
SELECT numberid, area, SUM(average)
FROM table1
WHERE owner = 'your-desired-owner-equal-to-all'
GROUP BY numberid, area
ORDER BY numberid, area