SQLite impute missing values by mean for every group

SQLite impute missing values by mean for every group - sql

I have an SQLite table as shown below.
students
grades
Nick
34
Nick
42
Nick
86
Nick
Null
John
38
John
12
John
74
John
Null
Colin
87
Colin
23
Colin
46
Colin
42
What I want to do is impute Null values with the mean of each student's grades.
For example, missing value for Nick will be 54 and for John 41.3.
How can I do this in SQL code? I am using SQLite.

Use a correlated subquery in the UPDATE statement:
UPDATE tablename AS t1
SET grades = (
SELECT ROUND(AVG(t2.grades), 1)
FROM tablename AS t2
WHERE t2.students = t1.students
)
WHERE t1.grades IS NULL;
See the demo.

Related

Save a comma separated array in a string in a table [duplicate]

This question already has answers here:
Split function equivalent in T-SQL?
(16 answers)
Closed 2 years ago.
I inherited a table that has these columns
ID, Name, Subjects
-- ---- --------
33 Mike Math,English,Physics
24 Paul Art,French,Med,English,Math
58 Sami Physics,Biology
22 Nora Math,English,Art
76 Mona Math,English,French,Med,Physics
39 Lila Physics
19 Dave Math,Biology,Physics
48 Jade English,French,Physics
82 Mark Med,Biology,Physics
23 Nina Biology,English,Physics
I am trying to break this into my structured table.
ID, Name, Subject
-- ---- --------
33 Mike Math
33 Mike English
33 Mike Physics
24 Paul Art
24 Paul French
24 Paul Med
24 Paul English
I tried with using STRING_SPLIT in the select statement
SELECT ID, Name, STRING_SPLIT(Subjects, ',') AS SUbject
FROM Students
but that did not work
'STRING_SPLIT' is not a recognized built-in function name.
How can I split these subjects into rows?
This script should generate the table and data
declare #Students as table (ID int, Name varchar(4), Subjects varchar(100))
INSERT INTO #Students (ID, Name, Subjects)
VALUES
(33,'Mike','Math,English,Physics'),
(24,'Paul','Art,French,Med,English,Math'),
(58,'Sami','Physics,Biology'),
(22,'Nora','Math,English,Art'),
(76,'Mona','Math,English,French,Med,Physics'),
(39,'Lila','Physics'),
(19,'Dave','Math,Biology,Physics'),
(48,'Jade','English,French,Physics'),
(82,'Mark','Med,Biology,Physics'),
(23,'Nina','Biology,English,Physics')
SELECT * FROM #Students
Response to suggested duplicate
Although this question was closed suggesting it similar to Split function equivalent in T-SQL? it is actually not.
That question is a simple like which a simple FROM string_split() can work for. I got the answer thanks to Gordon. I thought it might help other with the same issue. If you have a similar issue, you may find the answer down.

Just another option using a little XML
Example
Declare #YourTable Table ([ID] varchar(50),[Name] varchar(50),[Subjects] varchar(50))
Insert Into #YourTable Values
(33,'Mike','Math,English,Physics')
,(24,'Paul','Art,French,Med,English,Math')
,(58,'Sami','Physics,Biology')
,(22,'Nora','Math,English,Art')
,(76,'Mona','Math,English,French,Med,Physics')
,(39,'Lila','Physics')
,(19,'Dave','Math,Biology,Physics')
,(48,'Jade','English,French,Physics')
,(82,'Mark','Med,Biology,Physics')
,(23,'Nina','Biology,English,Physics')
Select A.ID
,A.Name
,B.*
From #YourTable A
Cross Apply (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From ( values (cast('<x>' + replace((Select replace([Subjects],',','§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.'))) as A(x)
Cross Apply x.nodes('x') AS B(i)
) B
Returns
ID Name RetSeq RetVal
33 Mike 1 Math
33 Mike 2 English
33 Mike 3 Physics
24 Paul 1 Art
24 Paul 2 French
24 Paul 3 Med
24 Paul 4 English
24 Paul 5 Math
58 Sami 1 Physics
58 Sami 2 Biology
22 Nora 1 Math
22 Nora 2 English
22 Nora 3 Art
76 Mona 1 Math
76 Mona 2 English
76 Mona 3 French
76 Mona 4 Med
76 Mona 5 Physics
39 Lila 1 Physics
19 Dave 1 Math
19 Dave 2 Biology
19 Dave 3 Physics
48 Jade 1 English
48 Jade 2 French
48 Jade 3 Physics
82 Mark 1 Med
82 Mark 2 Biology
82 Mark 3 Physics
23 Nina 1 Biology
23 Nina 2 English
23 Nina 3 Physics

You can use string_split() in the most recent versions of SQL Server:
SELECT ID, Name, ss.value as subject
FROM Students s CROSS APPLY
string_split(s.subjects, ',') ss;
You can also play with JSON or define your own split() function, although in older versions, I would just use a recursive CTE:
with cte as (
select s.id, convert(varchar(max), null) as subject, convert(varchar(max), subjects + ',') as rest
from students s
union all
select id, left(rest, charindex(',', rest) - 1),
stuff(rest, 1, charindex(',', rest), '')
from cte
where rest <> ''
)
select *
from cte
where subject is not null;
Here is a db<>fiddle.

Group columns in query

I have a query where I fetch the following columns:
ID Name Age Hobby
ID, name and age comes from Table A
Hobby comes from Table B
Example of results I can get is the following:
ID Name Age Hobby
0 John 35 Fishing
0 John 35 Tennis
0 John 35 Hiking
1 Jane 31 Fishing
2 Nate 42 Fishing
2 Nate 42 Tennis
What I would like to have as result is the following instead:
ID Name Age Hobby
0 John 35 Fishing, Tennis, Hiking
1 Jane 31 Fishing
2 Nate 42 Fishing, Tennis
Any ideas of how to achieve that?

Try this :
;WITH CTE AS(
SELECT DISTINCT ID,NAME,AGE
FROM TableName
)
SELECT *,
STUFF(SELECT ','+ Hobby FROM TableName t1 WHERE t1.ID=CTE.ID FOR XML PATH(''),1,1,'')
FROM CTE

query to find more than one name with different values

this is my table i need more than two names will appear as out put i used count in my query, but name timur has diff company so it cant count as 1 i need count as 2
Name ID Company Name CompanyID Role Name
Ahmed 73 King & Spalding 55 Counsel
Timur 78 Chance CIS Ltd 39 Partner
Timur 78 Clifford LLP 28 Counsel
Rahail 80 Reed Smith ltd 97 Partner
out put like this
Name ID Company Name CompanyID Role Name count
Ahmed 73 King & Spalding 55 Counsel 1
Timur 78 Chance CIS Ltd 39 Partner 2
Timur 78 Clifford LLP 28 Counsel 2
Rahail 80 Reed Smith ltd 97 Partner 1

I am assuming that name and ID match each other. So in case of duplicated names for different people, I am using ID for partitioning
SELECT
*,
count(*) over (partition by ID) as [count]
FROM yourtable

Use correlated sub-query:
select t.*, (select count(*) from tablename where name = t.name) as count
from tablename t

If you're using SQL Server 2005 or above then you can use a window function to achieve this easily:
SELECT
T.Name,
T.ID,
T.CompanyName,
T.CompanyID,
T.RoleName,
COUNT(*) OVER (PARTITION BY T.Name)
FROM
My_Table T

How to select students who got above average?

How to list all students who got above average grade of their group in SQL table? We have 6 group_ids so there six different average grades.
group_id student grade
1 James 85
1 Adam 96
2 Tom 56
2 Jane 89
2 Anny 90
Result:
group_id student grade
1 Adam 96
2 Jane 89
2 Anny 90

ashkufaraz's answer is closer but not quite right
select group_id,student,grade from students one where grade >
(select avg(grade) from students two where two.group_id = one.group_id)

The question is just tagged SQL, so this is an answer using standard SQL:
One option is to use a window function:
select group_id,student,grade
from (
select group_id,student,grade,
avg(grade) over (partition by group_id) as group_avg
from studends
) t
where grade > group_avg;
This has the additional benefit that you can also display the group average along with the result with no additional join or sub-select.

select same data from two columns in a table, and using one sql statement to show all data

i'm a fresh man in sql area, and i have some question.
the table like below
Table Name:EM
ID name Birth High
1 Tom 11/23 65
2 Mary 11/23 65
3 Bill 03/02 55
4 Liny 01/08 45
5 Kevin 05/16 50
6 Lee 05/16 50
but I only need data like below
ID name Birth High
1 Tom 11/23 65
2 Mary 11/23 65
3 Kevin 05/16 50
4 Lee 05/16 50
and I used fool sql to get data like this
select * from em where birth = '11/23' and high = '65';
select * from em where birth = '05/16' and high = '50';
please teach me how to get result in one sql statement, thank you very much.

you want IN
select * from em where (birth, high) in (('11/23','65'),('05/16','50'));

use OR to combine them:
select * from em where (birth = '11/23' and high = '65') or (birth = '05/16' and high = '50');

You may start learning SQL from here
Use IN and BETWEEN for this check tutorial for IN here and check tutorial for BETWEEN here
This could be your query
SELECT * FROM YOUR_TABLE WHERE COL1 IN (DATE_HERE,ID_HERE) AND/OR COL2 IN (DATE_HERE,ID_HERE)

IN operator is used for adding multiples values
select * from em where birth in ('11/23','05/16') and high in ('65','50');

You can use "or","in" operator
like this:
select * from em
where
birth = '11/23' or birth = '05/16'
and high = '65' or high = '50';

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQLite impute missing values by mean for every group - sql

Use a correlated subquery in the UPDATE statement: UPDATE tablename AS t1 SET grades = ( SELECT ROUND(AVG(t2.grades), 1) FROM tablename AS t2 WHERE t2.students = t1.students ) WHERE t1.grades IS NULL; See the demo.

Related

Save a comma separated array in a string in a table [duplicate]

Group columns in query

query to find more than one name with different values

How to select students who got above average?

select same data from two columns in a table, and using one sql statement to show all data

Categories

Resources