Oracle - Group By Creating Duplicate Rows

Oracle - Group By Creating Duplicate Rows - sql

I have a query that looks like this:
select nvl(trim(a.code), 'Blanks') as Ward, count(b.apcasekey) as UNSP, count(c.apcasekey) as GRAPH,
count(d.apcasekey) as "ANI/PIG",
(count(b.apcasekey) + count(c.apcasekey) + count(d.apcasekey)) as "TOTAL ACTIVE",
count(a.apcasekey) as "TOTAL OPEN" from (etc...)
group by a.code
order by Ward
The reason I have nvl(trim(a.code), 'Blanks') as Ward is that sometimes a.code is a blank string, sometimes it's a null.
The problem is that when I use the Group By statement, I can't use Ward or I get the error
Ward: Invalid Identifier
I can only use a.code so I get 2 rows for 'Blanks', as per below
1 Blanks 7 0 0 7 7
2 Blanks 23 1 1 25 30
3 W01 75 4 0 79 91
4 W02 62 1 0 63 72
5 W03 140 2 0 142 162
6 W04 6 1 0 7 7
7 W05 46 0 1 47 48
8 W06 322 46 1 369 425
9 W07 91 0 1 92 108
10 W08 93 2 0 95 104
11 W09 28 1 0 29 30
12 W10 25 0 0 25 28
What I need, is for the row with 'Blanks' to combined into 1 row. Little help?
Thanks.

You can not use the alias in the GROUP BY, but you can use the expression that builds the value:
GROUP BY nvl(trim(a.code), 'Blanks')

Related

How to group merge columns based on one row identifier with pandas?

I have a dataset, in which it has a lot of entries for a single location. I am trying to find a way to sum up all of those entries without affecting any of the other columns. So, just in case I'm not explaining it well enough, I want to use a dataset like this:
Locations Cyclists maleRunners femaleRunners maleCyclists femaleCyclists
Bedford 10 12 14 17 27
Bedford 11 40 34 9 1
Bedford 7 1 2 3 3
Leeds 1 1 2 0 0
Leeds 20 13 6 1 1
Bath 101 20 33 41 3
Bath 11 2 3 1 0
And turn it into something like this:
Locations Cyclists maleRunners femaleRunners maleCyclists femaleCyclists
Bedford 28 53 50 29 31
Leeds 21 33 39 1 1
Bath 111 22 36 42 3
Now, I have read up that a groupby should work in a way, but from my understanding a group by will change it into 2 columns and I don't particularly want to make hundreds of 2 columns and then merge it all. Surely there's a much simpler way to do this?

IIUC, groupby+sum will work for you:
df.groupby('Locations',as_index=False,sort=False).sum()
Output:
Locations Cyclists maleRunners femaleRunners maleCyclists femaleCyclists
0 Bedford 28 53 50 29 31
1 Leeds 21 14 8 1 1
2 Bath 112 22 36 42 3

Pivot table should work for you.
new_df = pd.pivot_table(df, values=['Cyclists', 'maleRunners', 'femalRunners',
'maleCyclists','femaleCyclists'],index='Locations', aggfunc=np.sum)

SQL LOGIC using 3conditions

course_completions CC
id coursemodid userid state timemodified
370 23 2 1 1433582890
329 24 89 1 1427771915
333 30 39 1 1428309816
332 32 39 1 1428303307
327 33 40 1 1427689703
328 34 89 1 1427710711
303 35 41 1 1410258482
358 36 99 1 1432020067
365 25 2 1 1433142455
304 26 69 1 1410717866
353 37 95 1 1430387005
416 38 2 1 1438972465
300 27 70 1 1409824001
302 29 74 1 1412055704
297 30 2 1 1409582123
301 133 41 1 1410255923
336 133 91 1 1428398435
364 133 40 1 1433142348
312 133 85 1 1425863621
course_modules CM
id course
23 6
24 6
25 6
26 6
27 6
28 6
29 8
30 8
31 8
32 8
33 8
34 5
35 5
36 5
37 5
38 5
39 9
40 9
41 9
course_mod_settings CMS
id course modinstance
27 8 30
28 8 31
29 8 32
30 8 33
31 6 23
32 6 24
33 6 25
34 6 26
35 6 27
36 6 28
37 9 39
38 9 40
39 9 41
I need the count of each user has Completed modules, Inprocess modules and Notstarted modules for each course, where getting the count of userids from table CC by taking courseia from table CM, get number of modules that an user has completed from each course.
(A course can have morethan one module and a course can have number of users attempted all modules, few modules or not attempted at all).
So, I need number of users - has done number of modules - in a course. (3 logics)
Completed.Users means : If number of modules attempted is equal to number of modinstance from table CMS (ex: no. of modules attempted by a user per course= 9, no.modinstance = 9. Because 7 is not equal to 9, They are completed.)
Inprocess.Users means : Number of modules attempted should be >0, but not equal to [count(modinstance) per course] (ex: no. of modules attempted by a user per course= 7 , no.modinstance = 9. Because 7 is not equal to 9, They are Inprocess.)
Notstarted.Users means : Number of modules attempted should be equal to 0, (ex: no. of modules attempted by a user per course= 0. They are Notstarted).
OUTPUT :
Course No.Completed.Users No.Inprocess.Users No.Notstarted.Users
5 65 32 6
6 40 12 15
8 43 56 0
9 0 7 9
Sir, this is a very critical logic that I was trying, I couldn't get a solution. I hope stackoverflow developers could help me out. I tried with my query :
SELECT cm.course AS "Course",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) =
(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 )
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Completed.Users",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) > 0 AND
(SELECT count(cms.id) FROM course_mod_settings cms) !=
(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 )
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Inprocess.Users",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) = 0
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Notstarted.Users"
FROM
mdl_course c
GROUP BY c.id

SQL Fiddle
SELECT course AS "Course",
SUM(CASE WHEN completion_count = module_count THEN 1 ELSE 0 END) AS "No.Completed.Users",
SUM(CASE WHEN completion_count > 0 AND completion_count < module_count THEN 1 ELSE 0 END) AS "No.Inprocess.Users",
SUM(CASE WHEN completion_count = 0 THEN 1 ELSE 0 END) AS "No.Notstarted.Users"
FROM (SELECT course, COUNT(*) AS module_count
FROM course_modules cm
GROUP BY course) course_module_counts JOIN
(SELECT cm.course AS courseid, users.id AS userid, SUM(CASE WHEN cc.state = 1 THEN 1 ELSE 0 END) completion_count
FROM ((SELECT DISTINCT userid AS id FROM course_completions) users CROSS JOIN course_modules cm) LEFT JOIN course_completions cc ON users.id = cc.userid AND cc.coursemodid = cm.id
GROUP BY cm.course, users.id) course_completion_counts
ON course_module_counts.course = course_completion_counts.courseid
GROUP BY course
gives this output, which matches the limited dataset that you provided in your question.
| course | No.Completed.Users | No.Inprocess.Users | No.Notstarted.Users |
|--------|--------------------|--------------------|---------------------|
| 5 | 0 | 5 | 7 |
| 6 | 0 | 4 | 8 |
| 8 | 0 | 4 | 8 |
| 9 | 0 | 0 | 12 |

how to update duplicate rows in a column to a new values

I will explain my problem briefly
have duplicate rino like below (actually this rino is the serial number in front end)
chqid rino branchid
----- ---- --------
876 6 2
14 6 2
18 10 2
828 10 2
829 11 2
19 11 2
830 12 2
20 12 2
78 40 2
1092 40 2
1094 41 2
79 41 2
413 43 2
1103 43 2
82 44 2
1104 44 2
1105 45 2
83 45 2
91 46 2
1106 46 2
here in my case I don't want to delete these duplicate rino instead of that I planned to update the rino having max date(column not specified in the above sample actually a date column is there) to the next rino number
what exactly I meant is :
if I sort out the above result according to the max(date) I will get
chqid rino branchid
----- ---- --------
876 6 2
828 10 2
19 11 2
830 12 2
1092 40 2
79 41 2
413 43 2
82 44 2
83 45 2
1106 46 2
(NOTE : total number of duplicate rows are 10 in branchid=2)
the last entered rino in the table for branchid=2 is 245
So I just want to update the 10 rows(Of column rino) with numbers starting from 246 to 255( just added 245+10 like this select lastno+ generate_series(1,10) nos from tab where cola=4 and branchid = 2 and vrid=20;)
Expected Output:
chqid rino branchid
----- ---- --------
876 246 2
828 247 2
19 248 2
830 249 2
1092 250 2
79 251 2
413 252 2
82 253 2
83 254 2
1106 255 2
using postgresql

Finally I found a solution, am using dynamic-sql to solve my issue
do
$$
declare
arow record;
begin
for arow in
select chqid,rino,branchid from (
select chqid,rino::int ,vrid,branchid ,row_number()over (partition by rino::int ) rn
from tab
where vrid =20
and branchid = 2)t
where rn >1
loop
execute format('
update tab
set rino=(select max(rino::int)+1 from gtab19 where acyrid=4 and branchid = 2 and vrid=20)
where chqid=%s
',arow.chqid);
end loop;
end;
$$;

Getting more data while converting data int to float and doing division and Multiplying with int?

I have three columns as shown in below tableA
Student Day Shifts
129 11 4
91 9 6
166 19 8
164 26 12
146 11 6
147 16 8
201 8 3
164 4 2
186 8 6
165 7 4
171 10 4
104 5 4
1834 134 67
I am writing a tvf to calculate Value of Points generated for Students as below
ALTER function Statagic(
#StartDate date
)
RETURNS TABLE
AS
RETURN
(
with src as
( select
Division=case when Shifts=0 then 0 else cast(Day as float)/cast(Shifts as float) end,*
from TableA
)
,tgt as
(select *,Points=Student*Division from src
)
select * from tgt)
When i execute above tvf(select * from Statagic('3/16/2014'))
My output is below
129 11 4 2.75 354.75
91 9 6 1.5 136.5
166 19 8 2.375 394.25
164 26 12 2.16666666666667 355.333333333333
146 11 6 1.83333333333333 267.666666666667
147 16 8 2 294
201 8 3 2.66666666666667 536
164 4 2 2 328
186 8 6 1.33333333333333 248
165 7 4 1.75 288.75
171 10 4 2.5 427.5
104 5 4 1.25 130
1834 134 67 2 3668
Note :
If you see the last row for three columns in the table is the total of rest column.So when you see the last row in the Output of TVF for last two columns when i am adding i am not getting same data i am getting more.
Guys please help me i am struggling to fix this bug i tried in all ways but i am unable to fix it.
select 354.75+136.5+394.25+355.333333333333+267.666666666667+294+536+328+248+288.75+427.5+130=3760.750000000000
3668 is not euql to 3760.75(I am getting more 100 value)

Exclude the specific kind of record

I am using SQL Server 2008 R2. I do have records as below in a table :
Id Sys Dia Type UniqueId
1 156 20 first 12345
2 157 20 first 12345
3 150 15 last 12345
4 160 17 Average 12345
5 150 15 additional 12345
6 157 35 last 891011
7 156 25 Average 891011
8 163 35 last 789521
9 145 25 Average 789521
10 156 20 first 963215
11 150 15 last 963215
12 160 17 Average 963215
13 156 20 first 456878
14 157 20 first 456878
15 150 15 last 456878
16 160 17 Average 456878
17 150 15 last 246977
18 160 17 Average 246977
19 150 15 additional 246977
Regarding this data, these records are kind of groups that have common UniqueId. The records can be of type "first, last, average and additional". Now, from these records I want to select "average" type of records only if they have "first" or "additional" kind of reading in group. Else I want to exclude them from selection..
The expected result is :
Id Sys Dia Type UniqueId
1 156 20 first 12345
2 157 20 first 12345
3 150 15 last 12345
4 160 17 Average 12345
5 150 15 additional 12345
6 157 35 last 891011
7 163 35 last 789521
8 156 20 first 963215
9 150 15 last 963215
10 160 17 Average 963215
11 156 20 first 456878
12 157 20 first 456878
13 150 15 last 456878
14 160 17 Average 456878
15 150 15 last 246977
16 160 17 Average 246977
17 150 15 additional 246977
In short, I don't want to select the record that have type="Average" and have only "last" type of record with same UniqueId. Any solution?

Using EXISTS operator along correlated sub-query:
SELECT * FROM dbo.Table1 t1
WHERE [Type] != 'Average'
OR EXISTS (SELECT * FROM Table1 t2
WHERE t1.UniqueId = t2.UniqueId
AND t1.[Type] = 'Average'
AND t2.[Type] IN ('first','additional'))
SQLFiddle DEMO

Try something like this:
SELECT * FROM MyTable WHERE [Type] <> 'Average'
UNION ALL
SELECT * FROM MyTable T WHERE [Type] = 'Average'
AND EXISTS (SELECT * FROM MyTable
WHERE [Type] IN ('first', 'additional')
AND UniqueId = T.UniqueId)
The first SELECT statement gets all records except the ones with Type = 'Average'. The second SELECT statement gets only the Type = 'Average' records that have at least one record with the same UniqueId, that is of type 'first' or 'additional'.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Oracle - Group By Creating Duplicate Rows - sql

You can not use the alias in the GROUP BY, but you can use the expression that builds the value: GROUP BY nvl(trim(a.code), 'Blanks')

Related

How to group merge columns based on one row identifier with pandas?

SQL LOGIC using 3conditions

how to update duplicate rows in a column to a new values

Getting more data while converting data int to float and doing division and Multiplying with int?

Exclude the specific kind of record

Categories

Resources