SQL percentage calculation over the hour - sql

I have a table consisting of thousands of devices similar to the one below, and I want to calculate the time spent by the devices in certain locations as a percentage on an hourly basis using this table.
(Values are given as an example.)
device
geohash
gridtype
total_hour_count
total_day_count
avg_spent_hour
67a47cd76baff7e2
sxk9g3
Work
500
25
20.00
67a47cd76baff7e2
swy9g3
Home
590
27
18.00
67a47cd76baff7e2
szbvfd
Other
420
18
9.28
02d171810d7ae1f5
swdvdf
Home
274
30
18,54
02d171810d7ae1f5
sdefvx
Work
184
22
17,51
02d171810d7ae1f5
dfvcxv
Other
122
19
14,12
...
...
...
...
...
...
As an example the desired output:
deviceid
home_percent
work_percent
other_percent
67a47cd76baff7e2
35
35
30
02d171810d7ae1f5
50
25
25
784faeff1c8b76c1
90
5
5
28fa9ca3dfff8a6f
80
10
10
f2f6324d5149e336
80
0
20
d84410d139981c19
25
50
25
...
...
...
...
Thanks for your help.

Related

How to I count a range in sql?

I have a data that looks like this:
$ Time : int 0 1 5 8 10 11 15 17 18 20 ...
$ NumOfFlights: int 1 6 144 91 504 15 1256 1 1 578 ...
Time col is just 24hr time. From 0 up all the way until 2400
What I hope to get is:
hour | number of flight
-------------------------------------
1st | 240
2nd | 223
... | ...
24th | 122
Where 1st hour is from midnight to 1am, and 2nd is 1am to 2am, and so on until finally 24th which is from 11pm to midnight. And number of flights is just the total of the NumOfFlights within the range.
I've tried:
dbGetQuery(conn,"
SELECT
flights.CRSDepTime AS Time,
COUNT(flights.CRSDepTime) AS NumOnTimeFlights
FROM flights
GROUP BY CRSDepTime/60
")
But I realise it can't be done this way. The results that I get will have 40 values for time.
> head
Time NumOnTimeFlights
1 50 6055
2 105 2383
3 133 674
4 200 446
5 245 266
6 310 34
> tail
Time NumOnTimeFlights
35 2045 48136
36 2120 103229
37 2215 15737
38 2245 36416
39 2300 15322
40 2355 8018
If your CRSDepTime column is an integer encoded time like HHmm then CRSDepTime/100 will extract the hour.
SELECT
CRSDepTime/100 AS hh,
COUNT(flights.CRSDepTime) AS NumOnTimeFlights
FROM flights
GROUP BY CRSDepTime/100

Display rows where multiple columns are different

I have data that looks like this. Thousands of rows returned, but this is just a sample.
Most days have the same numbers in them, but some do not. Note that ID 1 and 5 have identical numbers every day.
ID
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
1
26
26
26
26
26
26
26
2
44
44
30
30
44
44
44
3
55
55
55
55
80
90
55
4
12
12
43
43
43
43
43
5
36
36
36
36
36
36
36
I'd like to only return rows where the days of the week have different numbers.
In this case, the only IDs returned should be 2, 3 & 4.
What would I want this query to look like?
Thanks!
One idea that should work in most RDBMS (with some syntax tweaks) is the following.
This is SQL Server compatible: pivot the days into rows and count the distinct values and filter accordingly:
select id
from t
cross apply (
select Count(distinct d) from (
values(sunday),(monday),(tuesday),(wednesday),(thursday),(friday),(saturday)
)d(d)
)d(v)
where d.v>1

How to plot ring by zones under two variables using matplotlib or seaborn?

Date A ZONE_GEN A ZONE_LOAD B ZONE_GEN A ZONE_LOAD
1-1-2010 20 15 30 25
1-2-2010 30 25 40 35
.... ... ... ... ...
1-12-2010 15 20 20 14
I want to create two new columns having names "Gen" and "LOAD" then sum each column ending with "Gen" in GEN column likewise column ending with Load I would like to get output as below:
Date A ZONE_GEN A ZONE_LOAD B ZONE_GEN B ZONE_LOAD
1-1-2010 20 15 30 25
1-2-2010 30 25 40 35
.... ... ... ... ...
1-12-2010 15 20 20 14
Gen Load A ZONE_GEN/Gen ... B ZONE_LOAD/LOAD
50 45 40% ... 55.56%
70 60 42.86% ... 58.33%
... ... ... ...
35 34 42.86% ... 41.18%
Then ring plot-using matplotlib- the row having maximum "Gen" for percentage values and likewise for row having maximum "Load" for each zone and I hope the plot looks like the figure in the following link

How to sum multiple columns ending a certain word and keep summing in a new column?

Date A ZONE_GEN A ZONE_LOAD B ZONE_GEN A ZONE_LOAD
1-1-2010 20 15 30 25
1-2-2010 30 25 40 35
.... ... ... ... ...
1-12-2010 15 20 20 14
I want to create two new columns having names "Gen" and "LOAD" then sum each column ending with "Gen" in GEN column
likewise column ending with Load
I would like to get output as below:
Date Gen Load
1-1-2010 50 45
1-2-2010 70 60
...
1-12-2010 35 34
Try:
def f(c):
return c.rsplit('_', 1)[1]
df.set_index('Date').groupby(f, axis=1).sum().reset_index()
Date GEN LOAD
0 1-1-2010 50 40
1 1-2-2010 70 60
2 1-12-2010 35 34

Exclude the specific kind of record

I am using SQL Server 2008 R2. I do have records as below in a table :
Id Sys Dia Type UniqueId
1 156 20 first 12345
2 157 20 first 12345
3 150 15 last 12345
4 160 17 Average 12345
5 150 15 additional 12345
6 157 35 last 891011
7 156 25 Average 891011
8 163 35 last 789521
9 145 25 Average 789521
10 156 20 first 963215
11 150 15 last 963215
12 160 17 Average 963215
13 156 20 first 456878
14 157 20 first 456878
15 150 15 last 456878
16 160 17 Average 456878
17 150 15 last 246977
18 160 17 Average 246977
19 150 15 additional 246977
Regarding this data, these records are kind of groups that have common UniqueId. The records can be of type "first, last, average and additional". Now, from these records I want to select "average" type of records only if they have "first" or "additional" kind of reading in group. Else I want to exclude them from selection..
The expected result is :
Id Sys Dia Type UniqueId
1 156 20 first 12345
2 157 20 first 12345
3 150 15 last 12345
4 160 17 Average 12345
5 150 15 additional 12345
6 157 35 last 891011
7 163 35 last 789521
8 156 20 first 963215
9 150 15 last 963215
10 160 17 Average 963215
11 156 20 first 456878
12 157 20 first 456878
13 150 15 last 456878
14 160 17 Average 456878
15 150 15 last 246977
16 160 17 Average 246977
17 150 15 additional 246977
In short, I don't want to select the record that have type="Average" and have only "last" type of record with same UniqueId. Any solution?
Using EXISTS operator along correlated sub-query:
SELECT * FROM dbo.Table1 t1
WHERE [Type] != 'Average'
OR EXISTS (SELECT * FROM Table1 t2
WHERE t1.UniqueId = t2.UniqueId
AND t1.[Type] = 'Average'
AND t2.[Type] IN ('first','additional'))
SQLFiddle DEMO
Try something like this:
SELECT * FROM MyTable WHERE [Type] <> 'Average'
UNION ALL
SELECT * FROM MyTable T WHERE [Type] = 'Average'
AND EXISTS (SELECT * FROM MyTable
WHERE [Type] IN ('first', 'additional')
AND UniqueId = T.UniqueId)
The first SELECT statement gets all records except the ones with Type = 'Average'. The second SELECT statement gets only the Type = 'Average' records that have at least one record with the same UniqueId, that is of type 'first' or 'additional'.