Group by bursts of occurences in TimescaleDB/PostgreSQL - sql

this is my first question in stackoverflow, any advice on how to ask a well structured question will be welcomed.
So, I have a TimescaleDB database, which is time-series databases built over Postgres. It has most of its functionalities, so if any of you don't know about Timescale it won't be an issue.
I have a select statement which returns:
time | num_issues | actor_login
------------------------+------------+------------------
2015-11-10 01:00:00+01 | 2 | nifl
2015-12-10 01:00:00+01 | 1 | anandtrex
2016-01-09 01:00:00+01 | 1 | isaacrg
2016-02-08 01:00:00+01 | 1 | timbarclay
2016-06-07 02:00:00+02 | 1 | kcalmes
2016-07-07 02:00:00+02 | 1 | cassiozen
2016-08-06 02:00:00+02 | 13 | phae
2016-09-05 02:00:00+02 | 2 | phae
2016-10-05 02:00:00+02 | 13 | cassiozen
2016-11-04 01:00:00+01 | 6 | cassiozen
2016-12-04 01:00:00+01 | 4 | cassiozen
2017-01-03 01:00:00+01 | 5 | cassiozen
2017-02-02 01:00:00+01 | 8 | cassandraoid
2017-03-04 01:00:00+01 | 16 | erquhart
2017-04-03 02:00:00+02 | 3 | erquhart
2017-05-03 02:00:00+02 | 9 | erquhart
2017-06-02 02:00:00+02 | 5 | erquhart
2017-07-02 02:00:00+02 | 2 | greatwarlive
2017-08-01 02:00:00+02 | 8 | tech4him1
2017-08-31 02:00:00+02 | 7 | tech4him1
2017-09-30 02:00:00+02 | 17 | erquhart
2017-10-30 01:00:00+01 | 7 | erquhart
2017-11-29 01:00:00+01 | 12 | erquhart
2017-12-29 01:00:00+01 | 8 | tech4him1
2018-01-28 01:00:00+01 | 6 | ragasirtahk
And it follows. Basically it returns a username in a bucket of time, in this case 30 days.
The SQL query is:
SELECT DISTINCT ON(time_bucket('30 days', created_at))
time_bucket('30 days', created_at) as time,
count(id) as num_issues,
actor_login
FROM
issues_event
WHERE action = 'opened' AND repo_name='netlify/netlify-cms'
group by time, actor_login
order by time, num_issues DESC
My question is, how can i detect or group the rows which have equal actor_login and are consecutive.
For example, I would like to group the cassiozen from 2016-10-05 to 2017-01-03, but not with the other cassiozen of the column.
I have tried with auxiliar columns, with window functions such as LAG, but without a function or a do statement I don't think it is possible.
I also tried with functions but I can't find a way.
Any approach, idea or solution will be fully appreciated.
Edit: I show my desired output.
time | num_issues | actor_login | actor_group_id
------------------------+------------+------------------+----------------
2015-11-10 01:00:00+01 | 2 | nifl | 0
2015-12-10 01:00:00+01 | 1 | anandtrex | 1
2016-01-09 01:00:00+01 | 1 | isaacrg | 2
2016-02-08 01:00:00+01 | 1 | timbarclay | 3
2016-06-07 02:00:00+02 | 1 | kcalmes | 4
2016-07-07 02:00:00+02 | 1 | cassiozen | 5
2016-08-06 02:00:00+02 | 13 | phae | 6
2016-09-05 02:00:00+02 | 2 | phae | 6
2016-10-05 02:00:00+02 | 13 | cassiozen | 7
2016-11-04 01:00:00+01 | 6 | cassiozen | 7
2016-12-04 01:00:00+01 | 4 | cassiozen | 7
2017-01-03 01:00:00+01 | 5 | cassiozen | 7
2017-02-02 01:00:00+01 | 8 | cassandraoid | 12
2017-03-04 01:00:00+01 | 16 | erquhart | 13
2017-04-03 02:00:00+02 | 3 | erquhart | 13
2017-05-03 02:00:00+02 | 9 | erquhart | 13
2017-06-02 02:00:00+02 | 5 | erquhart | 13
2017-07-02 02:00:00+02 | 2 | greatwarlive | 17
2017-08-01 02:00:00+02 | 8 | tech4him1 | 18
2017-08-31 02:00:00+02 | 7 | tech4him1 | 18
2017-09-30 02:00:00+02 | 17 | erquhart | 16
2017-10-30 01:00:00+01 | 7 | erquhart | 16
2017-11-29 01:00:00+01 | 12 | erquhart | 16
2017-12-29 01:00:00+01 | 8 | tech4him1 | 21
2018-01-28 01:00:00+01 | 6 | ragasirtahk | 24
The solution of MatBaille is almost perfect.
I just wanted to group the consecutive actors like this so I could extract a bunch of metrics with other attributes of the table.

You could use a so-called "gaps-and-islands" approach
WITH
sorted AS
(
SELECT
*,
ROW_NUMBER() OVER ( ORDER BY time) AS rn,
ROW_NUMBER() OVER (PARTITION BY actor_login ORDER BY time) AS rn_actor
FROM
your_results
)
SELECT
*,
rn - rn_actor AS actor_group_id
FROM
sorted
Then the combination of (actor_login, actor_group_id) will group consecutive rows together.
db<>fiddle demo

Related

How can I get the table like this?

I have three tables in my database - campaign, app, revenue.
Campaign:
id | name
----+------
1 | Gis1
2 | Gis2
3 | Gis3
App:
app_id | name | campaign_id
--------+-----------+-------------
1 | Paino | 1
2 | Guitar | 1
3 | DrumPads | 1
4 | Karaoke | 2
5 | Metronome | 3
Revenue:
date | app_id | revenue
------------+--------+-------------------
2018-01-01 | 3 | 78.538551844269
2018-01-01 | 4 | 38.8709466245191
2018-01-01 | 2 | 35.5413845637373
2018-01-01 | 1 | 28.6825649309465
2018-01-01 | 5 | 6.33375584214843
2018-01-02 | 4 | 75.162254483704
2018-01-02 | 1 | 73.2370500155917
2018-01-02 | 5 | 70.4319678991422
2018-01-02 | 2 | 61.6702865774691
2018-01-02 | 3 | 11.7512900955221
2018-01-03 | 3 | 96.3792688491068
2018-01-03 | 4 | 84.3478274916822
2018-01-03 | 2 | 78.6001262071822
2018-01-03 | 5 | 13.8776103129058
2018-01-03 | 1 | 1.68915693074764
2018-01-04 | 5 | 99.4360222634511
2018-01-04 | 4 | 90.1921250023309
2018-01-04 | 3 | 16.5334091972016
2018-01-04 | 2 | 10.5714115940407
2018-01-04 | 1 | 1.35598296965985
2018-01-05 | 5 | 80.2475503409425
2018-01-05 | 4 | 38.9817245329402
2018-01-05 | 2 | 34.409188396027
2018-01-05 | 3 | 20.4833489416672
2018-01-05 | 1 | 2.61399153047812
2018-01-06 | 3 | 87.8649452536831
2018-01-06 | 1 | 74.4561480870284
2018-01-06 | 5 | 21.6574319699022
2018-01-06 | 2 | 4.87542333346478
2018-01-06 | 4 | 1.14697005074565
2018-01-07 | 1 | 87.9779788101898
2018-01-07 | 4 | 77.7294346579956
2018-01-07 | 3 | 59.3464731223967
2018-01-07 | 2 | 40.95148445392
2018-01-07 | 5 | 5.06283105895021
2018-01-08 | 1 | 96.2285605244126
2018-01-08 | 2 | 95.07328406998
2018-01-08 | 3 | 92.0486340327792
2018-01-08 | 4 | 85.379685234924
2018-01-08 | 5 | 9.78507570055686
2018-01-09 | 1 | 62.8192365909115
2018-01-09 | 4 | 62.0064597273823
2018-01-09 | 5 | 48.0621315020228
2018-01-09 | 3 | 29.7547369619939
2018-01-09 | 2 | 12.2752425067087
2018-01-10 | 2 | 81.0502551311092
2018-01-10 | 3 | 48.9698039641851
2018-01-10 | 1 | 17.5580143188766
2018-01-10 | 5 | 16.961404890828
2018-01-10 | 4 | 15.8832169199418
2018-01-11 | 2 | 77.6197753309208
2018-01-11 | 4 | 37.7590440824396
2018-01-11 | 1 | 28.964817136957
2018-01-11 | 3 | 28.706793080089
2018-01-11 | 5 | 26.9639842717711
2018-01-12 | 3 | 87.2789863299996
2018-01-12 | 1 | 78.8013559572292
2018-01-12 | 4 | 57.4583081599463
2018-01-12 | 5 | 48.0822281547709
2018-01-12 | 2 | 0.839615458734033
2018-01-13 | 3 | 69.8766551973526
2018-01-13 | 2 | 58.3078275325981
2018-01-13 | 5 | 21.8336755576109
2018-01-13 | 4 | 11.370240413885
2018-01-13 | 1 | 1.86340769095961
2018-01-14 | 5 | 92.6937944833375
2018-01-14 | 4 | 87.4130741995654
2018-01-14 | 3 | 72.2022209237481
2018-01-14 | 1 | 17.323222911245
2018-01-14 | 2 | 14.1322298298443
2018-01-15 | 2 | 90.8789341373927
2018-01-15 | 4 | 74.78605271702
2018-01-15 | 1 | 65.674207749016
2018-01-15 | 5 | 33.0848315520449
2018-01-15 | 3 | 19.7583865950811
2018-01-16 | 5 | 66.2050914825085
2018-01-16 | 4 | 34.6843542862023
2018-01-16 | 1 | 29.5897929780101
2018-01-16 | 2 | 15.0023649485883
2018-01-16 | 3 | 7.54663420658891
2018-01-17 | 2 | 83.3703723270077
2018-01-17 | 3 | 61.088943523605
2018-01-17 | 4 | 46.5194411862903
2018-01-17 | 5 | 46.462239550764
2018-01-17 | 1 | 16.1838123321874
2018-01-18 | 5 | 78.0041560412725
2018-01-18 | 4 | 30.3052500891844
2018-01-18 | 2 | 29.8116578069311
2018-01-18 | 3 | 5.80476470204397
2018-01-18 | 1 | 2.28775040131831
2018-01-19 | 5 | 94.0447243349086
2018-01-19 | 2 | 93.2593723776554
2018-01-19 | 3 | 86.2968057525727
2018-01-19 | 1 | 42.7138322733396
2018-01-19 | 4 | 22.1327564577787
2018-01-20 | 3 | 98.8579713044872
2018-01-20 | 5 | 64.8200087378497
2018-01-20 | 4 | 64.7727513652878
2018-01-20 | 2 | 39.2598249004273
2018-01-20 | 1 | 25.6178488851919
2018-01-21 | 3 | 84.4040426309011
2018-01-21 | 1 | 52.0713063443698
2018-01-21 | 5 | 41.7424199787255
2018-01-21 | 4 | 35.3389400530059
2018-01-21 | 2 | 28.350741474429
2018-01-22 | 2 | 96.8320321290855
2018-01-22 | 3 | 74.0004402752697
2018-01-22 | 1 | 72.5235460636752
2018-01-22 | 5 | 53.607618058446
2018-01-22 | 4 | 41.3008316635055
2018-01-23 | 2 | 66.6286214457232
2018-01-23 | 3 | 54.1626139019933
2018-01-23 | 5 | 52.5239485716162
2018-01-23 | 4 | 25.7367743326983
2018-01-23 | 1 | 6.46491466744874
2018-01-24 | 2 | 83.5308430627458
2018-01-24 | 1 | 68.6328785122374
2018-01-24 | 4 | 55.6973785257225
2018-01-24 | 3 | 46.0264499615527
2018-01-24 | 5 | 16.4651600203735
2018-01-25 | 4 | 80.9564163429763
2018-01-25 | 5 | 62.5899942406707
2018-01-25 | 1 | 59.0336831992662
2018-01-25 | 2 | 46.4030509765701
2018-01-25 | 3 | 22.6888680448289
2018-01-26 | 4 | 76.5099290710172
2018-01-26 | 3 | 53.933127563048
2018-01-26 | 5 | 49.5466520893498
2018-01-26 | 2 | 45.1699294234721
2018-01-26 | 1 | 21.3764512981173
2018-01-27 | 1 | 90.5434132585012
2018-01-27 | 4 | 67.0016445981484
2018-01-27 | 3 | 11.2431627841556
2018-01-27 | 2 | 5.39719616685773
2018-01-27 | 5 | 2.11776835627748
2018-01-28 | 2 | 53.3541751891504
2018-01-28 | 1 | 32.9596394913923
2018-01-28 | 3 | 21.1895497351378
2018-01-28 | 4 | 16.2897762555689
2018-01-28 | 5 | 5.34709359321544
2018-01-29 | 1 | 64.5439256676011
2018-01-29 | 2 | 15.9776125576869
2018-01-29 | 4 | 11.0105036902667
2018-01-29 | 3 | 2.16601788703412
2018-01-29 | 5 | 0.555523083910259
2018-01-30 | 4 | 94.7839147312857
2018-01-30 | 5 | 72.6621727991897
2018-01-30 | 3 | 70.124043314061
2018-01-30 | 2 | 34.5961079425723
2018-01-30 | 1 | 33.2888204556319
2018-01-31 | 3 | 77.7231288650421
2018-01-31 | 1 | 56.8044673174345
2018-01-31 | 4 | 43.5046513642636
2018-01-31 | 5 | 41.5792942791069
2018-01-31 | 2 | 25.5788387345906
2018-02-01 | 3 | 95.2107725320766
2018-02-01 | 4 | 86.3486448391141
2018-02-01 | 5 | 78.3239590582078
2018-02-01 | 2 | 39.1536975881585
2018-02-01 | 1 | 36.0675078797763
2018-02-02 | 5 | 97.8803713050822
2018-02-02 | 1 | 79.4247662701352
2018-02-02 | 2 | 26.3779061699958
2018-02-02 | 3 | 22.4354942949645
2018-02-02 | 4 | 13.2603534317112
2018-02-03 | 3 | 96.1323726327063
2018-02-03 | 5 | 59.6632595622737
2018-02-03 | 1 | 27.389807545151
2018-02-03 | 2 | 7.76389782111102
2018-02-03 | 4 | 0.969840948318645
2018-02-04 | 3 | 75.3978559173567
2018-02-04 | 5 | 49.3882938530803
2018-02-04 | 2 | 39.1100010374179
2018-02-04 | 4 | 35.8242148224422
2018-02-04 | 1 | 7.23734382101905
2018-02-05 | 4 | 75.3672510776635
2018-02-05 | 5 | 64.5369740371526
2018-02-05 | 3 | 51.5082265591993
2018-02-05 | 1 | 32.3788448578061
2018-02-05 | 2 | 21.4472612365463
2018-02-06 | 3 | 53.4502002775965
2018-02-06 | 2 | 53.0717656757934
2018-02-06 | 1 | 40.8672220798649
2018-02-06 | 4 | 37.839976598642
2018-02-06 | 5 | 9.12020377129901
2018-02-07 | 5 | 97.4855788083418
2018-02-07 | 4 | 97.4608054761709
2018-02-07 | 1 | 95.6723225551752
2018-02-07 | 3 | 87.9714358507064
2018-02-07 | 2 | 38.2405435002047
2018-02-08 | 4 | 82.9874314133669
2018-02-08 | 5 | 82.6651133226406
2018-02-08 | 1 | 69.3052440890685
2018-02-08 | 2 | 51.1343060185741
2018-02-08 | 3 | 25.5081553094595
2018-02-09 | 2 | 77.6589355538231
2018-02-09 | 5 | 74.7649757096248
2018-02-09 | 4 | 74.0052834670764
2018-02-09 | 1 | 37.58471748555
2018-02-09 | 3 | 9.52726961562965
2018-02-10 | 4 | 63.5114625028904
2018-02-10 | 1 | 57.6003561091767
2018-02-10 | 3 | 33.8354238124814
2018-02-10 | 5 | 24.755497452165
2018-02-10 | 2 | 6.09719410861046
2018-02-11 | 3 | 99.3200679204704
2018-02-11 | 4 | 92.787953262445
2018-02-11 | 2 | 75.7916875546417
2018-02-11 | 1 | 74.1264023056354
2018-02-11 | 5 | 39.0543105010909
2018-02-12 | 4 | 73.9016911300489
2018-02-12 | 2 | 36.8834180654883
2018-02-12 | 3 | 30.824684325787
2018-02-12 | 5 | 29.1559120548307
2018-02-12 | 1 | 10.1162943083399
2018-02-13 | 1 | 88.0975197801571
2018-02-13 | 3 | 73.3753659668181
2018-02-13 | 4 | 63.0762892472857
2018-02-13 | 5 | 35.8151357458788
2018-02-13 | 2 | 13.4014942840453
2018-02-14 | 1 | 94.8739671484573
2018-02-14 | 2 | 91.6415916160249
2018-02-14 | 5 | 66.2281593912018
2018-02-14 | 4 | 42.94700050317
2018-02-14 | 3 | 26.5246491333787
2018-02-15 | 2 | 98.7486846642082
2018-02-15 | 5 | 69.6182587287506
2018-02-15 | 3 | 44.6821718318301
2018-02-15 | 4 | 21.9568740682904
2018-02-15 | 1 | 15.374522578894
2018-02-16 | 2 | 94.3365941896695
2018-02-16 | 1 | 53.269122319394
2018-02-16 | 3 | 39.6046035126169
2018-02-16 | 4 | 37.622514510779
2018-02-16 | 5 | 31.3474270053205
2018-02-17 | 3 | 70.0631248181593
2018-02-17 | 5 | 50.1262781461011
2018-02-17 | 2 | 43.9279952731992
2018-02-17 | 1 | 28.2582849814117
2018-02-17 | 4 | 21.0913544631149
2018-02-18 | 3 | 74.8909778287795
2018-02-18 | 2 | 74.2363801582102
2018-02-18 | 5 | 72.4878600270842
2018-02-18 | 1 | 25.6855071233935
2018-02-18 | 4 | 0.37039199763309
2018-02-19 | 3 | 83.3856751613489
2018-02-19 | 5 | 46.4974932948942
2018-02-19 | 2 | 6.43301299768522
2018-02-19 | 4 | 4.81320557633388
2018-02-19 | 1 | 2.15515010060456
2018-02-20 | 4 | 81.4230771798843
2018-02-20 | 5 | 57.7265346180577
2018-02-20 | 1 | 56.2984247130064
2018-02-20 | 2 | 49.0169450043801
2018-02-20 | 3 | 46.5627217436774
2018-02-21 | 1 | 96.5297614033189
2018-02-21 | 5 | 96.2494094090932
2018-02-21 | 3 | 31.3462847216426
2018-02-21 | 4 | 23.2941891242544
2018-02-21 | 2 | 19.9083254355315
2018-02-22 | 1 | 79.0770313884165
2018-02-22 | 2 | 64.9973229306064
2018-02-22 | 3 | 55.3855288854335
2018-02-22 | 4 | 53.814505037514
2018-02-22 | 5 | 24.401256997123
2018-02-23 | 3 | 94.6754099868804
2018-02-23 | 1 | 52.4266618064681
2018-02-23 | 5 | 43.3877704733184
2018-02-23 | 2 | 23.3815439158117
2018-02-23 | 4 | 5.92925014836784
2018-02-24 | 4 | 82.3691566567076
2018-02-24 | 3 | 59.14386332869
2018-02-24 | 1 | 56.3529858789623
2018-02-24 | 5 | 17.7818909222602
2018-02-24 | 2 | 8.08320409409884
2018-02-25 | 1 | 51.144611434977
2018-02-25 | 4 | 32.6423341915492
2018-02-25 | 2 | 25.7686248507202
2018-02-25 | 3 | 3.33917220111982
2018-02-25 | 5 | 1.98348143815742
2018-02-26 | 5 | 95.2717564467113
2018-02-26 | 2 | 89.9541470672166
2018-02-26 | 4 | 73.8019448592861
2018-02-26 | 3 | 41.1512130216618
2018-02-26 | 1 | 36.3474907902939
2018-02-27 | 5 | 79.2906637385048
2018-02-27 | 4 | 62.3354455191908
2018-02-27 | 2 | 41.5109752476831
2018-02-27 | 1 | 18.9144882775624
2018-02-27 | 3 | 2.1427167667481
2018-02-28 | 4 | 85.4665146107167
2018-02-28 | 3 | 46.1527380247259
2018-02-28 | 2 | 22.3016369603851
2018-02-28 | 1 | 7.070596022248
2018-02-28 | 5 | 4.55199247079415
As a result of query I need to get table, that contained four columns:
period - A period by months and weeks = January, February, Week 1, Week 2,..., Week 9
gis1 - revenue by corresponding period of company Gis1
gis2 - the same for Gis2
gis3 - the same for Gis3
Here is source files.
I wrote such query that creates tables in database and gives revenue by company Gis1 by weeks:
CREATE TABLE campaign
(
id INT PRIMARY KEY,
name VARCHAR
);
COPY campaign (id, name) FROM '/home/leonid/Campaign.csv' DELIMITER ';' CSV HEADER;
CREATE TABLE app
(
app_id INT PRIMARY KEY,
name VARCHAR,
campaign_id INT REFERENCES campaign (id)
);
COPY app (app_id, name, campaign_id) FROM '/home/leonid/App.csv' DELIMITER ';' CSV HEADER;
CREATE TABLE revenue
(
date DATE,
app_id INT REFERENCES app (app_id),
revenue DOUBLE PRECISION
);
COPY revenue (date, app_id, revenue) FROM '/home/leonid/Revenue.csv' DELIMITER ';' CSV HEADER;
ALTER TABLE revenue
ADD COLUMN N SERIAL PRIMARY KEY;
SELECT DISTINCT EXTRACT (WEEK FROM r.date) AS Week, SUM (r.revenue) AS Gis1
FROM revenue r
JOIN app a ON r.app_id = a.app_id
JOIN campaign c ON a.campaign_id = c.id
WHERE c.name = 'Gis1'
GROUP BY Week
ORDER BY Week;
Maybe I need to use crosstab?
Disclaimer: Not really sure if I understood your problem. But it seems that you are just want to achieve a little pivot:
demo:db<>fiddle
WITH alldata AS ( -- 1
SELECT
r.revenue,
c.name,
EXTRACT('isoyear' FROM date) as year, -- 2
to_char(date, 'Month') as month, -- 3
EXTRACT('week' FROM date) as week -- 4
FROM
revenue r
JOIN app a ON a.app_id = r.app_id
JOIN campaign c ON c.id = a.campaign_id
)
SELECT
month || ' ' || week as period, -- 5
SUM(revenue) FILTER (WHERE name = 'Gis1') as gis1, -- 7
SUM(revenue) FILTER (WHERE name = 'Gis2') as gis2,
SUM(revenue) FILTER (WHERE name = 'Gis3') as gis2
FROM
alldata
GROUP BY year, month, week -- 6
ORDER BY year, week
The result (for random extract of your too huge sample data):
| period | gis1 | gis2 | gis2 |
|-------------|--------------------|--------------------|--------------------|
| January 1 | 690.6488198608687 | 192.90960581696436 | 103.48598677014304 |
| January 2 | 377.6251679591726 | 85.379685234924 | 9.78507570055686 |
| January 3 | 303.6608544533801 | 121.3054939033103 | 124.4663955920365 |
| January 4 | 59.0336831992662 | 80.9564163429763 | 62.5899942406707 |
| February 5 | 123.5221801778573 | (null) | 59.6632595622737 |
| February 6 | 368.2516021023734 | 175.3567225686088 | 182.1855864844304 |
| February 7 | 368.9193547506641 | 21.0913544631149 | 122.6141381731853 |
| February 8 | 154.03324156166846 | 81.4230771798843 | 57.7265346180577 |
| February 9 | 174.5234469014203 | 73.8019448592861 | 179.11441265601024 |
WITH clause calculates the whole table. Could be done within a subquery as well.
Calculates the year out of the date. isoyear instead of year to assure that there are no problems with the first week. isoyear defines exactly when the first week is ("the first Thursday of a year is in week 1 of that year.")
Get the month name
Get the week
Creating a "period" out of month name and week number
Grouping by the period (grouping by year and week mainly).
Doing the pivot using the FILTER clause. With that you are able to filter what you wish to aggregate. The company in your example.

SQL PIVOT with dynamic header

I have a SQL database table with the following structure from witch I'd like to give my users the option to export data in either a vertically or horizontally fashion.
SELECT [MEA_ID], [MEA_DateTime], [PTR_ID], [EXP_ID], [DUN_ID], [MEA_Value]
FROM [MeasurementData];
|--------|----------------------|--------|--------|--------|-----------|
| MEA_ID | MEA_DateTime | PTR_ID | EXP_ID | DUN_ID | MEA_Value |
|--------|----------------------|--------|--------|--------|-----------|
| 1 | 2009-08-10 00:00:00 | 24 | 14 | 2 | 15.1 |
| 2 | 2009-08-10 00:00:00 | 24 | 14 | 3 | 14.3 |
| 3 | 2009-08-10 00:00:00 | 24 | 14 | 4 | 16.7 |
| 4 | 2009-08-10 00:00:10 | 24 | 15 | 2 | 13.0 |
| 5 | 2009-08-10 00:00:10 | 24 | 15 | 4 | 13.4 |
| 6 | 2009-08-10 00:00:20 | 24 | 16 | 2 | 17.8 |
| 7 | 2009-08-10 00:00:20 | 24 | 16 | 3 | 17.7 |
| 8 | 2009-08-10 00:00:20 | 24 | 16 | 4 | 16.2 |
| 9 | 2009-08-10 00:00:00 | 25 | 14 | 3 | 34.0 |
| 10 | 2009-08-10 00:00:00 | 25 | 14 | 4 | 19.0 |
| 11 | 2009-08-10 00:00:10 | 25 | 15 | 2 | 22.1 |
| 12 | 2009-08-10 00:00:10 | 25 | 15 | 3 | 23.1 |
| 13 | 2009-08-10 00:00:20 | 25 | 16 | 2 | 24.6 |
| 14 | 2009-08-10 00:00:20 | 25 | 16 | 3 | 18.3 |
| 15 | 2009-08-10 00:00:20 | 25 | 16 | 4 | 18.2 |
This above table would be the vertical export.
Every combination of MEA_DateTime, PTR_ID, EXP_ID and DUN_ID is unique, so there can always only be 1 row with a given combination. What I am trying to accomplish is to turn the DUN_ID horizontally, to better be able to compare values.
It should look like this:
SELECT [MEA_DateTime], [PTR_ID], [EXP_ID], [MEA_Value]
FROM [MeasurementData]
PIVOT
(
SUM([MEA_Value])
FOR [DUN_ID] IN (????)
);
DUN_ID DUN_ID DUN_ID
| | |
v v v
|----------------------|--------|--------|-------|-------|-------|
| MEA_DateTime | PTR_ID | EXP_ID | 2 | 3 | 4 |
|----------------------|--------|--------|-------|-------|-------|
| 2009-08-10 00:00:00 | 24 | 14 | 15.1 | 14.3 | 16.7 |
| 2009-08-10 00:00:10 | 24 | 15 | 13.0 | NULL | 13.4 |
| 2009-08-10 00:00:20 | 24 | 16 | 17.8 | 17.7 | 16.2 |
| 2009-08-10 00:00:00 | 25 | 14 | NULL | 34.0 | 19.0 |
| 2009-08-10 00:00:10 | 25 | 15 | 22.1 | 23.1 | NULL |
| 2009-08-10 00:00:20 | 25 | 16 | 24.6 | 18.3 | 18.2 |
I tried to make it work with a PIVOT, but unfortunately never did something like that before and don't have much to show. From what I could figure out you'd need to know the column header names beforehand for it to work and I couldn't figure out how to use field values as column headers. Is what I'm trying to do possible or should I just build that structure manually in python afterwards?
Glad for any help.
EDIT: The database engine is Microsoft SQL.
EDIT: Solution in action: http://sqlfiddle.com/#!6/3afd7/1/0
You can use this query.
DECLARE #ColNames NVARCHAR(MAX) = ''
SELECT #ColNames = #ColNames + ', ' + QUOTENAME( DUN_ID ) FROM MyTable
GROUP BY DUN_ID
DECLARE #SqlText NVARCHAR(MAX) = '
SELECT * FROM (SELECT MEA_DateTime, PTR_ID, EXP_ID, DUN_ID, MEA_Value FROM MyTable ) SRC
PIVOT( MAX(MEA_Value) FOR DUN_ID IN ( '+ STUFF(#ColNames,1,1,'') +') ) AS PVT ORDER BY PTR_ID, EXP_ID'
EXEC(#SqlText)
Result:
MEA_DateTime PTR_ID EXP_ID 2 3 4
----------------------- ----------- ----------- ---------- -------- --------
2009-08-10 00:00:00.000 24 14 15.10 14.30 16.70
2009-08-10 00:00:10.000 24 15 13.00 NULL 13.40
2009-08-10 00:00:20.000 24 16 17.80 17.70 16.20
2009-08-10 00:00:00.000 25 14 NULL 34.00 19.00
2009-08-10 00:00:10.000 25 15 22.10 23.10 NULL
2009-08-10 00:00:20.000 25 16 24.60 18.30 18.20

Select min and max values while grouped by a third column

I have a table with campaign data and need to get a list of 'spend_perc' min and max values while grouping by the client_id AND timing of these campaigns.
sample data being:
camp_id | client_id | start_date | end_date | spend_perc
7257 | 35224 | 2017-01-16 | 2017-02-11 | 100.05
7284 | 35224 | 2017-01-16 | 2017-02-11 | 101.08
7308 | 35224 | 2017-01-16 | 2017-02-11 | 101.3
7309 | 35224 | 2017-01-16 | 2017-02-11 | 5.8
6643 | 35224 | 2017-02-08 | 2017-02-24 | 79.38
6645 | 35224 | 2017-02-08 | 2017-02-24 | 6.84
6648 | 35224 | 2017-02-08 | 2017-02-24 | 100.01
6649 | 78554 | 2017-02-09 | 2017-02-27 | 2.5
6650 | 78554 | 2017-02-09 | 2017-02-27 | 18.5
6651 | 78554 | 2017-02-09 | 2017-02-27 | 98.5
what I'm trying to get is the rows with min and max 'spend_perc' values per each client_id AND within the same campaign timing (identical start/end_date):
camp_id | client_id | start_date | end_date | spend_perc
7308 | 35224 | 2017-01-16 | 2017-02-11 | 101.3
7309 | 35224 | 2017-01-16 | 2017-02-11 | 5.8
6645 | 35224 | 2017-02-08 | 2017-02-24 | 6.84
6648 | 35224 | 2017-02-08 | 2017-02-24 | 100.01
6649 | 78554 | 2017-02-09 | 2017-02-27 | 2.5
6651 | 78554 | 2017-02-09 | 2017-02-27 | 98.5
smth like:?
with a as
(select distinct
camp_id,client_id,start_date,end_date,max(spend_perc) over (partition by start_date,end_date),min(spend_perc) over (partition by start_date,end_date)
from tn
)
select camp_id,client_id,start_date,end_date,case when spend_perc=max then max when spend_perc = min then min end spend_perc
from a
order by camp_id,client_id,start_date,end_date,spend_perc
I think you will want to get rid of the camp_id field because that will be meaningless in this case. So you want something like:
SELECT client_id, start_date, end_date,
min(spend_perc) as min_spend_perc, max(spend_perc) as max_spend_perc
FROM mytable
GROUP BY client_id, start_date, end_date;
Group by the criteria you want to, and select min and max as columns per unique combination of these values (i.e. per row).

Count rows each month of a year - SQL Server

I have a table "Product" as :
| ProductId | ProductCatId | Price | Date | Deadline |
--------------------------------------------------------------------
| 1 | 1 | 10.00 | 2016-01-01 | 2016-01-27 |
| 2 | 2 | 10.00 | 2016-02-01 | 2016-02-27 |
| 3 | 3 | 10.00 | 2016-03-01 | 2016-03-27 |
| 4 | 1 | 10.00 | 2016-04-01 | 2016-04-27 |
| 5 | 3 | 10.00 | 2016-05-01 | 2016-05-27 |
| 6 | 3 | 10.00 | 2016-06-01 | 2016-06-27 |
| 7 | 1 | 20.00 | 2016-01-01 | 2016-01-27 |
| 8 | 2 | 30.00 | 2016-02-01 | 2016-02-27 |
| 9 | 1 | 40.00 | 2016-03-01 | 2016-03-27 |
| 10 | 4 | 15.00 | 2016-04-01 | 2016-04-27 |
| 11 | 1 | 25.00 | 2016-05-01 | 2016-05-27 |
| 12 | 5 | 55.00 | 2016-06-01 | 2016-06-27 |
| 13 | 5 | 55.00 | 2016-06-01 | 2016-01-27 |
| 14 | 5 | 55.00 | 2016-06-01 | 2016-02-27 |
| 15 | 5 | 55.00 | 2016-06-01 | 2016-03-27 |
I want to create SP count rows of Product each month with condition Year = CurrentYear , like :
| Month| SumProducts | SumExpiredProducts |
-------------------------------------------
| 1 | 3 | 3 |
| 2 | 3 | 3 |
| 3 | 3 | 3 |
| 4 | 2 | 2 |
| 5 | 2 | 2 |
| 6 | 2 | 2 |
What should i do ?
You can use a query like the following:
SELECT MONTH([Date]),
COUNT(*) AS SumProducts ,
COUNT(CASE WHEN [Date] > Deadline THEN 1 END) AS SumExpiredProducts
FROM mytable
WHERE YEAR([Date]) = YEAR(GETDATE())
GROUP BY MONTH([Date])

Teradata conditional expand

I have a table with dates and val that I trying to expand and fill in missing dates in order. Not shown is that I am doing this by group and location, but the crux of what I need to do is below. Say I have the following table
dt | val
2014-01-01 | 10
2014-02-17 | 9
2014-04-21 | 5
I have expanded to this is a week table filling in missing with zeros
week_bgn_dt| week_end_dt| val
2014-01-01 | 2014-01-08 | 10
2014-01-09 | 2014-01-16 | 0
2014-01-17 | 2014-01-24 | 0
...
2014-02-10 | 2014-02-17 | 0
2014-02-18 | 2014-02-25 | 9
2014-02-26 | 2014-03-05 | 0
2014-03-06 | 2014-03-13 | 0
...
2014-03-30 | 2014-04-06 | 0
2014-04-07 | 2014-04-14 | 0
2014-04-15 | 2014-04-22 | 5
what I want is fill in with the last value until a change, so the output would looks like
week_bgn_dt| week_end_dt| val
2014-01-01 | 2014-01-08 | 10
2014-01-09 | 2014-01-16 | 10
2014-01-17 | 2014-01-24 | 10
...
2014-02-10 | 2014-02-17 | 10
2014-02-18 | 2014-02-25 | 9
2014-02-26 | 2014-03-05 | 9
2014-03-06 | 2014-03-13 | 9
...
2014-03-30 | 2014-04-06 | 9
2014-04-07 | 2014-04-14 | 9
2014-04-15 | 2014-04-22 | 5
In teradata I have tried this
case when val <> 0 then val
else sum(val) over (partition by group, location order by group, store, week_bgn_dt 1 preceding to current row) as val2
but this only give the last value once, like so,
week_bgn_dt| week_end_dt| val | val2
2014-01-01 | 2014-01-08 | 10 | 10
2014-01-09 | 2014-01-16 | 0 | 10
2014-01-17 | 2014-01-24 | 0 | 0
...
2014-02-10 | 2014-02-17 | 0 | 0
2014-02-18 | 2014-02-25 | 9 | 9
2014-02-26 | 2014-03-05 | 0 | 9
2014-03-06 | 2014-03-13 | 0 | 0
...
2014-03-30 | 2014-04-06 | 0 | 0
2014-04-07 | 2014-04-14 | 0 | 0
2014-04-15 | 2014-04-22 | 5 | 5
If I make the window unbounded, it sums when I hit a new value
case when val <> 0 then val
else sum(val) over (partition by group, location order by group, store, week_bgn_dt unbounded preceding to current row) as val2
week_bgn_dt| week_end_dt| val | val2
2014-01-01 | 2014-01-08 | 10 | 10
2014-01-09 | 2014-01-16 | 0 | 10
2014-01-17 | 2014-01-24 | 0 | 10
...
2014-02-10 | 2014-02-17 | 0 | 10
2014-02-18 | 2014-02-25 | 9 | 9
2014-02-26 | 2014-03-05 | 0 | 19
2014-03-06 | 2014-03-13 | 0 | 19
...
2014-03-30 | 2014-04-06 | 0 | 19
2014-04-07 | 2014-04-14 | 0 | 19
2014-04-15 | 2014-04-22 | 5 | 5
I have tried with max() and min(), but to similar results. Thank you for any assistance.
This seems to be an issue with the partitioning in the SUM operation. Remember that when OVER clause is specified, SUM calculates its results for each partition separately starting from zero for each partition. It appears that you want SUM to function over multiple partitions. As we cannot tell SUM in any way (that I'm aware of) to operate over multiple partitions, the way around is to redefine the partitioning to something else.
I your case, it appears that SUM should not use partitions at all. All we need are the RESET WHEN feature and the windowing operation of OVER. Using your expanded results filled with zeros, I have achieved the required output with following query.
SELECT
week_bgn_dt,
week_end_dt,
val,
SUM(val) OVER ( PARTITION BY 1
ORDER BY location ASC, week_bgn_dt ASC
RESET WHEN val<>0
ROWS UNBOUNDED PRECEDING ) AS val2
FROM test
week_bgn_dt | week_end_dt | val | val2
2014-01-01 | 2014-01-08 | 10 | 10
2014-01-09 | 2014-01-16 | 0 | 10
2014-01-17 | 2014-01-24 | 0 | 10
2014-02-10 | 2014-02-17 | 0 | 10
2014-02-18 | 2014-02-25 | 9 | 9
2014-02-26 | 2014-03-05 | 0 | 9
2014-03-06 | 2014-03-13 | 0 | 9
2014-03-30 | 2014-04-06 | 0 | 9
2014-04-07 | 2014-04-14 | 0 | 9
2014-04-15 | 2014-04-22 | 5 | 5
You might have noticed that I have added only location to the provided data. I believe you can add rest of the fields to ORDER BY clause and get the right results.