Why am I seeing multiple months in the results when I am joining with dim_date

Why am I seeing multiple months in the results when I am joining with dim_date - sql

Here is my simple Postgresql Query
SELECT dd.year_actual as yr, sum("Ordered_Amount") from channel_sales cs
JOIN dim_date dd ON cs."date" = dd.date_actual
GROUP BY
dd.year_actual,
cs."Ordered_Amount"
Here is the result below. What I was expecting was a single line result with the year and total amount, but instead it is breaking it down into multiple rows of 2018. I am not sure what I am doing wrong here.
2018 2226
2018 357
2018 616
2018 1074
2018 1422
2018 3080
2018 2106
2018 924
2018 176
2018 580
2018 1587
2018 14350
2018 306
2018 2516
2018 1482
2018 2880
2018 8400
2018 5200
2018 16758
2018 781
2018 135
2018 4056
2018 150
2018 500
2018 2338
2018 3850
2018 1432
2018 1396
2018 1230
2018 274
2018 1494
2018 1068
2018 878
2018 1441
2018 1832
2018 3042
2018 4180
2018 2327
2018 206
2018 426
2018 2090
2018 1003
2018 62499
2018 900
2018 2274
2018 399
2018 1980
2018 278
2018 736
2018 24070
2018 561
2018 648
2018 1256
2018 120
2018 21912
2018 1639
2018 4452
2018 1008
2018 96577
2018 3240
2018 1386
2018 388
2018 260
2018 1080
2018 5525
2018 2672
2018 24674
2018 4392
2018 948
2018 801
2018 658
2018 1908
2018 692
2018 498
2018 630
2018 8999
2018 4056
2018 2990
2018 1745
2018 1280
2018 126
2018 988
2018 422
2018 936
Is it how I am making the join, or is it because I am using group by clause in the wrong way. I cannot figure out for the life of it.

Because you are not grouping by only year. You are also grouping by ordered_amount which you also sum(). Thus you are effectively summing by year and common ordered_amounts. If say in 2018, there are 4 ordered_amounts of 100 that would show as 2018, 400. And this would be repeated per ordered_amount. ie:
2018,100
2018,100
2018,100
2018,100
2018,200
2018,300
2018,300
would be:
2018,400
2018,200
2018,600
Write it as:
SELECT dd.year_actual as yr, sum("Ordered_Amount")
from channel_sales cs
JOIN dim_date dd ON cs."date" = dd.date_actual
GROUP BY
dd.year_actual
Also note that if this is not a 1-to-many or 1-to-1 relation, then sum results would be wrong. To prevent that, you may first do the sum and then join. Depending on table structures and which data is coming from where, a join may not even be needed.

Related

Pandas Sort Two Columns with Day of Year Wrap-Around to New Year

I have data that may at certain times of the year around the first of each year, that a day_of_year sequence involves changing the "year" column to the new year when day_of_year ==1. It is a trick that I have not been able to figure out and in some ways not sure how to start so any help here is much appreciated. My data looks like this:
Here is my df1 =
day_of_year year var_1
364 2017 17.71666667
364 2018 5.166666667
364 2019 2
364 2020 1.595833333
364 2021 3.75
364 2022 6.8875
365 2017 14.83333333
365 2018 2.758333333
365 2019 4.108333333
365 2020 5.766666667
365 2021 5.291666667
365 2022 10.58636364
1 2017 2.0125
1 2018 14.0125
1 2019 -0.504166667
1 2020 7.666666667
1 2021 5.520833333
1 2022 1.229166667
2 2017 1.7625
2 2018 15.10416667
2 2019 -0.391666667
2 2020 9.5
2 2021 7.645833333
2 2022 0.9125
And, after the re-formatting, I need it to look like the below sorted df with "n/a" for any missing or expected data in a year that might be missing data. thank you again,
final df:
day_of_year year var_1
364 2017 17.71666667
365 2017 14.83333333
1 2018 14.0125
2 2018 15.10416667
364 2018 5.166666667
365 2018 2.758333333
1 2019 -0.504166667
2 2019 -0.391666667
364 2019 2
365 2019 4.108333333
1 2020 7.666666667
2 2020 9.5
364 2020 1.595833333
365 2020 5.766666667
1 2021 5.520833333
2 2021 7.645833333
364 2021 3.75
365 2021 5.291666667
1 2022 1.229166667
2 2022 0.9125
364 2022 6.8875
365 2022 10.58636364
n/a n/a n/a
n/a n/a n/a

Why would you change the year based on the day? Just sort by the two columns:
df.sort_values(by=['year', 'day_of_year'])
Output:
day_of_year year var_1
12 1 2017 2.012500
18 2 2017 1.762500
0 364 2017 17.716667
6 365 2017 14.833333
13 1 2018 14.012500
19 2 2018 15.104167
1 364 2018 5.166667
7 365 2018 2.758333
14 1 2019 -0.504167
20 2 2019 -0.391667
2 364 2019 2.000000
8 365 2019 4.108333
15 1 2020 7.666667
21 2 2020 9.500000
3 364 2020 1.595833
9 365 2020 5.766667
16 1 2021 5.520833
22 2 2021 7.645833
4 364 2021 3.750000
10 365 2021 5.291667
17 1 2022 1.229167
23 2 2022 0.912500
5 364 2022 6.887500
11 365 2022 10.586364
If for some reason you really need to fix the year, use a conditional with mask:
(df.assign(year=df['year'].mask(df['day_of_year'].le(2), df['year'].add(1)))
.sort_values(by=['year', 'day_of_year'])
)
Or, if you want to update the years after a change from 365 to a lower day:
(df.assign(year=df['year'].add(df['day_of_year'].diff().lt(0).cumsum()))
.sort_values(by=['year', 'day_of_year'])
)
Output:
day_of_year year var_1
0 364 2017 17.716667
6 365 2017 14.833333
12 1 2018 2.012500
18 2 2018 1.762500
1 364 2018 5.166667
7 365 2018 2.758333
13 1 2019 14.012500
19 2 2019 15.104167
2 364 2019 2.000000
8 365 2019 4.108333
14 1 2020 -0.504167
20 2 2020 -0.391667
3 364 2020 1.595833
9 365 2020 5.766667
15 1 2021 7.666667
21 2 2021 9.500000
4 364 2021 3.750000
10 365 2021 5.291667
16 1 2022 5.520833
22 2 2022 7.645833
5 364 2022 6.887500
11 365 2022 10.586364
17 1 2023 1.229167
23 2 2023 0.912500

I would convert everything to date time first. Just run:
pd.to_datetime(df['day_of_year'].astype(str) + '-' + df['year'].astype(str),
format='%j-%Y')
I assign it to column ymd and sort, yielding the following:
>>> df.sort_values('ymd')
day_of_year year var_1 ymd
12 1 2017 2.012500 2017-01-01
18 2 2017 1.762500 2017-01-02
0 364 2017 17.716667 2017-12-30
6 365 2017 14.833333 2017-12-31
13 1 2018 14.012500 2018-01-01
19 2 2018 15.104167 2018-01-02
1 364 2018 5.166667 2018-12-30
7 365 2018 2.758333 2018-12-31
14 1 2019 -0.504167 2019-01-01
20 2 2019 -0.391667 2019-01-02
2 364 2019 2.000000 2019-12-30
8 365 2019 4.108333 2019-12-31
15 1 2020 7.666667 2020-01-01
21 2 2020 9.500000 2020-01-02
3 364 2020 1.595833 2020-12-29
9 365 2020 5.766667 2020-12-30
16 1 2021 5.520833 2021-01-01
22 2 2021 7.645833 2021-01-02
4 364 2021 3.750000 2021-12-30
10 365 2021 5.291667 2021-12-31
17 1 2022 1.229167 2022-01-01
23 2 2022 0.912500 2022-01-02
5 364 2022 6.887500 2022-12-30
11 365 2022 10.586364 2022-12-31

Calculating incremental values from a cumulative sum field in Teradata

Good afternoon -
I have a table in Teradata that stores a rolling cumulative sum that resets every month. I would like to be able to calculate the incremental gain between each day of the month. Is this something that I can accomplish with olap functions or should it be handled in a recursive cte? Would love assistance thinking through this. Thanks!
example source
date
month
cum_value
2022-07-02
July 2022
25
2022-07-01
July 2022
5
2022-06-30
June 2022
100
2022-06-29
June 2022
70
2022-06-28
June 2022
65
2022-06-27
June 2022
50
example result
date
month
cum_value
incremental_value
2022-07-02
July 2022
25
20
2022-07-01
July 2022
5
5
2022-06-30
June 2022
100
30
2022-06-29
June 2022
70
5
2022-06-28
June 2022
65
15
2022-06-27
June 2022
50
..

T-SQL - Partition a running total

I've written a query that returns the size of my individual records in mb. These records contain Blob data.
I would like to partition the records in 50mb batches.
SELECT SourceId, Title, Description,
SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY DATALENGTH(SourceId) ORDER BY SourceId) AS RunningTotal,
RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)
The data returned from this query currently looks like this, where RunningTotal is the running total in mb of the records:
SourceId Title RunningTotalRowNo
00Pf4000006gna3EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf 5.242880 1
00Pf4000006gna8EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc 6.291456 2
00Pf4000006gnacEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf 7.340032 3
00Pf4000006gnaDEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc 12.582912 4
00Pf4000006gnahEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf 17.825792 5
00Pf4000006gnaIEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc 23.068672 6
00Pf4000006gnamEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf 33.554432 7
00Pf4000006gnaNEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt 34.603008 8
00Pf4000006gnarEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc 35.651584 9
00Pf4000006gnaSEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt 40.894464 10
00Pf4000006gnawEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc 46.137344 11
00Pf4000006gnaXEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt 51.380224 12
00Pf4000006gnb1EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc 61.865984 13
00Pf4000006gnb6EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt 62.914560 14
00Pf4000006gnbaEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc 68.157440 15
00Pf4000006gnbBEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt 78.643200 16
00Pf4000006gnbfEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc 89.128960 17
00Pf4000006gnbGEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf 90.177536 18
00Pf4000006gnbkEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt 91.226112 19
00Pf4000006gnbLEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf 96.468992 20
00Pf4000006gnbpEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt 101.711872 21
00Pf4000006gnbQEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf 112.197632 22
00Pf4000006gnbuEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt 122.683392 23
00Pf4000006gnbVEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc 123.731968 24
00Pf4000006gnbzEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf 124.780544 25
00Pf4000006gnc4EAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf 130.023424 26
00Pf4000006gnc9EAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf 140.509184 27
00Pf4000006gncdEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt 145.752064 28
00Pf4000006gncEEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc 146.800640 29
00Pf4000006gnciEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt 157.286400 30
00Pf4000006gncJEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc 162.529280 31
00Pf4000006gncKEAQ 001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt 173.015040 32
00Pf4000006gncnEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf 174.063616 33
00Pf4000006gncsEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf 179.306496 34
00Pf4000006gncTEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc 189.792256 35
00Pf4000006gncxEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf 200.278016 36
00Pf4000006gncYEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt 201.326592 37
00Pf4000006gnd2EAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc 202.375168 38
00Pf4000006gnd7EAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc 207.618048 39
00Pf4000006gndbEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf 212.860928 40
00Pf4000006gndCEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc 223.346688 41
00Pf4000006gndgEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf 233.832448 42
00Pf4000006gnDhEAI Snake_River_(5mb).jpg 239.077777 43
00Pf4000006gndHEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt 240.126353 44
00Pf4000006gndlEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc 241.174929 45
00Pf4000006gndMEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt 246.417809 46
00Pf4000006gndqEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc 251.660689 47
00Pf4000006gnDrEAI Pizigani_1367_Chart_10MB.jpg 261.835395 48
00Pf4000006gndREAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt 272.321155 49
00Pf4000006gndvEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc 282.806915 50
00Pf4000006gnDwEAI Spinner_Dolphin_Indian_Ocean_07-2017.jpg 284.109019 51
00Pf4000006gndWEAQ 001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf 285.157595 52
00Pf4000006gnDXEAY 440 Kb.jpg 285.609143 53
00Pf4000006gne0EAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt 286.657719 54
00Pf4000006gne5EAA 001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt 291.900599 55
00Pf4000006gneaEAA 001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt 302.386359 56
00Pf4000006gneAEAQ 001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt 312.872119 57
00Pf4000006gneeEAA 001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc 323.357879 58
I would like the results to look like this where they are partitioned in 50mb batches:
SourceId Title RunningTotalRowNo Batch
00Pf4000006gna3EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_37_32).pdf 5.242880 1 1
00Pf4000006gna8EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_37_38).doc 6.291456 2 1
00Pf4000006gnacEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_44).pdf 7.340032 3 1
00Pf4000006gnaDEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_41).doc 12.582912 4 1
00Pf4000006gnahEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_47).pdf 17.825792 5 1
00Pf4000006gnaIEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_46).doc 23.068672 6 1
00Pf4000006gnamEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_38_54).pdf 33.554432 7 1
00Pf4000006gnaNEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_52).txt 34.603008 8 1
00Pf4000006gnarEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_20).doc 35.651584 9 1
00Pf4000006gnaSEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_37_55).txt 40.894464 10 1
00Pf4000006gnawEAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_24).doc 46.137344 11 1
00Pf4000006gnaXEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_38_0).txt 51.380224 12 1
00Pf4000006gnb1EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_30).doc 61.865984 13 2
00Pf4000006gnb6EAA 001f400000ZP5yUAAT_3 Oct 2018 (14_39_50).txt 62.914560 14 2
00Pf4000006gnbaEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_29).doc 68.157440 15 2
00Pf4000006gnbBEAQ 001f400000ZP5yUAAT_3 Oct 2018 (14_39_58).txt 78.643200 16 2
00Pf4000006gnbfEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_34).doc 89.128960 17 2
00Pf4000006gnbGEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_7).pdf 90.177536 18 2
00Pf4000006gnbkEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_43).txt 91.226112 19 2
00Pf4000006gnbLEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_12).pdf 96.468992 20 2
00Pf4000006gnbpEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_46).txt 101.711872 21 3
00Pf4000006gnbQEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_17).pdf 112.197632 22 3
00Pf4000006gnbuEAA 001f400000ZP5yVAAT_3 Oct 2018 (14_40_52).txt 122.683392 23 3
00Pf4000006gnbVEAQ 001f400000ZP5yVAAT_3 Oct 2018 (14_40_26).doc 123.731968 24 3
00Pf4000006gnbzEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_0).pdf 124.780544 25 3
00Pf4000006gnc4EAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_5).pdf 130.023424 26 3
00Pf4000006gnc9EAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_11).pdf 140.509184 27 3
00Pf4000006gncdEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_41_56).txt 145.752064 28 3
00Pf4000006gncEEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_30).doc 146.800640 29 3
00Pf4000006gnciEAA 001f400000ZP5yWAAT_3 Oct 2018 (14_42_3).txt 157.286400 30 4
00Pf4000006gncJEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_33).doc 162.529280 31 4
00Pf4000006gncKEAQ 001f400000ZP5ycAAD_3 Oct 2018 (14_48_11).txt 173.015040 32 4
00Pf4000006gncnEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_12).pdf 174.063616 33 4
00Pf4000006gncsEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_15).pdf 179.306496 34 4
00Pf4000006gncTEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_44).doc 189.792256 35 4
00Pf4000006gncxEAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_30).pdf 200.278016 36 5
00Pf4000006gncYEAQ 001f400000ZP5yWAAT_3 Oct 2018 (14_41_53).txt 201.326592 37 5
00Pf4000006gnd2EAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_46).doc 202.375168 38 5
00Pf4000006gnd7EAA 001f400000ZP5yXAAT_3 Oct 2018 (14_42_49).doc 207.618048 39 5
00Pf4000006gndbEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_23).pdf 212.860928 40 5
00Pf4000006gndCEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_42_54).doc 223.346688 41 5
00Pf4000006gndgEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_30).pdf 233.832448 42 5
00Pf4000006gnDhEAI Snake_River_(5mb).jpg 239.077777 43 5
00Pf4000006gndHEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_3).txt 240.126353 44 5
00Pf4000006gndlEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_39).doc 241.174929 45 5
00Pf4000006gndMEAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_6).txt 246.417809 46 5
00Pf4000006gndqEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_41).doc 251.660689 47 6
00Pf4000006gnDrEAI Pizigani_1367_Chart_10MB.jpg 261.835395 48 6
00Pf4000006gndREAQ 001f400000ZP5yXAAT_3 Oct 2018 (14_43_11).txt 272.321155 49 6
00Pf4000006gndvEAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_47).doc 282.806915 50 6
00Pf4000006gnDwEAI Spinner_Dolphin_Indian_Ocean_07-2017.jpg 284.109019 51 6
00Pf4000006gndWEAQ 001f400000ZP5yYAAT_3 Oct 2018 (14_43_20).pdf 285.157595 52 6
00Pf4000006gnDXEAY 440 Kb.jpg 285.609143 53
00Pf4000006gne0EAA 001f400000ZP5yYAAT_3 Oct 2018 (14_43_59).txt 286.657719 54 6
00Pf4000006gne5EAA 001f400000ZP5yYAAT_3 Oct 2018 (14_44_2).txt 291.900599 55 6
00Pf4000006gneaEAA 001f400000ZP5yZAAT_3 Oct 2018 (14_44_59).txt 302.386359 56 7
00Pf4000006gneAEAQ 001f400000ZP5yYAAT_3 Oct 2018 (14_44_7).txt 312.872119 57 7
00Pf4000006gneeEAA 001f400000ZP5yZAAT_3 Oct 2018 (14_44_40).doc 323.357879 58 7
Help would be much appreciated, thank you.

You can use integer division:
SELECT ( CAST ( SUM(Datalength(versiondata) * 0.000001)
OVER (
partition BY Datalength(sourceid)
ORDER BY sourceid) AS INT) / 50 ) + 1 AS Batch
FROM TargetContentVersion
Here's a quick sample that demonstrates how it works:
CREATE TABLE #t (id INT IDENTITY(1,1), size NUMERIC(8,6))
GO
INSERT INTO #t
SELECT RAND() * 20
GO 20 -- Create 20 sample rows with random sizes between 0 and 20
SELECT id, SUM(size) OVER (ORDER BY id) AS RunningTotal,
(CAST(SUM(size) OVER (ORDER BY id) AS INT) / 50) + 1 AS Batch
FROM #t
id RunningTotal Batch
1 2.303367 1
2 4.049776 1
3 19.177784 1
4 28.637981 1
5 29.675840 1
6 32.781603 1
7 33.859586 1
8 36.633733 1
9 39.413363 1
10 58.004502 2
11 70.363837 2
12 82.897268 2
13 83.946657 2
14 85.623044 2
15 87.432670 2
16 103.304830 3
17 103.709745 3
18 122.165664 3
19 126.554616 3
20 128.019929 3

I've worked it out.
Script below for those interested.
WITH cte1 AS (
SELECT SourceId, Title, DATALENGTH(VersionData) * 0.000001 AS RecordSize,
CAST(SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY
DATALENGTH(SourceId) ORDER BY SourceId) AS INT) AS RunningTotal,
RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)
)
SELECT SourceId, Title, RecordSize, RunningTotal,
RowNo, SUM(RunningTotal) OVER (PARTITION BY SourceId ORDER BY SourceId) / 50 AS
Batch
FROM cte1

Another option would be to use dense_rank:
WITH CTE AS
(
SELECT SourceId, Title, Description,
SUM(DATALENGTH(VersionData) * 0.000001) OVER (PARTITION BY DATALENGTH(SourceId) ORDER BY SourceId) AS RunningTotal,
RANK() OVER(ORDER BY SourceId) AS RowNo
FROM TargetContentVersion WITH(NOLOCK)
)
SELECT SourceId, Title, Description, RunningTotal, RowNo
DENSE_RANK() OVER(PARTITION BY SourceId ORDER BY CAST(RunningTotal as int) / 50) As Batch
from #CTE
Note the casting of RunningTotal to int.

can't filter a view (datetype conversion error)

I have a view for which I'm trying to query.
Select top 100 Expiration , year(Expiration) from CICPROD.ExpiredLots
--where year(Expiration) = 2017
which returns (when I edit out the WHERE part):
Expiration (No column name)
2017-09-10 2017
2021-06-20 2021
2017-01-16 2017
2017-01-04 2017
2017-08-22 2017
2017-01-25 2017
2021-07-18 2021
2017-04-28 2017
2017-09-14 2017
2017-01-04 2017
2010-06-10 2010
2020-04-24 2020
2019-03-03 2019
2020-09-11 2020
2020-06-10 2020
2020-03-26 2020
2020-07-14 2020
2017-05-13 2017
2018-02-16 2018
2015-05-25 2015
2015-08-29 2015
2016-04-04 2016
2017-03-31 2017
2017-03-31 2017
2017-03-31 2017
2015-08-15 2015
2018-02-27 2018
2018-02-16 2018
2016-01-31 2016
2017-03-31 2017
2014-02-01 2014
2018-08-09 2018
2007-08-01 2007
2017-05-27 2017
2020-12-15 2020
2012-03-31 2012
2012-03-22 2012
2016-01-05 2016
2018-01-10 2018
2013-03-05 2013
2015-08-05 2015
2017-11-30 2017
2013-06-12 2013
2019-11-22 2019
2013-04-27 2013
2016-04-17 2016
2018-01-10 2018
2018-02-16 2018
2018-01-10 2018
2018-02-16 2018
2016-04-30 2016
2020-01-05 2020
2016-12-21 2016
2017-11-08 2017
2018-01-10 2018
2014-09-14 2014
2018-01-10 2018
2016-06-25 2016
2014-01-31 2014
2020-03-20 2020
2017-02-15 2017
2016-02-01 2016
2015-08-05 2015
2016-03-24 2016
2013-08-28 2013
2016-09-08 2016
2018-02-16 2018
2014-12-09 2014
2017-08-13 2017
2018-01-10 2018
2016-10-23 2016
2018-02-17 2018
2009-05-28 2009
2017-07-12 2017
2017-03-31 2017
2016-04-23 2016
2015-04-11 2015
2018-01-10 2018
2017-11-17 2017
2018-01-10 2018
2017-11-08 2017
2017-11-08 2017
2017-03-31 2017
2017-03-31 2017
2017-10-02 2017
2011-05-03 2011
2010-12-10 2010
2014-11-14 2014
2017-08-17 2017
2015-06-30 2015
2017-10-12 2017
2016-03-23 2016
2018-05-10 2018
2017-08-17 2017
2017-01-01 2017
2015-12-19 2015
2016-02-28 2016
2018-02-27 2018
2017-07-07 2017
2016-09-08 2016
However, when I try to filter the where with column 2 to say 2017, I get the error message:
Msg 241, Level 16, State 1, Line 1
Conversion failed when converting date and/or time from character string.
But when I tried it with TOP 10, the query WORKED no PROBLEM!!
I checked the length of the field and they're all 10 and have the same format so I'm wondering why this is happening.
Can anyone assist??
Original Query is:
Select cast([STOLOTFCY].ITMREF_0 as varchar(20)) as 'Product',
[ITMMASTER].ITMDES1_0 as 'Desc1',
[STOLOTFCY].STOFCY_0 as Site, cast([STOLOTFCY].LOT_0 as varchar(30)) as Lot ,
[STOCK].STA_0 as Status,
( case when isdate([STOLOT].USRFLD1_0) = 0 then null else
convert(date,[STOLOT].USRFLD1_0,101) end) as Expiration,
[STOCK].QTYSTU_0 as 'Total Stk',
[ITMMASTER].STU_0 as 'STK', [STOLOTFCY].AVC_0 as 'avgcost' ,
[STOLOTFCY].AVC_0 * [STOCK].QTYSTU_0 as 'ExtendedValue' ,
cast([STOLOT].LOTCREDAT_0 as date) as 'Lotcreated',
[ITMMASTER].ITMWEI_0 * [STOCK].QTYSTU_0 as 'TotalWgt(Kg)'
from [CICPROD].[STOLOTFCY]
inner join [CICPROD].[ITMMASTER] on [STOLOTFCY].ITMREF_0 = [ITMMASTER].ITMREF_0
inner join [CICPROD].[STOLOT] on [STOLOT].ITMREF_0 = [STOLOTFCY].ITMREF_0 and [STOLOT].LOT_0 = [STOLOTFCY].LOT_0
inner join [CICPROD].[STOCK] on [STOCK].ITMREF_0 = [STOLOTFCY].ITMREF_0 and [STOLOTFCY].STOFCY_0 = [STOCK].STOFCY_0 and [STOCK].LOT_0 =
[STOLOTFCY].LOT_0 and [STOLOTFCY].SLO_0 = [STOCK].SLO_0
where [STOLOTFCY].[AAACUMQTY_0] + [STOLOTFCY].[QQQCUMQTY_0] + [STOLOTFCY].[RRRCUMQTY_0] > 0

Based on the discussion via comments i think the following code should help you find the data which is causing the query to fail.
SET DATEFORMAT mdy;
select
[STOLOT].USRFLD1_0, *
from
CICPROD.STOLOT
WHERE
ISDATE([STOLOT].USRFLD1_0)= 0 and [STOLOT].USRFLD1_0 is not null

Try this:
Select top 100 Expiration , year(Expiration) from CICPROD.ExpiredLots
where year(cast(Expiration as date)) = 2017

How change date on the fly in SQL Server

My output from a procedure is like
Jan 1 1900 10:30PM
Jan 1 1900 10:45PM
Jan 1 1900 11:00PM
Jan 1 1900 11:30PM
Jan 1 1900 11:45PM
Jan 2 1900 12:00AM
Jan 2 1900 12:15AM
Jan 2 1900 12:30AM
Jan 2 1900 12:45AM
Jan 2 1900 1:00AM
I want add current date with time and change date after 12:00AM
like this:
Friday,MAY,18 10:30PM
Friday,MAY,18 10:45PM
Friday,MAY,18 11:00PM
Friday,MAY,18 11:30PM
Friday,MAY,18 11:45PM
Friday,MAY,19 12:00AM
Friday,MAY,19 12:15AM
Friday,MAY,19 12:30AM
Friday,MAY,19 12:45AM
Friday,MAY,19 1:00AM
How to do this??
thanks in advance

From SQL Server 2008:
select YourTimeCol+cast(getdate() as date)
from YourTable
Pre SQL Server 2008:
select YourTimeCol+dateadd(day, datediff(day, 0, getdate()), 0)
from YourTable
SE-Data

I think you need this
DECLARE #tt TABLE (Sday VARCHAR(50))
INSERT INTO #tt VALUES('Jan 1 1900 10:30PM'),('Jan 1 1900 10:45PM'),('Jan 1 1900 11:00PM'),('Jan 1 1900 11:30PM'),('Jan 1 1900 11:45PM'),('Jan 2 1900 12:00AM'),('Jan 2 1900 12:15AM'),('Jan 2 1900 12:30AM'),('Jan 2 1900 12:45AM'),('Jan 2 1900 1:00AM')
SELECT Sday,DATEADD(DAY,(DATEDIFF(DAY,'1900-01-01',GETDATE())),Sday) AS resultAsDatetime,
CONVERT(VARCHAR(50),DATEADD(DAY,(DATEDIFF(DAY,'1900-01-01',GETDATE())),Sday),109) AS result
FROM #tt
which is returning
Jan 1 1900 10:30PM 2012-05-18 22:30:00.000 May 18 2012 10:30:00:000PM
Jan 1 1900 10:45PM 2012-05-18 22:45:00.000 May 18 2012 10:45:00:000PM
Jan 1 1900 11:00PM 2012-05-18 23:00:00.000 May 18 2012 11:00:00:000PM
Jan 1 1900 11:30PM 2012-05-18 23:30:00.000 May 18 2012 11:30:00:000PM
Jan 1 1900 11:45PM 2012-05-18 23:45:00.000 May 18 2012 11:45:00:000PM
Jan 2 1900 12:00AM 2012-05-19 00:00:00.000 May 19 2012 12:00:00:000AM
Jan 2 1900 12:15AM 2012-05-19 00:15:00.000 May 19 2012 12:15:00:000AM
Jan 2 1900 12:30AM 2012-05-19 00:30:00.000 May 19 2012 12:30:00:000AM
Jan 2 1900 12:45AM 2012-05-19 00:45:00.000 May 19 2012 12:45:00:000AM
Jan 2 1900 1:00AM 2012-05-19 01:00:00.000 May 19 2012 1:00:00:000AM
obviouly you can choose the right format for the conversion of the DATETIME to a VARCHAR as documented in the CONVERT function, but I think this does not need help.
Hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Why am I seeing multiple months in the results when I am joining with dim_date - sql

Related

Pandas Sort Two Columns with Day of Year Wrap-Around to New Year

Calculating incremental values from a cumulative sum field in Teradata

T-SQL - Partition a running total

can't filter a view (datetype conversion error)

How change date on the fly in SQL Server

Categories

Resources