I have a table with columns and value like
ID Values FirstCol 2ndCol 3rdCol 4thCol 5thCol
1 1stValue 5466 34556 53536 54646 566
1 2ndValue 3544 957 667 1050 35363
1 3rdValue 1040 1041 4647 6477 1045
1 4thValue 1048 3546 1095 1151 65757
2 1stValue 845 5466 86578 885 859
2 2ndValue 35646 996 1300 7101 456467
2 3rdValue 102 46478 565 657 107
2 4thValue 5509 55110 1411 1152 1144
3 1stValue 845 854 847 884 675
3 2ndValue 984 994 4647 1041 1503
3 3rdValue 1602 1034 1034 1055 466
3 4thValue 1069 1610 6111 1124 1144
Now I want a result set in below form, is this possible with Pivot or Case statment?
ID Cols 1stValue 2ndValue 3rdValue 4thValue
1 FirstCol 5466 3544 1040 1048
1 2ndCol 34556 957 1041 3546
1 3rdCol 53536 667 4647 1095
1 4thCol 54646 1050 6477 1151
1 5thCol 566 35363 1045 65757
2 FirstCol 845 35646 102 5509
2 2ndCol 5466 996 46478 55110
2 3rdCol 86578 1300 565 1411
2 4thCol 885 7101 657 1152
2 5thCol 859 456467 107 1144
3 FirstCol 845 984 1602 1069
3 2ndCol 854 994 1034 1610
3 3rdCol 847 4647 1034 6111
3 4thCol 884 1041 1055 1124
3 5thCol 675 1503 466 1144
Assuming the table name is t1 this should do the trick:
SELECT * FROM t1
UNPIVOT (val FOR name IN ([FirstCol], [2ndCol], [3rdCol], [4thCol], [5thCol])) unpiv
PIVOT (SUM(val) FOR [Values] IN ([1stValue], [2ndValue], [3rdValue], [4thValue])) piv
There's sorting issue, it'd be good to rename FirstCol to 1stCol, then ORDER BY ID, name would put it in required order.
Related
REVISED POST
I need a query with a desired output shown in bullet #2. Below is a simple query of the data for a specific inventoryno. Notice avgcost can fluctuate for any given date. I need the highest avgcost on the most recent date, distinct to the inventoryno.
Note I have included sample snippets for additional reference however stackoverflow links my images instead of pasting directly here because I am a new OP.
Current query and output
select inventoryno, avgcost, dts
from invtrans
where DTS < '01-JAN-23'
order by dts desc;
INVENTORYNO
AVGCOST
DTS
264
52.36411
12/31/2022
264
52.36411
12/31/2022
264
52.36411
12/31/2022
507
149.83039
12/31/2022
6005
57.45968
12/31/2022
6005
57.45968
12/31/2022
6005
57.45968
12/31/2022
1518
4.05530
12/31/2022
1518
4.05530
12/31/2022
1518
4.05530
12/31/2022
1518
4.15254
12/31/2022
1518
4.15254
12/31/2022
1518
4.1525
12/31/2022
365
0.00000
2/31/2022
365
0.00000
2/31/2022
365
0.00000
2/31/2022
Snippet for above
My proposed query which doesn't work due to 'not a single-group group function
Select distinct inventoryno, Max(avgcost), max(dts)
from invtrans
where DTS < '01-JAN-23'
order by inventoryno;
DESIRED OUTPUT
INVENTORYNO
AVGCOST
DTS
264
52.36411
12/31/2022
507
149.83039
12/31/2022
6005
57.45968
12/31/2022
1518
4.15254
12/31/2022
365
0.00000
2/31/2022
Desired for above snippet
I have included the raw table with a few rows below for better context.
Raw table for reference
select * from invtrans
KEY
SOURCE
INVENTORYNO
WAREHOUSENO
QUANTITY
QOH
AVGCOST
DTS
EMPNO
INVTRANSNO
TOTALAMT
CO_ID
1805
INVXFER
223
3
1200
2811
0.78377
5/22/2018
999
112029
940.80000
1
076394
PROJ
223
3
-513
2298
0.78376
5/23/2018
999
112030
-402.19000
1
111722
APVCHR
223
3
3430
5728
0.79380
6/1/2018
999
112033
2862.68000
1
073455
PROJ
223
3
-209
5519
0.79392
6/8/2018
999
112034
-163.86000
1
076142
PROJ
223
3
-75
5444
0.79396
6/12/2018
999
112035
-58.80000
1
073492
PROJ
223
3
-252
5192
0.79411
6/13/2018
999
112036
-197.57000
1
072377
PROJ
223
3
-1200
3992
0.79414
8/22/2018
999
112056
-952.80000
1
If anyone could assist me further, it would be ideal for the query below to contain the 'avgcost' column. Otherwise I can take the fixed query from step 2 and the one below to excel and combine there, but would prefer not to.
Remember, Avgcost NEEDS to be the maximum avgcost based on the most recent date. I cannot figure it out. Thank you.
select inventoryno,
count(inventoryno),
MAX(DTS),
sum(quantity),
sum(totalamt)
from invtrans
where DTS < '01-JAN-23'
group by inventoryno
order by inventoryno;
INVENTORYNO
COUNT(INVENTORYNO)
MAX(DTS)
SUM(QUANTITY)
SUM(TOTALAMT)
1
103
11/28/2022 7:07:46 AM
75
1153.46
10
888
9/26/2022 9:31:20 AM
0
0
100
1287
12/31/2022
162
70486.77
1001
241
11/28/2022 7:27:04 PM
181
14207.43
1002
759
12/31/2022
566
76424.46
1003
936
12/31/2022
120
25252.61
1004
263
11/30/2022 10:48:00 AM
550
1627.62
1005
487
11/28/2022 5:05:56 PM
750
4435.51
1006
9
11/23/2022 8:38:05 AM
1311
504.63
1008
13
11/30/2022 10:48:00 AM
0
0
1009
38
10/31/2022 6:50:27 AM
90
2680.36
101
535
12/31/2022
79
48153.44
102
238
11/28/2022 6:42:01 PM
24
17802.91
1020
2
12/13/2019
50
119.89
1021
262
12/31/2022
2000
4844.37
1022
656
11/23/2022 4:49:35 PM
300
1315.17
1023
1693
12/31/2022
1260
2002.56
1025
491
11/28/2022 5:05:56 PM
225
864.75
1026
62
9/23/2022 4:35:14 PM
375
11956.17
1027
109
10/28/2022 8:44:21 AM
300
2157.97
1028
39
9/4/2019 12:30:00 AM
50
244.62
Example output of what I ultimately need
How do I fill root_cp_id column with cp_id of location that doesn't end with -.
The table I have
cp_id
location
1998
180
2294
180-1
2000
220
2150
2000
2001
240
2139
240-1
2157
120
2164
120-1
2244
120-2
2227
130
The expected result
cp_id
root_cp_id
location
1998
1998
180
2294
1998
180-1
2000
2000
220
2150
2000
2000
2001
2001
240
2139
2001
240-1
2157
2157
120
2164
2157
120-1
2244
2157
120-2
2227
2227
130
Use Series.mask for missing values if exist - and then forward filling previous non NaNs values:
df['root_cp_id'] = df['cp_id'].mask(df['location'].str.contains('-')).ffill()
print (df)
cp_id location root_cp_id
0 1998 180 1998.0
1 2294 180-1 1998.0
2 2000 220 2000.0
3 2150 2000 2150.0
4 2001 240 2001.0
5 2139 240-1 2001.0
6 2157 120 2157.0
7 2164 120-1 2157.0
8 2244 120-2 2157.0
9 2227 130 2227.0
Or if need new second column use DataFrame.insert:
df.insert(1, 'root_cp_id', df['cp_id'].mask(df['location'].str.contains('-')).ffill())
print (df)
cp_id root_cp_id location
0 1998 1998.0 180
1 2294 1998.0 180-1
2 2000 2000.0 220
3 2150 2150.0 2000
4 2001 2001.0 240
5 2139 2001.0 240-1
6 2157 2157.0 120
7 2164 2157.0 120-1
8 2244 2157.0 120-2
9 2227 2227.0 130
I'm trying to do a weekly forecast in FBProphet for just 5 weeks ahead. The make_future_dataframe method doesn't seem to be working right....makes the correct one week intervals except for one week between jul 3 and Jul 5....every other interval is correct at 7 days or a week. Code and output below:
INPUT DATAFRAME
ds y
548 2010-01-01 3117
547 2010-01-08 2850
546 2010-01-15 2607
545 2010-01-22 2521
544 2010-01-29 2406
... ... ...
4 2020-06-05 2807
3 2020-06-12 2892
2 2020-06-19 3012
1 2020-06-26 3077
0 2020-07-03 3133
CODE
future = m.make_future_dataframe(periods=5, freq='W')
future.tail(9)
OUTPUT
ds
545 2020-06-12
546 2020-06-19
547 2020-06-26
548 2020-07-03
549 2020-07-05
550 2020-07-12
551 2020-07-19
552 2020-07-26
553 2020-08-02
All you need to do is create a dataframe with the dates you need for predict method. utilizing the make_future_dataframe method is not necessary.
I have following data in a table:
qincId ID lc1 lc2 Time SP
963 544 22.3000526428 73.1743087769 2019-03-31 17:00:46.000 15
965 544 22.2998828888 73.1746368408 2019-03-31 17:01:07.000 2
968 544 22.2998828888 73.1746368408 2019-03-31 17:01:40.000 2
997 544 22.3010215759 73.1744003296 2019-03-31 17:06:11.000 15
998 544 22.3011436462 73.1747131348 2019-03-31 17:06:21.000 17
1010 544 22.3034667969 73.1747512817 2019-03-31 17:08:04.000 0
1011 544 22.3032741547 73.1747512817 2019-03-31 17:08:03.000 0
1565 544 22.3032035828 73.1748123169 2019-03-31 18:45:26.000 0
1571 544 22.3028964996 73.1748123169 2019-03-31 18:46:03.000 16
1573 544 22.3023796082 73.1747131348 2019-03-31 18:46:21.000 15
1575 544 22.3021774292 73.1746444702 2019-03-31 18:46:37.000 0
1577 544 22.3019657135 73.1747665405 2019-03-31 18:46:50.000 15
1586 544 22.3009243011 73.1742477417 2019-03-31 18:47:33.000 5
1591 544 22.2998828888 73.1745300293 2019-03-31 18:48:19.000 5
1592 544 22.2998828888 73.1745300293 2019-03-31 18:48:28.000 5
1593 544 22.2998981476 73.1746063232 2019-03-31 18:48:29.000 4
1597 544 22.3000450134 73.1744232178 2019-03-31 18:49:08.000 0
1677 544 22.3000450134 73.1744232178 2019-03-31 19:03:28.000 0
Now I want to calculate time difference between to row only for sp = 0 from their next record.
Expected output:
qincId ID lc1 lc2 Time SP TimeDiff (Minute)
963 544 22.3000526428 73.1743087769 2019-03-31 17:00:46.000 15 NULL
965 544 22.2998828888 73.1746368408 2019-03-31 17:01:07.000 2 NULL
968 544 22.2998828888 73.1746368408 2019-03-31 17:01:40.000 2 NULL
997 544 22.3010215759 73.1744003296 2019-03-31 17:06:11.000 15 NULL
998 544 22.3011436462 73.1747131348 2019-03-31 17:06:21.000 17 NULL
1010 544 22.3034667969 73.1747512817 2019-03-31 17:08:04.000 0 0.01
1011 544 22.3032741547 73.1747512817 2019-03-31 17:08:03.000 0 97
1565 544 22.3032035828 73.1748123169 2019-03-31 18:45:26.000 0 1
1571 544 22.3028964996 73.1748123169 2019-03-31 18:46:03.000 16 NULL
1573 544 22.3023796082 73.1747131348 2019-03-31 18:46:21.000 15 NULL
1575 544 22.3021774292 73.1746444702 2019-03-31 18:46:37.000 0 0.21
1577 544 22.3019657135 73.1747665405 2019-03-31 18:46:50.000 15 NULL
1586 544 22.3009243011 73.1742477417 2019-03-31 18:47:33.000 5 NULL
1591 544 22.2998828888 73.1745300293 2019-03-31 18:48:19.000 5 NULL
1592 544 22.2998828888 73.1745300293 2019-03-31 18:48:28.000 5 NULL
1593 544 22.2998981476 73.1746063232 2019-03-31 18:48:29.000 4 NULL
1597 544 22.3000450134 73.1744232178 2019-03-31 18:49:08.000 0 14
1677 544 22.3000450134 73.1744232178 2019-03-31 19:03:28.000 0 NULL
So basically I just want to calculate time difference in minute only.
How can I do this ?
If by next record you mean the row that has the minimum time that is greater than the current time:
select t.*,
round(case
when t.sp = 0 then
datediff(second, t.time,
(select min(time) from tablename where time > t.time)
)
else null
end / 60.0, 2) timediff
from tablename t
you can try by using lag() sqlserver version>=2012
select *, case when sp=0 then
datediff(second,time,lag(time) over(order by time)) else null end
from table_name
I would like to clean up some data returned from a query. This query :
select seriesId,
startDate,
reportingCountyId,
countyId,
countyName,
pocId,
pocValue
from someTable
where seriesId = 147
and pocid = 2
and countyId in (2033,2040)
order by startDate
usually returns 2 county matches for all years:
seriesId startDate reportingCountyId countyId countyName pocId pocValue
147 2004-01-01 00:00:00.000 6910 2040 CountyOne 2 828
147 2005-01-01 00:00:00.000 2998 2033 CountyTwo 2 4514
147 2005-01-01 00:00:00.000 3000 2040 CountyOne 2 2446
147 2006-01-01 00:00:00.000 3018 2033 CountyTwo 2 5675
147 2006-01-01 00:00:00.000 4754 2040 CountyOne 2 2265
147 2007-01-01 00:00:00.000 3894 2033 CountyTwo 2 6250
147 2007-01-01 00:00:00.000 3895 2040 CountyOne 2 2127
147 2008-01-01 00:00:00.000 4842 2033 CountyTwo 2 5696
147 2008-01-01 00:00:00.000 4846 2040 CountyOne 2 2013
147 2009-01-01 00:00:00.000 6786 2033 CountyTwo 2 2578
147 2009-01-01 00:00:00.000 6817 2040 CountyTwo 2 1933
147 2010-01-01 00:00:00.000 6871 2040 CountyOne 2 1799
147 2010-01-01 00:00:00.000 6872 2033 CountyTwo 2 4223
147 2011-01-01 00:00:00.000 8314 2033 CountyTwo 2 3596
147 2011-01-01 00:00:00.000 8315 2040 CountyOne 2 1559
But note please that the first entry has only CountyOne for 2004. I would like to return a fake row for CountyTwo for a graph I am doing. It would be sufficient to fill it like CountyOne only with pocValue = 0.
thanks!!!!!!!!
Try this (if you need blank row for that countryid)
; with CTE AS
(SELECT 2033 As CountryID UNION SELECT 2040),
CTE2 AS
(
seriesId, startDate, reportingCountyId,
countyId, countyName, pocId, pocValue
from someTable where
seriesId = 147 and pocid = 2 and countyId in (2033,2040)
order by startDate
)
SELECT x1.CountyId, x2.*, IsNull(pocValue,0) NewpocValue FROM CTE x
LEFT OUTER JOIN CTE2 x2 ON x1.CountyId = x2.reportingCountyId