I have following data in a table:
qincId ID lc1 lc2 Time SP
963 544 22.3000526428 73.1743087769 2019-03-31 17:00:46.000 15
965 544 22.2998828888 73.1746368408 2019-03-31 17:01:07.000 2
968 544 22.2998828888 73.1746368408 2019-03-31 17:01:40.000 2
997 544 22.3010215759 73.1744003296 2019-03-31 17:06:11.000 15
998 544 22.3011436462 73.1747131348 2019-03-31 17:06:21.000 17
1010 544 22.3034667969 73.1747512817 2019-03-31 17:08:04.000 0
1011 544 22.3032741547 73.1747512817 2019-03-31 17:08:03.000 0
1565 544 22.3032035828 73.1748123169 2019-03-31 18:45:26.000 0
1571 544 22.3028964996 73.1748123169 2019-03-31 18:46:03.000 16
1573 544 22.3023796082 73.1747131348 2019-03-31 18:46:21.000 15
1575 544 22.3021774292 73.1746444702 2019-03-31 18:46:37.000 0
1577 544 22.3019657135 73.1747665405 2019-03-31 18:46:50.000 15
1586 544 22.3009243011 73.1742477417 2019-03-31 18:47:33.000 5
1591 544 22.2998828888 73.1745300293 2019-03-31 18:48:19.000 5
1592 544 22.2998828888 73.1745300293 2019-03-31 18:48:28.000 5
1593 544 22.2998981476 73.1746063232 2019-03-31 18:48:29.000 4
1597 544 22.3000450134 73.1744232178 2019-03-31 18:49:08.000 0
1677 544 22.3000450134 73.1744232178 2019-03-31 19:03:28.000 0
Now I want to calculate time difference between to row only for sp = 0 from their next record.
Expected output:
qincId ID lc1 lc2 Time SP TimeDiff (Minute)
963 544 22.3000526428 73.1743087769 2019-03-31 17:00:46.000 15 NULL
965 544 22.2998828888 73.1746368408 2019-03-31 17:01:07.000 2 NULL
968 544 22.2998828888 73.1746368408 2019-03-31 17:01:40.000 2 NULL
997 544 22.3010215759 73.1744003296 2019-03-31 17:06:11.000 15 NULL
998 544 22.3011436462 73.1747131348 2019-03-31 17:06:21.000 17 NULL
1010 544 22.3034667969 73.1747512817 2019-03-31 17:08:04.000 0 0.01
1011 544 22.3032741547 73.1747512817 2019-03-31 17:08:03.000 0 97
1565 544 22.3032035828 73.1748123169 2019-03-31 18:45:26.000 0 1
1571 544 22.3028964996 73.1748123169 2019-03-31 18:46:03.000 16 NULL
1573 544 22.3023796082 73.1747131348 2019-03-31 18:46:21.000 15 NULL
1575 544 22.3021774292 73.1746444702 2019-03-31 18:46:37.000 0 0.21
1577 544 22.3019657135 73.1747665405 2019-03-31 18:46:50.000 15 NULL
1586 544 22.3009243011 73.1742477417 2019-03-31 18:47:33.000 5 NULL
1591 544 22.2998828888 73.1745300293 2019-03-31 18:48:19.000 5 NULL
1592 544 22.2998828888 73.1745300293 2019-03-31 18:48:28.000 5 NULL
1593 544 22.2998981476 73.1746063232 2019-03-31 18:48:29.000 4 NULL
1597 544 22.3000450134 73.1744232178 2019-03-31 18:49:08.000 0 14
1677 544 22.3000450134 73.1744232178 2019-03-31 19:03:28.000 0 NULL
So basically I just want to calculate time difference in minute only.
How can I do this ?
If by next record you mean the row that has the minimum time that is greater than the current time:
select t.*,
round(case
when t.sp = 0 then
datediff(second, t.time,
(select min(time) from tablename where time > t.time)
)
else null
end / 60.0, 2) timediff
from tablename t
you can try by using lag() sqlserver version>=2012
select *, case when sp=0 then
datediff(second,time,lag(time) over(order by time)) else null end
from table_name
Related
I'm trying to do a weekly forecast in FBProphet for just 5 weeks ahead. The make_future_dataframe method doesn't seem to be working right....makes the correct one week intervals except for one week between jul 3 and Jul 5....every other interval is correct at 7 days or a week. Code and output below:
INPUT DATAFRAME
ds y
548 2010-01-01 3117
547 2010-01-08 2850
546 2010-01-15 2607
545 2010-01-22 2521
544 2010-01-29 2406
... ... ...
4 2020-06-05 2807
3 2020-06-12 2892
2 2020-06-19 3012
1 2020-06-26 3077
0 2020-07-03 3133
CODE
future = m.make_future_dataframe(periods=5, freq='W')
future.tail(9)
OUTPUT
ds
545 2020-06-12
546 2020-06-19
547 2020-06-26
548 2020-07-03
549 2020-07-05
550 2020-07-12
551 2020-07-19
552 2020-07-26
553 2020-08-02
All you need to do is create a dataframe with the dates you need for predict method. utilizing the make_future_dataframe method is not necessary.
I am using an old SQL Server 2000.
Here is some sample data:
ROOMDATE rate bus_id quantity
2018-09-21 00:00:00.000 129 346686 2
2018-09-21 00:00:00.000 162 354247 36
2018-09-21 00:00:00.000 159 382897 150
2018-09-21 00:00:00.000 120 556111 25
2018-09-22 00:00:00.000 129 346686 8
2018-09-22 00:00:00.000 162 354247 86
2018-09-22 00:00:00.000 159 382897 150
2018-09-22 00:00:00.000 120 556111 25
2018-09-23 00:00:00.000 129 346686 23
2018-09-23 00:00:00.000 162 354247 146
2018-09-23 00:00:00.000 159 382897 9
2018-09-23 00:00:00.000 94 570135 23
Essentially what I am wanting is the MAX quantity of each day with it's corresponding rate and bus_id.
For example, I would want the following rows from my sample data above:
ROOMDATE rate bus_id quantity
2018-09-21 00:00:00.000 159 382897 150
2018-09-22 00:00:00.000 159 382897 150
2018-09-23 00:00:00.000 162 354247 146
From what I have read, SQL Server 2000 did not support ROW_NUMBER. But we can phrase your query using a subquery which finds the max quantity for each day:
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT
CONVERT(char(10), ROOMDATE, 120) AS ROOMDATE,
MAX(quantity) AS max_quantity
FROM yourTable
GROUP BY CONVERT(char(10), ROOMDATE, 120)
) t2
ON CONVERT(char(10), t1.ROOMDATE, 120) = t2.ROOMDATE AND
t1.quantity = t2.max_quantity
ORDER BY
t1.ROOMDATE;
Demo
I am creating a DataFrame from a csv file, where my index (rows) is date and my column names are names of cities.
After I create the raw DataFrame, I am trying to create a DataFrame from selected columns. I have tried:
A=df['city1'] #city 1
B=df['city2']
C=pd.merge(A,B)
but it does't work. This is what A and B look like.
Date
2013-11-01 2.56
2013-12-01 1.77
2014-01-01 0.00
2014-02-01 0.38
2014-03-01 13.16
2014-04-01 10.29
2014-05-01 15.43
2014-06-01 11.48
2014-07-01 8.54
2014-08-01 11.11
2014-09-01 2.71
2014-10-01 4.16
2014-11-01 13.01
2014-12-01 9.59
Name: Seattle.Washington, dtype: float64 Date
And this is what I am looking to create:
City1 City2
Date
2013-11-01 0.00 2.94
2013-12-01 8.26 3.41
2014-01-01 1.11 14.27
2014-02-01 32.86 84.26
2014-03-01 34.12 0.00
2014-04-01 68.39 0.00
2014-05-01 27.17 9.09
2014-06-01 10.47 32.00
2014-07-01 14.19 26.83
2014-08-01 14.91 6.36
2014-09-01 3.76 8.32
2014-10-01 5.83 2.19
2014-11-01 10.79 2.64
2014-12-01 21.24 8.08
Any suggestion?
Error Message:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-222-ec50ff9f372f> in <module>()
14 S = df['City1']
15 A = df['City2']
16
---> 17 print merge(S,A)
18 #df2=pd.merge(A,A)
19 #print df2
C:\...\merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
36 right_on=right_on, left_index=left_index,
37 right_index=right_index, sort=sort, suffixes=suffixes,
---> 38 copy=copy)
39 return op.get_result()
40 if __debug__:
Answer: (Courtesy of #EdChum)
df[['City1', 'City2']]
I have a table with columns and value like
ID Values FirstCol 2ndCol 3rdCol 4thCol 5thCol
1 1stValue 5466 34556 53536 54646 566
1 2ndValue 3544 957 667 1050 35363
1 3rdValue 1040 1041 4647 6477 1045
1 4thValue 1048 3546 1095 1151 65757
2 1stValue 845 5466 86578 885 859
2 2ndValue 35646 996 1300 7101 456467
2 3rdValue 102 46478 565 657 107
2 4thValue 5509 55110 1411 1152 1144
3 1stValue 845 854 847 884 675
3 2ndValue 984 994 4647 1041 1503
3 3rdValue 1602 1034 1034 1055 466
3 4thValue 1069 1610 6111 1124 1144
Now I want a result set in below form, is this possible with Pivot or Case statment?
ID Cols 1stValue 2ndValue 3rdValue 4thValue
1 FirstCol 5466 3544 1040 1048
1 2ndCol 34556 957 1041 3546
1 3rdCol 53536 667 4647 1095
1 4thCol 54646 1050 6477 1151
1 5thCol 566 35363 1045 65757
2 FirstCol 845 35646 102 5509
2 2ndCol 5466 996 46478 55110
2 3rdCol 86578 1300 565 1411
2 4thCol 885 7101 657 1152
2 5thCol 859 456467 107 1144
3 FirstCol 845 984 1602 1069
3 2ndCol 854 994 1034 1610
3 3rdCol 847 4647 1034 6111
3 4thCol 884 1041 1055 1124
3 5thCol 675 1503 466 1144
Assuming the table name is t1 this should do the trick:
SELECT * FROM t1
UNPIVOT (val FOR name IN ([FirstCol], [2ndCol], [3rdCol], [4thCol], [5thCol])) unpiv
PIVOT (SUM(val) FOR [Values] IN ([1stValue], [2ndValue], [3rdValue], [4thValue])) piv
There's sorting issue, it'd be good to rename FirstCol to 1stCol, then ORDER BY ID, name would put it in required order.
I would like to clean up some data returned from a query. This query :
select seriesId,
startDate,
reportingCountyId,
countyId,
countyName,
pocId,
pocValue
from someTable
where seriesId = 147
and pocid = 2
and countyId in (2033,2040)
order by startDate
usually returns 2 county matches for all years:
seriesId startDate reportingCountyId countyId countyName pocId pocValue
147 2004-01-01 00:00:00.000 6910 2040 CountyOne 2 828
147 2005-01-01 00:00:00.000 2998 2033 CountyTwo 2 4514
147 2005-01-01 00:00:00.000 3000 2040 CountyOne 2 2446
147 2006-01-01 00:00:00.000 3018 2033 CountyTwo 2 5675
147 2006-01-01 00:00:00.000 4754 2040 CountyOne 2 2265
147 2007-01-01 00:00:00.000 3894 2033 CountyTwo 2 6250
147 2007-01-01 00:00:00.000 3895 2040 CountyOne 2 2127
147 2008-01-01 00:00:00.000 4842 2033 CountyTwo 2 5696
147 2008-01-01 00:00:00.000 4846 2040 CountyOne 2 2013
147 2009-01-01 00:00:00.000 6786 2033 CountyTwo 2 2578
147 2009-01-01 00:00:00.000 6817 2040 CountyTwo 2 1933
147 2010-01-01 00:00:00.000 6871 2040 CountyOne 2 1799
147 2010-01-01 00:00:00.000 6872 2033 CountyTwo 2 4223
147 2011-01-01 00:00:00.000 8314 2033 CountyTwo 2 3596
147 2011-01-01 00:00:00.000 8315 2040 CountyOne 2 1559
But note please that the first entry has only CountyOne for 2004. I would like to return a fake row for CountyTwo for a graph I am doing. It would be sufficient to fill it like CountyOne only with pocValue = 0.
thanks!!!!!!!!
Try this (if you need blank row for that countryid)
; with CTE AS
(SELECT 2033 As CountryID UNION SELECT 2040),
CTE2 AS
(
seriesId, startDate, reportingCountyId,
countyId, countyName, pocId, pocValue
from someTable where
seriesId = 147 and pocid = 2 and countyId in (2033,2040)
order by startDate
)
SELECT x1.CountyId, x2.*, IsNull(pocValue,0) NewpocValue FROM CTE x
LEFT OUTER JOIN CTE2 x2 ON x1.CountyId = x2.reportingCountyId