Sybase ASE looping over two tables for data calculation - sql

I'm not well versed with SQL and wondering how to do this in a Syabse - ASE stored procedure. Would appreciate any guidance on this.
I have table-1 (t1) and table-2 (t2) that I need to loop over for calculations like (t1.c4+t2.c3)*2+(t1.c5+t2.c4)*5.
Steps:
Get all the rows from table-1 whose datetime column value is between a user given datetime range
For each row from table-1 get row(s) from table-2 where the datetime value from table-1 row falls between start datetime and end datetime column values in table-2
If only 1 row matched from table-2, take the values from table-1 row and table-2 row and do calculations and do step-7
If more than 1 row matched, find the row from table-2 whose start datetime is an exact match with the datetime from table-1 row
If no exact match found, flag error in table-1 row and proceed with the next row from table-1
If only one row match found, do the calculations and do step-7
Insert the calculation result into the current row of table-1
Go to step-2; until no more rows left in table-1
What is the optimal approach to do this ? Should I use cursor's Or temporary tables ?
T1
------------------------------------------------
C1 C2 C3 C4 C5 C6
------------------------------------------------
ABC 10 15 2019-03-01 00:30
XYZ 12 13 2019-03-01 01:00
DEF 5 7 2019-03-01 02:00
IJK 17 3 2019-03-02 01:00
T2
------------------------------------------------
C1 C2 C3 C4 C5
------------------------------------------------
LMN 1 5 2019-03-01 00:30 2019-03-02 00:00
OPQ 2 3 2019-03-01 01:00 2019-03-01 01:30
STU 4 2 2019-03-01 01:30 2019-03-01 03:00
KJF 3 1 2019-03-01 02:30 2019-03-01 03:00
User input: 2019-03-01 00:00 to 2019-03-01 00:30 (ABC & LMN Rows match)
Expected out:
------------------------------------------------------------
C1 C2 C3 C4 C5 C6
--------------------------------------------------------------
ABC 10 15 2019-03-01 00:30 (10*1)+(15*5)
XYZ 12 13 2019-03-01 01:00
DEF 5 7 2019-03-01 02:00
IJK 17 3 2019-03-02 01:00
User input: 2019-03-01 01:00 to 2019-03-01 01:30 (XYZ & OPQ Rows match)
Expected out:
------------------------------------------------------------
C1 C2 C3 C4 C5 C6
--------------------------------------------------------------
ABC 10 15 2019-03-01 00:30
XYZ 12 13 2019-03-01 01:00 (12*2)+(13*3)
DEF 5 7 2019-03-01 02:00
IJK 17 3 2019-03-02 01:00
User input: 2019-03-01 23:59 TO 2019-03-02 01:00 (IJK row & No matching rows in t2)
Expected output:
----------------------------------------------------
C1 C2 C3 C4 C5 C6
----------------------------------------------------
ABC 10 15 2019-03-01 00:30
XYZ 12 13 2019-03-01 01:00
DEF 5 7 2019-03-01 02:00
IJK 17 3 2019-03-02 01:00 No matching row
User input: 2019-03-01 01:30 TO 2019-03-01 02:30 (DEF row)
Expected output:
Though DEF falls in STU & KJF date range start dates i.e. C4 col of either rows match with DEF datetime exactly
----------------------------------------------------
C1 C2 C3 C4 C5 C6
----------------------------------------------------
ABC 10 15 2019-03-01 00:30
XYZ 12 13 2019-03-01 01:00
DEF 5 7 2019-03-01 02:00 No unique match
IJK 17 3 2019-03-02 01:00

Related

excel index match with pandas

I am trying replicate the excel index match in pandas so as to produce a new column which copy's the date on the first occurrence of value in colB being exceeded or matched by value in colC
date colA colB colC colD desired_output
0 2020-04-01 00:00:00 2 1 e 2020-04-02 00:00:00
1 2020-04-02 00:00:00 8 4 4 d 2020-04-02 00:00:00
2 2020-04-03 00:00:00 1 2 a 2020-04-03 00:00:00
3 2020-04-04 00:00:00 4 2 3 b 2020-04-04 00:00:00
4 2020-04-05 00:00:00 5 3 1 c 2020-04-07 00:00:00
5 2020-04-06 00:00:00 9 4 1 m
6 2020-04-07 00:00:00 5 3 3 c 2020-04-07 00:00:00
Here is the code that i have tried so far, unsuccessfully:
col_6 = []
for ind in df3.index:
if df3['colC'][ind] >= df3['colB']:
col_6.append(df3['date'][ind]
else:
col_6.append('')
df3['desired_output'] = col_6
and have also tried:
col_6 = []
for ind in df3.index:
if df3['colB'][ind] <= df3['colC']:
col_6.append(df3['date'][ind]
else:
col_6.append('')
df3['desired_output'] = col_6
...this second attempt has come the closest, but only produces results when the 'if' conditions occur within the same index row in the dataframe. For instance, the value of 'colB' in index row 4 is exceeded by value of 'colC' in index row 6 but my attempted code is unsuccessful at capturing this sort of occurrence

Pandas Normalization using groupby

I have a two series column
First Column date which ranges from 2015-01-01 to 2019-01-01 and second column has some random values and I want to create a new column which should look like below
I have a pandas column like below
A1 B1
2015-01-01 A
2015-02-01 A
2015-03-01 A
2015-04-01 A
2015-01-01 B
2015-02-01. B
-----
and I want a new column like below
A1 B1 B
2015-01-01 A 0
2015-02-01 A 1
2015-03-01 A 2
2015-05-01. A 3
2015-01-01 B 0
2015-02-01. B 1
I think I am supposed to use groupby function on B1 but not sure how to do that
groupby.cumcount
df.assign(B=df.groupby('B1').cumcount())
A1 B1 B
0 2015-01-01 A 0
1 2015-02-01 A 1
2 2015-03-01 A 2
3 2015-04-01 A 3
4 2015-01-01 B 0
5 2015-02-01 B 1
In place
df['B'] = df.groupby('B1').cumcount()

SQL How to order by and keep expanded row by another order?

My result table is this:
FDate FTime FNo FId FRID FRCont
2016-12-19 07:25:00 1254 A1 A1 1
2016-12-19 08:45:00 1322 A2 A1 2
2016-12-19 13:20:00 4521 B1 B1 1
2016-12-19 16:40:00 7841 B2 B1 2
2016-12-19 20:45:00 1258 B3 B1 3
2016-12-19 11:25:00 3254 C1 C1 1
2016-12-19 13:10:00 3145 C2 C1 2
2016-12-19 15:20:00 3333 C3 C1 3
2016-12-20 07:35:00 7777 C4 C1 4
2016-12-20 08:50:00 7851 D1 D1 1
2016-12-20 10:30:00 45123 D2 D1 2
I want order by date and time in (FCont=1 rows)
but i do not want change relation by value in column FRID and FRCont.
looks like this:
FDate FTime FNo FId FRID FRCont
2016-12-19 07:25:00 1254 A1 A1 1
2016-12-19 08:45:00 1322 A2 A1 2
2016-12-19 11:25:00 3254 C1 C1 1
2016-12-19 13:10:00 3145 C2 C1 2
2016-12-19 15:20:00 3333 C3 C1 3
2016-12-20 07:35:00 7777 C4 C1 4
2016-12-19 13:20:00 4521 B1 B1 1
2016-12-19 16:40:00 7841 B2 B1 2
2016-12-19 20:45:00 1258 B3 B1 3
2016-12-20 08:50:00 7851 D1 D1 1
2016-12-20 10:30:00 45123 D2 D1 2
please resolve with any way in sql server query.
thanks a lot.
I think you are looking for something like this:
SELECT FDate, FTime, FNo, FId, FRID, FRCont
FROM (
SELECT FDate, FTime, FNo, FId, FRID, FRCont,
MIN(FDate) OVER (PARTITION BY FRID) AS Min_Date,
MIN(FTime) OVER (PARTITION BY FRID) AS Min_Time
FROM mytable ) AS t
ORDER BY Min_Date, Min_Time, FRID, FDate, FTime
The couple (Min_Date, Min_Time) gives the starting datetime value per FRID slice. Using this couple we can order each slice, placing in the first place the slice having the lowest datetime value followed by the slice having the next datetime value, etc.
You seem to want to sort by the minimum date/time for each group:
select t.*
from t
order by min(date + time) over (partition by frid),
frid,
fid;
Note: You might have to convert the date/time to datetime for the addition to work.
Please try this:
select FDate,FTime, FNo,FId,FRID,FRCont from (
select t.*
,min(fdate+ftime) over (partition by frid) mn
from t) t
order by mn, frcont;

pandas grouper int by frequency

I would like to group a Pandas dataframe by hour disregarding the date.
My data:
id opened_at count sum
2016-07-01 07:02:05 1 46.14
154 2016-07-01 07:34:02 1 479
2016-07-01 10:10:01 1 127.14
2016-07-02 12:01:04 1 8.14
2016-07-02 12:00:50 1 18.14
I am able to group by hour with date taken into account by using the following.
groupByLocationDay = df.groupby([df.id,
pd.Grouper(key='opened_at', freq='3h')])
I get the following
id opened_at count sum
2016-07-01 06:00:00 2 4296.14
154 2016-07-01 09:00:00 46 43716.79
2016-07-01 12:00:00 169 150827.14
2016-07-02 12:00:00 17 1508.14
2016-07-02 09:00:00 10 108.14
How can I group by hour only, so that it would look like the following.
id opened_at count sum
06:00:00 2 4296.14
154 09:00:00 56 43824.93
12:00:00 203 152335.28
The original data is on hourly basis, thus I need to get 3h frequency.
Thanks!
you can do it this way:
In [134]: df
Out[134]:
id opened_at count sum
0 154 2016-07-01 07:02:05 1 46.14
1 154 2016-07-01 07:34:02 1 479.00
2 154 2016-07-01 10:10:01 1 127.14
3 154 2016-07-02 12:01:04 1 8.14
4 154 2016-07-02 12:00:50 1 18.14
5 154 2016-07-02 08:34:02 1 479.00
In [135]: df.groupby(['id', df.opened_at.dt.hour // 3 * 3]).sum()
Out[135]:
count sum
id opened_at
154 6 3 1004.14
9 1 127.14
12 2 26.28

how to find the date difference in hours between two records with nearest datetime value and it must be compared in same group

How to find the date difference in hours between two records with nearest datetime value and it must be compared in same group?
Sample Data as follows:
Select * from tblGroup
Group FinishedDatetime
1 03-01-2009 00:00
1 13-01-2009 22:00
1 08-01-2009 03:00
2 01-01-2009 10:00
2 13-01-2009 20:00
2 10:01-2009 10:00
3 27-10-2008 00:00
3 29-10-2008 00:00
Expected Output :
Group FinishedDatetime Hours
1 03-01-2009 00:00 123
1 13-01-2009 22:00 139
1 08-01-2009 03:00 117
2 01-01-2009 10:00 216
2 13-01-2009 20:00 82
2 10:01-2009 10:00 82
3 27-10-2008 00:00 48
3 29-10-2008 00:00 48
Try this:
Select t1.[Group], DATEDIFF(HOUR, z.FinishedDatetime, t1.FinishedDatetime)
FROM tblGroup t1
OUTER APPLY(SELECT TOP 1 *
FROM tblGroup t2
WHERE t2.[Group] = t1.[Group] AND t2.FinishedDatetime<t1.FinishedDatetime
ORDER BY FinishedDatetime DESC)z