I'm trying to query release dates for a ticker using pdblp. This is my attempt:
with pdblp.bopen() as c:
c.bulkref('AUM3 Index', 'ECO_RELEASE_DT_LIST')
But the results I'm getting are between Jan 2019 and Dec 2020. How can I get information about release dates further back? I've tried to override the start date but none of the attempts succeeded:
ovrds=[('start_dt', '20000101')]
ovrds=[('start_date', '20000101')]
ovrds=[('START_DATE', '20000101')]
I've also tried:
c.bdh('AUM3 Index', 'ECO_RELEASE_DT', '20000101', '20200609')
But this one generates no results at all.
The live Bloomberg help doesn't support Python API.
Assuming your bds is:
=BDS("AUM3 Index", "ECO_RELEASE_DT_LIST", "START_DATE=20000101")
You could try something like this:
c.ref(["AUM3 Index"], "ECO_RELEASE_DT_LIST", [("START_DATE","20000101"),])
Reference: https://github.com/matthewgilbert/pdblp/issues/25
Related
I have below code where I format the time value to be date format.
const arrivalDateTime = moment(
`${todaysDate.format("YYYY-MM-DD")} ${arrivalTime}:00:00`,
).toDate();
When debug mode is off, if I select 8 or 9 value from the control, its not able to format the value.
When I am in debug mode, and select 8 or 9, its able to format the value as:
Fri May 22 2020 09:00:00 GMT-0600 (Mountain Daylight Time)
I have seen many threads discussing the same issue but solution provided in them haven not helped me to format this correctly.
I am trying to print arrivalDateTime value, it shows like this in log ;
Date { NaN }
I am trying this but it does not work, it days toDate is not a function :
moment().format(`YYYY-MM-DD ${arrivalTime}:00:00`).toDate();
Took me a while but I finally figured it out.
this helped me
This didn't work:
const arrivalDateTime = moment(
`${todaysDate.format("YYYY-MM-DD")} ${arrivalTime}:00:00`,
).toDate();
This worked:
const arrivalDateTime = moment(
`${todaysDate.format('YYYY/MM/DD')} ${arrivalTime}:00`,
).toDate();
// note that engine does not seem to like parsing with ' - '. so I changed it to ' / '
still not sure the reason behind it.
I'm trying to get a streaming aggregation/groupBy working in append output mode, to be able to use the resulting stream in a stream-to-stream join. I'm working on (Py)Spark 2.3.2, and I'm consuming from Kafka topics.
My pseudo-code is something like below, running in a Zeppelin notebook
orderStream = spark.readStream().format("kafka").option("startingOffsets", "earliest").....
orderGroupDF = (orderStream
.withWatermark("LAST_MOD", "20 seconds")
.groupBy("ID", window("LAST_MOD", "10 seconds", "5 seconds"))
.agg(
collect_list(struct("attra", "attrb2",...)).alias("orders"),
count("ID").alias("number_of_orders"),
sum("PLACED").alias("number_of_placed_orders"),
min("LAST_MOD").alias("first_order_tsd")
)
)
debug = (orderGroupDF.writeStream
.outputMode("append")
.format("memory").queryName("debug").start()
)
After that, I would expected that data appears on the debug query and I can select from it (after the late arrival window of 20 seconds has expired. But no data every appears on the debug query (I waited several minutes)
When I changed output mode to update the query works immediately.
Any hint what I'm doing wrong?
EDIT: after some more experimentation, I can add the following (but I still don't understand it).
When starting the Spark application, there is quite a lot of old data (with event timestamps << current time) on the topic from which I consume. After starting, it seems to read all these messages (MicroBatchExecution in the log reports "numRowsTotal = 6224" for example), but nothing is produced on the output, and the eventTime watermark in the log from MicroBatchExecution stays at epoch (1970-01-01).
After producing a fresh message onto the input topic with eventTimestamp very close to current time, the query immediately outputs all the "queued" records at once, and bumps the eventTime watermark in the query.
What I can also see that there seems to be an issue with the timezone. My Spark programs runs in CET (UTC+2 currently). The timestamps in the incoming Kafka messages are in UTC, e.g "LAST__MOD": "2019-05-14 12:39:39.955595000". I have set spark_sess.conf.set("spark.sql.session.timeZone", "UTC"). Still, the microbatch report after that "new" message has been produced onto the input topic says
"eventTime" : {
"avg" : "2019-05-14T10:39:39.955Z",
"max" : "2019-05-14T10:39:39.955Z",
"min" : "2019-05-14T10:39:39.955Z",
"watermark" : "2019-05-14T10:35:25.255Z"
},
So the eventTime somehow links of with the time in the input message, but it is 2 hours off. The UTC difference has been subtraced twice. Additionally, I fail to see how the watermark calculation works. Given that I set it to 20 seconds, I would have expected it to be 20 seconds older than the max eventtime. But apparently it is 4 mins 14 secs older. I fail to see the logic behind this.
I'm very confused...
It seems that this was related to the Spark version 2.3.2 that I used, and maybe more concretely to SPARK-24156. I have upgraded to Spark 2.4.3 and here I get the results of the groupBy immediately (well, of course after the watermark lateThreshold has expired, but "in the expected timeframe".
Is it some sort of overflow?
phantomjs> new Date("1400-03-01T00:00:00.000Z")
"1400-03-01T00:00:00.000Z"
phantomjs> new Date("1400-02-28T20:59:59.000Z")
"1400-02-27T20:59:59.000Z"
what you would expect:
>>(new Date("1400-03-01T00:00:00.000Z")).toISOString()
"1400-03-01T00:00:00.000Z"
>>(new Date("1400-02-28T20:59:59.000Z")).toISOString()
"1400-02-28T20:59:59.000Z"
apparently there is a gap of 24 hours when parsing dates between the 28th of February in 1400 and the 1st of March in 1400.
any ideas?
Phantomjs anyway is obsolete but still ... our legacy tests are failing when we try to upgrade to chrome headless ...
PhantomJS uses a version of Qt WebKit which is maintained independently of Qt.
The date format you are using is part of the ISO-8601 date and time format. [related]
The version of Qt WebKit that PhantomJS uses has a function that parses dates of the form defined in ECMA-262-5, section 15.9.1.15 (similar to RFC 3339 / ISO 8601: YYYY-MM-DDTHH:mm:ss[.sss]Z).
In the source code, we can see that the function used to parse these types of dates is called:
double parseES5DateFromNullTerminatedCharacters(const char* dateString)
The file that contains this function in the PhantomJS repository has not been updated since July 27, 2014, while the official file was updated as recently as October 13, 2017.
It appears that there is a problem in the logic having to do with handling leap years.
Here is a comparison of DateMath.cpp between the most recent versions from the official qtwebkit repository (left) and the PhantomJS qtwebkit repository (right).
I am able to import data from my google storage. However, having troubling exporting data to Google Cloud Storage CSV files through the web console. Data set is small, and I am not getting any specific reasons that cause the issue.
Extract9:30am
gl-analytics:glcqa.Device togs://glccsv/device.csv
Errors:
Unexpected. Please try again.
Job ID: job_f8b50cc4b4144e14a22f3526a2b76b75
Start Time: 9:30am, 24 Jan 2013
End Time: 9:30am, 24 Jan 2013
Source Table: gl-analytics:glcqa.Device
Destination URI: gs://glccsv/device.csv
It looks like you have a nested schema, which cannot be output to csv. Try setting the output format to JSON.
Note this bug has now been fixed internally, so after our next release you'll get a better error when this happens.
When using the Omniture Data Warehouse API Explorer ( https://developer.omniture.com/en_US/get-started/api-explorer#DataWarehouse.Request ), the following request provides an 'Date_Granularity is invalid response'. Does anyone have experience with this? The API documentation ( https://developer.omniture.com/en_US/documentation/data-warehouse/pdf ), states that the following values are acceptable: "none, hour, day, week, month, quarter, year."
{
"Breakdown_List":[
"evar14",
"ip",
"evar64",
"evar65",
"prop63",
"evar6",
"evar16"
],
"Contact_Name":"[hidden]",
"Contact_Phone":"[hidden]",
"Date_From":"12/01/11",
"Date_To":"12/14/11",
"Date_Type":"range",
"Email_Subject":"[hidden]",
"Email_To":"[hidden]",
"FTP_Dir":"/",
"FTP_Host":"[hidden]",
"FTP_Password":"[hidden]",
"FTP_Port":"21",
"FTP_UserName":"[hidden]",
"File_Name":"test-report",
"Metric_List":[ ],
"Report_Name":"test-report",
"rsid":"[hidden]",
"Date_Granularity":"hour",
}
Response:
{
"errors":[
"Date_Granularity is invalid."
]
}
Old question, just noticing it now.
Data Warehouse did not support the Hour granularity correctly until Jan 2013 (the error you saw was a symptom of this). Then it was corrected for date ranges less then 14 days. In the July 2013 maintenance release of v15 the 14 day limit should be gone. But I have not verified that myself.
As always the more data you request the longer the DW processing will take. So I recommend keeping ranges to a maximum of a month and uncompressed file sizes to under a 1GB, though I hear 2 GB should now be supported.
If you still have issues please let us know.
Thanks C.