Omniture Data Warehouse API Not Allowing 'hour' Value for Date_Granularity - api

When using the Omniture Data Warehouse API Explorer ( https://developer.omniture.com/en_US/get-started/api-explorer#DataWarehouse.Request ), the following request provides an 'Date_Granularity is invalid response'. Does anyone have experience with this? The API documentation ( https://developer.omniture.com/en_US/documentation/data-warehouse/pdf ), states that the following values are acceptable: "none, hour, day, week, month, quarter, year."
{
"Breakdown_List":[
"evar14",
"ip",
"evar64",
"evar65",
"prop63",
"evar6",
"evar16"
],
"Contact_Name":"[hidden]",
"Contact_Phone":"[hidden]",
"Date_From":"12/01/11",
"Date_To":"12/14/11",
"Date_Type":"range",
"Email_Subject":"[hidden]",
"Email_To":"[hidden]",
"FTP_Dir":"/",
"FTP_Host":"[hidden]",
"FTP_Password":"[hidden]",
"FTP_Port":"21",
"FTP_UserName":"[hidden]",
"File_Name":"test-report",
"Metric_List":[ ],
"Report_Name":"test-report",
"rsid":"[hidden]",
"Date_Granularity":"hour",
}
Response:
{
"errors":[
"Date_Granularity is invalid."
]
}

Old question, just noticing it now.
Data Warehouse did not support the Hour granularity correctly until Jan 2013 (the error you saw was a symptom of this). Then it was corrected for date ranges less then 14 days. In the July 2013 maintenance release of v15 the 14 day limit should be gone. But I have not verified that myself.
As always the more data you request the longer the DW processing will take. So I recommend keeping ranges to a maximum of a month and uncompressed file sizes to under a 1GB, though I hear 2 GB should now be supported.
If you still have issues please let us know.
Thanks C.

Related

Mediawiki API - Throwing error getting previous date

I am new in this api which is MediaWiki web service API.
I am trying to pull data in above table of this url: https://lessonslearned.em.se.com/lessons/Main_Page
I'm testing it using its sandbox: https://lessonslearned.em.se.com/lessons/Special:ApiSandbox
Now, I can view the source of the page for example:
{{#ask:[[Category:Lesson]][[Has region::NAM]][[Creation date::>{{#time: r | -1 year}}]]|format=count}}
Above code should return lessons from NAM 12 months ago which has a value of 1 based on the table. But I am getting an error in this part [[Creation date::>{{#time: r | -1 year}}]].
Error message:{ "error": { "query": [ "\"Thu, 31 Mar 2022 03:50:49 +0000 -1 03202233103\" contains an extrinsic dash or other characters that are invalid for a date interpretation." ] } }
I tried to break the code in each part and found out that the root cause is the hyphen(-) in -1 year. I also check the documentation of the time parser function(https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions##time) and they have samples similar to this, But in my case it is not working.
Hope someone can give me at least reference for this problem. Thanks!

Spark structured streaming groupBy not working in append mode (works in update)

I'm trying to get a streaming aggregation/groupBy working in append output mode, to be able to use the resulting stream in a stream-to-stream join. I'm working on (Py)Spark 2.3.2, and I'm consuming from Kafka topics.
My pseudo-code is something like below, running in a Zeppelin notebook
orderStream = spark.readStream().format("kafka").option("startingOffsets", "earliest").....
orderGroupDF = (orderStream
.withWatermark("LAST_MOD", "20 seconds")
.groupBy("ID", window("LAST_MOD", "10 seconds", "5 seconds"))
.agg(
collect_list(struct("attra", "attrb2",...)).alias("orders"),
count("ID").alias("number_of_orders"),
sum("PLACED").alias("number_of_placed_orders"),
min("LAST_MOD").alias("first_order_tsd")
)
)
debug = (orderGroupDF.writeStream
.outputMode("append")
.format("memory").queryName("debug").start()
)
After that, I would expected that data appears on the debug query and I can select from it (after the late arrival window of 20 seconds has expired. But no data every appears on the debug query (I waited several minutes)
When I changed output mode to update the query works immediately.
Any hint what I'm doing wrong?
EDIT: after some more experimentation, I can add the following (but I still don't understand it).
When starting the Spark application, there is quite a lot of old data (with event timestamps << current time) on the topic from which I consume. After starting, it seems to read all these messages (MicroBatchExecution in the log reports "numRowsTotal = 6224" for example), but nothing is produced on the output, and the eventTime watermark in the log from MicroBatchExecution stays at epoch (1970-01-01).
After producing a fresh message onto the input topic with eventTimestamp very close to current time, the query immediately outputs all the "queued" records at once, and bumps the eventTime watermark in the query.
What I can also see that there seems to be an issue with the timezone. My Spark programs runs in CET (UTC+2 currently). The timestamps in the incoming Kafka messages are in UTC, e.g "LAST__MOD": "2019-05-14 12:39:39.955595000". I have set spark_sess.conf.set("spark.sql.session.timeZone", "UTC"). Still, the microbatch report after that "new" message has been produced onto the input topic says
"eventTime" : {
"avg" : "2019-05-14T10:39:39.955Z",
"max" : "2019-05-14T10:39:39.955Z",
"min" : "2019-05-14T10:39:39.955Z",
"watermark" : "2019-05-14T10:35:25.255Z"
},
So the eventTime somehow links of with the time in the input message, but it is 2 hours off. The UTC difference has been subtraced twice. Additionally, I fail to see how the watermark calculation works. Given that I set it to 20 seconds, I would have expected it to be 20 seconds older than the max eventtime. But apparently it is 4 mins 14 secs older. I fail to see the logic behind this.
I'm very confused...
It seems that this was related to the Spark version 2.3.2 that I used, and maybe more concretely to SPARK-24156. I have upgraded to Spark 2.4.3 and here I get the results of the groupBy immediately (well, of course after the watermark lateThreshold has expired, but "in the expected timeframe".

Snapchat API Error: "The start time should be start of a Local Time Zone day for DAY query."

I am making following request for Snapchat API:
GET https://adsapi.snapchat.com/v1/ads/7e4ebe9a-f903-4849-bd46-c590dbb4345e/stats?
granularity=DAY
&fields=android_installs,attachment_avg_view_time_millis,attachment_impressions,attachment_quartile_1,attachment_quartile_2,attachment_quartile_3,attachment_total_view_time_millis,attachment_view_completion,avg_screen_time_millis,avg_view_time_millis,impressions,ios_installs,quartile_1,quartile_2,quartile_3,screen_time_millis,spend,swipe_up_percent,swipes,total_installs,video_views,view_completion,view_time_millis,conversion_purchases,conversion_purchases_value,conversion_save,conversion_start_checkout,conversion_add_cart,conversion_view_content,conversion_add_billing,conversion_searches,conversion_level_completes,conversion_app_opens,conversion_page_views,attachment_frequency,attachment_uniques,frequency,uniques,story_opens,story_completes,conversion_sign_ups,total_installs_swipe_up,android_installs_swipe_up,ios_installs_swipe_up,conversion_purchases_swipe_up,conversion_purchases_value_swipe_up,conversion_save_swipe_up,conversion_start_checkout_swipe_up,conversion_add_cart_swipe_up,conversion_view_content_swipe_up,conversion_add_billing_swipe_up,conversion_sign_ups_swipe_up,conversion_searches_swipe_up,conversion_level_completes_swipe_up,conversion_app_opens_swipe_up,conversion_page_views_swipe_up,total_installs_view,android_installs_view,ios_installs_view,conversion_purchases_view,conversion_purchases_value_view,conversion_save_view,conversion_start_checkout_view,conversion_add_cart_view,conversion_view_content_view,conversion_add_billing_view,conversion_sign_ups_view,conversion_searches_view,conversion_level_completes_view,conversion_app_opens_view,conversion_page_views_view
&swipe_up_attribution_window=28_DAY
&view_attribution_window=1_DAY
&start_time=2018-10-05T00:00:00.000-08:00
&end_time=2018-10-19T00:00:00.000-08:00
Getting following Error:
{
"request_status": "ERROR",
"request_id": "5bf3f47e00ff060ab0faf7f4330001737e616473617069736300016275696c642d30666635373463642d312d3232302d350001010c",
"debug_message": "The start time should be start of a Local Time Zone day for DAY query.",
"display_message": "We're sorry, but the data provided in the request is incomplete or incorrect",
"error_code": "E1008"
}
Certain date ranges will work and others won't. It also doesn't matter what timezone offset (Europe/London +00:00, Los Angeles, -08:00) I use or how I format the request dates (2018-10-01T00:00:00Z, 2018-10-01T00:00:00.000, 2018-10-01T00:00:00.000-08:00, etc) for the ad stats request date range, the error will come back the same. The error has a code but it's not detailed in Snapchat's documentation. All they say is "it's a bad request".
For example, one ad would let me query 29/10/2018 to date or even 29/10/2018 to 30/10/2018 but as soon as I change it to 28/10/2018, it fails with the same error.
There's no apparent start/end times on ads as I thought it might be related to that. It's also not related to the campaign start/end times in this one case we tested.
API DOC: https://developers.snapchat.com/api/docs/?shell#overview
Solved the issue with above error. I forgot to consider the day light saving while passing the timezone offset.
For e.g. We need to check if there is day light saving for the start_time or end_time and adjust the offset accordingly for that timezone.

RabbitMQ API returning incorrect queue statistics

I'm working with RabbitMQ instances hosted at CloudAMQP. I'm calling the management API to get detailed queue statistics. About 1 in 10 calls to the API return invalid numbers.
The endpoint is /api/queues/[vhost]/[queue]?msg_rates_age=600&msg_rates_incr=30. I'm looking for average message rates at 30 second increments over a 10 minute span of time. Usually that returns valid data for the stats I'm interested in, e.g.
{
"messages": 16,
"consumers": 30,
"message_stats": {
"ack_details": {
"avg_rate": 441
},
"publish_details": {
"avg_rate": 441
}
}
}
But sometimes I get incorrect results for one or both "avg_rate" values, often 714676 or higher. If I then wait 15 seconds and call the same API again the numbers go back down to normal. There's no way the average over 10 minutes jumps by a multiple of 200 and then comes back down seconds later.
I haven't been able to reproduce the issue with a local install, only in production where the queue is always very busy. The data displayed on the admin web page always looks correct. Is there some other way to get the same stats accurately like the UI?

Google Finance: How big is a normal delay for historical stock data or is something broken?

I tried to download historical data from Google with this code:
import pandas_datareader.data as wb
import datetime
web_df = wb.DataReader("ETR:DAI", 'google',
datetime.date(2017,9,1),
datetime.date(2017,9,7))
print(web_df)
and got this:
Open High Low Close Volume
Date
2017-09-01 61.38 62.16 61.22 61.80 3042884
2017-09-04 61.40 62.01 61.31 61.84 1802854
2017-09-05 62.01 62.92 61.77 62.42 3113816
My question: Is this a normal delay or is something broken?
Also I would want to know: Have you noticed that Google has removed the historical data pages at Google Finance? Is this a hint that the will remove or allready have removed the download option for historical stock data, too?
google finance using pandas has stopped working since last night, I am trying to figure out.I have also noticed that the links to the historical data on their website is removed.
It depends on which stocks and which market.
Example with Indonesia market, it still able to get latest data. Of course, it may be soon to follow the fate of others market that stop updating on 5 September 2017. A very sad things
web_df = wb.DataReader("IDX:AALI", 'google',
datetime.date(2017,9,1),
datetime.date(2017,9,7))
Open High Low Close Volume
Date
2017-09-04 14750.0 14975.0 14675.0 14700.0 475700
2017-09-05 14700.0 14900.0 14650.0 14850.0 307300
2017-09-06 14850.0 14850.0 14700.0 14725.0 219900
2017-09-07 14775.0 14825.0 14725.0 14725.0 153300