Quickest Way to synchronously refresh TR-Formulas in VBA - vba

Thanks to the help in this forum, I got my SQL-conncection and inserts working now.
The following TR-formula is used to retrieve the data from Excel Eikon:
#TR($C3,"TR.CLOSEPRICE (adjusted=0);
TR.CompanySharesOutstanding;
TR.Volume;
TR.TURNOVER"," NULL=Null CODE=MULTI Frq=D SDate="&$A3&" EDate="&$A3)
For 100k RICs the formulas usually need between 30s and 120s to refresh. That would still be acceptable.
The problem is to get the same refresh-speed in a VBA-loop. Application.Run "EikonRefreshWorksheet" is currently used for a synchronous refresh as recommended in this post.
https://community.developers.refinitiv.com/questions/20247/can-you-please-send-me-the-excel-vba-code-which-ex.html
The syntax of the code is correct and working for 100 RICS. But already for 1k the fetching gets very slow and will freeze completely for like 50k. Even with a timeout interval of 5min.
I isolated the refresh-part. There is nothing else slowing it down. So is this maybe just not the right method for fetching larger data sets? Does anyone know a better alternative?

I finally got some good advice from the Refinitiv Developer Forum which I wanted to share here:
I think you should be using the APIs directly as opposed to opening a spreadsheet and taking the data from that - but all our APIs have limits in place. There are limits for the worksheet functions as well (which use the same backend services as our APIs) - as I think you have been finding out.
You are able to use our older Eikon COM APIs directly in VBA. In this instance you would want to use the DEX2 API to open a session and download the data. You can find more details and a DEX2 tutorial sample here:
https://developers.refinitiv.com/en/api-catalog/eikon/com-apis-for-use-in-microsoft-office/tutorials#tutorial-6-data-engine-dex-2
However, I would recommend using our Eikon Data API in the Python environment as it is much more modern and will give you a much better experience than the COM APIs. If you have a list of 50K instruments say - you could make 10 API calls of say 5K instruments using some chunking and it would all be much easier for you to manage - without even resorting to Excel - and then you can use any Python SQL tool to ingest into any database you wish - all from one python script.
import refinitiv.dataplatform.eikon as ek
ek.set_app_key('YOUR APPKEY HERE')
riclist = ['VOD.L','IBM.N','TSLA.O']
df,err = ek.get_data(riclist,["TR.CLOSEPRICE(adjusted=0).date","TR.CLOSEPRICE(adjusted=0)",'TR.CompanySharesOutstanding','TR.Volume','TR.TURNOVER'])
df
#df.to_sql - see note below
#df.to_csv("test1.csv")
1641297076596.png
This will return you a pandas dataframe that you can easily directly write into any SQLAlchemy Database for example (see example here) or CSV / JSON for example.
Unfortunately, our company policy does not allow for Python at the moment. But the VBA-solution also worked, even though it took some time to understand the tutorial and it has more limitations.

Related

BigQuery connecting from GSheet without enabling API every time

I have some scripts running from GSheet getting data from BigQuery. However, in order to make the files run, I need to manually enable the API every time for a given sheet.
So the question is: How to enable API within the code, so that if I share the GSheet or make a copy I don't have to go to the script editor and enable the API from there?
Thanks
I am a huge fan of this particular use of the Google ecosystem, so I'm happy to help get others up and running using GSheets with BigQuery! Hopefully it is working well for you!
When sharing the sheet with others, there is no need to alter anything in the script editor at all. The scripts should run and query BigQuery without issue; this has been my experience at least. The obvious caveat to this is that the users you share it with must have access to the Google Developer Project that the BigQuery instance is associated with.
However, when copying the sheet, I do not believe it is possible to have it replicate the connection. This is because when the file is copied, it becomes associated with a new Google Developer Project. Thus, you have to go into the script editor, then go to Resources > Developers Console Project and change the project listed to the one in which you have BigQuery enabled.
Hopefully this helps! Sorry I don't have better news for you!

How to access results of Sonar metrics for use with applications like PowerPivot

I'm trying to run a number of applications with known failure rates through Sonar, with hopes of deciding which metrics are most valuable in determining whether a particular application will fail. Ultimately I'll be making some sort of algorithm that will look at the outputs of whatever metrics I'm using and generate a score from 1 - 100. I've got about 21 applications put through Sonar, and the results have been stored in a MySQL database. I originally planned to use PowerPivot to find relationships in the data, but it seems like the formatting of the tables doesn't lend itself well to that. Other questions on stackoverflow have told me that Sonar's tables are unformatted, and I should instead use the Web Service API to get the information. I'm unfamiliar with API and was unsuccessful in trying to do what I wanted by looking at Sonar's documentation for API.
From an answer to another question:
http://nemo.sonarsource.org/api/timemachine?resource=org.apache.cxf:cxf&format=csv&metrics=ncloc,violations_density,comment_lines_density,public_documented_api_density,duplicated_lines_density,blocker_violations,critical_violations,major_violations,minor_violations
This looks very similar to what I'd like to have, except I'm only looking at each application once (I'm analyzing a sample of all the live applications on a grid), which means Timemachine isn't really what I'm looking for. Would it be possible to generate a similar table, except instead of the stats for a particular application per date, it showed the statistics for an application and all of its classes, etc?
If you're not familiar with the WS API, you can also create your own Sonar plugin to achieve whatever you want: it is written in Java and it will execute on every analysis you run. This way, in the code ot this custom plugin, you can do whatever you want: flush the metrics you need in an output file, push them into a third party system, ... etc.
Just take a look on how to write a plugin (most probably you will create a Decorator). You have concrete examples also to get started faster.

How can I speed up batch processing job in Coldfusion?

Every once in awhile I am fed a large data file that my client uploads and that needs to be processed through CMFL. The problem is that if I put the processing on a CF page, then it runs into a timeout issue after 120 seconds. I was able to move the processing code to a CFC where it seems to not have the timeout issue. However, sometime during the processing, it causes ColdFusion to crash and has to restarted. There are a number of database queries (5 or more, mixture of updates and selects) required for each line (8,000+) of the file I go through as well as other logic provided by me in the form of CFML.
My question is what would be the best way to go through this file. One caveat, I am not able to move the file to the database server and process it entirely with the DB. However, would it be more efficient to pass each line to a stored procedure that took care of everything? It would still be a lot of calls to the database, but nothing compared to what I have now. Also, what would be the best way to provide feedback to the user about how much of the file has been processed?
Edit:
I'm running CF 6.1
I just did a similar thing and use CF often for data parsing.
1) Maintain a file upload table (Parent table). For every file you upload you should be able to keep a list of each file and what status it is in (uploaded, processed, unprocessed)
2) Temp table to store all the rows of the data file. (child table) Import the entire data file into a temporary table. Attempting to do it all in memory will inevitably lead to some errors. Each row in this table will link to a file upload table entry above.
3) Maintain a processing status - For each row of the datafile you bring in, set a "process/unprocessed" tag. This way if it breaks, you can start from where you left off. As you run through each line, set it to be "processed".
4) Transaction - use cftransaction if possible to commit all of it at once, or at least one line at a time (with your 5 queries). That way if something goes boom, you don't have one row of data that is half computed/processed/updated/tested.
5) Once you're done processing, set the file name entry in the table in step 1 to be "processed"
By using the approach above, if something fails, you can set it to start where it left off, or at least have a clearer path of where to start investigating, or worst case clean up in your data. You will have a clear way of displaying to the user the status of the current upload processing, where it's at, and where it left off if there was an error.
If you have any questions, let me know.
Other thoughts:
You can increase timeouts, give the VM more memory, put it in 64 bit but all of those will only increase the capacity of your system so much. It's a good idea to do these per call and do it in conjunction with the above.
Java has some neat file processing libraries that are available as CFCS. if you run into a lot of issues with speed, you can use one of those to read it into a variable and then into the database
If you are playing with XML, do not use coldfusion's xml parsing. It works well for smaller files and has fits when things get bigger. There are several cfc's written out there (check riaforge, etc) that wrap some excellent java libraries for parsing xml data. You can then create a cfquery manually if need be with this data.
It's hard to tell without more info, but from what you have said I shoot out three ideas.
The first thing, is with so many database operations, it's possible that you are generating too much debugging. Make sure that under Debug Output settings in the administrator that the following settings are turned off.
Enable Robust Exception Information
Enable AJAX Debug Log Window
Request Debugging Output
The second thing I would do is look at those DB queries and make sure they are optimized. Make sure selects are happening with indicies, etc.
The third thing I would suspect is that the file hanging out in memory is probably suboptimal.
I would try looping through the file using file looping:
<cfloop file="#VARIABLES.filePath#" index="VARIABLES.line">
<!--- Code to go here --->
</cfloop>
Have you tried an event gateway? I believe those threads are not subject to the same timeout settings as page request threads.
SQL Server Integration Services (SSIS) is the recommended tool for complex ETL (Extract, Transform, and Load) work, which is what this sounds like. (It can be configured to access files on other servers.) The question might be, can you work up an interface between Cold Fusion and SSIS?
If you can upgrade to cf8 and take advantage of cfloop file="" which would give you greater speed and the file would not be put in memory (which is probably the cause of the crashing).
Depending on the situation you are encountering you could also use cfthread to speed up processing.
Currently, an event gateway is the only way to get around the timeout limits of an HTTP request cycle. CF does not have a way to process CF pages offline, that is, there is no command-line invocation (one of my biggest gripes about CF - very little offling processing).
Your best bet is to use an Event Gateway or rewrite your parsing logic in straight Java.
I had to do the same thing, Ben Nadel has written a bunch of great articles uses java file io, to allow you to more speedily read files, write files etc...
Really helped improve the performance of our csv importing application.

Website data retrieval

An recent article has prompted me to pick up a project I have been working on for a while. I want to create a web service front end for a number of sites to allow automated completion of forms and data retrieval from the results, and other areas of the site. I have acheived a degree of success using Selenium and custom code however I am looking to extend this to a stage where adding additional sites is a trivial task (maybe one which doesn't require a developer even).
The Kapow web data server looks to achieve a lot of this however I am told it is quite expensive (currently awaiting a quote). Has anyone had experience with this, or can suggest any alternatives (Open Source ideally)?
Disclaimer: I realise the potential legality issues around automating data retrieval from 3rd party websites - this tool is designed to be used in a price comparison system and all of the websites integrated with it will be done with the express permission of the owners. Where the sites provide an API this will clearly be the favoured approach.
Thanks
Realised it's been a while since I posted this, however should anyone come across it, I have had lots of success in using the WSO2 framework (particularly the mashup server) for this. For data mining tasks I have also used a Java library that this wraps - webharvest - which has achieved everything I needed

Automating WebTrends analysis

Every week I access server logs processed by WebTrends (for about 7 profiles) and copy ad clickthrough and visitor information into Excel spreadsheets. A lot of it is just accessing certain sections and finding the right title and then copying the unique visitor information.
I tried using WebTrends' built-in query tool but that is really poorly done (only uses a drag-and-drop system instead of text-based) and it has a maximum number of parameters and maximum length of queries to query with. As far as I know, the tools in WebTrends are not suitable to my purpose of automating the entire web metrics gathering process.
I've gotten access to the raw server logs, but it seems redundant to parse that given that they are already being processed by WebTrends.
To me it seems very scriptable, but how would I go about doing that? Is screen-scraping an option?
I use ODBC for querying metrics and numbers out of webtrends. We even fill a scorecard with all key performance metrics..
Its in German, but maybe the idea helps you: http://www.web-scorecard.net/
Michael
Which version of WebTrends are you using? Unless this is a very old install, there should be options to schedule these reports to be emailed to you, and also to bookmark queries. Let me know which version it is and I can make some recommendations.