Posting huge datasets to a webservice from excel VBA - vba

Hello fellow Excel vba coders
I have this great macro in my excel sheet, where i compile XML based on the rows the user puts in - after this it post the xml to a webservice.
You can have a look at my code below - It is fairly simple:
Set XMLHttpRequest = New MSXML2.XMLHTTP
With XMLHttpRequest
.Open "POST", URL, False
.setRequestHeader "Content-Type", "text/xml; encoding='utf-8'"
.setRequestHeader "Content-Length", strLength
.send strXML
End With
Right now it works great when there is less than 200 rows, yet it times out when the row number gets above 1000 rows. The string of XML I post is really big, and i'm quite sure that's the reason it times out.
Now my problem is, how do i post this huge dataset, that exceed 1.000 rows, maybe even above 20.000 rows, to a webservice?
So far i have spend a lot of time to look for a possible solution around the web, but have yet to find a way to handle this. So far i have the following ideas to solve the problem:
Copy the sheet to a new workbook, take the location of the new workbook and convert the file to a Base64 string and post the entire file to a new .asmx service and handle the "workbook" in C# code.
Convert the huge string to some kind of byte array and post that to a new .asmx webservice and handle the C# code.
I really hope one of you guys can point me in the right direction and help me solve this problem?

Instead of sending the entire sheet as a single object, send each row individually.
This will require your web service to be modified to accept just a row, rather than, what I would guess is currently being sent as an array of rows.
This would allow you to process any number of rows as the most you are processing at any one time is exactly one.

It is possible the issue of configuration on server side... See this post
and this.

Ok - I found a solution to my problem, and it seems like the best way to handle this problem.
I use the following function to make a copy of the workbook:
Private Function saveAS(Path As String)
Application.EnableEvents = False
Application.DisplayAlerts = False
ActiveWorkbook.Sheets.Copy
ActiveWorkbook.saveAS Path, FileFormat:=51
ActiveWorkbook.Close savechanges:=True
Application.EnableEvents = True
Application.DisplayAlerts = True
End Function
Then i encode the file into a base64string, like so:
Private Function EncodeFileBase64(Filename As String) As String
Dim arrData() As Byte
Dim fileNum As Integer
Filename = Filename + ".xlsx"
fileNum = FreeFile
Open Filename For Binary As fileNum
ReDim arrData(LOF(fileNum) - 1)
Get fileNum, , arrData
Close fileNum
Dim objXML As MSXML2.DOMDocument
Dim objNode As MSXML2.IXMLDOMElement
Set objXML = New MSXML2.DOMDocument
Set objNode = objXML.createElement("b64")
objNode.DataType = "bin.base64"
objNode.nodeTypedValue = arrData
EncodeFileBase64 = objNode.Text
Set objNode = Nothing
Set objXML = Nothing
End Function
And then i send the encoded string to my own .asmx webservice and work with it in C# The webmethod looks like this:
[WebMethod]
[ScriptMethod(UseHttpGet = true)]
public string UploadXML(string base64string)
{
try
{
byte[] bytes = Convert.FromBase64String(base64string);
using (MemoryStream ms = new MemoryStream(bytes))
{
using (var package = new ExcelPackage(ms))
{
ExcelWorkbook workBook = package.Workbook;
ExcelWorksheet settings = workBook.Worksheets.SingleOrDefault(w => w.Name == "sheet1");
ExcelWorksheet data = workBook.Worksheets.SingleOrDefault(w => w.Name == "sheet2");
//Getting data
string SS1 = (string)settings.Cells[8, 3].Value;
string ss2 = (string)settings.Cells[7, 3].Value;
}
}
return "success";
}
catch (Exception ee)
{
return ee.Message;
}
}
I just need to find a good way to pull out all the data in a smart algorithm, i dont think that will be a problem at all :)

I know you have a solution but I'd like to give an opinion...
If the data is really so tabular then shouldn't it be in a database like SQL Server? If you are uploading to a database then SQL Server has some nice bulk upload features to load workbooks efficiently.

Related

Lotusscript save base64 encoded string to file (DLL)

[EDIT] I believe i left out my original problem. To me it seems like the issue resides in passing the content of the decoded MIMEEntity to a stream, which i'd like to write out to a file. No matter how i attempt it, i can not get lotus script to write the binary data to the file. If anyone has any helpful opinion/suggestion/etc.., I'd be more then grateful!
[ORIGINAL]
I have the following code
Dim a As String
a = "TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" & _
"AAAAgAAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1v" & _
...
...
Dim session As New NotesSession
Dim stream As NotesStream
Dim doc As NotesDocument
Dim body As NotesMimeEntity
Dim db As NotesDatabase
Set session = New NotesSession
'Create stream and display properties
Set stream = session.CreateStream
'check if the file exists
If Not stream.Open("C:\\Notes\\update.dll") Then
'if the file doesnot exist then create one and add a time stamp to it
Dim fileNum As Integer
fileNum% = Freefile()
Open "/ww414/notes/ebcdicfile.txt" For Output As fileNum%
Close fileNum%
'this should have created the file. see if it existis now
If Not stream.Open("C:\\Notes\\update.dll") Then
'if the file has not been created yet then let the user know of the error that blocks the operation
Messagebox("Log file Is inaccessible")
End If
End If
Dim b As NotesStream
Set b = session.CreateStream
Call b.WriteText(a)
'==========================================================
'update file with the b64 decoded content
Set db = session.CurrentDatabase
Set doc = db.CreateDocument
session.ConvertMime = False
Set body = doc.CreateMIMEEntity
Call body.SetContentFromText(b, "", ENC_BASE64)
Call body.DecodeContent
content = body.ContentAsText
Call stream.WriteText(content)
'close stream/file open in memory
Call stream.Close()
The problem is, the file gets created, but when it comes to the content, it simply puts a few bytes in it (instead of the 14kb of actual file data)
I have checked a bunch of forums and possible solutions, but none of them seem to work.
For instance:
https://www.nsftools.com/tips/Base64v14.lss
http://www-10.lotus.com/ldd/nd6forum.nsf/e5f5333619f2996885256a220009508f/a8bb2c21c99f9c4d852571ee005cede9?OpenDocument
https://ghads.wordpress.com/2008/10/17/vbscript-readwrite-binary-encodedecode-base64/
So, the solution was even simpler as i thought.
This was a huge help, as the root cause of my issue seemed to be the writing out the binary content to the disk. And that was due to creating the file the wrong way! While the file got created it couldn't output the content properly (for some "Lotus reasons"..)
Either way, taking a coffee break and starting everything from zero helped a lot! The code that worked (for future ref. if someone would need to get such a thing working):
Sub Initialize
Dim a As String
a ="BASE64 ENCODED STRING(In my case it was a DLL)"
Dim session As New NotesSession
Dim stream As NotesStream
Dim doc As NotesDocument
Dim body As NotesMimeEntity
Dim db As NotesDatabase
Set session = New NotesSession
Set stream = session.CreateStream
Dim b As NotesStream
Set b = session.CreateStream
Call b.WriteText(a, EOL_NONE)
Set db = session.CurrentDatabase
Set doc = db.CreateDocument
session.ConvertMime = False
Set mime = doc.CreateMIMEEntity
Call mime.SetContentFromText(b, "application/octet-stream", ENC_BASE64)
Call mime.DecodeContent
If Not(mime Is Nothing) Then
Set stream = session.CreateStream
pathname$ = "c:\temp\test.dll"
If Not stream.Open(pathname$, "binary") Then
Messagebox pathname$,, "Open failed"
Goto ExitSub
End If
Call mime.GetContentAsBytes(stream)
Call stream.Close
Else
Messagebox "Not MIME",, doc.GetItemValue("Subject")(0)
End If
ExitSub:
session.ConvertMIME = True ' Restore conversion
End Sub

VBA XMLHTTP 'Out-of-memory' state

Here is a piece of code that should connect to a web page with contents as follows: link1, description1, otherdata1, link2, description2, otherdata2, ..., linkN, descriptionN, otherdataN where N is 30 000 +.
From these links, the program finds one link of interest using regex, goes to that link, and downloads a file from there.
My problem is: at htmlWebInterfaceXML.send the program frequently, but not always, runs out of memory ('out-of-memory' error). It is difficult for me to test different solutions because normally the program runs smoothly, and it is difficult to notice changes if any.
Additional info:
It runs smoothly on some PCs and does not work for others
It usually runs until afternoon and throws an error then
Other infos that may be helpful:
the code shown is a private method of class and the class itself is a small part of the source code
other subs called which I do not explain are not relevant and run smoothly, the probelm always appears at htmlWebInterfaceXML.send.
One guess of mine is that I have declared local variable inside of a function that holds a very large object and can cause a stack overflow but it seems unlikely as VBA should handle those things on its own. Maybe you see a problem that I don't? Thank you.
Private Sub FileUpload()
' THE FUNCTION CANNOT BE CONNECTING FOR EACH CONTRACT ID! WILL TAKE TOO MUCH TIME - NEED TO ALTER
Dim member As Variant
Dim byteCounter As Byte
Dim byteMaxID As Byte
Dim strPathToXMLFile As String
Dim strURLToXMLFile
Dim strXMLFileStorageName As String
Dim domdocXMLText As New MSXML2.DOMDocument
Dim clctStrFoundMatches As New Collection
Dim clctInternalIDs As New Collection
Dim vrntContractID As Variant
Dim htmlHTMLMainPageXMLInterface As New MSHTML.HTMLDocument
Dim htmlTagElement As Variant
Dim htmclctFoundXMLs As MSHTML.IHTMLElementCollection
Dim htmlWebInterfaceXML As MSXML2.XMLHTTP60
Dim intNumberOfTradeUnderProcessing As Integer
UpdateProgressStatus "LOADING SERVER WITH SOURCE XML..." '<----------- UPDATE PROGRESS!
'----------------------> OPEN AND LOAD THE WEB SERVER, AND STORE ITS HTML INTO AN OBJECT
Set htmlWebInterfaceXML = New MSXML2.XMLHTTP
With htmlWebInterfaceXML
.Open "GET", p_cstrWebInterfaceXMLRootDirectory, False
.setRequestHeader "Authorization", "Basic" & Base64Encode( _
p_cstrXMLWebInterfaceAuthenticationUser & ":" & p_cstrXMLWebInterfaceAuthenticationPassword)
.setRequestHeader "Pragma", "no-cache"
.setRequestHeader "Cache-Control", "no-cache"
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
End With
htmlHTMLMainPageXMLInterface.body.innerHTML = htmlWebInterfaceXML.responseText ' how much text is the htmldocument able to store??
Set htmlWebInterfaceXML = Nothing
SetUpDirectory ' ------------> create or set directory where to store XML files
If Me.ContractiDs.Count <> Me.MailParts.Count And Me.ContractiDs.Count <> Me.MailParts.Count * 2 Then
Err.Raise 1504, "FileUpload", p_cstrError1504Message
Else
For Each vrntContractID In Me.ContractiDs
intNumberOfTradeUnderProcessing = intNumberOfTradeUnderProcessing + 1
UpdateProgressStatus "LOADING XML FOR THE TRADE NUMBER " & intNumberOfTradeUnderProcessing & "..." ' ----------------> UPDATE STATUS BAR
' ------------------------> find the tags containing the needed contract id in their names
Set htmclctFoundXMLs = htmlHTMLMainPageXMLInterface.getElementsByTagName("a")
Set clctStrFoundMatches = New Collection
For Each htmlTagElement In htmclctFoundXMLs
If htmlTagElement.getAttribute("href") Like "*" & vrntContractID & "*" Then
clctStrFoundMatches.Add htmlTagElement
End If
Next htmlTagElement
If clctStrFoundMatches.Count = 0 Then Err.Raise 1506, "FileUpload", p_cstrError1506Message
' -----------------------> exclude the archives from the collection
byteCounter = 0
For byteCounter = 1 To clctStrFoundMatches.Count
If blnContainsPattern("\.gz$", clctStrFoundMatches(byteCounter).innerText) Then
clctStrFoundMatches.Remove byteCounter
End If
Next byteCounter
' ----------------------> extract the contract ids and find the last contract id
Set clctInternalIDs = New Collection
For Each member In clctStrFoundMatches
clctInternalIDs.Add strReturnSingleMatch("\d{9}", member.innerText)
If clctInternalIDs(clctInternalIDs.Count) = "False" Then Err.Raise 1505, "FileUpload", p_cstrError1505Message
Next member
byteMaxID = FindMaximum(clctInternalIDs)
strPathToXMLFile = clctStrFoundMatches(byteMaxID).innerText
' -----------------------> check whether such file exists, and, if not, download it
If blnFileExists(strPathToXMLFile, p_cstrXMLDestination) Then
Else
strURLToXMLFile = p_cstrWebInterfaceXMLRootDirectory & strPathToXMLFile
Set htmlWebInterfaceXML = Nothing: Set htmlWebInterfaceXML = New MSXML2.XMLHTTP
htmlWebInterfaceXML.Open "GET", strURLToXMLFile, False
htmlWebInterfaceXML.setRequestHeader "Authorization", "Basic" & Base64Encode( _
p_cstrXMLWebInterfaceAuthenticationUser & ":" & p_cstrXMLWebIntervaceAuthenticationPassword)
htmlWebInterfaceXML.send
With domdocXMLText
.validateOnParse = False
.async = False
End With
domdocXMLText.LoadXML htmlWebInterfaceXML.responseText
domdocXMLText.Save p_cstrXMLDestination & "\" & strPathToXMLFile
End If
Next vrntContractID
End If
Set htmlHTMLMainPageXMLInterface = Nothing
End Sub
I had a similar problem that i solved. my script ran through a server and looked for a string in thousands of text files.
The way i solved it is (using ascync ServerXMLHTTP60) setting an if condition in the response handler to ABORT the xmlhttp object. After i was done comparing data, the object would get "dropped" out of memory, and now i can query (virtually) any amount of data. (took me about a month to test a lot of solution, and this was the right one)
This might work for you, so just add htmlWebInterfaceXML.abort after you are done with that data set.
Hope this helps! Cheers!

Single line thread/sequenced execution of formulas in Excel

I wrote a simple macro function in VBA for Excel to extract text appearing in a particular location in an HTML document, first retrieving the HTML document from a URL value in another cell. The macro function itself is not important, except for the fact that it sends an HTTP request and creates an HTML file object, which I fear will cause Excel to crash if I paste, say, a column of 100 or more URLs and it starts trying to calculate all the values at once. I can see that it stops and churns for a moment if I drag the formula down 10 cells where there is already a column of URLs. Is there a setting to force Excel to calculate one formula at a time, so that it may take longer but is less likely to freeze up or crash?
Update: I incorporated a static collection variable into the function to at least avoid repeated slowdowns retrieving the same HTML in the same worksheet:
Function GetUSPatentAbstract(ByVal url As String) As String
Static colAbstract As New Collection
Dim abstract As String
On Error Resume Next
abstract = colAbstract(url)
`If there is no abstract for the URL in the collection yet, then it is retrieved:
If Err.Number <> 0 Then
Dim description As String
Dim abstractStart As Long
Dim abstractEnd As Long
Dim abstractLength As Long
Set html_doc = CreateObject("htmlfile")
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
xml_obj.Open "GET", url, False
xml_obj.send
html_doc.body.innerhtml = xml_obj.responseText
Set xml_obj = Nothing
description = html_doc.body.innertext
abstractStart = InStr(description, "Abstract") + 8
abstractEnd = InStr(description, "Inventors:")
abstractLength = abstractEnd - abstractStart
abstract = Mid(description, abstractStart, abstractLength)
colAbstract.Add Item:=abstract, Key:=url
End If
On Error GoTo 0
GetUSPatentAbstract = abstract
End Function

Rubbish in "GET" response from RallyDev API

I have an function in Excel:
Function getState(Defects As Object) As String
Dim str As String
Dim res As String
Dim was As Boolean
Dim sURL As String
Dim oRequest As Object
Set oRequest = CreateObject("WinHttp.WinHttpRequest.5.1")
was = False
For Each defect In Defects
If was = False Then
str = "(FormattedID = """ & defect & """)"
res = str
was = True
Else
res = res & " OR " & str
End If
Next defect
sURL = "https://rally1.rallydev.com/slm/webservice/v2.0/defect?query=(" & res & ")&fetch=FormattedID,State"
oRequest.Open "GET", sURL, True
oRequest.setRequestHeader "Content-Type", "application/json;charset=UTF-8"
oRequest.Send
oRequest.WaitForResponse
' Set Defects = oRequest.ResponseText
Debug.Print (oRequest.ResponseText)
End Function
Unfortunately, i get rubbish instead of json in response like:
? i?ANA0E?=A(utU
¤RoP°irC°%'.o$%·CI{OOYO????uApeaBRM?Zb?u?OWo?"{oSCy5?(}e??e?qBA"qnu~E·Uu?|?aRbE?a>anµ?c?9P=?A[­Oul?0i O {PZS?Af~???^??k??R˜?|©?#iEoPNO|?¦'y?vO^Ol? ]?g?#?AjAa?\aC¤y %©e»]"IHog??#:?· (??"¶E9yog?Az?7bw?/#?eWp^u?ZU?u??3?q?A)cy7µe?E
Could you please take a look at it and provide an solution how it can be fixed?
Thank you in advance!
Larry's answer is correct.
As an aside, if you need to pull data from Rally into Excel, your best bet is the Rally Add-in for Excel:
https://help.rallydev.com/rally-add-excel
Since you mentioned VBA, you may be working to build some automation in Excel, which the Excel add-in doesn't support. There is an alpha-level Rally Rest Toolkit for VBA. It handles the authentication and REST serialization/de-serialization for you, so it could ease some of your coding effort.
It is unofficial and not supported by Rally, but could be worth trying. Since it is unsupported though, Rally can't help you out if you run into issues. You'd have to refactor your own code against the VBA toolkit to pull the data you want:
https://github.com/markwilliams970/RallyRestToolkitForVBA
The response is compressed. Either try setting a request header for Accept-Encoding: identity or decompress the response.

GZipstream cuts off data from original file when decompressing

For a long time I have been trying to debug why my parsing counts were off when downloading files to be parsed and been made to look really dumb about this. I did some debugging and found that the file I download when trying to decompress using GZipStream shows that it misses data from the original file. Here is my code for decompressing:
Using originalFileStream As FileStream = fileItem.OpenRead()
Dim currentFileName As String = fileItem.FullName
Dim newFileName = currentFileName.Remove(currentFileName.Length - fileItem.Extension.Length)
newFile = newFileName
Using decompressedFileStream As FileStream = File.Create(newFileName)
Using decompressionStream As GZipStream = New GZipStream(originalFileStream, CompressionMode.Decompress)
decompressionStream.CopyTo(decompressedFileStream)
Console.WriteLine("Decompressed: {0}", fileItem.Name)
decompressionStream.Close()
originalFileStream.Close()
End Using
End Using
End Using
Now what I do is return the newfile to the calling function and read the contents from there:
Dim responseData As String = inputFile.ReadToEnd
Now pasting the url in the browser and downloading from there and then opening using winrar I can see the data is not the same. Now this does not happen all the time as some files parse and decompress correctly. Each downloaded file has check counter to compare how many posts I am supposed to be parsing from it and that triggered me to see the mismatch in counts.
EDIT
Here is what I found in addition. If I read the problem file (as I said that only some files happen this way) by individual lines I will get all the data:
Dim rData As String = inputFile.ReadLine
If Not rData = "" Then
While Not inputFile.EndOfStream
rData = inputFile.ReadLine + vbNewLine + rData
End While
getIndividualPosts(rData)
End If
Now if I try to read an individual line from a file that is not problematic it will return nothing and so I will have to readtoEnd. Can anyone explain this odd behavior and is it related to the GZIPSTREAM or some error in my code.