save to disk in append mode - rebol

save is used to store data in a format more directly usable by REBOL, as stated here
write has an append mode but it saves data in a raw mode.
My application needs to save a block of data (as a map!) to disk. Each couple of seconds it will generate a new element, up to tens of thousand of elements.
So, my question. I can save the whole data each couple of seconds. But I'd like to know if I can append the new elements to disk using the save command or save format. I guess that I could mimic the save format using the write command in /append mode. Is this the best solution, or is there another one I don't know?

save is a mezzanine function, that is basically write mold. So it's possible to mimic the save function using write or it's possible to update save function to support /append refinement.

Related

Access when exporting it removes spaces at the end of a string

Long story short, I am dealing with an excel files, which need to be modified a little bit. As the files are coming on weekly basis, I decided to write a simple program via Access, which will help me to make the process fully automatic.
The first step was to upload the excel file into an Access database. I managed to achieve that by creating a custom function and inside to just use the "DoCMD.TransferSpreadsheet acImport" aproach.
The second step was to create two queries and update the table that I just uploaded. That was also pretty straight forward too.
However, the third step is what I am struggling with. Now when the table is updated I wanted to export it back to .xlsx format. However, when I do that no matter if I do it manually via the "External Data" tab or simply use "DoCMD.TransferText acExport" approach I noticed that a few columns that have a space after the end of the string are trimmed automatically. For example, original:"string ", but after exporting it is changed to "string".
I would be really grateful if someone can tell me how to specify to Access that the space after the string is intended and not done by mistake? Preferably with a VBA solution than having to do it manually. Thank you in advance for the help!
PS: I know that .CSV format would be way better, but sadly I need it to be in a XLSX format.

Creating a test-data container in Azure blob storage

I'm adding some testing to my current project which uses Azure blob storage to store telemetry data coming from a stream analytics job. I want to do testing of the routines that get the telemetry data, so I created a separate container for test data. I downloaded a sample set of data, modified the data to serve my needs and re-uploaded (using Azure storage explorer) everything back into the new container.
The tests were immediately failing and I quickly found out that this is because the LastModified date of the files changed into the date/time of upload. This is fine, but the sequence of the upload was also different. My code uses the modified date of the file to find out which one is the most recent, which would now return a different file based on the new dates.
I found that you cannot modify this property, although you can change another property to have it update. So I know the solution: I could write a quick script which gets the sequence of files from my production instance and then touches every file in the test instance in the same sequence.
But... I was wondering whether this is the best option. I also read it's 'best practice' to store a custom datetime in a separate property, but I don't think I can do that straight from Stream Analytics (which is writing the blobs). I also considered using an Azure Function to do this (new blob => update property), but I'm than adding complexity and something that might fail for whatever reason.
So I'm looking for the best way to solve this problem. Anyone?
Update: this one probably deserves a tiny bit more explanation. Apart from using the LastModified date to sort on, I also use it to filter blobs. The blobs themselves are CSV files containing ASA output data, so telemetry records. Each record has a timestamp, but that information is IN the file. When retrieving data, I don't want to have to dive into each file to find out what the timestamp is of those records. So I use a prefilter to filter out the blobs within a certain timespan, and then only download / open those file to the records inside.
This works perfectly as long as you do not touch any of the blob, but obviously it stops working as soon as any of the blobs gets modified for whatever reason. So I'm now convinced that I need a different / better way to solve this issue; but how?
It seems to me that you have two separate things: the data that you want to store in blob storage and metadata about the blob such as the timestamp. I would create a different (azure) database for the metadata or even simpler just add metadata to the (block)blob:
blockBlob.Metadata.Add("from", dateTime.ToString());
blockBlob.Metadata.Add("to", dateTime.ToString());
blockBlob.Metadata.Add("order", "1");
For sorting I would just add a simple order property.
The comment by #Vignesh deserves the credit here, but in order to get this one marked answer I'll provide it myself.
With ASA, you can set the output to be structured by date/time. That means in this case, data is written to the blob store with a directory structure such as:
2016 / 06 / 27 / 15 / 23 (= 27-06-2016 15:23)
2016 / 06 / 28 / 11 / 02 (= 28-06-2016 11:02)
The ASA output allow you to specify how granular you want the structure to be, in my case I chose to store it by day (so not including a time path). The ASA runtime will now ensure that data from a certain point in time is stored within a blob in that resides in the correct path.
Then I subsequently changed my logic to not use the datetime stamp of the individual blob files any more, but simply read just the files from the folders that are within the timerange I'm interested in. That assures we only get data that was produced within that timerange. And if there's more than one file in a folder, I need to load them both since both were in the same timerange anyway. As long as minutes are enough granularity for you, this works excellent even though it might feel a bit strange to use a folder structure for such a thing.
Having a seperate 'index' for blobs which tracks their datetime would work too of course, but adds complexity which in this case I don't really need.

Create a file which enables random access using CMTime

I am currently seeking a solution whereby I can store accelerometer data into a file and retrieve the results by indexing into a file by CMTime. This way I can pass in a time value like 1.5 seconds and retrieve the motion data (stored as a plain text line)
AVAssetWriter allows me to write to a file and encode images/audio with CMTime and then retrieve using copyCGImageAtTime. However, I'm looking for a way, instead of images/audio, to store a plain text line with CMTime.
Overall, I am storing accelerometer data into a file every 10 milliseconds and once I finish writing to the file, I would like to index into a file using CMTime. Simultaneously, I will be writing a video file as well so that I can retrieve the frame associated with that CMTime. Another solution can include writing a line into the file to include the timestamp followed by the data, or perhaps encoding the accelerometer data alongside the video? But I would like to see if there is a better way of doing so.
Appreciate any thoughts.

downloading huge files - application using grails

I am developing a RESTful web service that allows users to download data in csv and json formats that is dynamically retrieved from the database.
Right now I am using a StringWriter to write out the CSV data. My major concern is that the resultset could get very large depending the on the user input. In that case, having them all in memory doesn't seem to be a good idea to me.
I am thinking of creating a temp file, but how to make sure the file gets deleted soon after the download completes.
Is there a better way to do this.
Thanks for the help.
If memory is the issue, you could simply write out to the response writer that writes directly to the output stream? This way you're not storing anything (much) in memory and no need to write out temporary files:
// controller action for CSV download
def download = {
response.setContentType("text/csv")
response.setHeader("Content-disposition", "attachment;filename=downloadFile.csv")
def results = // get all your results
results.each { result ->
out << result.col1 << ',' << result.col2 // etc
out << '\n'
}
}
This writes out to the output stream as it is looping round your results.
In theory You can make this even more memory efficient by using a scrollable results set - see "Using Scrollable Results" section of Querying with GORM - Criteria - and looping round that whilst writing out to the response writer. In theory this means you're also not loading all your DB results into memory, but in practice this may not work as expected if you're using MySQL (and its Java connector). Manually batching up queries may work too (get DB rows 1-10000, write out, get 10001-20001, etc)
This kind of thing might be more difficult with JSON, depending on what library you're using to render your objects.
Well, the simplest solution to preventing temp files from sticking around too long would be a cron job that simply deletes any file in the temp directory that has a modified time older than, say, 1 hour.
If you want it to all be done within Grails, you could design a Quartz job to clean up files. This job could either do as described above (and simply check modification timestamps to decide what to delete) or you could run the job only "on demand" with a parameter of a file name to be deleted. Once the download action is called you could schedule the cleanup of that specific file for X minutes later (to allow enough time for a successful download). The job would then be in charge of simply deleting the file.
Depending on the number of files involved you can always use http://download.oracle.com/javase/1,5.0/docs/api/java/io/File.html#deleteOnExit() to ensure the file is blown away when the VM shuts down.
To create a temp file that gets automatically deleted after the session has expired, you can use the Session Temp Files plugin.

Changing hash of a files

I have a folder full of binary files and I want to make a change to these files so that the hash of these files will change. I want to do this is a fashion that doesn't pertinently corrupt the files. Meaning that the change should still allow the file to operate normally or that I should be able to undo the change at any point in time.
Does anyone know of a script that I could use to do this or many a program that will automate this?
Cheers
UPDATE
Its a edge case that I am trying to deal with. I have a system that only allows me to store a file with a given hash once. Hence I am wanting to change the content hash of the file to allow the file to be stored. Note the system in question is not one I control or can change.
Couldn't I just add a random 1 to the end of the file and then remove it afterward without breaking anything? I'm just not sure how to script this - as in how to modify the binary data in this way. Note I'm in a windows environment.
Without knowing the format of the files, we can't tell. It may in fact be impossible - for instance if these binary files are self-signed with some private key. Changing any single bit within the file is likely to render it invalid.
Is your hash calculated purely from the contents, and not any other metadata that you can change (such as filename or modified date)? If so, you're probably out of luck. If the hash is meant to detect when the content changes, but you're trying to change the hash without actually changing the content, you've clearly got a problem...
What is the hash used for? Why do you want to change it? There may be an alternative solution if you could give us more information about the bigger picture.
EDIT: One alternative is to effectively create your own container format - so while a file is stored in your container format, it's not usable in its original form, but it can be extracted easily. Your container could be as simple as "add four bytes at the end as a seed to disturb the hash" - "extracting" the file would just involve copying it and removing the last four bytes. But the important point is that what you end up with isn't an MP3 file or whatever you started with - it's your custom format, simple as it is. You need to package/extract the file any time you interact with the store.