cannot view document in ravendb studio - ravendb

When I try to view my document I get this error:
Client side exception:
System.InvalidOperationException: Document's property: "DocumentData" is too long to view in the studio (property length: 699.608, max allowed length: 500.000)
at Raven.Studio.Models.EditableDocumentModel.AssertNoPropertyBeyondSize(RavenJToken token, Int32 maxSize, String path)
at Raven.Studio.Models.EditableDocumentModel.AssertNoPropertyBeyondSize(RavenJToken token, Int32 maxSize, String path)
at Raven.Studio.Models.EditableDocumentModel.<LoadModelParameters>b__2a(DocumentAndNavigationInfo result)
at Raven.Studio.Infrastructure.InvocationExtensions.<>c__DisplayClass17`1.<>c__DisplayClass19.<ContinueOnSuccessInTheUIThread>b__16()
at AsyncCompatLibExtensions.<>c__DisplayClass55.<InvokeAsync>b__54()
I am saving a pdf in that field.
I want to be able to edit the other fields.
Is it possible for it to ignore the field that's too big?
Thanks!

Don't save large binary (or base64 encoded) data into the json document. That's a poor use of the database. Instead, you should consider one of these two options:
Option 1
Write the binary data to disk (or cloud storage) yourself.
Save a file path (or url) to it in your document.
Option 2
Use Raven's attachments feature. This is a separate area in the database meant specifically for storing binary files.
The advantage is that your binary documents are included in database backups, and if you like you can take advantage of features like my Indexed Attachments Bundle, or write your own custom bundles that use attachment triggers.
The disadvantage is that your database can grow very large. For this reason, many prefer option 1.

Related

Reading *.cdpg file with python without knowing structure

I am trying to use python to read a .cdpg file. It was generated by the labview code. I do not have access to any information about the structure of the file. Using another post I have had some success, but the numbers are not making any sense. I do not know if my code is wrong or if my interpretation of the data is wrong.
The code I am using is:
import struct
with open(file, mode='rb') as file: # b is important -> binary
fileContent = file.read()
ints = struct.unpack("i" * ((len(fileContent) -24) // 4), fileContent[20:-4])
print(ints)
The file is located here. Any guidance would be greatly appreciated.
Thank you,
T
According to the documentation here https://www.ni.com/pl-pl/support/documentation/supplemental/12/logging-data-with-national-instruments-citadel.html
The .cdpg files contain trace data. Citadel stores data in a
compressed format; therefore, you cannot read and extract data from
these files directly. You must use the Citadel API in the DSC Module
or the Historical Data Viewer to access trace data. Refer to the
Citadel Operations section for more information about retrieving data
from a Citadel database.
.cdpg is a closed format containing compressed data. You won't be able to interpret them properly not knowing the file format structure. You can read the raw binary content and this is what you're actually doing with your example Python code

Rename filename.ext.crswap to filename.ext rather than copying

When performing this sequence
Obtain a handle to a new file via window.showSaveFilePicker, say filename.ext
Obtain a writeable file stream from the handle
Write some content into the file using the stream
close the stream to signal completion
the File System API writes to filename.ext.crswap and on close copies filename.ext.crswap to filename.ext
Is there a reason that filename.ext.crswap is not rather renamed to filename.ext?
The reason for this behavior is to avoid partial writes:
"User agents try to ensure that no partial writes happen, i.e. the file represented by fileHandle will either contain its old contents or it will contain whatever data was written through stream up until the stream has been closed."—Spec.

Migrating from Microsoft.Azure.Storage.Blob to Azure.Storage.Blobs - directory concepts missing

These are great guides for migrating between the different versions of NuGet package:
https://github.com/Azure/azure-sdk-for-net/blob/Azure.Storage.Blobs_12.6.0/sdk/storage/Azure.Storage.Blobs/README.md
https://elcamino.cloud/articles/2020-03-30-azure-storage-blobs-net-sdk-v12-upgrade-guide-and-tips.html
However I am struggling to migrate the following concepts in my code:
// Return if a directory exists:
container.GetDirectoryReference(path).ListBlobs().Any();
where GetDirectoryReference is not understood and there appears to be no direct translation.
Also, the concept of a CloudBlobDirectory does not appear to have made it into Azure.Storage.Blobs e.g.
private static long GetDirectorySize(CloudBlobDirectory directoryBlob) {
long size = 0;
foreach (var blobItem in directoryBlob.ListBlobs()) {
if (blobItem is BlobClient)
size += ((BlobClient) blobItem).GetProperties().Value.ContentLength;
if (blobItem is CloudBlobDirectory)
size += GetDirectorySize((CloudBlobDirectory) blobItem);
}
return size;
}
where CloudBlobDirectory does not appear anywhere in the API.
There's no such thing as physical directories or folders in Azure Blob Storage. The directories you sometimes see are part of the blob (e.g. folder1/folder2/file1.txt). The List Blobs requests allows you to add a prefix and delimiter in a call, which are used by the Azure Portal and Azure Data Explorer to create a visualization of folders. As example prefix folder1/ and delimiter / would allow you to see the content as if folder1 was opened.
That's exactly what happens in your code. The GetDirectoryReference() adds a prefix. The ListBlobs() fires a request and Any() checks if any items return.
For V12 the command that'll allow you to do the same would be GetBlobsByHierarchy and its async version. In your particular case where you only want to know if any blobs exist in the directory a GetBlobs with prefix would also suffice.

How to handle file inputs with changing schemas in Talend

Questions: How do I continue to process files that differ substantially from a base schema and that trigger tSchemaComplianceCheck errors?
Background
Suppose I have a folder with Customer xls files called file1,file2,....file1000. Assume I have imported the file schema into Talend repository and called it 6Columns and I have the talend job configured to iterate through each of the files and process them
1-tFileInput ->2-tSchemaCompliance-6Columns -> 3-tMap ->4-FurtherProcessing
Read each excel file
Compare it to the schema 6Columns
Format the output (rename columns)
Take the collection of Customer data and process it more
While processing I notice that the schema compliance is generating errors (errorCode 16) which points to a number of files (200) with a different schema 13Columns but there isn't a way to identify the files in advance to filter then into a subjob
How do I amend my processing to correctly integrate the files with 13Columnsschema into the process (whats the recommended way of handling) and designing incase other schema changes occur
1-tFileInput ->2-tSchemaCompliance-6Columns -> 3-tMap ->4-FurtherProcessing
|
|Reject Flow (ErrorCode 16)
|Schema-13Columns
|
|-> ??
Current Thinking When ErrorCode 16 detected
Option 1 Parallel. Take the file path for the current file and process it against 13Columns using a new FileInput before merging the 2 flows back into 1
Option 2 Serial. Collect the list of files that triggered the error and process them after I've finished with the compliance files?
You could try something like below :
tFileList - Read your input repository
tFileInput "schema6" - tSchemaComplianceCheck : read files as 6-columns schema
tMap_1 : further processing
In the reject part :
tMap after reject link : add a new column containing the filepath that has been rejected
tFlowToIterate : used to get an iterate link, acceptable input for tFileInputDelimited that follows.
tFileInput : read data as 13-columns schema. Following components are the same as in part 1.
After that, you can push your data to tHashOutput, in order to read them further in another subjob.

How to restore an unknown type BLOB field from Firebird

I am trying to restore a BLOB field stored in a Firebird database, and the only information I have is that the content of the BLOB field is a document.
I've tried using IBManager to right-click on the cell and click "Save BLOB to file", but the saved file is unreadable (as if it was encrypted). I tried to open it with Microsoft Word, notepad, adobe etc, with no success. I also tried opening it with WinRAR (I thought that it might have been compressed before being stored to the database) but still nothing.
Is there a way to find out whether and how the BLOB file was compressed, and how to restore it?
Thanks in advance!
Update:
I have converted the firebird database to SQL and I use the following code to extract the Unencoded BLOB documents:
conn.Open();
dr = comm.ExecuteReader();
while (dr.Read())
{
byte[] document_byte = null;
if (dr[1] != System.DBNull.Value)
{
document_byte = (byte[])dr[1];
}
string subPath = "C:\\Documents\\" + dr[0] + "\\";
System.IO.Directory.CreateDirectory(subPath);
if (document_byte != null)
{
System.IO.File.WriteAllBytes(subPath + "Document", document_byte);
}
}
How can I adjust my code to decode the BLOB file from Base64 since I know is Base64 encoded?
Unless the field uses BLOB filter the data is stored into database as is, ie Firebird doesn't alter it in any way. Check the field's definition, if it does have SUB_TYPE 0 (or binary) then it is "ordinary" binary data, ie Firebird doesn't apply any filter to it. And even in case the field uses some filter, unless there is a bug in the filter code you should get the original data back when reading the content of the BLOB.
So it comes down to the program which stored the document into DB, it is quite possible that it compressed or encrypted the file, but there is no way Firebird can help you to figure out what algorithm was used... One option would be to save the content of the BLOB into file and then try the *nix file command, perhaps it is able to detect the file format used.
I would also check the DB for corruptions, just for case (Firebird's gfix command line tool).