missing blob with jgit - kotlin

I am trying the following with jgit:
val git = Git.open(File("/path/toMyRepo"))
val diffFormatter = DiffFormatter(DisabledOutputStream.INSTANCE).apply {
setRepository(git.repository)
}
git.diff().call().forEach {
if (it.changeType == DiffEntry.ChangeType.MODIFY) {
diffFormatter.toFileHeader(it).toEditList().forEach {
println(it)
}
}
}
but I am getting a the following exception:
"org.eclipse.jgit.errors.MissingObjectException: Missing blob 9645ba8461cd88af20fd66a3e44055deb24f826e"
Does anyone see what is wrong with the code?
EDIT: full stacktrace with a quite empty repo (only one commit and a change on the only line in the only file):
Exception in thread "main" org.eclipse.jgit.errors.MissingObjectException: Missing blob f7891cbde46bbb6ca96065ecf1900ef6a223f679
at org.eclipse.jgit.internal.storage.file.WindowCursor.open(WindowCursor.java:149)
at org.eclipse.jgit.diff.ContentSource$ObjectReaderSource.open(ContentSource.java:140)
at org.eclipse.jgit.diff.ContentSource$Pair.open(ContentSource.java:276)
at org.eclipse.jgit.diff.DiffFormatter.open(DiffFormatter.java:1020)
at org.eclipse.jgit.diff.DiffFormatter.createFormatResult(DiffFormatter.java:950)
at org.eclipse.jgit.diff.DiffFormatter.toFileHeader(DiffFormatter.java:915)
at MainKt.main(Main.kt:17)

Not sure if this will be helpful, but I lost some time in this issue and if someone falls down this hole in the future: It appears to be some issue in the DiffFormatter class when the latest changes which you are trying to diff are not committed.
My use case was to get the changes of a single file, and to make it work, I made use of the gitDiff interface directly, like so:
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
String filename = "mafile.java";
Git.open(File("/path/toMyRepo")).diff() // calling diff command
.setPathFilter(PathFilter.create(filename)) //setting to just look for changes in specific file
.setContextLines(0) // just the changes itself, not the bordering lines
.setOutputStream(byteArrayOutputStream) // the outputstream to print the changes
.call();
System.out.println(byteArrayOutputStream.toString(StandardCharsets.UTF_8));

Related

AmazonS3: Getting warning: S3AbortableInputStream:Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection

Here's the warning that I am getting:
S3AbortableInputStream:Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
I tried using try with resources but S3ObjectInputStream doesn't seem to close via this method.
try (S3Object s3object = s3Client.getObject(new GetObjectRequest(bucket, key));
S3ObjectInputStream s3ObjectInputStream = s3object.getObjectContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(s3ObjectInputStream, StandardCharsets.UTF_8));
){
//some code here blah blah blah
}
I also tried below code and explicitly closing but that doesn't work either:
S3Object s3object = s3Client.getObject(new GetObjectRequest(bucket, key));
S3ObjectInputStream s3ObjectInputStream = s3object.getObjectContent();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(s3ObjectInputStream, StandardCharsets.UTF_8));
){
//some code here blah blah
s3ObjectInputStream.close();
s3object.close();
}
Any help would be appreciated.
PS: I am only reading two lines of the file from S3 and the file has more data.
Got the answer via other medium. Sharing it here:
The warning indicates that you called close() without reading the whole file. This is problematic because S3 is still trying to send the data and you're leaving the connection in a sad state.
There's two options here:
Read the rest of the data from the input stream so the connection can be reused.
Call s3ObjectInputStream.abort() to close the connection without reading the data. The connection won't be reused, so you take some performance hit with the next request to re-create the connection. This may be worth it if it's going to take a long time to read the rest of the file.
Following option #1 of Chirag Sejpal's answer I used the below statement to drain the S3AbortableInputStream to ensure the connection can be reused:
com.amazonaws.util.IOUtils.drainInputStream(s3ObjectInputStream);
I ran into the same problem and the following class helped me
#Data
#AllArgsConstructor
public class S3ObjectClosable implements Closeable {
private final S3Object s3Object;
#Override
public void close() throws IOException {
s3Object.getObjectContent().abort();
s3Object.close();
}
}
and now you can use without warning
try (final var s3ObjectClosable = new S3ObjectClosable(s3Client.getObject(bucket, key))) {
//same code
}
To add an example to Chirag Sejpal's answer (elaborating on option #1), the following can be used to read the rest of the data from the input stream before closing it:
S3Object s3object = s3Client.getObject(new GetObjectRequest(bucket, key));
try (S3ObjectInputStream s3ObjectInputStream = s3object.getObjectContent()) {
try {
// Read from stream as necessary
} catch (Exception e) {
// Handle exceptions as necessary
} finally {
while (s3ObjectInputStream != null && s3ObjectInputStream.read() != -1) {
// Read the rest of the stream
}
}
// The stream will be closed automatically by the try-with-resources statement
}
I ran into the same error.
As others have pointed out, the /tmp space in lambda is limited to 512 MB.
And if the lambda context is re-used for a new invocation, then the /tmp space is already half-full.
So, when reading the S3 objects and writing all the files to the /tmp directory (as I was doing),
I ran out of disk space somewhere in between.
Lambda exited with error, but NOT all bytes from the S3ObjectInputStream were read.
So, two things one need to keep in mind:
1) If the first execution causes the problem, be stingy with your /tmp space.
We have only 512 MB
2) If the second execution causes the problem, then this could be resolved by attacking the root problem.
Its not possible to delete the /tmp folder.
So, delete all the files in the /tmp folder after the execution is finished.
In java, here is what I did, which successfully resolved the problem.
public String handleRequest(Map < String, String > keyValuePairs, Context lambdaContext) {
try {
// All work here
} catch (Exception e) {
logger.error("Error {}", e.toString());
return "Error";
} finally {
deleteAllFilesInTmpDir();
}
}
private void deleteAllFilesInTmpDir() {
Path path = java.nio.file.Paths.get(File.separator, "tmp", File.separator);
try {
if (Files.exists(path)) {
deleteDir(path.toFile());
logger.info("Successfully cleaned up the tmp directory");
}
} catch (Exception ex) {
logger.error("Unable to clean up the tmp directory");
}
}
public void deleteDir(File dir) {
File[] files = dir.listFiles();
if (files != null) {
for (final File file: files) {
deleteDir(file);
}
}
dir.delete();
}
This is my solution. I'm using spring boot 2.4.3
Create an amazon s3 client
AmazonS3 amazonS3Client = AmazonS3ClientBuilder
.standard()
.withRegion("your-region")
.withCredentials(
new AWSStaticCredentialsProvider(
new BasicAWSCredentials("your-access-key", "your-secret-access-key")))
.build();
Create an amazon transfer client.
TransferManager transferManagerClient = TransferManagerBuilder.standard()
.withS3Client(amazonS3Client)
.build();
Create a temporary file in /tmp/{your-s3-key} so that we can put the file we download in this file.
File file = new File(System.getProperty("java.io.tmpdir"), "your-s3-key");
try {
file.createNewFile(); // Create temporary file
} catch (IOException e) {
e.printStackTrace();
}
file.mkdirs(); // Create the directory of the temporary file
Then, we download the file from s3 using transfer manager client
// Note that in this line the s3 file downloaded has been transferred in to the temporary file that we created
Download download = transferManagerClient.download(
new GetObjectRequest("your-s3-bucket-name", "your-s3-key"), file);
// This line blocks the thread until the download is finished
download.waitForCompletion();
Now that the s3 file has been successfully transferred into the temporary file that we created. We can get the InputStream of the temporary file.
InputStream input = new DataInputStream(new FileInputStream(file));
Because the temporary file is not needed anymore, we just delete it.
file.delete();

Google diff-match-patch : How to unpatch to get Original String?

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.
diff_match_patch dmp = new diff_match_patch();
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB
Now is there any way to use this patch to re-create the originalString by passing the latestString?
I google about this and found this very old comment # Google diff-match-patch Wiki saying,
Unpatching can be done by just looping through the diff, swapping
DIFF_INSERT with DIFF_DELETE, then applying the patch.
But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.
Edit:
The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.
I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:
static void Main(string[] args)
{
var dmp = new diff_match_patch();
string v1 = "My Json Object;
string v2 = "My Mutated Json Object"
var v2ToV1Patch = dmp.patch_make(v2, v1);
var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch); // Persist text to db
string v3 = "Latest version of JSON object;
var v3ToV2Patch = dmp.patch_make(v3, v2);
var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch); // Persist text to db
// Time to re-hydrate the objects
var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think
var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString();
}
I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:
private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
patchedString = patchResult[0] as string;
var successArray = patchResult[1] as bool[];
foreach (var b in successArray)
{
if (!b)
return false;
}
return true;
}

Dart file copy in Windows

The File class in the dart:io library doesn't yet include copy() and move() methods.
To tide me over until they arrive, I'm trying to roll my own copy function. I'm using the code below on Windows, but it just creates a 0kb file.
void copyFile(String input, String output) {
var inFile = new File(input), outFile = new File(output);
if (outFile.existsSync()) outFile.deleteSync(); // I realize this isn't required
var inStream = null, outStream = null;
try {
inStream = inFile.openInputStream();
outStream = outFile.openOutputStream(FileMode.WRITE);
inStream.pipe(outStream);
} finally {
if (outStream != null && !outStream.closed) outStream.close();
if (inStream != null && !inStream.closed) inStream.close();
}
}
I've also tried replacing the pipe line with print(inStream.read(100).toString()); and I get null. The input file does exist (otherwise I'd get a FileIOException). Am I doing something wrong, or are input streams broken under Windows?
I'm using:
Dart Editor version 0.3.1_r17463
Dart SDK version 0.3.1.2_r17463
Edit: The following works (although it doesn't "chunk"). Am I using the streams above incorrectly?
void copyFile(String input, String output) {
var inFile = new File(input), outFile = new File(output);
if (outFile.existsSync()) outFile.deleteSync(); // I realize this isn't required
outFile.writeAsBytesSync(inFile.readAsBytesSync(), FileMode.WRITE);
}
With your first code snippet, you get an empty file because pipe is not a synchronous method. Thus, the copy of inputStream to outputStream has not started when the finally block is execute. By closing the streams in this finally block, you stop the pipe before it even starts. Without that finally block the copy is done correctly.
void copyFile(String input, String output) {
final inStream = new File(input).openInputStream();
final outStream = new File(output).openOutputStream(FileMode.WRITE);
inStream.pipe(outStream);
}
Finally, you don't have to worry about closing streams because pipe close streams by default once achieved. See InputStream.pipe.
For synchronous copy, use:
File(sourceFile).copySync(destinationFile);
For asynchronous copy, use:
File(sourceFile).copy(destinationFile);

NHibernate - Handling StaleObjectStateException to always commit client changes - Need advice/recommendation

I am trying to find the perfect way to handle this exception and force client changes to overwrite any other changes that caused the conflict. The approach that I came up with is to wrap the call to Session.Transaction.Commit() in a loop, inside the loop I would do a try-catch block and handle each stale object individually by copying its properties, except row-version property then refreshing the object to get latest DB data then recopying original values to the refreshed object and then doing a merge. Once I loop I will commit and if any other StaleObjectStateException take place then the same applies. The loop keeps looping until all conflicts are resolved.
This method is part of a UnitOfWork class. To make it clearer I'll post my code:
// 'Client-wins' rules, any conflicts found will always cause client changes to
// overwrite anything else.
public void CommitAndRefresh() {
bool saveFailed;
do {
try {
_session.Transaction.Commit();
_session.BeginTransaction();
saveFailed = false;
} catch (StaleObjectStateException ex) {
saveFailed = true;
// Get the staled object with client changes
var staleObject = _session.Get(ex.EntityName, ex.Identifier);
// Extract the row-version property name
IClassMetadata meta = _sessionFactory.GetClassMetadata(ex.EntityName);
string rowVersionPropertyName = meta.PropertyNames[meta.VersionProperty] as string;
// Store all property values from client changes
var propertyValues = new Dictionary<string, object>();
var publicProperties = staleObject.GetType().GetProperties();
foreach (var p in publicProperties) {
if (p.Name != rowVersionPropertyName) {
propertyValues.Add(p.Name, p.GetValue(staleObject, null));
}
}
// Get latest data for staled object from the database
_session.Refresh(staleObject);
// Update the data with the original client changes except for row-version
foreach (var p in publicProperties) {
if (p.Name != rowVersionPropertyName) {
p.SetValue(staleObject, propertyValues[p.Name], null);
}
}
// Merge
_session.Merge(staleObject);
}
} while (saveFailed);
}
The above code works fine and handle concurrency with the client-wins rule. However, I was wondering if there is any built-in capabilities in NHibernate to do this for me or if there is a better way to handle this.
Thanks in advance,
What you're describing is a lack of concurrency checking. If you don't use a concurrency strategy (optimistic-lock, version or pessimistic), StaleStateObjectException will not be thrown and the update will be issued.
Okay, now I understand your use case. One important point is that the ISession should be discarded after an exception is thrown. You can use ISession.Merge to merge changes between a detached a persistent object rather than doing it yourself. Unfortunately, Merge does not cascade to child objects so you still need to walk the object graph yourself. So the implementation would look something like:
catch (StaleObjectStateException ex)
{
if (isPowerUser)
{
var newSession = GetSession();
// Merge will automatically get first
newSession.Merge(staleObject);
newSession.Flush();
}
}

Uploading with multipart/form-data using OpenRasta and IMultipartHttpEntity

I'm trying to post some files using OpenRasta. I've gotten as far as getting my handler called, but by all appearances the stream in the entity is empty. Here's my handler:
public OperationResult Post( IEnumerable<IMultipartHttpEntity> entities)
{
var foo = entities.ToList();
foreach (var entity in foo)
{
if (entity.Stream != null && entity.ContentType != null)
{
var memoryStream = new MemoryStream();
entity.Stream.CopyTo(memoryStream);
}
}
return new OperationResult.Created();
}
Each time through the loop memoryStream has a length of 0. What am I doing wrong?
Nothing like posting on StackOverflow to make the answer immediately obvious. Apparently you only get one enumeration of the entities in order to grab the stream. I had added the "foo" variable above to make debugging easier, but it was causing the streaming to fail. As I stored the stream to the database, I had also failed to reset memoryStream to the beginning before writing it. Fixing these two issues got the file to upload correctly.