A sha value represent to the status for git repository - libgit2

There is a sha value represent to the status for git repository in current status?
The sha value to be updated each time the Object Database updated and References changed if git has this sha value.
In other words, the sha value represent to the current version of whole repository.
Here is my code for calculate a sha for current references. There is a git level API or interface?
private string CalcBranchesSha(bool includeTags = false)
{
var sb = new StringBuilder();
sb.Append(":HEAD");
if (_repository.Head.Tip != null)
sb.Append(_repository.Head.Tip.Sha);
sb.Append(';');
foreach (var branch in _repository.Branches.OrderBy(s => s.Name))
{
sb.Append(':');
sb.Append(branch.Name);
if (branch.Tip != null)
sb.Append(branch.Tip.Sha);
}
sb.Append(';');
if (includeTags)
{
foreach (var tag in _repository.Tags.OrderBy(s => s.Name))
{
sb.Append(':');
sb.Append(tag.Name);
if (tag.Target != null)
sb.Append(tag.Target.Sha);
}
}
return sb.ToString().CalcSha();
}

There is a sha value represent to the status for git repository in current status?
In git parlance, the status of a repository usually refers to the paths that have differences between the working directory, the index and the current HEAD commit.
However, it doesn't look like you're after this. From what I understand, you're trying to calculate a checksum that represents the state of the current repository (ie. what your branches and tags point to).
Regarding the code, it may be improved in some ways to get a more precise checksum:
Account for all the references (beside refs/tags and refs/heads) in the repository (think refs/stash, refs/notes or the generated backup references in refs/original when one rewrite the history of a repository).
Consider disambiguating symbolic from direct references. (ie: HEAD->master->08a4217 should lead to a different checksum than HEAD->08a4127)
Below a modified version of the code which deals with those two points above, by leveraging the Refs namespace:
private string CalculateRepositoryStateSha(IRepository repo)
{
var sb = new StringBuilder();
sb.Append(":HEAD");
sb.Append(repo.Refs.Head.TargetIdentifier);
sb.Append(';');
foreach (var reference in repo.Refs.OrderBy(r => r.CanonicalName))
{
sb.Append(':');
sb.Append(reference.CanonicalName);
sb.Append(reference.TargetIdentifier);
sb.Append(';');
}
return sb.ToString().CalcSha();
}
Please keep in mind the following limits:
This doesn't consider the changes to the index or the workdir (eg. the same checksum will be returned either a file has been staged or not)
There are ways to create object in the object database without modifying the references. Those kind of changes won't be reflected with the code above. One possible hack-ish way to do this could be to also append to the StringBuuilder, the number of object in the object database (ie. repo.ObjectDatabase.Count()), but that may hinder the overall performance as this will enumerate all the objects each time the checksum is calculated).
There is a git level API or interface?
I don't know of any equivalent native function in git (although similar result may be achieved through some scripting). There's nothing native in libgit2 or Libgit2Sharp APIs.

Related

Memory cache dependencies

I'm trying to understand the Cache dependencies example, but some of it is eluding me.
var cts = new CancellationTokenSource();
_cache.Set(CacheKeys.DependentCTS, cts);
using (var entry = _cache.CreateEntry(CacheKeys.Parent))
{
// expire this entry if the dependant entry expires.
entry.Value = DateTime.Now;
entry.RegisterPostEvictionCallback(DependentEvictionCallback, this);
_cache.Set(CacheKeys.Child,
DateTime.Now,
new CancellationChangeToken(cts.Token));
}
CancellationTokenSource allows evicting multiple cache entries as a group. But what's the point of the using block in this example? I downloaded the sample project and replaced the using block with the following:
_cache.Set( CacheKeys.Parent, DateTime.Now, new CancellationChangeToken( cts.Token ) );
_cache.Set( CacheKeys.Child, DateTime.Now, new CancellationChangeToken( cts.Token ) );
While my two lines are simpler, they seem to have the same effect as the using block. The doc says:
With the using pattern in the code above, cache entries created inside the using block will inherit triggers and expiration settings.
What does "triggers and expiration settings" refer to here?
I must add dependent cache entries in my application, but they must be added at separate times. Therefore I can't take the approach shown in the example's using block since it requires both entries be added at the same time, right? What functionality am I missing out on by instead taking the approach shown above in my two lines of code?
Finally, shouldn't Dispose be called on CancellationTokenSource after the Cancel call? The sample doesn't do that.

Migrating from Microsoft.Azure.Storage.Blob to Azure.Storage.Blobs - directory concepts missing

These are great guides for migrating between the different versions of NuGet package:
https://github.com/Azure/azure-sdk-for-net/blob/Azure.Storage.Blobs_12.6.0/sdk/storage/Azure.Storage.Blobs/README.md
https://elcamino.cloud/articles/2020-03-30-azure-storage-blobs-net-sdk-v12-upgrade-guide-and-tips.html
However I am struggling to migrate the following concepts in my code:
// Return if a directory exists:
container.GetDirectoryReference(path).ListBlobs().Any();
where GetDirectoryReference is not understood and there appears to be no direct translation.
Also, the concept of a CloudBlobDirectory does not appear to have made it into Azure.Storage.Blobs e.g.
private static long GetDirectorySize(CloudBlobDirectory directoryBlob) {
long size = 0;
foreach (var blobItem in directoryBlob.ListBlobs()) {
if (blobItem is BlobClient)
size += ((BlobClient) blobItem).GetProperties().Value.ContentLength;
if (blobItem is CloudBlobDirectory)
size += GetDirectorySize((CloudBlobDirectory) blobItem);
}
return size;
}
where CloudBlobDirectory does not appear anywhere in the API.
There's no such thing as physical directories or folders in Azure Blob Storage. The directories you sometimes see are part of the blob (e.g. folder1/folder2/file1.txt). The List Blobs requests allows you to add a prefix and delimiter in a call, which are used by the Azure Portal and Azure Data Explorer to create a visualization of folders. As example prefix folder1/ and delimiter / would allow you to see the content as if folder1 was opened.
That's exactly what happens in your code. The GetDirectoryReference() adds a prefix. The ListBlobs() fires a request and Any() checks if any items return.
For V12 the command that'll allow you to do the same would be GetBlobsByHierarchy and its async version. In your particular case where you only want to know if any blobs exist in the directory a GetBlobs with prefix would also suffice.

App Folder files not visible after un-install / re-install

I noticed this in the debug environment where I have to do many re-installs in order to test persistent data storage, initial settings, etc... It may not be relevant in production, but I mention this anyway just to inform other developers.
Any files created by an app in its App Folder are not 'visible' to queries after manual un-install / re-install (from IDE, for instance). The same applies to the 'Encoded DriveID' - it is no longer valid.
It is probably 'by design' but it effectively creates 'orphans' in the app folder until manually cleaned by 'drive.google.com > Manage Apps > [yourapp] > Options > Delete hidden app data'. It also creates problem if an app relies on finding of files by metadata, title, ... since these seem to be gone. As I said, not a production problem, but it can create some frustration during development.
Can any of friendly Googlers confirm this? Is there any other way to get to these files after re-install?
Try this approach:
Use requestSync() in onConnected() as:
#Override
public void onConnected(Bundle connectionHint) {
super.onConnected(connectionHint);
Drive.DriveApi.requestSync(getGoogleApiClient()).setResultCallback(syncCallback);
}
Then, in its callback, query the contents of the drive using:
final private ResultCallback<Status> syncCallback = new ResultCallback<Status>() {
#Override
public void onResult(#NonNull Status status) {
if (!status.isSuccess()) {
showMessage("Problem while retrieving results");
return;
}
query = new Query.Builder()
.addFilter(Filters.and(Filters.eq(SearchableField.TITLE, "title"),
Filters.eq(SearchableField.TRASHED, false)))
.build();
Drive.DriveApi.query(getGoogleApiClient(), query)
.setResultCallback(metadataCallback);
}
};
Then, in its callback, if found, retrieve the file using:
final private ResultCallback<DriveApi.MetadataBufferResult> metadataCallback =
new ResultCallback<DriveApi.MetadataBufferResult>() {
#SuppressLint("SetTextI18n")
#Override
public void onResult(#NonNull DriveApi.MetadataBufferResult result) {
if (!result.getStatus().isSuccess()) {
showMessage("Problem while retrieving results");
return;
}
MetadataBuffer mdb = result.getMetadataBuffer();
for (Metadata md : mdb) {
Date createdDate = md.getCreatedDate();
DriveId driveId = md.getDriveId();
}
readFromDrive(driveId);
}
};
Job done!
Hope that helps!
It looks like Google Play services has a problem. (https://stackoverflow.com/a/26541831/2228408)
For testing, you can do it by clearing Google Play services data (Settings > Apps > Google Play services > Manage Space > Clear all data).
Or, at this time, you need to implement it by using Drive SDK v2.
I think you are correct that it is by design.
By inspection I have concluded that until an app places data in the AppFolder folder, Drive does not sync down to the device however much to try and hassle it. Therefore it is impossible to check for the existence of AppFolder placed by another device, or a prior implementation. I'd assume that this was to try and create a consistent clean install.
I can see that there are a couple of strategies to work around this:
1) Place dummy data on AppFolder and then sync and recheck.
2) Accept that in the first instance there is the possibility of duplicates, as you cannot access the existing file by definition you will create a new copy, and use custom metadata to come up with a scheme to differentiate like-named files and choose which one you want to keep (essentially implement your conflict merge strategy across the two different files).
I've done the second, I have an update number to compare data from different devices and decide which version I want so decide whether to upload, download or leave alone. As my data is an SQLite DB I also have some code to only sync once updates have settled down and I deliberately consider people updating two devices at once foolish and the results are consistent but undefined as to which will win.

How can I get references to BlockBlob objects from CloudBlobDirectory.ListBlobs?

I am using the Microsoft Azure .NET client libraries to interact with Azure cloud storage. I need to be able to access additional information about each blob in its metadata collection. I am currently using CloudBlobDirectory.ListBlobs() method to get a list of blobs in a particular directory of a directory structure I've devised in the blob names. The ListBlobs() method returns a list of IListBlobItem objects. They only have a couple of properties: Url and references to parent directory and parent container. I need to get to the metadata of the actual blob objects.
I envisioned there would be a way to either cast the IListBlobItem to a BlockBlob object or use the IListBlockItem to get a reference to the BlockBlob, but can't seem to find a way to do that.
My question is: Is there a way to get a BlockBlob object from this method, or do I have to use a different way of getting the actual BlockBlob objects? If different, then can you suggest a way to achieve this, while also being able to filter by the "directory" scheme?
OK... I found a way to do this, and while it seems a little clunky and indirect, it does achieve the main thing I thought should be doable, which is to cast the IListBlobItem directly to a CloudBlockBlob object.
What I am doing is getting the list from the Directory object's ListBlobs() method and then looping over each item in the list and casting the item to a CloudBlockBlob object and then calling the FetchAttributes() method to retrieve the properties (including the metadata). Then add a new "info" object to a new list of info objects. Here's the code I'm using:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(true);
foreach (IListBlobItem item in blobs)
{
CloudBlockBlob blob = (CloudBlockBlob)item;
blob.FetchAttributes();
files.Add(new ImageInfo
{
FileUrl = item.Uri.ToString(),
FileName = item.Uri.PathAndQuery.Replace(restaurantId.ToString().PadLeft(3, '0') + "/", ""),
ImageName = blob.Metadata["Name"]
});
}
The whole "Blob" concept seems needlessly complex and doesn't seem to achieve what I'd have thought would have been one of the main features of the Blob wrapper. That is, a way to expand search capabilities by allowing a query over name, directory, container and metadata. I'd have thought you could construct a linq query that would read somewhat like: "return a list of all blobs in the 'images' container, that are in the 'natural/landscapes/' directory path that have a metadata key of 'category' with the value of 'sunset'". There doesn't seem to be a way to do that and that seems to be a missed opportunity to me. Oh, well.
If I'm wrong and way off base here, please let me know.
This approach has been developed for Java, but I hope it can somehow be modified to fit any other supported language. Despite the functionality you ask has not been explicitly developed yet, I think I found a different (hopefully less clunky) way to access CloudBlockBlob data from a ListBlobItem element.
The following code can be used to delete, for example, every blob inside a specific directory.
String blobUri;
CloudBlobClient blobClient = /* Obtain your blob client */
try{
CloudBlobContainer container = /* Obtain your blob container */
for (ListBlobItem blobItem : container.listBlobs(blobPrefix)) {
if (blobItem instanceof CloudBlob) {
blob = (CloudBlob) blobItem;
if (blob.exists()){
System.out.println("Deleting blob " + blob.getName());
blob.delete();
}
}
}
}catch (URISyntaxException | StorageException ex){
Logger.getLogger(BlobOperations.class.getName()).log(Level.SEVERE, null, ex);
}
The previous answers are good. I just wanted to point out 2 things:
1) Nowadays ASYNC programming is recommended to do and supported by Azure SDK as well. So try to use it:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(true);
foreach (IListBlobItem item in blobs)
{
CloudBlockBlob blob = (CloudBlockBlob)item;
await blob.FetchAttributesAsync(); //Use async calls...
}
2) Fetching Metadata in a separate call is not efficient. The code makes 2 HTTP request per blob object. ListBlobs() method supports getting Metadata with as well in one call by setting BlobListingDetails parameter:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(useFlatBlobListing: true, blobListingDetails: BlobListingDetails.Metadata);
I recommend to use second code it it is possible. Since it is the most efficient way to fetch Metadata.

How do I get the name of the current branch in libgit2?

I am trying to use libgit2 to read the name of the current branch. Do I have to do some sort of resolve?
I tried using
git_branch_lookup
to look up the git_reference for HEAD, but it results in
Unable to find local branch 'HEAD'
Thanks!
Running git branch -a doesn't list HEAD. In libgit2, HEAD isn't considered a valid branch either. It's only a reference.
If you want to discover which reference is the current branch, then you should
Load the current HEAD reference (try the git_repository_head() convenience method)
Determine its type (using git_reference_type())
Depending on its type (GIT_REF_SYMBOLIC or GIT_REF_OID) retrieve one of the following
The name of the branch (using git_reference_symbolic_target())
The commit being pointed at (using git_reference_target())
I did not find the existing answer/comments helpful, when having this exact problem. Instead I combined git_reference_lookup() and git_reference_symbolic_target().
git_reference* head_ref;
git_reference_lookup(&head_ref, repo, "HEAD");
const char *symbolic_ref;
symbolic_ref = git_reference_symbolic_target(head_ref);
std::string result;
// Skip leading "refs/heads/" -- 11 chars.
if (symbolic_ref) result = &symbolic_ref[11];
git_reference_free(head_ref);
This feels like something of a dirty hack, but it's the best I've managed. The result string either ends up empty (e.g. detached head, there is no checked out branch) or contains the name of the checked out branch. The symbolic target is owned by the ref, so copy that value into the string before freeing it!
There is an example for this at https://libgit2.org/libgit2/ex/HEAD/status.html . It is based around the method get_reference_shorthand. This should give the branch name instead of the reference, so no string manipulation needed, and it should also work in the edge case where branch and remote have different names.
static void show_branch(git_repository *repo, int format)
{
int error = 0;
const char *branch = NULL;
git_reference *head = NULL;
error = git_repository_head(&head, repo);
if (error == GIT_EUNBORNBRANCH || error == GIT_ENOTFOUND)
branch = NULL;
else if (!error) {
branch = git_reference_shorthand(head);
} else
check_lg2(error, "failed to get current branch", NULL);
if (format == FORMAT_LONG)
printf("# On branch %s\n",
branch ? branch : "Not currently on any branch.");
else
printf("## %s\n", branch ? branch : "HEAD (no branch)");
git_reference_free(head);
}