Appending instead of overwriting table in BiqQuery API - google-bigquery

I currently use bigquery.tabledata().insertAll() to put data into BigQuery. However it overwrites all previous content instead of appending it. Is there a way to change default behaviour or should I use another method to do so?
Code below:
GoogleCredential credential = GoogleCredential.fromStream(...);
if (credential.createScopedRequired()) {
credential = credential.createScoped(BigqueryScopes.all());
}
bigquery = new Bigquery.Builder(new NetHttpTransport(), new GsonFactory(), credential).setApplicationName("Bigquery Samples").build();
TableDataInsertAllRequest.Rows r = new TableDataInsertAllRequest.Rows();
r.setInsertId("123");
ObjectMapper m = new ObjectMapper();
Map<String,Object> props = m.convertValue(person, Map.class);
r.setJson(props);
TableDataInsertAllRequest content =
new TableDataInsertAllRequest().setRows(Arrays.asList(r));
content.setSkipInvalidRows(true);
content.setIgnoreUnknownValues(true);
TableDataInsertAllResponse execute = bigquery.tabledata().insertAll("", "", "", content).execute();

Solution is to assign [globally] unique ID as an InserID.
BigQuery uses InsertId property to detect duplicate insertion requests on a best-effort basis.
If you will ignore this - you might end up with having unwanted duplicate rows!
See more in https://cloud.google.com/bigquery/streaming-data-into-bigquery#dataconsistency

Oh, found the answer.
Inserts with same (if set) id by setInsertId(id) are overridden by next with same id.
Solution: do not set InsertId.
EDIT: see #Mikhail Berlayant response and why you should care about InsertId.

Related

Cosmos DB "Partial Update"/Patch, cant set new property value to null

I'm trying out the Cosmos DB SDK's new Patch/Partial Update-functionality (for .NET)
When adding a new property I use
var patchOperations = new List<PatchOperation>(){
PatchOperation.Add<string>("/FavoriteColor", **null**)
};
await container.PatchItemAsync<T>(
id: myId,
partitionKey: new PartitionKey(myPk),
patchOperations: patchOperations);
The problem is, that it throws at the PatchOperation-Add() if I set second parameter to null (with message "Value cannot be null"). I can set any non-null string and it works well. I just wonder if this isn't supported yet or if I missed something.
Thanks to github user rvdvelden (source), this work around appears to work perfectly:
private JToken NullValue { get { return new JProperty("nullValue", null).Value; } }
Used in this way:
operations.Add(PatchOperation.Set("\customer", value ?? NullValue));
Remove is one alternative if the intent is to remove field/property.
This is not supported with Patch yet,
However, if you want to remove the entire property you need to use Remove Operation

Table.PutItemAsync of AWSSDK.DynamoDBv2 always returns null

In .net core 2.1 application I am adding a new record into DynamoDB table using Table.PutItemAsync of AWSSDK.DynamoDBv2 (v3.3.101.18) library:
var doc = await _table.PutItemAsync(document);
I can see that the record is successfully added in AWS Console, but it always returns null whereas the expected return value should be a Document:
public Task<Document> PutItemAsync(Document doc, CancellationToken cancellationToken = default);
I wonder if I am missing something obvious?
You need to specify the ReturnValues enum type in your PutItemOperationConfig and include this config in your request. The default is to return None. If you specify ReturnValues.AllOldAttributes (the only other option for this request), then you will get back a document with the old item's attributes if you overwrote an item or an empty item if you added a new item.
var putItemOperationConfig = new PutItemOperationConfig()
{
ReturnValues = ReturnValues.AllOldAttributes
};

Google diff-match-patch : How to unpatch to get Original String?

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.
diff_match_patch dmp = new diff_match_patch();
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB
Now is there any way to use this patch to re-create the originalString by passing the latestString?
I google about this and found this very old comment # Google diff-match-patch Wiki saying,
Unpatching can be done by just looping through the diff, swapping
DIFF_INSERT with DIFF_DELETE, then applying the patch.
But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.
Edit:
The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.
I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:
static void Main(string[] args)
{
var dmp = new diff_match_patch();
string v1 = "My Json Object;
string v2 = "My Mutated Json Object"
var v2ToV1Patch = dmp.patch_make(v2, v1);
var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch); // Persist text to db
string v3 = "Latest version of JSON object;
var v3ToV2Patch = dmp.patch_make(v3, v2);
var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch); // Persist text to db
// Time to re-hydrate the objects
var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think
var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString();
}
I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:
private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
patchedString = patchResult[0] as string;
var successArray = patchResult[1] as bool[];
foreach (var b in successArray)
{
if (!b)
return false;
}
return true;
}

Add track to playlist SoundCloud API

In my Windows Phone App, I'm using the following code to add a track to playlist (i.e. a PUT request to playlists/id endpoint)
using (HttpClient httpClient = new HttpClient())
{
httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue(AccessToken);
HttpResponseMessage response = await httpClient.PutAsync(endpoint, new StringContent(data));
response.EnsureSuccessStatusCode();
}
where "data" is JSON data the form:
{"playlist":{"tracks":["TrackId(to be added)"]}}
The above code returns "OK"(200) response but the track is NOT added to the playlist!
What am I doing wrong? Stuck on it for two days. Thanks in advance!
I use Put to replace track ids in set.
here is sample code
for (String s : trackIds)
nameValuePairs.add(new BasicNameValuePair("playlist[tracks][][id]", s.trim()));
String url = "https://api.soundcloud.com/playlists/" + setId + ".json";
httpPut(url, nameValuePairs);
The problem was that the JSON data (of the request body) was not formatted correctly.
"data" must be of the form:
{"playlist":{"tracks":[{"id":"__"}, {"id":"__"}, {"id":"__"}]}}
Here the id-value pair must be present for
every track already present in the playlist, as well as
the track that you want to add to the playlist
(Remember, this is a PUT request. So, you need to update data i.e. update the "tracks" property of the "playlist")

Datatables: How to reload server-side data with additional params

I have a table which gets its data server-side, using custom server-side initialization params which vary depending upon which report is produced. Once the table is generated, the user may open a popup in which they can add multiple additional filters on which to search. I need to be able to use the same initialization params as the original table, and add the new ones using fnServerParams.
I can't figure out how to get the original initialization params using the datatables API. I had thought I could get a reference to the object, get the settings using fnSettings, and pass those settings into a new datatables instance like so:
var oSettings = $('#myTable').dataTable().fnSettings();
// add additional params to the oSettings object
$('#myTable').dataTable(oSettings);
but the variable returned through fnSettings isn't what I need and doesn't work.
At this point, it seems like I'm going to re-architect things so that I can pass the initialization params around as a variable and add params as needed, unless somebody can steer me in the right direction.
EDIT:
Following tduchateau's answer below, I was able to get partway there by using
var oTable= $('#myTable').dataTable(),
oSettings = oTable.fnSettings(),
oParams = oTable.oApi._fnAjaxParameters(oSettings);
oParams.push('name':'my-new-filter', 'value':'my-new-filter-value');
and can confirm that my new serverside params are added on to the existing params.
However, I'm still not quite there.
$('#myTable').dataTable(oSettings);
gives the error:
DataTables warning(table id = 'myTable'): Cannot reinitialise DataTable.
To retrieve the DataTables object for this table, please pass either no arguments
to the dataTable() function, or set bRetrieve to true.
Alternatively, to destroy the old table and create a new one, set bDestroy to true.
Setting
oTable.bRetrieve = true;
doesn't get rid of the error, and setting
oSettings.bRetrieve = true;
causes the table to not execute the ajax call. Setting
oSettings.bDestroy = true;
loses all the custom params, while setting
oTable.bDestroy = true;
returns the above error. And simply calling
oTable.fnDraw();
causes the table to be redrawn with its original settings.
Finally got it to work using fnServerParams. Note that I'm both deleting unneccessary params and adding new ones, using a url var object:
"fnServerParams": function ( aoData ) {
var l = aoData.length;
// remove unneeded server params
for (var i = 0; i < l; ++i) {
// if param name starts with bRegex_, sSearch_, mDataProp_, bSearchable_, or bSortable_, remove it from the array
if (aoData[i].name.search(/bRegex_|sSearch_|mDataProp_|bSearchable_|bSortable_/) !== -1 ){
aoData.splice(i, 1);
// since we've removed an element from the array, we need to decrement both the index and the length vars
--i;
--l;
}
}
// add the url variables to the server array
for (i in oUrlvars) {
aoData.push( { "name": i, "value": oUrlvars[i]} );
}
}
This is normally the right way to retrieve the initialization settings:
var oSettings = oTable.fnSettings();
Why is it not what you need? What's wrong with these params?
If you need to filter data depending on your additional filters, you can complete the array of "AJAX data" sent to the server using this:
var oTable = $('#myTable').dataTable();
var oParams = oTable.oApi._fnAjaxParameters( oTable );
oParams.push({name: "your-additional-param-name", value: your-additional-param-value });
You can see some example usages in the TableTools plugin.
But I'm not sure this is what you need... :-)