unable to use create_job function using s3 control for batch operations

unable to use create_job function using s3 control for batch operations - amazon-s3

I am unable to understand how to fill in the fields inside the create_job function. (specifically manifest parameter). I would really appreciate if someone would give a real time example for the create_job function as i could not find it on the internet.
What i need to do is to add tags to multiple objects at once.
code which i have written and understood till now is below:
client = boto3.client('s3control')
response = client.create_job(
AccountId='682283364620 ',
Operation={
'S3PutObjectTagging': {
'TagSet': [
{
'Key': 'naturalnumber',
'Value': 'yo'
},
]
}
},
Report={
'Bucket': 'shivam1052061',
'Format': 'Report_CSV_20180820',
'Enabled': True,
'Prefix': 'string',
'ReportScope': 'AllTasks'
},
ClientRequestToken='',
Manifest={
'Spec': {
'Format': 'S3BatchOperations_CSV_20180820',
'Fields': [
'Ignore'|'Bucket'|'Key'|'VersionId',
]
},
'Location': {
'ObjectArn': 'string',
'ObjectVersionId': 'string',
'ETag': 'string'
}
},
Description='string',
Priority=123,
RoleArn='string'
)

You can use the manifest files that S3 Inventory with the Object ARN an ETag
Manifest={
'Spec': {
'Format': 'S3InventoryReport_CSV_20161130'
},
'Location': {
'ObjectArn': 'arn:aws:s3:::bucket/report/2020-08-17T00-00Z/manifest.json',
'ETag': 'xxxxxxxxx'
}
}

Related

Which class in AWS CDK have option to configure Dynamic partitioning for Kinesis delivery stream

I'm using kinesis delivery stream to send stream, from event bridge to s3 bucket. But i can't seem to find which class have the option to configure dynamic partitioning?
this is my code for delivery stream:
new CfnDeliveryStream(this, `Export-delivery-stream`, {
s3DestinationConfiguration: {
bucketArn: bucket.bucketArn,
roleArn: kinesisFirehoseRole.roleArn,
prefix: `test/!{timestamp:yyyy/MM/dd}/`
}
});

I have been working on the same issue for a few days, and have finally gotten something to work. Here is an example of how it can be implemented in CDK. In short, the partitioning has to be enables as you have done, but you need to set the key and .jq expression in the so-called processingConfiguration.
Our incomming json data looks something like this:
{
"data":
{
"timestamp":1633521266990,
"defaultTopic":"Topic",
"data":
{
"OUT1":"Inactive",
"Current_mA":3.92
}
}
}
The CDK code looks as following:
const DeliveryStream = new CfnDeliveryStream(this, 'deliverystream', {
deliveryStreamName: 'deliverystream',
extendedS3DestinationConfiguration: {
cloudWatchLoggingOptions: {
enabled: true,
},
bucketArn: Bucket.bucketArn,
roleArn: deliveryStreamRole.roleArn,
prefix: 'defaultTopic=!{partitionKeyFromQuery:defaultTopic}/!{timestamp:yyyy/MM/dd}/',
errorOutputPrefix: 'error/!{firehose:error-output-type}/',
bufferingHints: {
intervalInSeconds: 60,
},
dynamicPartitioningConfiguration: {
enabled: true,
},
processingConfiguration: {
enabled: true,
processors: [
{
type: 'MetadataExtraction',
parameters: [
{
parameterName: 'MetadataExtractionQuery',
parameterValue: '{defaultTopic: .data.defaultTopic}',
},
{
parameterName: 'JsonParsingEngine',
parameterValue: 'JQ-1.6',
},
],
},
{
type: 'AppendDelimiterToRecord',
parameters: [
{
parameterName: 'Delimiter',
parameterValue: '\\n',
},
],
},
],
},
},
})

Indexes: Search by Boolean?

I'm having some trouble with FaunaDB Indexes. FQL is quite powerful but the docs seem to be limited (for now) to only a few examples/use cases. (Searching by String)
I have a collection of Orders, with a few fields: status, id, client, material and date.
My goal is to search/filter for orders depending on their Status, OPEN OR CLOSED (Boolean true/false).
Here is the Index I created:
CreateIndex({
name: "orders_all_by_open_asc",
unique: false,
serialized: true,
source: Collection("orders"),
terms: [{ field: ["data", "status"] }],
values: [
{ field: ["data", "unique_id"] },
{ field: ["data", "client"] },
{ field: ["data", "material"] },
{ field: ["data", "date"] }
]
}
So with this Index, I want to specify either TRUE or FALSE and get all corresponding orders, including their data (fields).
I'm having two problems:
When I pass TRUE OR FALSE using the Javascript Driver, nothing is returned :( Is it possible to search by Booleans at all, or only by String/Number?
Here is my Query (in FQL, using the Shell):
Match(Index("orders_all_by_open_asc"), true)
And unfortunately, nothing is returned. I'm probably doing this wrong.
Second (slightly unrelated) question. When I create an Index and specify a bunch of Values, it seems the data returned is in Array format, with only the values, not the Fields. An example:
[
1001,
"client1",
"concrete",
"2021-04-13T00:00:00.000Z",
],
[
1002,
"client2",
"wood",
"2021-04-13T00:00:00.000Z",
]
This format is bad for me, because my front-end expects receiving an Object with the Fields as a key and the Values as properties. Example:
data:
{
unique_id : 1001,
client : "client1",
material : "concrete",
date: "2021-04-13T00:00:00.000Z"
},
{
unique_id : 1002,
client : "client2",
material : "wood",
date: "2021-04-13T00:00:00.000Z"
},
etc..
Is there any way to get the Field as well as the Value when using Index values, or will it always return an Array (and not an object)?
Could I use a Lambda or something for this?
I do have another Query that uses Map and Lambda to good effect, and returns the entire document, including the Ref and Data fields:
Map(
Paginate(
Match(Index("orders_by_date"), date),
),
Lambda('item', Get(Var('item')))
)
This works very nicely but unfortunately, it also performs one Get request per Document returned and that seems very inefficient.
This new Index I'm wanting to build, to filter by Order Status, will be used to return hundreds of Orders, hundreds of times a day. So I'm trying to keep it as efficient as possible, but if it can only return an Array it won't be useful.
Thanks in advance!! Indexes are great but hard to grasp, so any insight will be appreciated.

You didn't show us exactly what you have done, so here's an example that shows that filtering on boolean values does work using the index you created as-is:
> CreateCollection({ name: "orders" })
{
ref: Collection("orders"),
ts: 1618350087320000,
history_days: 30,
name: 'orders'
}
> Create(Collection("orders"), { data: {
unique_id: 1,
client: "me",
material: "stone",
date: Now(),
status: true
}})
{
ref: Ref(Collection("orders"), "295794155241603584"),
ts: 1618350138800000,
data: {
unique_id: 1,
client: 'me',
material: 'stone',
date: Time("2021-04-13T21:42:18.784Z"),
status: true
}
}
> Create(Collection("orders"), { data: {
unique_id: 2,
client: "you",
material: "muslin",
date: Now(),
status: false
}})
{
ref: Ref(Collection("orders"), "295794180038328832"),
ts: 1618350162440000,
data: {
unique_id: 2,
client: 'you',
material: 'muslin',
date: Time("2021-04-13T21:42:42.437Z"),
status: false
}
}
> CreateIndex({
name: "orders_all_by_open_asc",
unique: false,
serialized: true,
source: Collection("orders"),
terms: [{ field: ["data", "status"] }],
values: [
{ field: ["data", "unique_id"] },
{ field: ["data", "client"] },
{ field: ["data", "material"] },
{ field: ["data", "date"] }
]
})
{
ref: Index("orders_all_by_open_asc"),
ts: 1618350185940000,
active: true,
serialized: true,
name: 'orders_all_by_open_asc',
unique: false,
source: Collection("orders"),
terms: [ { field: [ 'data', 'status' ] } ],
values: [
{ field: [ 'data', 'unique_id' ] },
{ field: [ 'data', 'client' ] },
{ field: [ 'data', 'material' ] },
{ field: [ 'data', 'date' ] }
],
partitions: 1
}
> Paginate(Match(Index("orders_all_by_open_asc"), true))
{ data: [ [ 1, 'me', 'stone', Time("2021-04-13T21:42:18.784Z") ] ] }
> Paginate(Match(Index("orders_all_by_open_asc"), false))
{ data: [ [ 2, 'you', 'muslin', Time("2021-04-13T21:42:42.437Z") ] ] }
It's a little more work, but you can compose whatever return format that you like:
> Map(
Paginate(Match(Index("orders_all_by_open_asc"), false)),
Lambda(
["unique_id", "client", "material", "date"],
{
unique_id: Var("unique_id"),
client: Var("client"),
material: Var("material"),
date: Var("date"),
}
)
)
{
data: [
{
unique_id: 2,
client: 'you',
material: 'muslin',
date: Time("2021-04-13T21:42:42.437Z")
}
]
}
It's still an array of results, but each result is now an object with the appropriate field names.

Not too familiar with FQL, but I am somewhat familiar with SQL languages. Essentially, database languages usually treat all of your values as strings until they don't need to anymore. Instead, your query should use the string definition that FQL is expecting. I believe it should be OPEN or CLOSED in your case. You can simply have an if statement in java to determine whether to search for "OPEN" or "CLOSED".
To answer your second question, I don't know for FQL, but if that is what is returned, then your approach with a lamda seems to be fine. Not much else you can do about it from your end other than hope that you get a different way to get entries in API form somewhere in the future. At the end of the day, an O(n) operation in this context is not too bad, and only having to return a hundred or so orders shouldn't be the most painful thing in the world.
If you are truly worried about this, you can break up the request into portions, so you return only the first 100, then when frontend wants the next set, you send the next 100. You can cache the results too to make it very fast from the front-end perspective.

Another suggestion, maybe I am wrong and failed at searching the docs, but I will post anyway just in case it's helpful.
My index was failing to return objects, example data here is the client field:
"data": {
"status": "LIVRAISON",
"open": true,
"unique_id": 1001,
"client": {
"name": "TEST1",
"contact_name": "Bob",
"email": "bob#client.com",
"phone": "555-555-5555"
Here, the client field returned as null even though it was specified in the Index.
From reading the docs, here: https://docs.fauna.com/fauna/current/api/fql/indexes?lang=javascript#value
In the Value Objects section, I was able to understand that for Objects, the Index Field must be defined as an Array, one for each Object key. Example for my data:
{ field: ['data', 'client', 'name'] },
{ field: ['data', 'client', 'contact_name'] },
{ field: ['data', 'client', 'email'] },
{ field: ['data', 'client', 'phone'] },
This was slightly confusing, because my beginner brain expected that defining the 'client' field would simply return the entire object, like so:
{ field: ['data', 'client'] },
The only part about this in the docs was this sentence: The field ["data", "address", "street"] refers to the street field contained in an address object within the document’s data object.
This is enough information, but maybe it would deserve its own section, with a longer example? Of course the simple sentence works, but with a sub-section called 'Adding Objects to Fields' or something, this would make it extra-clear.
Hoping my moments of confusion will help out. Loving FaunaDB so far, keep up the great work :)

Using if/then/else schema validation conditional on values in other json objects

I'm trying to use if/then/else to use three different schemas for an object, but based on the value of a JSON field outside of that object. I've only seen example of people making their conditionals based on fields within the same object.
Here's what I've tried:
schema: {
'MyOtherObject': {
'$id': '#/properties/MyOtherObject',
'type': 'object',
'additionalProperties': false,
'title': 'The MyOtherObject Schema',
'required': [
'PaymentMethod'
],
'properties': {
'PaymentMethod': {
'$id': '#/properties/Deal/properties/PaymentMethod',
'type': 'string',
'title': 'The PaymentMethod Schema',
'default': '',
'examples': [
'Cash'
],
'pattern': '^(.*)$'
},
...
},
'MyConditionalSchemaObject': {
'$id': '#/properties/MyConditionalSchemaObject',
'type': 'object',
'additionalProperties': false,
'title': 'The MyConditionalSchemaObject Schema',
'if': {
// Trying to get at the value above but don't know how... my schema validations still failing
'2/MyOtherObject/properties/PaymentMethod' : { 'const': 'Cash' }
},
'then': {
'required': [
'PaymentStartDate'
],
'properties': {
'BankName': {
'$id': '#/properties/Financing/properties/BankName',
'type': ['string', 'null'],
'title': 'The Bankname Schema',
'default': '',
'examples': [
'Capital One Auto Finance'
],
...
}
}
I would expect that this code would check the value of the PaymentMethod field in the MyOtherObject json object and then use the schema based on passing the conditional check (if it equals Cash), but I'm still getting schema validation errors saying, Should not be additional properties 'BankName', implying that the schema in the "then" block is not being used for validation.

Generate items in Ext.dataview.List from hasMany models the MVC way

I have a Blog model with hasMany Posts (and many other fields). Now I want to list these posts in a List-view like that:
[My post #1]
[My post #2]
[My post #3]
As far as the API described, I'm able to pass either a store or a data attribute to Ext.dataview.List. But I was not able to find out how to pass the hasMany records to the list so it will display an item for each of them.
Do I really have to create another store? Isn't it possible to configure my dataview to something like store: 'Blog.posts' or data: 'Blog.posts' or even records: 'Blog.posts'?

Extend the dataview.List to define the itemtpl to loop through the posts
itemTpl: new Ext.XTemplate(
'<tpl for="**posts**" >',
'<div>{TheBlogPost}</div>',
'</tpl>'
)

As #Adam Marshall said, this doesn't work as easy as I imagined.

Sencha autogenerates stores from associations if you know how to access them.
So you simply can switch out the list's store for the autogenerated "substore" when it has loaded.
This approach probably has some problems, e.g. when listpaging plugin is used, but it is quick.
Example:
MODELS
Ext.define('Conversation', {
extend: 'Ext.data.Model',
config: {
fields: [
],
associations:
[
{
type: 'hasMany',
model: "Message",
name: "messages",
associationKey: 'messages'
}
]
}
});
Ext.define('Message' ,
{
extend: "Ext.data.Model",
config: {
idProperty: 'id_message',
fields: [
{ name: 'text', type: 'string' },
{ name: 'date', type: 'string' },
{ name: 'id_message', type: 'int' },
{ name: 'me', type: 'int'} // actually boolean
]
}
}
);
JSON
[
{
"messages": [
{"id_message": 11761, "date": 1378033041, "me": 1, "text": "iiii"},
{"id_message": 11762, "date": 1378044866, "me": 1, "text": "hallo"}
]}
]
CONTROLLER
this.getList().getStore().load(
{
callback: function(records, operation, success) {
//IMPORTANT LINE HERE:
getList().setStore(Ext.getStore(me.getList().baseStore).getAt(0).messages());
},
scope: this
}
);
LIST-VIEW
{
flex: 1,
xtype: 'list',
itemId: 'ConversationList',
data: [],
store: 'ConversationStore',
baseStore: 'ConversationStore',
itemTpl:
' {[app.util.Helpers.DateFromTimestamp(values.date)]}<br><b>{name}</b>' +
' {[app.util.Helpers.fixResidualHtml(values.text)]} </div>' +
},

ExtJS4 - Load Store From Another Grid

I am trying to load a json store when I click on a particular row in another grid. Can anyone see what I am doing wrong here? In the ext-all.js the error comes back as data is undefined (from debugger).
Ext.define('Documents', {
extend: 'Ext.data.Model',
fields: [
{ name: 'index', type: 'int' },
{ name: 'path', type: 'string' }
]
});
var documents = new Ext.data.JsonStore({
model: 'Documents',
root: 'groupdocuments',
autoLoad: false
});
// in the Ext.grid.Panel
listeners: {
itemclick: function () {
var itemgroupid = rec.get('groupid');
Ext.Ajax.request({
url: '/GetDocuments',
params: { groupId: itemgroupid },
success: function (result) {
var jsondata = Ext.decode(result.responseText);
documents.loadData(jsondata);
}
});
}
}
// the sample json returned from url
// { "groupdocuments": [{ "index": 1, "path": "1.doc" }, { "index": 2, "path": "2.doc" }, { "index": 3, "path": "3.doc" }] }

it looks like you need to escape the path data. should be { path: "C:\\something\\" }
Also why not use the grid Grouping feature?
http://docs.sencha.com/ext-js/4-0/#!/api/Ext.grid.feature.Grouping
In looking further it looks like the loaddata function is expecting an array. Not a json object with a rootdata object like you are giving it. change the listener to the following:
var jsondata = Ext.decode(result.responseText);
documents.loadData(jsondata.groupdocuments);
http://docs.sencha.com/ext-js/4-0/#!/api/Ext.data.Store-method-loadData
alternatively you should be able to use loadRawData with the full json object.
http://docs.sencha.com/ext-js/4-0/#!/api/Ext.data.Store-method-loadRawData

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

unable to use create_job function using s3 control for batch operations - amazon-s3

You can use the manifest files that S3 Inventory with the Object ARN an ETag Manifest={ 'Spec': { 'Format': 'S3InventoryReport_CSV_20161130' }, 'Location': { 'ObjectArn': 'arn:aws:s3:::bucket/report/2020-08-17T00-00Z/manifest.json', 'ETag': 'xxxxxxxxx' } }

Related

Which class in AWS CDK have option to configure Dynamic partitioning for Kinesis delivery stream

Indexes: Search by Boolean?

Using if/then/else schema validation conditional on values in other json objects

Generate items in Ext.dataview.List from hasMany models the MVC way

ExtJS4 - Load Store From Another Grid

Categories

Resources