How to convert pandas dataframe to a dictionary list dictionary? - pandas

I have a movie link dataframe and I want it to look like this:
{
'tt0051744': [
{
'title': 'Torrent',
'infoHash': '9f86563ce2ed86bbfedd5d3e9f4e55aedd660960'
}
],
'tt1254207': [
{
'title': 'HTTP URL',
'url': 'http://clips.vorwaerts-gmbh.de/big_buck_bunny.mp4'
}
],
'tt0137523': [
{
'title': 'External URL',
'externalUrl': 'https://www.netflix.com/watch/26004747'
}
]
}
I tried to use:
df.to_json('filmes.json', orient='records')
But it did not work.

Related

python: create directory structure in Json format from s3 bucket objects

Am getting objects in a s3 buckets using following
s3 = boto3.resource(
service_name='s3',
aws_access_key_id=key_id,
aws_secret_access_key=secret
)
for summary_obj in s3.Bucket(bucket_name).objects.all():
print(summary_obj.key)
Its giving me all object like this
'sub1/sub1_1/file1.zip',
'sub1/sub1_2/file2.zip',
'sub2/sub2_1/file3.zip',
'sub3/file4.zip',
'sub4/sub4_1/file5.zip',
'sub5/sub5_1/file6.zip',
'sub5/sub5_2/file7.zip',
'sub5/sub5_3/file8.zip',
'sub6/'
But i want to have a list of json of all objects with proper directory structure like this to show in my app
[
{'sub1': [
{
'sub1_1': ['file1.zip'] // All files in sub1_1 folder
},
{
'sub1_2': ['file2.zip'] // All files in sub1_2 folder
},
]},
{'sub2': [
{
'sub2_1': [
'file3.zip'
]
}
]},
{'sub3': [
'file4.zip'
]},
{'sub4': [
{
'sub4_1': [
'file5.zip'
]
}
]},
{'sub5': [
{
'sub5_1': [
'file6.zip'
]
},
{
'sub5_2': [
'file7.zip'
]
},
{
'sub5_3': [
'file8.zip'
]
}
]},
{'sub6': []}
]
what is the best way to do this in python3.8?
I give it a try and the closest I could get to your json was through recursion which works with any level of sub-folders and folders:
from collections import defaultdict
objects=['sub1/sub1_1/file1.zip',
'sub1/sub1_2/file2.zip',
'sub2/sub2_1/file3.zip',
'sub3/file4.zip',
'sub4/sub4_1/file5.zip',
'sub5/sub5_1/file6.zip',
'sub5/sub5_2/file7.zip',
'sub5/sub5_3/file8.zip',
'sub5/sub5_3/file9.zip',
'sub5/sub5_3/sub5_4/file1.zip',
'sub5/sub5_3/sub5_4/file2.zip',
'sub6/']
#print(objects)
def construct_dict(in_list, accumulator):
if not in_list:
return
else:
if in_list[0] not in accumulator:
accumulator[in_list[0]] = defaultdict(list)
return construct_dict(in_list[1::], accumulator[in_list[0]])
accumulator = defaultdict(list)
for obj in objects:
construct_dict(obj.split('/'), accumulator)
print(json.dumps(accumulator))
Which gives (the content is same, but structure a bit different):
{
"sub1": {
"sub1_1": {
"file1.zip": {}
},
"sub1_2": {
"file2.zip": {}
}
},
"sub2": {
"sub2_1": {
"file3.zip": {}
}
},
"sub3": {
"file4.zip": {}
},
"sub4": {
"sub4_1": {
"file5.zip": {}
}
},
"sub5": {
"sub5_1": {
"file6.zip": {}
},
"sub5_2": {
"file7.zip": {}
},
"sub5_3": {
"file8.zip": {},
"file9.zip": {},
"sub5_4": {
"file1.zip": {},
"file2.zip": {}
}
}
},
"sub6": {
"": {}
}
}

FaunaDB get entries by date range with index binding not working

I am struggling to get an Index by Date to work with a Range.
I have this collection called orders:
CreateCollection({name: "orders"})
And I have these sample entries, with one attribute called mydate. As you see it is just a string. And I do need to create the date as a string since in my DB we already have around 12K records with dates like that so I cant just start using the Date() to create them.
Create(Collection("orders"), {data: {"mydate": "2020-07-10"}})
Create(Collection("orders"), {data: {"mydate": "2020-07-11"}})
Create(Collection("orders"), {data: {"mydate": "2020-07-12"}})
I have created this index that computes the date to and actual Date object
CreateIndex({
name: "orders_by_my_date",
source: [
{
collection: Collection("orders"),
fields: {
date: Query(Lambda("order", Date(Select(["data", "mydate"], Var("order"))))),
},
},
],
terms: [
{
binding: "date",
},
],
});
If I try to fetch a single date the index works.
// this works
Paginate(
Match(Index("orders_by_my_date"), Date("2020-07-10"))
);
// ---
{
data: [Ref(Collection("orders"), "278496072502870530")]
}
But when I try to get a Range it never finds data.
// This does NOT work :(
Paginate(
Range(Match(Index("orders_by_my_date")), Date("2020-07-09"), Date("2020-07-15"))
);
// ---
{
data: []
}
Why the index does not work with a Range?
Range operates on the values of an index, not on the terms.
See: https://docs.fauna.com/fauna/current/api/fql/functions/range?lang=javascript
You need to change your index definition to:
CreateIndex({
name: "orders_by_my_date",
source: [
{
collection: Collection("orders"),
fields: {
date: Query(Lambda("order", Date(Select(["data", "mydate"], Var("order"))))),
},
},
],
values: [
{ binding: "date" },
{ field: ["ref"] },
],
})
Then you can get the results that you expect:
> Paginate(Range(Match(Index('orders')), Date('2020-07-11'), Date('2020-07-15')))
{
data: [
[
Date("2020-07-11"),
Ref(Collection("orders"), "278586211497411072")
],
[
Date("2020-07-12"),
Ref(Collection("orders"), "278586213229658624")
],
[
Date("2020-07-13"),
Ref(Collection("orders"), "278586215000703488")
],
[
Date("2020-07-14"),
Ref(Collection("orders"), "278586216887091712")
],
[
Date("2020-07-15"),
Ref(Collection("orders"), "278586218585784832")
]
]
}
Another alternative is to use a filter with a lambda expression to validate which values you want
Filter(
Paginate(Documents(Collection('orders'))),
Lambda('order',
And(
GTE(Select(['data', 'mydate'], Var('order')), '2020-07-09'),
LTE(Select(['data', 'mydate'], Var('order')), '2020-07-15')
)
)
)
You can update the conditions as you need
I believe this will work with the strings you have already
There are some mistakes here, first of all, you have to create documents that way:
Create(Collection("orders"), {data: {"mydate": ToDate("2020-07-10")}})
The index has to be created like this:
CreateIndex(
{
name: "orders_by_my_date",
source: Collection("orders"),
values:[{field:['data','mydate']},{field:['ref']}]
}
)
and finally, you can query your index and range:
Paginate(Range(Match('orders_by_my_date'),[Date("2020-07-09")], [Date("2020-07-15")]))
{ data:
[ [ Date("2020-07-10"),
Ref(Collection("orders"), "278532030954734085") ],
[ Date("2020-07-11"),
Ref(Collection("orders"), "278532033804763655") ],
[ Date("2020-07-12"),
Ref(Collection("orders"), "278532036737630725") ] ] }
or if you want to get the full doc:
Map(Paginate(Range(Match('orders_by_my_date'),[Date("2020-07-09")], [Date("2020-07-15")])),Lambda(['date','ref'],Get(Var('ref'))))
{ data:
[ { ref: Ref(Collection("orders"), "278532030954734085"),
ts: 1601887694290000,
data: { mydate: Date("2020-07-10") } },
{ ref: Ref(Collection("orders"), "278532033804763655"),
ts: 1601887697015000,
data: { mydate: Date("2020-07-11") } },
{ ref: Ref(Collection("orders"), "278532036737630725"),
ts: 1601887699800000,
data: { mydate: Date("2020-07-12") } } ] }

Points not showing in Map using TimeStampedGeojson Folium Plugin

I followed https://nbviewer.jupyter.org/github/python-visualization/folium/blob/master/examples/Plugins.ipynb to create my own map using TimeStampedGeojson folium Plugin, Time slider is working but points aren't getting displayed on map. I have using pune city coordinates , the aim was to display multipoint coordinates with changing icon and popup with timeslider functionality over a month.
points = [
{
'time': '2019-09-01',
'popup': '<h1>address1</h1>',
'coordinates': [18.528387, 73.874251]
},
{
'time': '2019-09-02',
'popup': '<h1>address1</h1>',
'coordinates': [18.456863, 73.801601]
},
{
'time': '2019-09-03',
'popup':'<h1>address1</h1>',
'coordinates': [18.527615, 73.872384]
},
{
'time': '2019-09-04',
'popup': '<h1>address1</h1>',
'coordinates': [18.528387, 73.874251]},
{
'time': '2019-09-05',
'popup': '<h1>address1</h1>',
'coordinates': [18.456863, 73.801601]}]
features = [
{
'type': 'Feature',
'geometry': {
'type': 'Point',
'coordinates': point['coordinates'],
},
'properties': {
'time': point['time'],
'popup': point['popup']
}
} for point in points]
features.append(
{
'type': 'Feature',
'geometry': {
'type': 'LineString',
'coordinates':[
[18.528387, 73.874251],
[18.456863, 73.801601],
[18.527615, 73.872384],
[18.528387, 73.874251],
[18.456863, 73.801601]
] ,
},
'properties': {
'popup': 'Current address',
'times': [
'2019-09-01',
'2019-09-02',
'2019-09-03',
'2019-09-04',
'2019-09-05'
]
}
})
m = folium.Map(
location=[18.5204,73.8567],
tiles='cartodbpositron',
zoom_start=10,)
plugins.TimestampedGeoJson(
{
'type': 'FeatureCollection',
'features': features
},
auto_play=False,
loop=False,
#max_speed=1,
loop_button=True,
date_options='YYYY/MM/DD',
#time_slider_drag_update=True,
duration='P2D').add_to(m)
This is the output of code in jupyter
The locations are getting rendered on the map, try zooming out the map's current view. I suspect the issue is with incorrect order of lat/long positions given inside your coordinates var.

unable to use create_job function using s3 control for batch operations

I am unable to understand how to fill in the fields inside the create_job function. (specifically manifest parameter). I would really appreciate if someone would give a real time example for the create_job function as i could not find it on the internet.
What i need to do is to add tags to multiple objects at once.
code which i have written and understood till now is below:
client = boto3.client('s3control')
response = client.create_job(
AccountId='682283364620 ',
Operation={
'S3PutObjectTagging': {
'TagSet': [
{
'Key': 'naturalnumber',
'Value': 'yo'
},
]
}
},
Report={
'Bucket': 'shivam1052061',
'Format': 'Report_CSV_20180820',
'Enabled': True,
'Prefix': 'string',
'ReportScope': 'AllTasks'
},
ClientRequestToken='',
Manifest={
'Spec': {
'Format': 'S3BatchOperations_CSV_20180820',
'Fields': [
'Ignore'|'Bucket'|'Key'|'VersionId',
]
},
'Location': {
'ObjectArn': 'string',
'ObjectVersionId': 'string',
'ETag': 'string'
}
},
Description='string',
Priority=123,
RoleArn='string'
)
You can use the manifest files that S3 Inventory with the Object ARN an ETag
Manifest={
'Spec': {
'Format': 'S3InventoryReport_CSV_20161130'
},
'Location': {
'ObjectArn': 'arn:aws:s3:::bucket/report/2020-08-17T00-00Z/manifest.json',
'ETag': 'xxxxxxxxx'
}
}

Generate items in Ext.dataview.List from hasMany models the MVC way

I have a Blog model with hasMany Posts (and many other fields). Now I want to list these posts in a List-view like that:
[My post #1]
[My post #2]
[My post #3]
As far as the API described, I'm able to pass either a store or a data attribute to Ext.dataview.List. But I was not able to find out how to pass the hasMany records to the list so it will display an item for each of them.
Do I really have to create another store? Isn't it possible to configure my dataview to something like store: 'Blog.posts' or data: 'Blog.posts' or even records: 'Blog.posts'?
Extend the dataview.List to define the itemtpl to loop through the posts
itemTpl: new Ext.XTemplate(
'<tpl for="**posts**" >',
'<div>{TheBlogPost}</div>',
'</tpl>'
)
As #Adam Marshall said, this doesn't work as easy as I imagined.
Sencha autogenerates stores from associations if you know how to access them.
So you simply can switch out the list's store for the autogenerated "substore" when it has loaded.
This approach probably has some problems, e.g. when listpaging plugin is used, but it is quick.
Example:
MODELS
Ext.define('Conversation', {
extend: 'Ext.data.Model',
config: {
fields: [
],
associations:
[
{
type: 'hasMany',
model: "Message",
name: "messages",
associationKey: 'messages'
}
]
}
});
Ext.define('Message' ,
{
extend: "Ext.data.Model",
config: {
idProperty: 'id_message',
fields: [
{ name: 'text', type: 'string' },
{ name: 'date', type: 'string' },
{ name: 'id_message', type: 'int' },
{ name: 'me', type: 'int'} // actually boolean
]
}
}
);
JSON
[
{
"messages": [
{"id_message": 11761, "date": 1378033041, "me": 1, "text": "iiii"},
{"id_message": 11762, "date": 1378044866, "me": 1, "text": "hallo"}
]}
]
CONTROLLER
this.getList().getStore().load(
{
callback: function(records, operation, success) {
//IMPORTANT LINE HERE:
getList().setStore(Ext.getStore(me.getList().baseStore).getAt(0).messages());
},
scope: this
}
);
LIST-VIEW
{
flex: 1,
xtype: 'list',
itemId: 'ConversationList',
data: [],
store: 'ConversationStore',
baseStore: 'ConversationStore',
itemTpl:
' {[app.util.Helpers.DateFromTimestamp(values.date)]}<br><b>{name}</b>' +
' {[app.util.Helpers.fixResidualHtml(values.text)]} </div>' +
},