Pandas - Extract value from Dataframe based on certain key value - pandas

I have a Dataframe in the below format:
id, ref
101, [{'id': '74947', 'type': {'id': '104', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-A'}}]
102, [{'id': '74948', 'type': {'id': '105', 'name': 'Return', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-C'}}]
103, [{'id': '74949', 'type': {'id': '106', 'name': 'Sales', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-B'}}]
I am trying to extract rows that have name = Sales and return back the below output:
id, value
101, Prod-A
103, Prod-B

Use str[0] for first lists with Series.str.get by values by keys of dicts:
#if necessary convert list/dict repr to list/dict
import ast
df['ref'] = df['ref'].apply(ast.literal_eval)
df['names'] = df['ref'].str[0].str.get('type').str.get('name')
df['value'] = df['ref'].str[0].str.get('inwardIssue').str.get('key')
print (df)
id ref names value
0 101 [{'id': '74947', 'type': {'id': '104', 'name':... Sales Prod-A
1 102 [{'id': '74948', 'type': {'id': '105', 'name':... Return Prod-C
2 103 [{'id': '74949', 'type': {'id': '106', 'name':... Sales Prod-B
And then filter by boolean indexing:
df1 = df.loc[df['names'].eq('Sales'), ['id','value']]
print (df1)
id value
0 101 Prod-A
2 103 Prod-B

Related

Pandas - Extract value from Dataframe based on certain key value not in a sequence

I have a Dataframe in the below format:
id, ref
101, [{'id': '74947', 'type': {'id': '104', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-A'}}]
102, [{'id': '74948', 'type': {'id': '105', 'name': 'Return', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-C'}},
{'id': '750001', 'type': {'id': '342', 'name': 'Sales', 'inward': 'Sales', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-X'}}]
103, [{'id': '74949', 'type': {'id': '106', 'name': 'Sales', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-B'}},
104, [{'id': '67543', 'type': {'id': '106', 'name': 'Other', 'inward': 'Return Order', 'outward': 'PO'}, 'inwardIssue': {'id': '76560', 'key': 'Prod-BA'}}]
I am trying to extract rows that have name = Sales and return back the below output:
101, Prod-A
102, Prod-X
103, Prod-B
I am able to extract the required data if the key value pair appears at the first instance but I am not able to do so if it is not the first instance like in the case of id = 102
df['names'] = df['ref'].str[0].str.get('type').str.get('name')
df['value'] = df['ref'].str[0].str.get('inwardIssue').str.get('key')
df['output'] = np.where(df['names'] == 'Sales', df['value'], 0)
Currently I am able to only get values for id = 101, 103
Let us do explode
s=pd.DataFrame(df.ref.explode().tolist())
s=s.loc[s.type.str.get('name').eq('Sales'),'inwardIssue'].str.get('key')
dfs=df.join(s,how='right')
id ref inwardIssue
0 101 [{'id': '74947', 'type': {'id': '104', 'name':... Prod-A
2 103 [{'id': '74949', 'type': {'id': '106', 'name':... Prod-X
3 104 [{'id': '67543', 'type': {'id': '106', 'name':... Prod-B
If you already have a dataframe in that format, you may convert it to json format and use pd.json_normalize to turn original df to a flat dataframe and slicing/filering on this flat dataframe.
df1 = pd.json_normalize(df.to_dict(orient='records'), 'ref')
The output of this flat dataframe df1
Out[83]:
id type.id type.name type.inward type.outward inwardIssue.id \
0 74947 104 Sales Sales PO 76560
1 74948 105 Return Return Order PO 76560
2 750001 342 Sales Sales PO 76560
3 74949 106 Sales Return Order PO 76560
4 67543 106 Other Return Order PO 76560
inwardIssue.key
0 Prod-A
1 Prod-C
2 Prod-X
3 Prod-B
4 Prod-BA
Finally, slicing on df1
df_final = df1.loc[df1['type.name'].eq('Sales'), ['type.id', 'inwardIssue.key']]
Out[88]:
type.id inwardIssue.key
0 104 Prod-A
2 342 Prod-X
3 106 Prod-B

Pandas - Extracting value based of common key

I have a Dataframe in the below format:
id, key1, key2
101, {'key': 'key_1001', 'fields': {'type': {'subtask': False}, 'summary': 'Title_1' , 'id': '71150'}}, NaN
101, NaN,{'key': 'key_1002', 'fields': {'type': {'subtask': False}, 'summary': 'Title_2' , 'id': '71151'}}
102, {'key': 'key_2001', 'fields': {'type': {'subtask': False}, 'summary': 'Title_11' , 'id': '71160'}}, NaN
102, NaN,{'key': 'key_2002', 'fields': {'type': {'subtask': False}, 'summary': 'Title_12' , 'id': '71161'}}
I am trying to achieve the below output from the above Dataframe.
id, key_value_1, key_value_2
101, key_1001, key_1002
102, key_2001, key_2002
Output of df.dict()
{'id': {103: '101', 676: '101'}, 'key1' : {103: {'fields': {'type': {'subtask': False}, 'summary': 'Title_1' , 'id': '71150'},
676: nan}
You can use:
s=df.set_index('id').stack().str.get('key').unstack()
key1 key2
id
101 key_1001 key_1002
102 key_2001 key_2002

pandas same attribute comparison

I have the following dataframe:
df = pd.DataFrame([{'name': 'a', 'label': 'false', 'score': 10},
{'name': 'a', 'label': 'true', 'score': 8},
{'name': 'c', 'label': 'false', 'score': 10},
{'name': 'c', 'label': 'true', 'score': 4},
{'name': 'd', 'label': 'false', 'score': 10},
{'name': 'd', 'label': 'true', 'score': 6},
])
I want to return names that have the "false" label score value higher than the score value of the "true" label with at least the double. In my example, it should return only the "c" name.
First you can pivot the data, and look at the ratio, filter what you want:
new_df = df.pivot(index='name',columns='label', values='score')
new_df[new_df['false'].div(new_df['true']).gt(2)]
output:
label false true
name
c 10 4
If you only want the label, you can do:
new_df.index[new_df['false'].div(new_df['true']).gt(2)].values
which gives
array(['c'], dtype=object)
Update: Since your data is result of orig_df.groupby().count(), you could instead do:
orig_df['label'].eq('true').groupby('name').mean()
and look at the rows with values <= 1/3.

Convert list of dictionary in a dataframe to seperate dataframe

To convert list of dictionary already present in the dataset to a dataframe.
The dataset looks something like this.
[{'id': 35, 'name': 'Comedy'}]
How do I convert this list of dictionary to dataframe?
Thank you for your time!
I want to retrieve:
Comedy
from the list of dictionary.
Use:
df = pd.DataFrame({'col':[[{'id': 35, 'name': 'Comedy'}],[{'id': 35, 'name': 'Western'}]]})
print (df)
col
0 [{'id': 35, 'name': 'Comedy'}]
1 [{'id': 35, 'name': 'Western'}]
df['new'] = df['col'].apply(lambda x: x[0].get('name'))
print (df)
col new
0 [{'id': 35, 'name': 'Comedy'}] Comedy
1 [{'id': 35, 'name': 'Western'}] Western
If possible multiple dicts in list:
df = pd.DataFrame({'col':[[{'id': 35, 'name': 'Comedy'}, {'id':4, 'name':'Horror'}],
[{'id': 35, 'name': 'Western'}]]})
print (df)
col
0 [{'id': 35, 'name': 'Comedy'}, {'id': 4, 'name...
1 [{'id': 35, 'name': 'Western'}]
df['new'] = df['col'].apply(lambda x: [y.get('name') for y in x])
print (df)
col new
0 [{'id': 35, 'name': 'Comedy'}, {'id': 4, 'name... [Comedy, Horror]
1 [{'id': 35, 'name': 'Western'}] [Western]
And if want extract all values:
df1 = pd.concat([pd.DataFrame(x) for x in df['col']], ignore_index=True)
print (df1)
id name
0 35 Comedy
1 4 Horror
2 35 Western

how to order hardware servers with different package id's with a single API call

How can I order several SoftLayer baremetal servers that have differing package IDs with a single order/API call?
UPDATE: Added details below, hopefully it will add clarity to what I am trying to do.
The 2 data structures below are SoftLayer_Container_Product_Order_Hardware_Server datatypes. Each is for a hardware server (billed monthly) with a different package_id, and they can each be passed individually to the placeOrder() API method.
My question is whether there is a way to combine them into a single Order, so I can make a single API call to placeOrder()?
first data structure, to order 2 servers with package_id 553
{'complexType': 'SoftLayer_Container_Product_Order_Hardware_Server',
'hardware': [{'domain': 'example.com',
'hostname': u'host1',
'primaryBackendNetworkComponent': {'networkVlan': {'id': 1234}},
'primaryNetworkComponent': {'networkVlan': {'id': 5678}}},
{'domain': 'example.com',
'hostname': u'host2',
'primaryBackendNetworkComponent': {'networkVlan': {'id': 1234}},
'primaryNetworkComponent': {'networkVlan': {'id': 5678}}}],
'location': 1441195,
'packageId': 553,
'prices': [{'id': 177613},
{'id': 49811},
{'id': 49811},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 49081},
{'id': 49427},
{'id': 141945},
{'id': 50359},
{'id': 35686},
{'id': 50223},
{'id': 34807},
{'id': 29403},
{'id': 34241},
{'id': 32627},
{'id': 25014},
{'id': 33483},
{'id': 35310},
{'id': 32500}],
'quantity': 2,
'quoteName': u'DAL10-qty2-rand9939',
'sshKeyIds': [{'sshKeyIds': [9876]}, {'sshKeyIds': [9876]}],
'storageGroups': [{'arrayTypeId': 2, 'hardDrives': [0, 1]},
{'arrayTypeId': 5, 'hardDrives': [2, 3, 4, 5, 6, 7]}],
'useHourlyPricing': False}
Second data structure, to order 2 servers with package_id 251
{'complexType': 'SoftLayer_Container_Product_Order_Hardware_Server',
'hardware': [{'domain': 'example.com',
'hostname': u'host3',
'primaryBackendNetworkComponent': {'networkVlan': {'id': 1234}},
'primaryNetworkComponent': {'networkVlan': {'id': 5678}}},
{'domain': 'example.com',
'hostname': u'host4',
'primaryBackendNetworkComponent': {'networkVlan': {'id': 1234}},
'primaryNetworkComponent': {'networkVlan': {'id': 5678}}}],
'location': 1441195,
'packageId': 251,
'prices': [{'id': 50659},
{'id': 49811},
{'id': 49811},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 50113},
{'id': 49081},
{'id': 49437},
{'id': 141945},
{'id': 50359},
{'id': 26109},
{'id': 50223},
{'id': 34807},
{'id': 29403},
{'id': 34241},
{'id': 32627},
{'id': 25014},
{'id': 33483},
{'id': 35310},
{'id': 32500}],
'quantity': 2,
'quoteName': u'DAL10-qty2-rand3106',
'sshKeyIds': [{'sshKeyIds': [9876]}, {'sshKeyIds': [9876]}],
'storageGroups': [{'arrayTypeId': 2, 'hardDrives': [0, 1]},
{'arrayTypeId': 5, 'hardDrives': [2, 3, 4, 5]}],
'useHourlyPricing': False}
yep it is posible if you take a look at the documentation for the SoftLayer_Container_Product_Order_Hardware_Server container you will see this property:
orderContainers
Orders may contain an array of configurations. Populating this
property allows you to purchase multiple configurations within a
single order. Each order container will have its own individual
settings independent of the other order containers. For example, it is
possible to order a bare metal server in one configuration and a
virtual server in another. If orderContainers is populated on the base
order container, most of the configuration-specific properties are
ignored on the base container. For example, prices, location and
packageId will be ignored on the base container, but since the
billingInformation is a property that's not specific to a single order
container (but the order as a whole) it must be populated on the base
container.
So you just need to set that property for your orders e.g.
"orderContainers": [
order1,
order2
]
Note: replace order1 and order2 with the orders that you posted in your question
Regards