im trying to make a dataframe out of a list of dictionaries. I am quite new at this whole programming thing, and google just makes me more confused. That is why i am turning to you guys hoping for some assistance.
The first two list values (YV01', '3nP3RFgGnBrOfILK4DF2Tp) i would like to have under columns called: Name and GlobalId. I would lie to drop Pset_wallcommon, AC_Pset_RenovationAndPhasing, and BaseQuantities. And use the rest of the keys(if that what they are called) as column names.
It would be great if someone could give me the right push :)
For the record: Im am parsing an Ifc file with the IfcOpenshell package
The data:
['YV01', '3nP3RFgGnBrOfILK4DF2Tp', {'Pset_WallCommon': {'Combustible': False, 'Compartmentation': False, 'ExtendToStructure': False, 'SurfaceSpreadOfFlame': '', 'ThermalTransmittance': 0.0, 'Reference': '', 'AcousticRating': '', 'FireRating': '', 'LoadBearing': False, 'IsExternal': False}, 'AC_Pset_RenovationAndPhasing': {'Renovation Status': 'New'}, 'BaseQuantities': {'Length': 13786.7314346, 'Height': 2700.0, 'Width': 276.0, 'GrossFootprintArea': 3.88131387595, 'NetFootprintArea': 3.88131387595, 'GrossSideArea': 37.9693748734, 'NetSideArea': 37.9693748734, 'GrossVolume': 10.4795474651, 'NetVolume': 10.4795474651}}, 'YV01', '1M4JyBJhXD5xt8fBFUcjUU', {'Pset_WallCommon': {'Combustible': False, 'Compartmentation': False, 'ExtendToStructure': False, 'SurfaceSpreadOfFlame': '', 'ThermalTransmittance': 0.0, 'Reference': '', 'AcousticRating': '', 'FireRating': '', 'LoadBearing': False, 'IsExternal': False}, 'AC_Pset_RenovationAndPhasing': {'Renovation Status': 'New'}, 'BaseQuantities': {'Length': 6166.67382573, 'Height': 2700.0, 'Width': 276.0, 'GrossFootprintArea': 1.6258259759, 'NetFootprintArea': 1.6258259759, 'GrossSideArea': 15.9048193295, 'NetSideArea': 15.9048193295, 'GrossVolume': 4.38973013494, 'NetVolume': 4.38973013494}}
all_walls = ifc_file.by_type('IfcWall')
wallList = []
for wall in all_walls:
propertySets = (ifcopenshell.util.element.get_psets(wall))
wallList.append(wall.Name)
wallList.append(wall.GlobalId)
wallList.append(propertySets)
print(wallList)
wall_table = pd.DataFrame.from_records(wallList)
print(wall_table)
I have tried these basic pd.DataFrame.from_dict/records/arrays(data)
but the output looks like this
enter image description here
UPDATE: Thank you so much for your help, i am learning alot from this!
So i made a dictionary out of the wallList, and flattened the dict. like this:
#list of walls
for wall in all_walls:
propertySets = (ifcopenshell.util.element.get_psets(wall))
wallList.append(wall.Name)
wallList.append(wall.GlobalId)
wallList.append(propertySets)
#dict from list
wall_dict = {i: wallList[i] for i in range(0, len(wallList))}
new_dict = {}
#flattening dict
for key, value in wall_dict.items():
if isinstance(value, dict):
for key in value.keys():
for key2 in value[key].keys():
new_dict[key + '_' + key2] = value[key][key2]
else:
new_dict[key] = value
wall_table = pd.DataFrame.from_dict(new_dict, orient='index')
print(wall_table)
It seems to work pretty good, the only problem is that the dataframe contains all walls, but only propertyset data from the first in the list. I cant seem to understand how the dict flattening loop works. I would also like the index names (Pset_WallCommon_Combustible, and so on) to be the columns in my dataframe. Is that possible?
enter image description here
EDIT : Simply flattening a list as i did goes nowhere. Actually, i think you should drop this list thing altogether and try to load the Dataframe from a dictionnary. We'd need to see what does all_walls look like to help you for that, tho.
Have you tried directly loading the all_walls dictionary into a dataframe : df = pd.Dataframe.from_dict(all_walls) ?
I think if that doesnt work, flattening the dictionnaries in a fashion similar to the following should do the trick.
new_dict = {}
for key, value in all_walls.items():
if isinstance(value, dict):
for key in value.keys():
for key2 in value[key].keys():
new_dict[key + '_' + key2] = value[key][key2]
else:
new_dict[key] = value
In Amadeus hotels API there is amenities choices and in the search results there is different possibilities as well.
To make amenities more user readable I'd like a FULL list of ALL different possible amenities so that I can populate a database with amenities code and different translations.
For a client searching for hotels: stuff like ACC_BATHS, SAFE_DEP_BOX is kind of not readable friendly...
I'm referring to this
{
"data": [
{
"type": "hotel-offers",
"hotel": {
"type": "hotel",
"cityCode": "MIA",
...
"amenities": [
"HANDICAP_FAC",
"ACC_BATHS",
"ACC_WASHBASIN",
"ACC_BATH_CTRLS",
"ACC_LIGHT_
where can I find a csv of all amenities ?
I contacted the Amadeus tech support and they answered me this :
(you can copy this list, it's csv format... NAME_OF_AMENITY,amenity_code )
226 codes
PHOTOCOPIER,BUS.2
PRINTER,BUS.28
AUDIO-VIS_EQT,BUS.37
WHITE/BLACKBOARD,BUS.38
BUSINESS_CENTER,BUS.39
CELLULAR_PHONE_RENTAL,BUS.40
COMPUTER_RENTAL,BUS.41
EXECUTIVE_DESK,BUS.42
LCD/PROJECTOR,BUS.45
MEETING_ROOMS,BUS.46
OVERHEAD_PROJECTOR,BUS.48
SECRETARIAL_SERVICES,BUS.49
CONFERENCE_SUITE,BUS.94
CONVENTION_CTR,BUS.95
MEETING_FACILITIES,BUS.96
24_HOUR_FRONT_DESK,HAC.1
DISABLED_FACILITIES,HAC.101
MULTILINGUAL_STAFF,HAC.103
WEDDING_SERVICES,HAC.104
BANQUETING_FACILITIES,HAC.105
PORTER/BELLBOY,HAC.106
BEAUTY_PARLOUR,HAC.107
WOMENS_GST_RMS,HAC.110
PHARMACY,HAC.111
120_AC,HAC.113
120_DC,HAC.114
220_AC,HAC.115
220_DC,HAC.117
BARBECUE,HAC.118
BUTLER_SERVICE,HAC.136
CAR_RENTAL,HAC.15
CASINO,HAC.16
BAR,HAC.165
LOUNGE,HAC.165
TRANSPORTATION,HAC.172
WIFI,HAC.178
WIRELESS_CONNECTIVITY,HAC.179
BALLROOM,HAC.191
BUS_PARKING,HAC.192
CHILDRENS_PLAY_AREA,HAC.193
NURSERY,HAC.194
DISCO,HAC.195
24_HOUR_ROOM_SERVICE,HAC.2
COFFEE_SHOP,HAC.20
BAGGAGE_STORAGE,HAC.201
NO_KID_ALLOWED,HAC.217
KIDS_WELCOME,HAC.218
COURTESY_CAR,HAC.219
CONCIERGE,HAC.22
NO_PORN_FILMS,HAC.220
INT_HOTSPOTS,HAC.221
FREE_INTERNET,HAC.222
INTERNET_SERVICES,HAC.223
PETS_ALLOWED,HAC.224
FREE_BREAKFAST,HAC.227
CONFERENCE_FACILITIES,HAC.24
HI_INTERNET,HAC.259
EXCHANGE_FAC,HAC.26
LOBBY,HAC.276
DOCTOR_ON_CALL,HAC.28
24H_COFFEE_SHOP,HAC.281
AIRPORT_SHUTTLE,HAC.282
LUGGAGE_SERVICE,HAC.283
PIANO_BAR,HAC.284
VIP_SECURITY,HAC.285
DRIVING_RANGE,HAC.30
DUTY_FREE_SHOP,HAC.32
ELEVATOR,HAC.33
EXECUTIVE_FLR,HAC.34
GYM,HAC.35
EXPRESS_CHECK_IN,HAC.36
EXPRESS_CHECK_OUT,HAC.37
FLORIST,HAC.39
CONNECTING_ROOMS,HAC.4
FREE_AIRPORT_SHUTTLE,HAC.41
FREE_PARKING,HAC.42
FREE_TRANSPORTATION,HAC.43
GAMES_ROOM,HAC.44
GIFT_SHOP,HAC.45
HAIRDRESSER,HAC.46
ICE_MACHINES,HAC.52
GARAGE_PARKING,HAC.53
JACUZZI,HAC.55
JOGGING_TRACK,HAC.56
KENNELS,HAC.57
LAUNDRY_SVC,HAC.58
AIRLINE_DESK,HAC.6
LIVE_ENTERTAINMENT,HAC.60
MASSAGE,HAC.61
NIGHT_CLUB,HAC.62
SWIMMING_POOL,HAC.66
PARKING,HAC.68
ATM/CASH_MACHINE,HAC.7
POOLSIDE_SNACK_BAR,HAC.72
RESTAURANT,HAC.76
ROOM_SERVICE,HAC.77
SAFE_DEP_BOX,HAC.78
SAUNA,HAC.79
BABY-SITTING,HAC.8
SOLARIUM,HAC.83
SPA,HAC.84
CONVENIENCE_STOR,HAC.88
PICNIC_AREA,HAC.9
THEATRE_DESK,HAC.90
TOUR_DESK,HAC.91
TRANSLATION_SERVICES,HAC.92
TRAVEL_AGENCY,HAC.93
VALET_PARKING,HAC.97
VENDING_MACHINES,HAC.98
TELECONFERENCE,MRC.121
VOLTAGE_AVAILABLE,MRC.123
NATURAL_DAYLIGHT,MRC.126
GROUP_RATES,MRC.141
INTERNET-HIGH_SPEED,MRC.17
VIDEO_CONF_FACILITIES,MRC.53
ACC_BATHS,PHY.102
BR/L_PRINT_LIT,PHY.103
ADAPT_RM_DOORS,PHY.104
ACC_RM_WCHAIR,PHY.105
SERV_SPEC_MENU,PHY.106
WIDE_ENTRANCE,PHY.107
WIDE_CORRIDORS,PHY.108
WIDE_REST_ENT,PHY.109
ACC_LIGHT_SW,PHY.15
ACC_WCHAIR,PHY.28
SERV_DOGS_ALWD,PHY.29
ACC_WASHBASIN,PHY.3
ACC_TOILETS,PHY.32
ADAPT_BATHROOM,PHY.38
HANDRAIL_BTHRM,PHY.38
ADAPTED_PHONES,PHY.39
ACC_ELEVATORS,PHY.42
TV_SUB/CAPTION,PHY.45
DIS_PARKG,PHY.50
EMERG_COD/BUT,PHY.57
HANDICAP_FAC,PHY.6
DIS_EMERG_PLAN,PHY.60
HEAR_IND_LOOPS,PHY.65
BR/L_PRNT_MENU,PHY.66
DIS_TRAIN_STAF,PHY.71
PIL_ALARMS_AVL,PHY.76
ACC_BATH_CTRLS,PHY.79
PUTTING_GREEN,REC.5
TROUSER_PRESS,RMA.111
VIDEO,RMA.116
GAMES_SYSTEM_IN_ROOM,RMA.117
VOICEMAIL_IN_ROOM,RMA.118
WAKEUP_SERVICE,RMA.119
WI-FI_IN_ROOM,RMA.123
CD_PLAYER,RMA.129
BATH,RMA.13
MOVIE_CHANNELS,RMA.139
SHOWER,RMA.142
OUTLET_ADAPTERS,RMA.159
BIDET,RMA.16
DVD_PLAYER,RMA.163
CABLE_TELEVISION,RMA.18
OVERSIZED_ROOMS,RMA.185
TEA/COFFEE_MK_FACILITIES,RMA.19
AIR_CONDITIONING,RMA.2
TELEVISION,RMA.20
ANNEX_ROOM,RMA.204
FREE_NEWSPAPER,RMA.205
HONEYMOON_SUITES,RMA.206
INTERNETFREE_HIGH_IN_RM,RMA.207
MAID_SERVICE,RMA.208
PC_HOOKUP_INRM,RMA.209
PC_IN_ROOM,RMA.21
SATELLITE_TV,RMA.210
VIP_ROOMS,RMA.211
CORDLESS_PHONE,RMA.25
CRIBS_AVAILABLE,RMA.26
ALARM_CLOCK,RMA.3
PHONE-DIR_DIAL,RMA.31
FAX_FAC_INROOM,RMA.38
FREE_LOCAL_CALLS,RMA.45
HAIR_DRYER,RMA.50
INTERNET-HI_SPEED_IN_RM,RMA.51
IRON/IRON_BOARD,RMA.55
KITCHEN,RMA.59
BABY_LISTENING_DEVICE,RMA.6
LAUNDRY_EQUIPMENT_IN_ROOM,RMA.66
MICROWAVE,RMA.68
MINIBAR,RMA.69
NONSMOKING_RMS,RMA.74
REFRIGERATOR,RMA.88
ROLLAWAY_BEDS,RMA.91
SAFE,RMA.92
WATER_SPORTS,RST.110
ANIMAL_WATCHING,RST.126
BIRD_WATCHING,RST.127
SIGHTSEEING,RST.142
BEACH_WITH_DIRECT_ACCESS,RST.155
SKI_IN/OUT,RST.156
TENNIS_PROFESSIONAL,RST.157
FISHING,RST.20
GOLF,RST.27
FITNESS_CENTER,RST.36
BEACH,RST.5
HORSE_RIDING,RST.61
INDOOR_TENNIS,RST.62
MINIATURE_GOLF,RST.67
BOATING,RST.7
TENNIS,RST.71
SCUBA_DIVING,RST.82
SKEET_SHOOTING,RST.85
SNOW_SKIING,RST.88
BOWLING,RST.9
VOLLEYBALL,RST.98
ELEC_GENERATOR,SEC.15
EMERG_LIGHTING,SEC.19
FIRE_DETECTORS,SEC.22
GUARDED_PARKG,SEC.34
RESTRIC_RM_ACC,SEC.39
EXT_ROOM_ENTRY,SEC.40
INT_ROOM_ENTRY,SEC.41
SMOKE_DETECTOR,SEC.50
ROOMS_WITH_BALCONIES,SEC.51
SPRINKLERS,SEC.54
FIRST_AID_STAF,SEC.57
SECURITY_GUARD,SEC.58
VIDEO_SURVEIL,SEC.62
EXTINGUISHERS,SEC.89
FIRE_SAFETY,SEC.9
FEMA_FIRE_SAFETY_COMPLIANT,SEC.93
FIRE_SAF_NOT_STANDARD,SEC.95
According to the API, you can filter the offers by amenities:
https://developers.amadeus.com/self-service/category/hotel/api-doc/hotel-search/api-reference
I assume the multiple select list in the amenities property contains all the items you need.
EDIT: I noticed that unfortunately, the response example contains additional values, apart from the input. So the input is not enough.
I use this method
allcity = dom.body.findAll(attrs={'id' : re.compile("\d{1,2}")})
to return a list like this:
[<a onmousedown="return c({'fm':'as','F':'77B717EA','F1':'9D73F1E4','F2':'4CA6DE6B','F3':'54E5243F','T':'1279189248','title':this.innerHTML,'url':this.href,'p1':1,'y':'B2D76EFF'})" href="http://www.ylyd.com/showurl.asp?id=6182" target="_blank"><font size="3">掳虏驴碌路驴碌脴虏煤脨脜脧垄脥酶 隆煤 脢脦脝路脦露脕卢陆脫</font></a>,
掳脵露脠驴矛脮脮]
How do I extract this href?
http://www.ylyd.com/showurl.asp?id=6182
Thanks. :)
you can use
for a in dom.body.findAll(attrs={'id' : re.compile("\d{1,2}")}, href=True):
a['href']
In this example, there's no real need to use regex, it can be simply as calling <a> tag and then ['href'] attribute like so:
get_me_url = soup.a['href'] # http://www.ylyd.com/showurl.asp?id=6182
# cached URL
get_me_cached_url = soup.find('a', class_='m')['href']
You can always use prettify() method to better see the HTML code.
from bs4 import BeautifulSoup
string = '''
[
<a href="http://www.ylyd.com/showurl.asp?id=6182" onmousedown="return c({'fm':'as','F':'77B717EA','F1':'9D73F1E4','F2':'4CA6DE6B','F3':'54E5243F','T':'1279189248','title':this.innerHTML,'url':this.href,'p1':1,'y':'B2D76EFF'})" target="_blank">
<font size="3">
掳虏驴碌路驴碌脴虏煤脨脜脧垄脥酶 隆煤 脢脦脝路脦露脕卢陆脫
</font>
</a>
,
<a class="m" href="http://cache.baidu.com/c?m=9f65cb4a8c8507ed4fece763105392230e54f728629c86027fa3c215cc791a1b1a23a4fb7935107380843e7000db120afdf14076340920a3de95c81cd2ace52f38fb5023716c914b19c46ea8dc4755d650e34d99aa0ee6cae74596b9a1d6c85523dd58716df7f49c5b7003c065e76445&p=8b2a9403c0934eaf5abfc8385864&user=baidu" target="_blank">
掳脵露脠驴矛脮脮
</a>
]
'''
soup = BeautifulSoup(string, 'html.parser')
href = soup.a['href']
cache_href = soup.find('a', class_='m')['href']
print(f'{href}\n{cache_href}')
# output:
'''
http://www.ylyd.com/showurl.asp?id=6182
http://cache.baidu.com/c?m=9f65cb4a8c8507ed4fece763105392230e54f728629c86027fa3c215cc791a1b1a23a4fb7935107380843e7000db120afdf14076340920a3de95c81cd2ace52f38fb5023716c914b19c46ea8dc4755d650e34d99aa0ee6cae74596b9a1d6c85523dd58716df7f49c5b7003c065e76445&p=8b2a9403c0934eaf5abfc8385864&user=baidu
'''
Alternatively, you can do the same thing using Baidu Organic Results API from SerpApi. It's a paid API with a free trial of 5,000 searches.
Essentially, the main difference in this example is that you don't have to figure out how to grab certain elements since it's already done for the end-user with a JSON output.
Code to grab href/cached href from first page results:
from serpapi import BaiduSearch
params = {
"api_key": "YOUR_API_KEY",
"engine": "baidu",
"q": "ylyd"
}
search = BaiduSearch(params)
results = search.get_dict()
for result in results['organic_results']:
# try/expect used since sometimes there's no link/cached link
try:
link = result['link']
except:
link = None
try:
cached_link = result['cached_page_link']
except:
cached_link = None
print(f'{link}\n{cached_link}\n')
# Part of the output:
'''
http://www.baidu.com/link?url=7VlSB5iaA1_llQKA3-0eiE8O9sXe4IoZzn0RogiBMCnJHcgoDDYxz2KimQcSDoxK
http://cache.baiducontent.com/c?m=LU3QMzVa1VhvBXthaoh17aUpq4KUpU8MCL3t1k8LqlKPUU9qqZgQInMNxAPNWQDY6pkr-tWwNiQ2O8xfItH5gtqxpmjXRj0m2vEHkxLmsCu&p=882a9646d5891ffc57efc63e57519d&newp=926a8416d9c10ef208e2977d0e4dcd231610db2151d6d5106b82c825d7331b001c3bbfb423291505d3c77e6305a54d5ceaf13673330923a3dda5c91d9fb4c57479c77a&s=c81e728d9d4c2f63&user=baidu&fm=sc&query=ylyd&qid=e42a54720006d857&p1=1
'''
Disclaimer, I work for SerpApi.