my table looks like
id total avg test_no
1 445 89
2 434 85
3 378 75
4 421 84
I'm working on matillion-snowflake
I need my result to look like
id total avg test_no
1 445 89 1
2 434 85 1
3 378 75 1
4 421 84 1
Just use a Calculator component and set the value of the calculated column to 1
In Snowflake, you would modify the table using:
update t
set test_no = 1;
I assume that Matillion supports this as well.
Reading Pina_Indian_Diabities.csv some of the values are strings, something like this
+AC0-5.4128147485
734 2
735 4
736 0
737 8
738 +AC0-5.4128147485
739 1
740 NaN
741 3
742 1
743 9
744 13
745 12
746 1
747 1
like in row 738, there re such values in other rows and columns as well.
How can I drop them?
I have a situation where I have a table that has a single varchar(max) column called dbo.JsonData. It has x number of rows with x number of properties.
How can I create a query that will allow me to turn the result set from a select * query into a row/column result set?
Here is what I have tried:
SELECT *
FROM JSONDATA
FOR JSON Path
But it returns a single row of the json data all in a single column:
JSON_F52E2B61-18A1-11d1-B105-00805F49916B
[{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":17,\"Defrost_Cycles\":0,\"Freeze_Cycles\":2304,\"Float_Switch_Raw_ADC\":1328,\"Bin_status\":2304,\"Line_Voltage\":0,\"ADC_Evaporator_Temperature\":0,\"Mem_Sw\":1280,\"Freeze_Timer\":2560,\"Defrost_Timer\":593,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":1259,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":17.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":11264.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-63.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Bin Full\",\"Compressor_Runtime\":134215680}"},{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":200,\"Defrost_Cycles\":559,\"Freeze_Cycles\":510,\"Float_Switch_Raw_ADC\":106,\"Bin_status\":0,\"Line_Voltage\":119,\"ADC_Evaporator_Temperature\":123,\"Mem_Sw\":113,\"Freeze_Timer\":0,\"Defrost_Timer\":66,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":54,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":199.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":10984.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-60.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Defrost\",\"Compressor_Runtime\":11304}"},{"Json_Data":"{\"Seri...
What am I missing?
I can't specify the columns explicitly because the json strings aren't always the same.
This what I expect:
Serial_Number Gateway_Uptime Defrost_Cycles Freeze_Cycles Float_Switch_Raw_ADC Bin_status Line_Voltage ADC_Evaporator_Temperature Mem_Sw Freeze_Timer Defrost_Timer Water_Flow_Switch ADC_Mid_Temperature ADC_Water_Temperature Ambient_Temperature Mid_Temperature Water_Temperature Evaporator_Temperature Ambient_Temperature_Off_Board Ambient_Temperature_On_Board ADC_Ambient_Temperature Control_State Compressor_Runtime temp_sensor temp_pcb gw_uptime winc_fw gw_fw_version gw_fw_version_git gw_sn heap_free gw_sig_csq gw_sig_quality wifi_sig_strength wifi_resets LastModifiedDateUTC Defrost_Cycle_time Freeze_Cycle_time
12345 251402 540 494 106 0 98 158 113 221 184 0 0 0 1259 1259 1259 33 0 0 0 Freeze 10833 0 78 251402 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -61 0 2018-03-20 11:15:28.000 0 0
12345 251702 540 494 106 0 98 178 113 517 184 0 0 0 1259 1259 1259 22 0 0 0 Freeze 10838 0 78 251702 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -62 0 2018-03-20 11:15:42.000 0 0
...
Thank you,
Ron
My dataframe is a list of football games with varying stats, around 300 entries.
game_id team opp_team avg_marks
0 2919 STK BL 122
1 2919 BL STK 114
2 2920 RICH SYD 135
3 2920 SYD RICH 108
I would like to add the opposition stats as a new column for each entry. Resultant dataframe would appear like this
game_id team opp_team avg_marks opp_avg_marks
0 2919 STK BL 122 114
1 2919 BL STK 114 122
2 2920 RICH SYD 135 108
3 2920 SYD RICH 108 135
Any suggestions would be most welcome, I'm new to this forum. I have tried mapping but the entry is conditional on 2 columns, game_id and opp_team.
Ideally I would add it in original spreadsheet but I created a cumulative average for the season in pandas so was hoping there would be a way to incorporate this as well.
You could group on game_id and reverse the avg_marks values
In [725]: df.groupby('game_id')['avg_marks'].transform(lambda x: x[::-1])
Out[725]:
0 114
1 122
2 108
3 135
Name: avg_marks, dtype: int64
In [726]: df['opp_avg_marks'] = (df.groupby('game_id')['avg_marks']
.transform(lambda x: x[::-1]))
In [727]: df
Out[727]:
game_id team opp_team avg_marks opp_avg_marks
0 2919 STK BL 122 114
1 2919 BL STK 114 122
2 2920 RICH SYD 135 108
3 2920 SYD RICH 108 135
Or, get dict mapping from team and avg_marks, then use map on opp_team
In [729]: df['opp_team'].map(df.set_index('team')['avg_marks'].to_dict())
Out[729]:
0 114
1 122
2 108
3 135
Name: opp_team, dtype: int64
For the following table
count_value
CPUCore Offline_RetentionAge
i7 183 4184
7 1981
30 471
i5 183 2327
7 831
30 250
Pentium 183 333
7 125
30 43
2 183 575
7 236
31 96
Is it possible to generate a seaborn countplot (or normal countplot) like the following (generated using sns.countplot(x='CPUCore', hue="Offline_BackupSchemaIncrementType", data=dfCombined_df))
Problem here is that I need to use the count_value as count, rather then really go and count the Offline_RetentionAge
I think you need seaborn.barplot:
sns.barplot(x="count_value", y="index", hue='Offline_RetentionAge', data=df.reset_index())