How do you create df column using str.contains to assign rows by category? - pandas

I want to add a column to a df and I want the values for the rows of that column to be a category based on strings contained in another attribute in the data.
For example, if another attribute contains a "P" or "S" I want the string within the new df column to say cat1. But if the other attribute contains an "O" I want the string in the df column to say cat2. Otherwise, I want the string to be cat3.
So essentially, I want to create a df column that considers what the string in another df column is and then assigns the row value based on what that string contains.

Related

Python : How to create a new boolean column in my dataframe if the value of another column is in a list

I have a dataframe and I want to create a new column which take the value 1 if the value of an other column is in a list and 0 else. I try this but it did not work. Thank you

How to apply multiple functions to all sheets and then save as one excel workbook

So I imported all sheets from an excel file using pd.read_excel('df.xlsx',sheet_name=None).
I have a dict with key value pairs.
In all these sheets there is a table present. I want to make the first column as index and then insert a column from a separate dataframe that I already have.
What is the best way to approach this, should I save all sheets in to dfs individually or is there a way to loop over key value pairs?
Info
Col A
Col B
First Name
Second Name
Phone
And then I wanted to insert a column in all sheets so I was thinking I would use pd.insert()
As you have a dict of key, values, where each value is a df, you can iterate over them and first set the index, next create a column based on the other df. Use:
for name, df in data.items():
df = df.set_index('info')
df['new col'] = another_df['specified col']

is there a way to give column names to pd.read_clipboard() as it is treating first row of data as column names

I am using pd.read_clipboard() function to get an excel table that doesnt have column names as first row . The dataframe returned has first row as column labels. How to fix that.
I would like results to be
and not this
Though not showing up on help for read_clipboard() function , passing read_clipboard(names=['c1','c2']) where c1 and c2 are the column names fixes the read_clipboard() function to not treat first row as column names i.e provide column names to avoid having the function treat first row as column names

how to identify first record in a column pysaprk

I have a dataframe with many columns
So I have to identigy first record of a column and assign it one value and for others assign another value
i.e
if df[price].first_record = df[amt]
else
df[price] = df[amt]+df[delivery_charges]
how do I identify the first record in a column/dataframe
You can do this in following way:
window = Window.orderBy('Id')
df.withColumn('row',f.row_number().over(window)).withColumn('price',f.when(f.col('row')==1,f.col('amt')).otherwise(f.col('amt')+f.col('delivery_charges'))).show()

VLOOKUP to return multiple matches

I want to ask if there's a way/formula/vba to return multiple values when using vlookup? For example, I vlookup a data and when that data has multiple values to return, it will return the other values. Thanks.
For something as generic as this, just use Google.
Step #1) www.google.com
Step #2) get your answer in less time than it takes you to post here.
Return MULTIPLE corresponding values for ONE Lookup Value
The Excel VLOOKUP Function searches for a value (ie. Lookup_value) in the first column of a table array and returns a value in the same row from another column in the table array. In case of multiple occurrences of the Lookup value, the function searches the first occurrence of the Lookup value, and returns the corresponding value in the same row from another column.
In case you want to return multiple corresponding values, for the one Lookup value which has multiple occurrences, we show how it can be done using INDEX, SMALL, IF & ROW excel functions, as follows.
Consider the table array ("A2:B8"), in which you want to lookup the value "Apples" in column A which has multiple occurrences, and return all corresponding values in column B.
Enter the lookup value "Apples" in cell A11. In cell B11, enter below formula, as an array formula (CTRL+SHIFT+ENTER), and copy it downward in the same column B, in 7 rows (ie. number of times as the number of records in the table array "A2:B8". Multiple corresponding values (of the lookup value "Apples") will get copied vertically, starting from cell B11 till B17. Refer Table 1.
=INDEX($B$2:$B$8, SMALL(IF($A$11=$A$2:$A$8, ROW($A$2:$A$8)-ROW($A$2)+1), ROW(1:1)))
http://www.globaliconnect.com/excel/index.php?option=com_content&view=article&id=119:vlookup-multiple-values-return-multiple-corresponding-values-for-one-lookup-value&catid=77&Itemid=473