pandas drop row if column contains string

pandas drop row if column contains string - pandas

I have a csv file as follow:
message,name,userID,period,#timestamp,timediff
messagebody,Request URL,system,period_8,2021-05-10 09:21:31,1
messagebody,Request URL,system,period_9,2021-05-10 09:58:19,1
"Failed Logon for user ""user""",Logon Attempt,user,period_1,2021-05-10 08:00:22,1
"Failed Logon for user ""user""",Logon Attempt,user,period_1,2021-05-09 05:59:34,1
I am trying to check check the userID and remove all the rows that contains system
I tried with:
f['userID'] = f[~f["userID"].str.contains("system", na=False)]
But it doesn't seem to drop the rows.
Just a little explanation about the columns userID
This column is the result of a merge of other 2 columns.
f['userID'] = f[['destinationUserName','sourceUserName']].astype(str).agg(''.join,1).replace('nan','',regex=True)
f['userID'] = f[~f["userID"].str.contains("system", na=False)]
if I run my script I get this error:
ValueError: Length mismatch: Expected axis has 239 elements, new values have 252 elements
Can anyone help me to understand how to overcome this issue?
How can I target that column and remove specific rows that contains a specific string.
thank you so much for any help

Related

Using to_datetime several columns names

I am working with several CSV's that first N columns are information and then the next Ms (M is big) columns are information regarding a date.
This is the dataframe picture
I need to set just the columns between N+1 to N+M - 1 columns name to date format.
I tried this, in this case N+1 = 5, no matter M, I suppose that I can use -1 to not affect the last column name.
ContDiarios.columns[5:-1] = pd.to_datetime(ContDiarios.columns[5:-1])
but I get the following error:
TypeError: Index does not support mutable operations

The way you are doing is not feasable. Please try this way
def convert(x):
try:
return pd.to_datetime(x)
except:
return x
x.columns = map(convert,x.columns)
Or you can also use df.rename property to convert it.

Pandas dataframe selection df['a'][50][:51]

I have a dataframe where one of the column name is 'a'
I came across a following selection expression
dataframe['a'][50][:50]
I understand dataframe['a'][50] selects the row 49 in column ['a'], but what does [:50] do?
Thank you

If dataframe['a'][50][:50] doesn't error out and it actually returns something, it means the row 49 in column ['a'] contains iterables(more precisely sequence types) such as list, string, tuple...
dataframe['a'][50][:50] returns the sequence from element 0 to 49 from the value of the row 49 in column ['a'].
As I said above, if the row 49 in column ['a'] doesn't contain a sequence type, you will get errors. Try check dataframe['a'][50] to see if it is a sequence type
Note: dataframe['a'][50] is chain-indexing. It is not recommended. However, it is out of the scope of this question so I don't go into the detail of it.

Length of value issue with unique ids

I am trying to write a simple code and haven't found a simple answer for this. I am trying to assign a unique ID to each person based on when the file was amended and their employee ID. Then add the column of Unique IDs to the file.
excel1 = "Book1.xlsx"
df1 = pd.read_excel(excel1, header = 0)
time = time.strftime('%m%d%Y%H%m', time.gmtime(os.path.getmtime ("Book1.xlsx")))
unique_id=[df1["ID"] + time]
df1["CID"]=unique_id
When I try to run it I keep getting an error of
ValueError: Length of values does not match length of index
Could anyone have an answer on this?

Pandas giving length must be equal error when trying to replace one column with another

I am trying to fill one column with another column if null. I tried 2 ways
df.NAME1 = np.where(df.NAME1.isnull(), df.NAME2, df.NAME1)
df['NAME1'] = df['NAME1'].fillna(df['NAME2'])
I get:
Lengths must be equal.

Error exceptions.IndexError while importing in Odoo products

I'm trying to import 8500 products, and I cut the CSV in files with 1000 rows. Everything goes fine, but when I get 2500, I get this error:
Unknown error during import: : list index out of range at row 2
name,categ_id,standard_price,list_price,Public Price,default_code,description_purchase,Main Supplier,sale_delay,taxes_id,Id. Externo,property_account_expense,route_ids/id,Acabado,product_variant_ids/attribute_line_ids/attribute_id,product_variant_ids/attribute_line_ids/value_ids
Mueble Base Encajonada con Estante Metal,Category / Subcategory,999.00,999.24,999.24,A037073000,MOBILETTO BASE SCATOLATA,Provider,35,IVA 21%,A037073000,400000080,"purchase.route_warehouse0_buy,stock.route_warehouse0_mto",A03,Color,D7
Any idea where is the problem?

This error came because of you are given extra line value compare to columns label.
For example:
You have 16 columns than you must have to give 16 values. If you give 17 values than this type of error will come list index out of range
Situation:
Please recheck columns value it must be same as the number of columns label.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

pandas drop row if column contains string - pandas

Related

Using to_datetime several columns names

Pandas dataframe selection df['a'][50][:51]

Length of value issue with unique ids

Pandas giving length must be equal error when trying to replace one column with another

Error exceptions.IndexError while importing in Odoo products

Categories

Resources