Merge rows based on value (pandas to excel - xlsxwriter) - addition

Merge rows based on value (pandas to excel - xlsxwriter) - addition - pandas

Following up this question
I would like to ask if we could merge rows from multiple columns, i.e.
Following answers from #Dimitris Thomas, I would be able to obtain the following
And wondering how the code could be updated to get:
Thanks

IIUC:
Firstly:
df=pd.read_excel('filename.xlsx')
#read excel file in pandas
try via set_index() and to_excel()
df.set_index(df.columns[:-1].tolist()).to_excel('filename.xlsx',header=None)
#OR(Since you don't provide data as text so not sure which one should work)
df.set_index(df.columns.tolist()).to_excel('filename.xlsx',header=None)

Related

read_excel only read cells formated as table

This is the way I am currently importing the information from an excel file where all rows that contains information are formatted as a table. Row number 13 is the header.
df = pd.read_excel('path_to_file', skiprows=12, usecols="C:T", na_values='N/A')
My question is, considering that I have skipped the rows(skiprows) and columns(usecols) without information, is pandas only reading down to the end of the cells that contain values? Currently 65.500 but increasing everyday, or always read a fix amount(e.g. 1M)
Is there any way that I can improve the performance/only read the necessary rows(cells with values)?
Thank you!

How to remove redundant data entered into xlsx file using pandas?

I need to remove redundant data in xlsx file in which data is entered using pandas. Any help on how to remove the redundant data?
Thanks!

You can use pandas.read_excel to load your sheet as a pandas dataframe. Then use pandas.DataFrame.drop_duplicates to remove the redundant data. You can then write it as you want.

How can two PsychoPy .psydat files be combined into a .csv I can import into R?

While running an experiment I made an error. If PsychoPy has correct participant codes, then the two conditions are combined into one .csv per session. (If not two .psydat files are created.) I know that .psydat is a format which contains more information than the csv format I need, but I'm not sure how to access the information which is there so it can be added to my dataframe and analysed in R. For example, I need to combine the data in MA22BI1.psydat and MAMA22BI1.psydat into MA22BI1.csv. I couldn't find anything remotely related in the PsychoPy forum. Thanks for any ideas!

You can regenerate a .csv file from a .psydat file using the code supplied here: psychopy.org/general/dataOutputs.html.

Reading xlsx into R and creating new header

I'm a newbie to R.
I read in an Excel file for a survey, but I started reading observations from the 3rd row of the excel file, as the survey download creates a first two rows of the question string (first row for all questions) followed by a second row of multiple choice questions (each option gets its own column except the first option, which is listed in the same column in the second row as the question in the first row).
So now, my dataframe starts with Row 3.
But now I need to create custom variable names - ie. new variable names for each column before I manipulate further. I'm looking for tips on how to best accomplish this.
What I am thinking:
Create an Excel file with the variable names, and then use this is as the header. I'm not quite sure which code I would use to do this.
Code the names as an empty dataframe, and then somehow merge this so the empty dataframe column names are the column names for the file I imported.
I would appreciate some suggestions on how best to do this!

How to handle SSIS solution with an excel source having multiple headers

I am working on a package that requires I import an excel sheet with multiple headers. I am suppose to unpivot the top column (top header) and make that a row and also unpivot some columns in the actual bottom columns.
I have created the package using just the actual bottom columns. I then hard coded the top column into the table using derived column.
Here's an example of what I am looking for
Sample Raw Data and Target Result

You are trying to un-pivot or reverse-pivot the input. In SSIS you can use the unpivot transformation,
Shared a good walkthrough below, hope this helps.
https://www.sqlshack.com/an-overview-of-ssis-pivot-and-ssis-unpivot-transformation/
KR,
Alex

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Merge rows based on value (pandas to excel - xlsxwriter) - addition - pandas

Following up this question I would like to ask if we could merge rows from multiple columns, i.e. Following answers from #Dimitris Thomas, I would be able to obtain the following And wondering how the code could be updated to get: Thanks

Related

read_excel only read cells formated as table

How to remove redundant data entered into xlsx file using pandas?

How can two PsychoPy .psydat files be combined into a .csv I can import into R?

Reading xlsx into R and creating new header

How to handle SSIS solution with an excel source having multiple headers

Categories

Resources