I'm new to programming, teaching myself Python3. Wife asked me to make her a script that will read data from excel, from one column, and copy every other row into another column.
She has an excel of 12k rows and it's in form of: row1=string, row2=(int/float/date/time), row3=string, row4=(int/float/date/time)...
what i did is this:
import xlrd
import xlwt
workbook = xlrd.open_workbook("MyExcel.xls")
sheet = workbook.sheet_by_index(0)
wb = xlwt.Workbook()
ws = wb.add_sheet("Sheet1")
i = 0
data = []
for i in range(sheet.nrows):
if i % 2 == 0:
value = sheet.cell(i, 0).value
data.append(value)
j = 0
for j in range(len(data)):
ws.write(j, 0, data[j])
j += 1
wb.save("MyOutput.xls")
It works fine, but the issue is that it works fine with strings and integers, not with dates or time, it converts those into floats.
I just need it to copy the value of each cell "as is" no formatting or anything, just copy paste. If a cell has a value of: "monkey/73.0:blah" i just want it to copy it as "monkey/73.0:blah" into the new excel.
Any idea how to achieve this?
ps: I know it can be achieved within excel itself, without using python(with INDEX and ROWS), but i'm curious as to how to transfer data in this way as normal copy paste, without it for example dividing my dates into a float if its 1/1/2016.
Thx
You can't do exactly what you want to, because you don't understand what Excel stores in the file.
Dates and times are stored as floating-point values. Time values are between 0 and 1.0, and date/time values are stored as larger numbers (negative dates are not handled correctly by many versions of Excel). You can't just copy the cell values "as-is" (i.e., without formatting) because formatting is the only way a cell is known to be a date or time!
To read the dates & times, you could replace your block of code that builds the data list with this:
cv = lambda v, t: (v if t != 3
else datetime(*(xlrd.xldate_as_tuple(v, workbook.datemode))))
data = [ cv(sheet.cell(i, 0).value, sheet.cell_type(i, 0))
for i in range(1, sheet.nrows, 2) ]
This will give you a list of string, float, and datetime objects you can write to your output file (writing dates using xlrw is left as an exercise for the reader because I haven't done it, and it's late here & now).
I have a sample excel doc. Two sheets of this doc contain a long description of the doc, I am writing a script to generate some dynamic excel data , I will need add these two static page into my result. What is the best way? I can only think of creating a long data list, then do a add_table. But the text is quite large and big, wonder if there is any better way.
If the long descriptions are static then you could store them in text files and add them to each workbook like this:
import xlsxwriter
workbook = xlsxwriter.Workbook('test.xlsx')
worksheet = workbook.add_worksheet('Desc 1')
textfile = open('description_1.txt')
# Write the first description worksheet.
row = col = 0
for line in textfile:
worksheet.write(row, col, line.rstrip("\n"))
row += 1
worksheet = workbook.add_worksheet('Desc 2')
textfile = open('description_2.txt')
# Write the second description worksheet.
row = col = 0
for line in textfile:
worksheet.write(row, col, line.rstrip("\n"))
row += 1
# Now add a new worksheet and add new data.
worksheet = workbook.add_worksheet()
worksheet.write('A1', 'Hello')
workbook.close()
The repeated code could be put into a function. If you are reading non ASCII text then use the correct encoding. See the examples in the docs.
I want to get a list of column headers for each cell that contains a text value.
Eg.
A--------------B-------------C-------------BC (desired output)
1 Header1 Header2 Header3
2 M T Header1, Header3
3 T MT Header1, Header2
4 TMW Header2
In the final product I want to use two final columns with formulas listing headers from cells with values across 9 columns and a second with the other 40 odd columns.
I have the vague notion that I might need to use INDEX, MATCH and IF functions - but as a novice have no idea how to string them together coherently.
Here I will make use of VBA's Join function. VBA functions aren't directly available in Excel, so I wrap Join in a user-defined function that exposes the same functionality:
Function JoinXL(arr As Variant, Optional delimiter As String = " ")
JoinXL = Join(arr, delimiter)
End Function
The formula in D2 is:
=JoinXL(IF(NOT(ISBLANK(A2:C2)),$A$1:$C$1&", ",""),"")
entered as an array formula (using Ctrl-Shift-Enter). It is then copied down.
Explanation:
NOT(ISBLANK(A2:C2)) detects which cells have text in them; returns this array for row 2: {TRUE,FALSE,TRUE}
IF(NOT(ISBLANK(A2:C2)),$A$1:$C$1&", ","") converts those boolean values to row 1 contents followed by a comma delimiter; returns the array {"Header A, ","","Header C, "}.
JoinXL joins the contents of that array into a single string.
If you want to use worksheet functions, and not VBA, I suggest returning each column header in a separate cell. You can do this by entering a formula such as:
This formula must be array-entered:
BC: =IFERROR(INDEX($A$1:$C$1,1,SMALL((LEN($A2:$C2)>0)*COLUMN($A2:$C2),COUNTBLANK($A2:$C2)+COLUMNS($A:A))),"")
Adjust the range references A:C to reflect the columns actually used for your data. Be sure to use the same mixed address format as in above. Do NOT change the $A:A reference, however.
Then fill right until you get blanks; and fill down as far as required.
You can reverse the logic to get a list of the "other" headers.
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
ctrl-shift while hitting enter. If you did this
correctly, Excel will place braces {...} around the formula.
If you really need to have the results as comma-separated values in two different columns, I would suggest the following User Defined Function.
To enter this User Defined Function (UDF), alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like
=Headers($A2:$BA2,$A$1:$BA$1,True)
or, to get the headers that do NOT contain text:
=Headers($A2:$BA2,$A$1:$BA$1,FALSE)
in some cell.
=====================================================
Option Explicit
Function Headers(rData As Range, rHeaders As Range, Optional bTextPresent As Boolean = True) As String
Dim colHeaders As Collection
Dim vData, vHeaders
Const sDelimiter As String = ", "
Dim sRes() As String
Dim I As Long
vData = rData
vHeaders = rHeaders
Set colHeaders = New Collection
For I = 1 To UBound(vData, 2)
If (Len(vData(1, I)) > 0) = bTextPresent Then colHeaders.Add vHeaders(1, I)
Next I
ReDim sRes(1 To colHeaders.Count)
For I = 1 To colHeaders.Count
sRes(I) = colHeaders(I)
Next I
Headers = Join(sRes, sDelimiter)
End Function
==========================================
You should probably add some logic to the routine to ensure your range arguments are a single row, and that the two arguments are of the same size.
I am looking to enable Word to save with a file name using data contained within the document.
At the top of the document (an airline release letter), there is a table containing 2 columns with 3 rows containing alpha in one column and alpha-numeric data in column 2.
Column 1,
Cell 1: AETC; Cell 2: MAWB; Cell 3: HAWB
Column 2,
Cell 1: 80123; Cell 2, 0161234567; Cell 3: 00112345678
Basically, the first column will be the static labels for the variable data to be entered into column 2.
From all this, I want to generate a save-as file name: AETC80123_MAWB0161234567_HAWB00112345678_ReleaseLetter.doc
I've barely scratched the surface of VBA as I am more an operations supervisor than a techie so I'm not certain if this is even possible.
Any help/direction/copy-paste coding (if it's super easy and of little trouble) would be awesome!
Thanks!
I am not going to work it all out in detail for, nor is this tested at all (just written out of my head), but this should give a hint how you read cell content in a Word document:
' Set tbl to first table in document
Dim tbl As Table
Set tbl = ActiveDocument.Tables(0)
Dim r As Integer
Dim c As Integer
Dim val As String
Dim filename As String
filename = ""
For r = 1 To tbl.Rows.Count
For c = 1 To tbl.Columns.Count
' Get text in cell
val = tbl.Cell(r, c).Range.Text
' and append to string or whatever
filename = filename & val & "_"
Next c
Next r
Finally, save your document using
ActiveDocument.SaveAs FileName:=filename
Check this microsoft site for more information about SaveAs parameters.
I have an excel book that has two sheets: 1) Import 2) Pricing Rules.
Pricing Rules Sheet
The A column is what I need to match on. Example values include STA_PNP4, STA_PST.. and others. There are potentially around 50 different rows in the sheet, and it will continue to grow over time. Then for each row, there are pricing values in columns B to CF.
Import Sheet
This sheet has the same number of columns, but only Column A is filled out. Example values include STA_PNP4_001_00, STA_PNP4_007_00, STA_PST_010_00.. and many more.
What I need to do:
If the text in Import Sheet Column A before the second "_" matches the column identifer in Pricing Rules Sheet Column A, copy the rest of B to CF of Pricing Rules sheet for that row into the Import sheet for the row it matched on.
Any idea on where to begin with this one?
Why don't you do it using formulas only?
Assuming :
1.) Data in Import Sheet is
(col A)
STA_PNP4_007_00
STA_PNP4_001_00
STA_PNP4_001_00
.
.
2.) Data in Pricing Rules Sheet
(Col A) (col B) (ColC) (Col D) .......
STA_PNP4 1 2 3 .....
STA_PST 4 5 6 .....
STA_ASA2 7 8 9 .....
Then write this formula in B1 cell of Import Sheet
=IFERROR(VLOOKUP(LEFT(A1,FIND("",A1,FIND("",A1)+1)-1),PricingRules!$A$1:$CF$100,2,0),"")
Drag it down in column B
and For Column C , D just change index num from 2 to (3 for C) , (4 for D) and like that.
Because it will continue to grow over time you may be best using VBA. However, even with code I would start by applying the ‘groups’ via formula, so as not to have a spreadsheet overburdened with formulae and hence potentially slow and easy to corrupt. Something like part of #xtremeExcel’s solution which I repeat because the underscores have been treated as formatting commands in that answer:
=LEFT(A1,FIND("_",A1,1+FIND("_",A1))-1)
I’d envisage this (copied down) as an additional column in your Import Sheet - to serve as a key field to link to your Pricing Rules Sheet. Say on the extreme left so available for use by VLOOKUP across the entire sheet.
With that as a key field then either:
Write the code to populate Pricing Rules Sheet as frequently as run/desired. Either populating ‘from scratch’ each time (perhaps best for low volumes) or incrementally (likely advisable for high volumes).
Use VLOOKUP (as suggested). However with at least 84 columns and, presumably, many more than 50 rows that is a lot of formulae, though may be viable as a temporary ‘once off’ solution (ie after population Copy/Paste Special/Values).
A compromise. As 2. But preserve a row or a cell with the appropriate formulae/a and copy that to populate the other columns for your additions to your ColumnA and/or ColumnA:B.
Thanks for the input guys.
I got it implemented via a method like this:
{=VLOOKUP(LEFT($A4,7),PricingRules!A3:CF112,{2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84},FALSE)}
That is my ugly function, applied across a whole row, to look up and copy from my pricing rules every column when it finds a match.
Below is the function that I have created for above scenario. Its working as per the requirement that you have mentioned.
Sub CopyData()
Dim wb As Workbook
Dim importws As Worksheet
Dim PricingRulesws As Worksheet
Dim Pricingrowcount As Integer
Dim importRowCount As Integer
Dim FindValue As String
Dim textvalue As String
Dim columncount As Integer
Dim stringarray() As String
'Enter full address of your file ex: "C:\newfolder\datafile.xlsx"
Set wb = Workbooks.Open("C:\newfolder\datafile.xlsx")
'Enter the name of your "import" sheet
Set importws = Sheets("Import")
'Enter the name of your "Pricing" sheet
Set PricingRulesws = Sheets("PricingRules")
For Pricingrowcount = 1 To PricingRulesws.UsedRange.Rows.Count
FindValue = PricingRulesws.Cells(Pricingrowcount, 1)
For importRowCount = 1 To importws.UsedRange.Rows.Count
textvalue = importws.Cells(importRowCount, 1)
stringarray = Split(textvalue, "_")
textvalue = stringarray(0) & "_" & stringarray(1)
If FindValue = textvalue Then
For columncount = 2 To PricingRulesws.UsedRange.Columns.Count
importws.Cells(importRowCount, columncount) = PricingRulesws.Cells(Pricingrowcount, columncount)
Next columncount
End If
Next importRowCount
Next Pricingrowcount
End Sub