With Excel VBA Create Multiple Rows From One Row - vba

I have an Excel sheet (InData) which has individual rows of data by unique "ID NUMBER". Each ID Number may have multiple "deductions" and "benefits" contained in the one row. But I need to convert the single row of data into multiple rows by ID Number and write the results into a new sheet (OutData).
I tried to attach my sample Excel file but can't find way to do it. So attached sample images for InData and OutData.
This is InData...
This is OuData...
Below is code I'm using.
Option Explicit
'Found original VBA here...
'http://stackoverflow.com/questions/3698442/convert-row-with-columns-of-data-into-column-with-multiple-rows-in-excel
Sub reOrgV2_New(inSource As Range, inTarget As Range)
'' This version works directly on the worksheet
'' and transfers the result directly to the target
'' given as the top-left cell of the result.
Dim resNames()
Dim propNum As Integer
Dim srcRows As Integer
Dim resRows As Integer
Dim i As Integer
Dim j As Integer
Dim g As Integer
'' Shape the result
resNames = Array("Deduction Desc", "Deduction Amount", "Deduction Start Date", "Deduction Stop Date", _
"Benefit Desc", "Benefit Amount", "Benefit Start Date", "Benefit Stop Date")
propNum = 1 + UBound(resNames)
'' Row counts
srcRows = inSource.Rows.Count
resRows = srcRows * propNum
'' re-org and transfer source to result range
inTarget = inTarget.Resize(resRows, 7)
g = 1
For i = 1 To srcRows
For j = 0 To 7
inTarget.Item(g + j, 1) = inSource.Item(i, 1) '' ID NUMBER
inTarget.Item(g + j, 2) = inSource.Item(i, 2) '' LAST NAME
inTarget.Item(g + j, 3) = inSource.Item(i, 3) '' FIRST NAME
inTarget.Item(g + j, 4) = resNames(j) '' Column Type
inTarget.Item(g + j, 5) = inSource.Item(i, j + 4) '' Value
Next j
g = g + propNum
Next i
End Sub
'' Call ReOrgV2_New with input and output ranges
Sub ReOrg_New()
Dim ws As Worksheet
Dim i As Integer
i = Range("InData!A:A").Find("").Row - 2
reOrgV2_New Range("InData!A2").Resize(i, 7), [OutData!A2]
With Sheets("OutData")
'We select the sheet so we can change the window view
.Select
'' apply column headings and autofit/align
.Range("A1:E1").Value = Array("ID NUMBER", "LAST NAME", "FIRST NAME", "Column Type", "Value")
.Columns("A:E").EntireColumn.AutoFit
.Columns("E:E").HorizontalAlignment = xlRight
End With
End Sub

Pertinent to your task definition, it seems that you can achieve the result simply by deletion of the unnecessary Worksheet Columns, which could be performed as, for example: Columns("H").Delete, or Columns(7).EntireColumn.Delete and so on (see the following sample VBA code snippet):
Sub DeleteColumns()
'delete columns
Columns("AR:AU").Delete
Columns("H:AL").Delete
' re-arrange columns order
Columns("D").Cut
Columns("F").Insert Shift:=xlToRight
End Sub
Then you can just re-arrange the order of residual data columns.
Hope this may help.

Related

VBA - to compare 2 tables from 2 sheets, and get result in another sheets creating comparison table

It is above my possibilities.
I know how to create Vba code with Vlookup or Hlookup for single comparision. However whatIi am trying is beyond my knowledge.
I need to compare 2 tables. 1st is requirements where 2nd is DB extract.
Both tables contains same "Action" column and other columns with Employer role as a header. Values from Action column for both tables are the same however in different order ( Those values need to act as primary key).
Columns with Employer role as a header - same header value for both tables - however columns in different order.
Amount of columns with Employer role as a header is not constants and it gets change every time I get this files. Those columns in extract are in different order than in requirements.
Amount of values from "Action" columns ( primary key) also not constants and change every time I receive files. So I cannot set specific range.
Example of Requirements table
Example of Extract table
Example of what is expected
New target worksheet need to be created where Comparison table will be created.
VBA to create Comparison table in newly created worksheet.
This table should have "Action" column + all columns with Employers role as header form requirements + all columns with Employers role as header form extract set in same order like columns in requirements + comparison table which compare values between Employers roles from Requirements and Extract and show values YES or NO
Try the next code, please. It will return in a newly created sheet (after the last existing). The new file is named "New Sheet". If it exists, it is cleared and reused:
Sub testMatchTables()
Dim sh As Worksheet, sh1 As Worksheet, shNew As Worksheet
Dim tbl1 As ListObject, tbl2 As ListObject, rightH As Long
Dim arrR, arrE, arrH1, arrH2, arrFin, i As Long, j As Long, k As Long
Dim first As Long, sec As Long, refFirst As Long, refSec As Long
Set sh = Set sh = Worksheets("Requirements")'use here the sheet keeping the first table
Set sh1 = Worksheets("Extract Table") 'use here your appropriate sheet
Set tbl1 = sh.ListObjects(1) 'use here your first table name (instead of 1)
Set tbl2 = sh1.ListObjects(1) 'use here your second table name (instead of 1)
arrR = tbl1.Range.value 'put the table range in an array
arrE = tbl2.Range.value 'put the table range in an array
'working with arrays will condiderably increase the processing speed
ReDim arrFin(1 To UBound(arrR), 1 To UBound(arrR, 2) * 3 + 2) 'redim the array to keep the processing result
'UBound is a property telling the number of array elements
arrH1 = Application.Index(arrR, 1, 0) 'make a slice in the array (1D array), the first row, which keeps the headers
arrH2 = Application.Index(arrE, 1, 0) 'make a slice in the array (1D array), the first row, which keeps the headers
'build the column headers:
For i = 1 To UBound(arrFin, 2)
If i <= UBound(arrH1) Then 'firstly the headers of the first table are filled in the final array
arrFin(1, i) = arrH1(i)
ElseIf refSec = 0 Then 'refSec is the column where a blanck column will exist
first = first + 1 'the code incrementes this variable to allow making empty only for the following row
If first = 1 Then
arrFin(1, i) = Empty: refFirst = i 'make the empty column between the two tables data and create a reference
'to be decreated from the already incremented i variable
Else
arrFin(1, i) = arrH1(i - refFirst) 'place each header column values
If i - refFirst = UBound(arrH1) Then refSec = i + 1 'when the code reaches the end of the first array
'it creates a reference for referencing the second time
End If
Else
sec = sec + 1 'the same philosophy as above, to create the second empty column
If sec = 1 Then
arrFin(1, i) = Empty 'create the empty column (for each processed row)
Else
arrFin(1, i) = arrH1(i - refSec) 'fill the header columns
End If
End If
Next
Dim C As Long, r As Long, eT As Long, T As Long
eT = UBound(arrR) 'mark the ending of the first array (where to be the first empty column)
T = UBound(arrR, 2) * 2 + 2 'mark the begining of the third final array part
'after the second empty column
For i = 2 To UBound(arrR) 'iterating between the first array rows
For j = 2 To UBound(arrE) 'iterating the second array rows
If arrR(i, 1) = arrE(j, 1) Then 'if the both arrays first column matches
arrFin(i, 1) = arrR(i, 1): arrFin(i, T + 1) = arrR(i, 1) 'put the Action values in the first area columns
arrFin(i, eT) = arrR(i, 1) 'put the Action values in the last area column
For C = 2 To UBound(arrR, 2) 'iterate between the array columns
rightH = Application.match(arrR(1, C), arrH2, 0) 'find the match of the first array header in the second one
arrFin(i, C) = arrR(i, C): arrFin(i, C + eT - 1) = arrE(j, rightH) 'place the matching header in the final array
If arrR(i, C) = arrE(j, rightH) Then
arrFin(i, T + C) = "TRUE" 'place 'TRUE' in case of matching
Else
arrFin(i, T + C) = "FALSE" 'place 'FALSE' in case of NOT matching
End If
Next C
End If
Next j
Next i
On Error Resume Next 'necessary to return an error if worksheet "New Sheet" does not exist
Set shNew = Worksheets("New Sheet")
If err.Number = 9 Then 'if it raises error number 9, this means that the sheet does not exist
err.Clear: On Error GoTo 0 'clear the error and make the code to return other errors, if any
Set shNew = Worksheets.Add(After:=Worksheets(Worksheets.count)) 'set shNew as new inserted sheet
shNew.name = "New Sheet" 'name the newly inserted sheet
Else
shNew.cells.Clear: On Error GoTo 0 ' in case of sheet exists, it is clear and the code is made to return errors
End If
'set the range where the final array to drop its values:
With shNew.Range("A1").Resize(UBound(arrFin), UBound(arrFin, 2))
.value = arrFin 'drop the array content
.EntireColumn.AutoFit 'AutoFit the involved columns
End With
End Sub
Please, test it and send some feedback.
Edited:
I commented the code as detailed I could. If still something unclear, please do not hesitate to ask for clarifications.

Copy and paste information based on matching IDs where one sheet has rows in the pivot table

I have a code that allows me to copy and paste thousands of rows of information based on matching IDs. However the code does not seem to run well in a pivot table. In sheet 4, the IDs are put into a pivot table while in sheet 1 the IDs and the information are not in pivot table (Both IDs in sheet 4 and 1 are in the same column which is column A). However, the IDs appeared more than once in sheet 1. Thus, when i try to run the code, it gave an error that said Cannot enter a null value as an item or field name in pivot table report" on the line 'rngTracker.Value = arrT found below.
Sub Sample()
Dim rngTracker As Range
Dim rngMaster As Range
Dim arrT, arrM
Dim dict As Object, r As Long, tmp
With Workbooks("FAST_Aug2015_Segment_Out_V1.xlsm")
Set rngTracker = .Sheets("Sheet4").Range("A5:D43000")
Set rngMaster = .Sheets("Sheet1").Range("A2:C200000")
End With
'get values in arrays
arrT = rngTracker.Value
arrM = rngMaster.Value
'load the dictionary
Set dict = CreateObject("scripting.dictionary")
For r = 1 To UBound(arrT, 1)
dict(arrT(r, 1)) = r
Next r
'map between the two arrays using the dictionary
For r = 1 To UBound(arrM, 1)
tmp = arrM(r, 1)
If dict.exists(tmp) Then
arrT(dict(tmp), 4) = arrM(r, 3)
End If
Next r
rngTracker.Value = arrT 'Error shown on this line'
End Sub
Above is the code that i have and gave error as mention above. Would appreciate any help. Thank you. :) Below is the image of the pivot table in sheet 4. The column header called "Acc Seg" is not part of the pivot table but it is where the data will be paste from sheet 1 when both IDs in sheet 4 and sheet 1 matched.
Option Explicit
Public Sub Sample()
Const T As Long = 43000
Const M As Long = 200000
Dim arrT1 As Variant, arrM1 As Variant, rngT2 As Range
Dim arrT2 As Variant, arrM2 As Variant, dict As Object, r As Long
With Workbooks("TEST2.xlsm") 'get values in arrays
Set rngT2 = .Sheets("Sheet4").Range("D5:D" & T)
arrM1 = .Sheets("Sheet1").Range("A2:A" & M)
arrM2 = .Sheets("Sheet1").Range("C2:C" & M)
arrT1 = .Sheets("Sheet4").Range("A5:A" & T)
arrT2 = rngT2
End With
Set dict = CreateObject("Scripting.Dictionary")
For r = 1 To UBound(arrT1) 'load the dictionary
dict(arrT1(r, 1)) = r
Next r
For r = 1 To UBound(arrM1, 1) 'map between the arrays using the dictionary
If dict.exists(arrM1(r, 1)) Then arrT2(dict(arrM1(r, 1)), 1) = arrM2(r, 1)
Next r
rngT2 = arrT2
End Sub

VBA code for Extracting Symbols like "&&","&&-","&-" and numbers into different columns

I am having a sheet which contains range of values like "5670&&2","1281&&-3&-5&&7",... etc. in Column A.
Kindly help me to extract the output in VBA in following way:
For E.g 5670&&2 I require A1 cell contains 5670,B1 cell contains &&,C1 cell contains 2.
For E.g 1281&&-3&-5&&7,I would require that A1 cell contains 1281,B1 cell contains &&-,C1 cell contains 3,D1 cell contains &-,E1 cell contains 5,F1 cell contains && and G1 cell contains 7.
Pls help in the same .
Thanks.,
Here i have tried to write code to separate numbers from non-numbers. Numbers and non-numbers are copied to different columns, like Excel Text-To-Columns. Code is a little crazy, if u need i will provide comments. As input the ActiveSheet.UsedRange.Columns(1).Cells is used.
Option Explicit
Sub SeparateNumbers()
Dim targetRange As Range
Dim cellRange As Range
Dim charIndex As Integer
Dim oneChar As String
Dim nextChar As String
Dim start As Integer
Dim copiedCharsCount As Integer
Dim cellValue As String
Dim columnIndex As Integer
Set targetRange = ActiveSheet.UsedRange.Columns(1).Cells
For Each cellRange In targetRange
columnIndex = cellRange.Column
start = 1
copiedCharsCount = 0
cellValue = cellRange.Value
If (VBA.Strings.Len(cellValue) <= 1) Then GoTo nextCell
For charIndex = 2 To Len(cellValue)
oneChar = VBA.Strings.Mid(cellValue, charIndex - 1, 1)
nextChar = VBA.Strings.Mid(cellValue, charIndex, 1)
If VBA.IsNumeric(oneChar) And VBA.IsNumeric(nextChar) Then GoTo nextCharLabel
If Not VBA.IsNumeric(oneChar) And Not VBA.IsNumeric(nextChar) Then GoTo nextCharLabel
cellRange.Offset(0, columnIndex).Value = VBA.Strings.Mid(cellValue, start, charIndex - start)
columnIndex = columnIndex + 1
copiedCharsCount = copiedCharsCount + (charIndex - start)
start = charIndex
nextCharLabel:
If charIndex = Len(cellValue) Then
cellRange.Offset(0, columnIndex).Value = VBA.Strings.Right(cellValue, charIndex - copiedCharsCount)
End If
Next charIndex
nextCell:
Next cellRange
End Sub
Here is one more code. As a side product, function TextSplitToNumbersAndOther can be used independently as a formula to achieve the same effect.
To prevent accidental firing of the macro in a wrong sheet or a wrong column and overwriting neighbouring columns with scrap, named range "Start_point" should be defined by a user. Below this range in the same column, all data will be processed till the first blank row.
Spreadsheet example: http://www.bumpclub.ee/~jyri_r/Excel/Extracting_symbols_into_columns.xls
Option Explicit
Sub ExtractSymbolsIntoColumns()
Dim rng As Range
Dim row_processed As Integer
Dim string_to_split As String
Dim columns_needed As Long
Dim counter As Long
row_processed = 1
counter = 0
Set rng = Range("Start_point")
While rng.Offset(row_processed, 0).Value <> ""
string_to_split = rng.Offset(row_processed, 0).Value
columns_needed = TextSplitToNumbersAndOther(string_to_split)
For counter = 1 To columns_needed
rng.Offset(row_processed, counter).Value = _
TextSplitToNumbersAndOther(string_to_split, counter)
Next
row_processed = row_processed + 1
Wend
End Sub
Function TextSplitToNumbersAndOther(InputText As String, _
Optional SplitPieceNumber As Long) As Variant
Dim piece_from_split(100) As Variant
Dim char_from_input As String
Dim word_count As Long
Dim counter As Long
Dim char_type(100) As Variant
InputText = Trim(InputText)
If Not IsNull(InputText) Then
word_count = 1
piece_from_split(word_count) = ""
For counter = 1 To Len(InputText)
char_from_input = CharFromTextPosition(InputText, counter)
char_type(counter) = CharTypeAsNumber(char_from_input)
If counter = 1 Then
piece_from_split(word_count) = char_from_input
Else
If (char_type(counter - 1) = char_type(counter)) Then
piece_from_split(word_count) = piece_from_split(word_count) & char_from_input
'Merge for the same type
Else
word_count = word_count + 1
piece_from_split(word_count) = char_from_input
End If
End If
Next
End If
If SplitPieceNumber = 0 Then
TextSplitToNumbersAndOther = word_count
Else
If SplitPieceNumber > word_count Then
TextSplitToNumbersAndOther = ""
Else
TextSplitToNumbersAndOther = piece_from_split(SplitPieceNumber)
End If
End If
End Function
Function CharTypeAsNumber(InputChar As String, Optional PositionInString As Long) As Long
If PositionInString = 0 Then PositionInString = 1
If Not IsNull(InputChar) Then
InputChar = Mid(InputChar, PositionInString, 1)
Select Case InputChar
Case 0 To 9
CharTypeAsNumber = 1
Case "a" To "z"
CharTypeAsNumber = 2
Case "A" To "Z"
CharTypeAsNumber = 3
Case Else
CharTypeAsNumber = 4
End Select
Else
CharTypeAsNumber = 0
End If
End Function
Function CharFromTextPosition(InputString As String, TextPosition As Long) As String
CharFromTextPosition = Mid(InputString, TextPosition, 1)
End Function
You can write a UDF (user defined function) to achieve the objective.
Your two example are in an order (ascending) to filter out into adjacent columns in Excel (A, B, C, D...)
So is it correct to assume logically, that you will never have scenarios where you will have to break the string into non-adjacent columns? e.g. 1234 goes to A, && goes to C, 3 goes to D... resulting in A, C, D.
Asumption 2: That your splitted-string is not going to need columns more than Excel can provide.
Steps you may try:
1. Check your string is not empty
2. Split it by the characters other than numerics
3. At the start and end of each non-numeric character you may proceed to the next adjacent column.
search help: Split a string into multiple columns in Excel - VBA

macro extract data block

I have worked on this problem for my entire day and can't solve it.
The input data consists of several data blocks with the same number of rows and columns. Each data block has its name in the first line within the block. Besides, they are further separated by a blank row.
block1
name score value
a 2 3
b 3 5
c 1 6
block2
name score value
a 4 6
b 7 8
c 2 6
block3
name score value
a 5 4
b 7 8
c 2 9
The desired output is to extract the name and value column of each block, and then parallel them in columns. Like this:
value block1 block2 block3
a 3 6 4
b 5 8 8
c 6 6 9
Thanks for your help!
UPDATE
Thanks for your answer, Tony, and others!
I just have another requirement. It is possible that some row in some tables are missing. In other words, as you mentioned previously, the row number may vary. Is it possible to fill in the corresponding cell in these tables with NA? i.e. the new input is like:
block1
name score value
a 2 3
c 1 6
block2
name score value
a 4 6
b 7 8
c 2 6
block3
name score value
a 5 4
b 7 8
The desired output now is like this:
value block1 block2 block3
a 3 6 4
b NA 8 8
c 6 6 NA
UPDATE on Jul.3 (If it's inappropriate to make the question too long, I will move this part and make it a new question)
block1
name score value
a 2 3
b 3 5
c 1 6
block2
name score value
a 4 6
b 7 8
c 2 6
block3
name score value
a 5 4
b 7 8
c 2 9
How can I pull both the value and its corresponding score and put them into one cell? Like this: The code indicates that the value is put into an dynamic array. Then the .range is assigned to this array. My first thought is to construct another array to store the value of the "score" column. Then loop through each element in both array, and concatenate them together. However, it seems that VBA does allow me to loop through the array, since its dimension is not defined. I tried REDIM, but it did not work.
value block1 block2 block3
a 3(2) 6(4) 4(5)
b 5(3) 8(7) 8(7)
c 6(1) 6(2) 9(2)
First answer - introduction to issues and request for clarification
This is not a solution - you do not give enough information for a solution - but introduces the issues and possible techniques. Warning: I have typed this into NotePad; no guarantees that there are no syntax errors.
You say each table is the same size although I assume not 3x3. But if they were 3x3, could I say table 1 starts in row 1, table 2 starts in row 7 and table N starts in 6(N-1)+1? That is, can you calculate the position of each table or do you need to search?
If you need to search, the following might help:
Dim ColSrcLast as Long
Dim RowSrcCrnt As Long
RowSrcCrnt = 1 ' Assumed start of Table 1
With Worksheets("xxxx")
ColSrcLast = .Cells(RowCrnt,Columns.Count).End(xlToLeft).Column
End With
ColSrcLast = .Cells(RowCrnt,Columns.Count).End(xlToLeft).Column is the VBA equivalent of placing the cursor in the last column of row RowCrnt+1 and then clicking Control+Left. This is probably the easiest way of finding the last used column in table 1.
Control+ArrowKey moves the cursor in the indicated direction and:
if the current cell is blank, stops at the first non-blank cell,
if the current cell is non-blank and so is the next, stops at the last non-blank cells before a blank cell,
if the current cell is non-blank but the next cell is blank, stops at the next non-blank cell,
if no cell meets the above criteria, stops at the end of range.
Experiment and the above will become clearer.
If the number of blank lines between tables might vary, I think the following will be the easiest method of locating each table:
Dim Found As Boolean
Dim RowSrcCrnt As Long
Dim RowSrcLast As Long
Dim RowSrcTableTitle As Long
Dim RowSrcTableLast As Long
With Worksheets("xxxx")
' Find last used row of worksheet
RowSrcLast = .Cells(Rows.Count,"A").End(xlUp).Row
End With
RowSrcCrnt = 1
Do While RowSrcCrnt <= RowSrcLast
With Worksheets("xxxx")
Found = False
Do While RowSrcCrnt <= RowSrcLast
If .Cells(RowSrcCrnt,"A").Value = "" then
' Have found start of next (first) table
RowSrcTableTitle = RowSrcCrnt
Found = True
Exit Do
End If
RowSrcCrnt = RowSrcCrnt+1
Loop
If Not Found Then
' No more tables
Exit Do
End If
RowSrcTableLast = .Cells(RowSrcTableTitle,"A").End(xlDown).Row
End With
' Process table RowSrcTableTitle to RowSrcTableLast
RowSrcCrnt = RowSrcTableLast+1
Loop
Within the above loop we have: Process table RowSrcTableTitle to RowSrcTableLast.
Is the Name column always column "A"? Is the Value column always the last column? If not, you will have to search across the header row for the column names.
Is every table in the same sequence? If not, you will have to sort them. Does every table contain every row? If not, your code for combining the tables will have to allow for this.
I hope the above gets you started. Come back if you have specific questions.
Second answer - Response to clarification
I created a test worksheet Jia Source which looks like this:
You say that the tables are all the same size. In this situation, the following code outputs to the Immediate Window the dimensions of each table. The output from this code is:
Table A1:C6
Table A8:C13
Table A15:C20
For your tables you will need to change the values of constants TableHeight and TableWidth. You will also have to change "Jia Source" to the name of your source worksheet.
Option Explicit
Sub ExtractValue()
Dim ColSrcLeft As Long
Dim ColSrcRight As Long
Dim RowSrcTitle As Long ' First row or table
Dim RowSrcHeader As Long ' Header row of table
Dim RowSrcEnd As Long ' Last row of table
Const TableHeight As Long = 4
Const TableWidth As Long = 3
RowSrcTitle = 1
Do While True
With Worksheets("Jia Source")
If .Cells(RowSrcTitle, "A").Value = "" Then
Exit Do
End If
RowSrcHeader = RowSrcTitle + 1
RowSrcEnd = RowSrcHeader + TableHeight
ColSrcLeft = 1
ColSrcRight = ColSrcLeft + TableWidth - 1
Debug.Print "Table " & colNumToCode(ColSrcLeft) & RowSrcTitle & ":" & _
colNumToCode(ColSrcRight) & RowSrcEnd
End With
' Code to handle table goes here.
RowSrcTitle = RowSrcEnd + 2
Loop
End Sub
Function colNumToCode(ByVal colNum As Integer) As String
' Convert Excel column number to column identifier or code
' Last updated 3 Feb 12. Adapted to handle three character codes.
Dim code As String
Dim partNum As Integer
If colNum = 0 Then
colNumToCode = "0"
Else
code = ""
Do While colNum > 0
partNum = (colNum - 1) Mod 26
code = Chr(65 + partNum) & code
colNum = (colNum - partNum - 1) \ 26
Loop
colNumToCode = code
End If
End Function
I have left the code that shows how to search for the tables if they vary in size. If the above code does not produce the correct results for your worksheet, you may need to merge the two routines.
The following assumes RowSrcTitle, RowSrcHeader, RowSrcLast, ColSrcLeft and ColSrcRight are correct. It is the code from ExtractValue() plus the code to copy the data to the destination sheet which I have named "Jia Destination". Its output is:
Have a play. Come back with questions if necessary.
Sub ExtractValue2()
Dim ColDestCrnt As Long
Dim ColSrcCrnt As Long
Dim ColSrcLeft As Long
Dim ColSrcRight As Long
Dim Found As Boolean
Dim RowDestBottom As Long
Dim RowDestTop As Long
Dim RowSrcTitle As Long ' First row or table
Dim RowSrcHeader As Long ' Header row of table
Dim RowSrcEnd As Long ' Last row of table
Dim TableTitle As String
Dim CellArray() As Variant
Const TableHeight As Long = 4
Const TableWidth As Long = 3
RowSrcTitle = 1
ColDestCrnt = 1
RowDestTop = 1
RowDestBottom = RowDestTop + TableHeight
Do While True
With Worksheets("Jia Source")
If .Cells(RowSrcTitle, "A").Value = "" Then
Exit Do
End If
RowSrcHeader = RowSrcTitle + 1
RowSrcEnd = RowSrcHeader + TableHeight
ColSrcLeft = 1
ColSrcRight = ColSrcLeft + TableWidth - 1
End With
If ColDestCrnt = 1 Then
' Column 1, the list of names, has not been output.
' This assumes all tables have the same rows in the same
' sequence
With Worksheets("Jia Source")
' This statement loads all the values in a range to an array in a
' single statements. Ask if you want more detail on what I am doing.
' Load name column for this table
CellArray = .Range(.Cells(RowSrcHeader, ColSrcLeft), _
.Cells(RowSrcEnd, ColSrcLeft)).Value
End With
With Worksheets("Jia Destination")
' Clear destination sheet
.Cells.EntireRow.Delete
' Write array containing name column to destination sheet
.Range(.Cells(RowDestTop, 1), _
.Cells(RowDestBottom, 1)).Value = CellArray
End With
ColDestCrnt = ColDestCrnt + 1
End If
With Worksheets("Jia Source")
' Find Value column.
Found = False
For ColSrcCrnt = ColSrcLeft + 1 To ColSrcRight
If LCase(.Cells(RowSrcHeader, ColSrcCrnt).Value) = "value" Then
Found = True
Exit For
End If
Next
End With
' If Found is False, the table has no value column and is ignored
If Found Then
With Worksheets("Jia Source")
' Extract title of title
TableTitle = .Cells(RowSrcTitle, ColSrcLeft).Value
' Load name column (excluding header) for this table
CellArray = .Range(.Cells(RowSrcHeader + 1, ColSrcCrnt), _
.Cells(RowSrcEnd, ColSrcCrnt)).Value
End With
With Worksheets("Jia Destination")
' Copy title
.Cells(1, ColDestCrnt).Value = TableTitle
' Write array containing name column to destination sheet
.Range(.Cells(RowDestTop + 1, ColDestCrnt), _
.Cells(RowDestBottom, ColDestCrnt)).Value = CellArray
End With
ColDestCrnt = ColDestCrnt + 1
End If
RowSrcTitle = RowSrcEnd + 2
Loop
End Sub
Answer to new question
If your final clarification is correct, this code is more complicated than you need. Before you posted it I had created a routine capable of handling much more varied tables than you assume you need. Since you have not seen the "real" files, I have not removed the code to handle the full, possible complexity.
I creates a test worksheet like this:
I suggest you duplicate this worksheet since it contains every nasty problem I could think of. Try out this code with this worksheet. Try to understand what the code is doing and why. You should then be ready for anything the real tables throw at you.
Some of the code is complex and I had to define a User-Defined Data Type. I tried googling "vba User-Defined Data Type" and was very disappointed by the tutorials I found so I will have a go myself.
Suppose my macro needs to hold name and age for a number of people. I will clearly need some arrays:
Dim NameFamily() As String
Dim NameGiven() As String
Dim Age() As Long
ReDim NameFamily(1 to 20)
ReDim NameGiven(1 to 3, 1 to 20)
ReDim Age(1 to 20)
NameFamily(5) = "Dallimore"
NameGiven(1, 5) = "Anthony"
NameGiven(2, 5) = "John"
NameGiven(3, 5) = ""
Age(5) = 65
You can very easily end up with a lot of code that can be difficult to maintain; particularly as the number of variables per person increases.
The alternative is to use what most languages call a structure and VBA calls a user-defined data type:
Type Person
NameFamily As String
NameGiven() As String
NumGivenNames as Long
Age As Long
End Type
Person is a new data type and I can declare variables using this type:
Dim Boss As Person
Dim OtherStaff() As Person
ReDim OtherStaff(1 to 20)
OtherStaff(5).NameFamily = "Dallimore"
OtherStaff(5).NumGivenNames = 2
Redim OtherStaff(5).NameGiven(1 To OtherStaff(5).NumGivenNames)
OtherStaff(5).NameGiven(1) = "Anthony"
OtherStaff(5).NameGiven(2) = "John"
OtherStaff(5).Age = 65
This probably does not look any easier. The benefits become more obvious when you want to add another item of information about people; perhaps number of children. With regular arrays, you first have to add a new array. You then have to find every point within the code where you resize the person arrays and add a ReDim statement for the new array. You get strange errors if you miss any ReDim. With the user defined data types, you add one line to the Type definition:
Type Person
NameFamily As String
NameGiven() As String
NumGivenNames as Long
Age As Long
NumChildren As Long
End Type
All existing code is now fully updated for this new variable.
The above is a very brief introduction but I believe it covers every feature of user-defined data types that I used in the code.
I hope I have included enough comments to allow you to understand my code. Work through it slowly and ask questions if necessary.
The code below is a third version having been updated to address questions on the earlier versions.
Variable naming conventions
Names are of the form AaaaBbbbCccc where each name part reduces the scope of name. So "Col" is short for column. Any variable used as a column number starts "Col". "Dest" is short for destination and "Src" is short for "Source". So any variable starting "ColSrc" is a column number for the source worksheet.
If I have an array AaaaBbbbCccc, any indices for that array will start InxAaaaBbbbCccc unless the resulting name is too long in which case Aaaa, Bbbb and Cccc are abbreviated or discarded. So all indices for "NameDtl()" start "InxName" because I think "InxNameDtl" is too long.
"Crnt" is short for "Current" and typically indicates a for-loop variable or a value extracted from an array for one iteration of a for-loop.
Option Explicit
Type typNameDtl
InxPredCrntMax As Long
Name As String
Output As Boolean
Predecessor() As String
End Type
Sub ExtractValue3()
Dim ColDestCrnt As Long ' Current column of destination worksheet
Dim ColSrcCrnt As Long ' Current column of source worksheet
Dim ColSrcSheetLast As Long ' Last column of worksheet
Dim InxNISCrnt As Long ' Current index into NameInSeq array
Dim InxNISCrntMax As Long ' Index of last used entry in NameInSeq array
Dim InxNISFirstThisPass As Long ' Index of first entry in NameInSeq array
' used this pass
Dim InxNameCrnt As Long ' Current index into NameDtl array
Dim InxNameCrntMax As Long ' Index of last used entry in NameDtl array
Dim InxPredCrnt As Long ' Current index into NameDtl(N).Predecessor
' array
Dim InxPredCrntMaxCrnt As Long ' Temporary copy of
' NameDtl(N).InxPredecessorCrntMax
Dim InxTableCrnt As Long ' Current index into RowSrcTableTitle and
' RowSrcTableEnd arrays
Dim InxTableCrntMax As Long ' Last used entry in RowSrcTableTitle and
' RowSrcTableEnd arrays
Dim Found As Boolean ' Set to True if a loop finds what is
' being sought
Dim NameCrnt As String ' Current index into NameDtl array
Dim NameInSeq() As String ' Array of names in output sequence
Dim NameLenMax As Long ' Maximum length of a name. Only used to
' align columns in diagnostic output.
Dim NameDtl() As typNameDtl ' Array of names found and their predecessors
Dim PredNameCrnt As String ' Current predecessor name. Used when
' searching NameDtl(N).Predecessor
Dim RowDestCrnt As Long ' Current row of destination worksheet
Dim RowSrcCrnt1 As Long ' \ Indices into source worksheet allowing
Dim RowSrcCrnt2 As Long ' / nested searches
Dim RowSrcTableEnd() As Long ' Array holding last row of each table within
' source worksheet
Dim RowSrcTableEndCrnt As Long ' The last row of the current table
Dim RowSrcSheetLast As Long ' Last row of source worksheet
Dim RowSrcTableTitle() As Long ' Array holding title row of each table within
' source worksheet
Dim RowSrcTableTitleCrnt As Long ' Title row of current table
Dim SheetValue() As Variant ' Copy of source worksheet.
' Column A of source worksheet used to test this code:
' Start
' row Values in starting and following rows
' 2 block1 name c d e f
' 9 block2 name b c d e
' 16 block3 name a c d
' 22 block4 name a d e
' 29 block5 name a d f
' 36 block6 name d e f
' Note that a and b never appear together in a table; it is impossible
' to deduce their preferred sequence from this data.
' Stage 1: Load entire source worksheet into array.
' =================================================
With Worksheets("Jia Source")
' Detrmine dimensions of worksheet
RowSrcSheetLast = .Cells.Find("*", .Range("A1"), xlFormulas, , _
xlByRows, xlPrevious).Row
ColSrcSheetLast = .Cells.Find("*", .Range("A1"), xlFormulas, , _
xlByColumns, xlPrevious).Column
SheetValue = .Range(.Cells(1, 1), _
.Cells(RowSrcSheetLast, ColSrcSheetLast)).Value
' SheetValue is a one-based array with rows as the first dimension and
' columns as the second. An array loaded from a worksheet is always one-based
' even if the range does not start at Cells(1,1). Because this range starts
' at Cells(1,1), indices into SheetValue match row and column numbers within
' the worksheet. This match is convenient for diagnostic output but is not
' used by the macro which does not reference the worksheet, RowSrcSheetLast or
' ColSrcSheet again.
End With
' Stage 2: Locate each table and store number of
' title row and last data row in arrays.
' ==============================================
' 100 entries may be enough. The arrays are enlarged if necessary.
ReDim RowSrcTableEnd(1 To 100)
ReDim RowSrcTableTitle(1 To 100)
InxTableCrntMax = 0 ' Arrays currently empty
RowSrcCrnt1 = 1
' Loop identifying dimensions of tables
Do While RowSrcCrnt1 <= RowSrcSheetLast
' Search down for the first row of a table
Found = False
Do While RowSrcCrnt1 <= RowSrcSheetLast
If SheetValue(RowSrcCrnt1, 1) <> "" Then
RowSrcTableTitleCrnt = RowSrcCrnt1
Found = True
Exit Do
End If
RowSrcCrnt1 = RowSrcCrnt1 + 1
Loop
If Not Found Then
' All tables located
Exit Do
End If
' Search down for the last row of a table
Found = False
Do While RowSrcCrnt1 <= RowSrcSheetLast
If SheetValue(RowSrcCrnt1, 1) = "" Then
RowSrcTableEndCrnt = RowSrcCrnt1 - 1
Found = True
Exit Do
End If
RowSrcCrnt1 = RowSrcCrnt1 + 1
Loop
If Not Found Then
' Last table extends down to bottom of worksheet
RowSrcTableEndCrnt = RowSrcSheetLast
End If
' Store details of this table.
InxTableCrntMax = InxTableCrntMax + 1
' Enlarge arrays if they are full
If InxTableCrntMax > UBound(RowSrcTableTitle) Then
' Redim Preserve requires the interpreter find a block of memory
' of the new size, copy values across from the old array and
' release the old array for garbage collection. I always allocate
' extra memory in large chunks and use an index like
' InxTableCrntMax to record how much of the array has been used.
ReDim Preserve RowSrcTableTitle(UBound(RowSrcTableTitle) + 100)
ReDim Preserve RowSrcTableEnd(UBound(RowSrcTableTitle) + 100)
End If
RowSrcTableTitle(InxTableCrntMax) = RowSrcTableTitleCrnt
RowSrcTableEnd(InxTableCrntMax) = RowSrcTableEndCrnt
Loop
' Output the arrays to the Immediate window to demonstrate they are correct.
' For my test data, the output is:
' Elements: 1 2 3 4 5 6
' Title: 2 9 16 22 29 36
' Last data: 7 14 20 26 33 40
Debug.Print "Location of each table"
Debug.Print " Elements:";
For InxTableCrnt = 1 To InxTableCrntMax
Debug.Print Right(" " & InxTableCrnt, 3);
Next
Debug.Print
Debug.Print " Title:";
For InxTableCrnt = 1 To InxTableCrntMax
Debug.Print Right(" " & RowSrcTableTitle(InxTableCrnt), 3);
Next
Debug.Print
Debug.Print "Last data:";
For InxTableCrnt = 1 To InxTableCrntMax
Debug.Print Right(" " & RowSrcTableEnd(InxTableCrnt), 3);
Next
Debug.Print
' Stage 3. Build arrays listing predecessors of each name
' ========================================================
' The names within the tables are all in the same sequence but no table
' contains more than a few names so that sequence is not obvious. This
' stage accumulates data from the tables so that Stage 4 can deduce the full
' sequence. More correctly, Stage 4 deduces a sequence that does not
' contradict the tables because the sequence of a and b and the sequence
' of f and g is not defined by these tables.
' For Stage 4, I need a list of every name used in the tables and, for each
' name, a list of its predecessors. Consider first the list of names.
' NameDtl is initialised to NameDtl(1 to 50) and InxNameCrntMax is initialised
' to 0 to record the array is empty. In table 1, the code below finds c, d,
' e and f. NameDtl and InxNameCrntMax are updated as these names are found:
'
' Initial state: InxNameCrntMax = 0 NameDtl empty
' Name c found : InxNameCrntMax = 1 NameDtl(1).Name = "c"
' Name d found : InxNameCrntMax = 2 NameDtl(2).Name = "d"
' Name e found : InxNameCrntMax = 3 NameDtl(3).Name = "e"
' Name f found : InxNameCrntMax = 4 NameDtl(4).Name = "f"
' In table 2, the code finds; b, c, d and e. b is new but c, d and e are
' already recorded and they must not be added again. For each name found,
' the code checks entries 1 to InxNameCrntMax. Only if the new name is not
' found, is it added.
' For each name, Stage 4 needs to know its predecessors. From table 1 it
' records that:
' d is preceeded by c
' e is preceeded by c and d
' f is preceeded by c, d and e
' The same technique is used for build the list of predecessors. The
' differences are:
' 1) Names are accumulated in NameDtl().Name while the predecessors of
' the fifth name are accumulated in NameDtl(5).Predecessor.
' 2) InxNameCrntMax is replaced, for the fifth name, by
' NameDtl(5).InxPredCrntMax.
' Start with space for 50 names. Enlarge if necessary.
ReDim NameDtl(1 To 50)
InxNameCrntMax = 0 ' Array is empty
' For each table
For InxTableCrnt = 1 To InxTableCrntMax
RowSrcTableTitleCrnt = RowSrcTableTitle(InxTableCrnt)
RowSrcTableEndCrnt = RowSrcTableEnd(InxTableCrnt)
' For each data row in the current table
For RowSrcCrnt1 = RowSrcTableTitleCrnt + 2 To RowSrcTableEndCrnt
' Look in NameDtl for name from current data row
NameCrnt = SheetValue(RowSrcCrnt1, 1)
Found = False
For InxNameCrnt = 1 To InxNameCrntMax
' Not this comparison is case sensitive "John" and "john" would not
' match. Use LCase if case insensitive comparison required.
If NameCrnt = NameDtl(InxNameCrnt).Name Then
Found = True
Exit For
End If
Next
If Not Found Then
' This is a new name. Create entry in NameDtl for it.
InxNameCrntMax = InxNameCrntMax + 1
If InxNameCrntMax > UBound(NameDtl) Then
ReDim Preserve NameDtl(UBound(NameDtl) + 50)
End If
InxNameCrnt = InxNameCrntMax
NameDtl(InxNameCrnt).Output = False
NameDtl(InxNameCrnt).Name = NameCrnt
' Allow for up to 20 predecessors
ReDim NameDtl(InxNameCrnt).Predecessor(1 To 20)
NameDtl(InxNameCrnt).InxPredCrntMax = 0
End If
' Check that each predecessor for the current name within the
' current table is recorded against the current name
For RowSrcCrnt2 = RowSrcTableTitleCrnt + 2 To RowSrcCrnt1 - 1
Found = False
PredNameCrnt = SheetValue(RowSrcCrnt2, 1)
' Move current number of predecessors from array to variable
' to make code more compact and easier to read
InxPredCrntMaxCrnt = NameDtl(InxNameCrnt).InxPredCrntMax
For InxPredCrnt = 1 To InxPredCrntMaxCrnt
If PredNameCrnt = _
NameDtl(InxNameCrnt).Predecessor(InxPredCrnt) Then
Found = True
Exit For
End If
Next
If Not Found Then
' This predecessor has not been recorded against the current name
InxPredCrntMaxCrnt = InxPredCrntMaxCrnt + 1
If InxPredCrntMaxCrnt > _
UBound(NameDtl(InxNameCrnt).Predecessor) Then
ReDim Preserve NameDtl(UBound(NameDtl) + 20)
End If
NameDtl(InxNameCrnt).Predecessor(InxPredCrntMaxCrnt) = PredNameCrnt
' Place new value for number of predecessors in its permenent store.
NameDtl(InxNameCrnt).InxPredCrntMax = InxPredCrntMaxCrnt
End If
Next
Next
Next
' Output NameDtl to the Immediate window to demonstrate it is correct.
' Find length of longest name so columns can be justified
NameLenMax = 4 ' Minimum length is that of title
For InxNameCrnt = 1 To InxNameCrntMax
If Len(NameDtl(InxNameCrnt).Name) > NameLenMax Then
NameLenMax = Len(NameDtl(InxNameCrnt).Name)
End If
Next
' Output headings
Debug.Print vbLf & "Contents of NameDtl table"
Debug.Print Space(NameLenMax + 10) & "Max"
Debug.Print Left("Name" & Space(NameLenMax), NameLenMax + 2) & _
"Output inx Predecessors"
' Output table contents
For InxNameCrnt = 1 To InxNameCrntMax
Debug.Print Left(NameDtl(InxNameCrnt).Name & Space(NameLenMax), _
NameLenMax + 4) & _
IIf(NameDtl(InxNameCrnt).Output, " True ", " False") & _
" " & Right(" " & _
NameDtl(InxNameCrnt).InxPredCrntMax, 3) & " ";
For InxPredCrnt = 1 To NameDtl(InxNameCrnt).InxPredCrntMax
Debug.Print " " & _
NameDtl(InxNameCrnt).Predecessor(InxPredCrnt);
Next
Debug.Print
Next
' Stage 4: Sequence names for list.
' =================================
' The output from the above routine for the test data is:
' Max
' Name Output inx Predecessors
' c False 2 b a
' d False 3 c b a
' e False 4 c d b a
' g False 3 c d e
' b False 0
' a False 0
' f False 3 a d e
' Note 1: All this information is in the sequence found.
' Note 2: We do not know the "true" sequence of b and a or of g and f.
' The loop below has three steps:
' 1) Transfer any names to NamesInSeq() that have not already been
' transferred and have a value of 0 for Max inx.
' 2) If no names are transferred, the loop has completed its task.
' 3) Remove any names transferred during this pass from the predecessor
' lists and mark the name as output.
' Before the loop NameInSeq() is empty, InxNISCrntMax = 0 and
' InxNISFirstThisPass = InxNISCrntMax+1 = 1.
' After step 1 of pass 1:
' NameInSeq(1) = "b" and NameInSeq(2) = "a"
' InxNISCrntMax = 2
' Entries InxNISFirstThisPass (1) to InxNISCrntMax (2) of NamesInSeq have
' been transferred during this pass so names a and b are removed from the
' lists by copying the last entry in each list over the name to be removed
' and reducing Max inx. For pass 1, only the list for f is changed.
' At the end of pass 1, NameDtl is:
' Max
' Name Output inx Predecessors
' c False 0
' d False 1 c
' e False 2 c d
' g False 3 c d e
' b True 0
' a True 0
' f False 2 e d
' During pass 2, c is moved to NamesInSeq and removed form the lists to give:
' Max
' Name Output inx Predecessors
' c True 0
' d False 0
' e False 1 d
' g False 2 e d
' b True 0
' a True 0
' f False 2 e d
' This process continues until all names have been transferred.
' Size array for total number of names.
ReDim NameInSeq(1 To InxNameCrntMax)
InxNISCrntMax = 0 ' Array empty
' Loop until every name has been moved
' from ProdecessorDtl to NameInSeq.
Do While True
Found = False ' No name found to move during this pass
' Record index of first name, if any, to be added during this pass
InxNISFirstThisPass = InxNISCrntMax + 1
' Transfer names without predecessors to NameInSeq()
For InxNameCrnt = 1 To InxNameCrntMax
If Not NameDtl(InxNameCrnt).Output Then
' This name has not been output
If NameDtl(InxNameCrnt).InxPredCrntMax = 0 Then
' This name has no predecessors or no predecessors that
' have not already been transferred to NameInSeq()
InxNISCrntMax = InxNISCrntMax + 1
NameInSeq(InxNISCrntMax) = NameDtl(InxNameCrnt).Name
NameDtl(InxNameCrnt).Output = True
Found = True
End If
End If
Next
If Not Found Then
' All names already transferred to NameInSeq
Exit Do
End If
' Remove references to names transferred to NameinSeq()
' during this pass
For InxNISCrnt = InxNISFirstThisPass To InxNISCrntMax
NameCrnt = NameInSeq(InxNISCrnt)
For InxNameCrnt = 1 To InxNameCrntMax
If Not NameDtl(InxNameCrnt).Output Then
' This name has not been output
For InxPredCrnt = 1 To NameDtl(InxNameCrnt).InxPredCrntMax
If NameCrnt = _
NameDtl(InxNameCrnt).Predecessor(InxPredCrnt) Then
' Remove this name by overwriting it
' with the last name in the list
NameDtl(InxNameCrnt).Predecessor(InxPredCrnt) = _
NameDtl(InxNameCrnt).Predecessor _
(NameDtl(InxNameCrnt).InxPredCrntMax)
NameDtl(InxNameCrnt).InxPredCrntMax = _
NameDtl(InxNameCrnt).InxPredCrntMax - 1
Exit For
End If
Next
End If
Next
Next
Loop
Debug.Print vbLf & "Name list"
For InxNISCrnt = 1 To InxNISCrntMax
Debug.Print NameInSeq(InxNISCrnt)
Next
' Stage 5: Transfer data
' ======================
' We now have everything we need for the transfer:
' * NameInSeq() contains the names in the output sequence
' * SheetValue() contains all the data from the source worksheet
' * RowSrcTableTitle() and RowSrcTableEnd() identify the
' start and end row of each table
With Worksheets("Jia Destination")
.Cells.EntireRow.Delete ' Clear destination sheet
ColDestCrnt = 1
.Cells(1, ColDestCrnt).Value = "Name"
' Output names
RowDestCrnt = 2
For InxNISCrnt = 1 To InxNISCrntMax
.Cells(RowDestCrnt, ColDestCrnt).Value = NameInSeq(InxNISCrnt)
RowDestCrnt = RowDestCrnt + 1
Next
' Output values from each table
For InxTableCrnt = 1 To InxTableCrntMax
RowSrcTableTitleCrnt = RowSrcTableTitle(InxTableCrnt)
RowSrcTableEndCrnt = RowSrcTableEnd(InxTableCrnt)
' Find value column, if any
Found = False
ColSrcCrnt = 2
Do While SheetValue(RowSrcTableTitleCrnt + 1, ColSrcCrnt) <> ""
If LCase(SheetValue(RowSrcTableTitleCrnt + 1, ColSrcCrnt)) = _
"value" Then
Found = True
Exit Do
End If
ColSrcCrnt = ColSrcCrnt + 1
Loop
If Found Then
' Value column found for this table
ColDestCrnt = ColDestCrnt + 1
' Transfer table name
.Cells(1, ColDestCrnt).Value = SheetValue(RowSrcTableTitleCrnt, 1)
' Transfer values
RowDestCrnt = 2
RowSrcCrnt1 = RowSrcTableTitleCrnt + 2
For InxNISCrnt = 1 To InxNISCrntMax
If NameInSeq(InxNISCrnt) = SheetValue(RowSrcCrnt1, 1) Then
' Value for this name in this table
.Cells(RowDestCrnt, ColDestCrnt).Value = _
SheetValue(RowSrcCrnt1, ColSrcCrnt)
' Value transferred from this row. Step to next if any
RowSrcCrnt1 = RowSrcCrnt1 + 1
If RowSrcCrnt1 > RowSrcTableEndCrnt Then
' No more rows in this table
Exit For
End If
End If
RowDestCrnt = RowDestCrnt + 1
Next
Else
Call MsgBox("Table starting at row " & RowSrcTableTitleCrnt & _
" does not have a value column", vbOKOnly)
End If
Next
End With
End Sub

How to "flatten" or "collapse" a 2D Excel table into 1D?

I have a two dimensional table with countries and years in Excel. eg.
1961 1962 1963 1964
USA a x g y
France u e h a
Germany o x n p
I'd like to "flatten" it, such that I have Country in the first col, Year in the second col, and then value in the third col. eg.
Country Year Value
USA 1961 a
USA 1962 x
USA 1963 g
USA 1964 y
France 1961 u
...
The example I present here is only a 3x4 matrix, but the real dataset i have is significantly larger (roughly 50x40 or so).
Any suggestions how I can do this using Excel?
You can use the excel pivot table feature to reverse a pivot table (which is essentially what you have here):
Good instructions here:
http://spreadsheetpage.com/index.php/tip/creating_a_database_table_from_a_summary_table/
Which links to the following VBA code (put it in a module) if you don't want to follow the instructions by hand:
Sub ReversePivotTable()
' Before running this, make sure you have a summary table with column headers.
' The output table will have three columns.
Dim SummaryTable As Range, OutputRange As Range
Dim OutRow As Long
Dim r As Long, c As Long
On Error Resume Next
Set SummaryTable = ActiveCell.CurrentRegion
If SummaryTable.Count = 1 Or SummaryTable.Rows.Count < 3 Then
MsgBox "Select a cell within the summary table.", vbCritical
Exit Sub
End If
SummaryTable.Select
Set OutputRange = Application.InputBox(prompt:="Select a cell for the 3-column output", Type:=8)
' Convert the range
OutRow = 2
Application.ScreenUpdating = False
OutputRange.Range("A1:C3") = Array("Column1", "Column2", "Column3")
For r = 2 To SummaryTable.Rows.Count
For c = 2 To SummaryTable.Columns.Count
OutputRange.Cells(OutRow, 1) = SummaryTable.Cells(r, 1)
OutputRange.Cells(OutRow, 2) = SummaryTable.Cells(1, c)
OutputRange.Cells(OutRow, 3) = SummaryTable.Cells(r, c)
OutputRange.Cells(OutRow, 3).NumberFormat = SummaryTable.Cells(r, c).NumberFormat
OutRow = OutRow + 1
Next c
Next r
End Sub
-Adam
In Excel 2013 need to follow next steps:
select data and convert to table (Insert -> Table)
call Query Editor for table (Power Query -> From Table)
select columns that contain years
in context menu select 'Unpivot Columns'-command.
Support Office: Unpivot columns (Power Query)
In Excel 2016, Power Query is called Get & Transform and it is found in the Data tab.
#Adam Davis's answer is perfect, but just in case you're as clueless as I am about Excel VBA, here's what I did to get the code working in Excel 2007:
Open the workbook with the Matrix that needs to be flattened to a table and navigate to that worksheet
Press Alt-F11 to open the VBA code editor.
On the left pane, in the Project box, you'll see a tree structure representing the excel objects and any code (called modules) that already exist. Right click anywhere in the box and select "Insert->Module" to create a blank module file.
Copy and paste #Adman Davis's code from above as is into the blank page the opens and save it.
Close the VBA editor window and return to the spreadsheet.
Click on any cell in the matrix to indicate the matrix you'll be working with.
Now you need to run the macro. Where this option is will vary based on your version of Excel. As I'm using 2007, I can tell you that it keeps its macros in the "View" ribbon as the farthest right control. Click it and you'll see a laundry list of macros, just double click on the one called "ReversePivotTable" to run it.
It will then show a popup asking you to tell it where to create the flattened table. Just point it to any empty space an your spreadsheet and click "ok"
You're done! The first column will be the rows, the second column will be the columns, the third column will be the data.
Flattening a data matrix (aka Table) can be accomplished with one array formula¹ and two standard formulas.
      
The array formula¹ and two standard formulas in G3:I3 are is,
=IFERROR(INDEX(A$2:A$4, MATCH(0, IF(COUNTIF(G$2:G2, A$2:A$4&"")<COUNT($1:$1), 0, 1), 0)), "")
=IF(LEN(G3), INDEX($B$1:INDEX($1:$1, MATCH(1E+99,$1:$1 )), , COUNTIF(G$3:G3, G3)), "")
=INDEX(A:J,MATCH(G3,A:A,0),MATCH(H3,$1:$1,0))
Fill down as necessary.
While array formulas can negatively impact performance due to their cyclic calculation, your described working environment of 40 rows × 50 columns should not overly impact performance with a calculation lag.
¹ Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. Try and reduce full-column references to ranges more closely representing the extents of your actual data. Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum. See Guidelines and examples of array formulas for more information.
For anyone who wants to use the PivotTable to do this and is following the below guide:
http://spreadsheetpage.com/index.php/tip/creating_a_database_table_from_a_summary_table/
If you want to do it in Excel 2007 or 2010 then you first need to enable the PivotTable Wizard.
To find the option you need to go to "Excel Options" via the Main Excel Window icon, and see the options selected in the "customize" section, then select "Commands Not in the Ribbon" from the "Choose Commands from:" dropdown and "PivotTable and PivotChart Wizard" needs to be added to the right.. see the image below.
Once that is done there should be a small pivottable wizard icon in the quickbar menu at the top of the Excel window, you can then follow the same process as shown in the link above.
I developed another macro because I needed to refresh the output table quite often (input table was filled by other) and I wanted to have more info in my output table (more copied column and some formulas)
Sub TableConvert()
Dim tbl As ListObject
Dim t
Rows As Long
Dim tCols As Long
Dim userCalculateSetting As XlCalculation
Dim wrksht_in As Worksheet
Dim wrksht_out As Worksheet
'##block calculate and screen refresh
Application.ScreenUpdating = False
userCalculateSetting = Application.Calculation
Application.Calculation = xlCalculationManual
'## get the input and output worksheet
Set wrksht_in = ActiveWorkbook.Worksheets("ressource_entry")'## input
Set wrksht_out = ActiveWorkbook.Worksheets("data")'## output.
'## get the table object from the worksheet
Set tbl = wrksht_in.ListObjects("Table14") '## input
Set tb2 = wrksht_out.ListObjects("Table2") '## output.
'## delete output table data
If Not tb2.DataBodyRange Is Nothing Then
tb2.DataBodyRange.Delete
End If
'## count the row and col of input table
With tbl.DataBodyRange
tRows = .Rows.Count
tCols = .Columns.Count
End With
'## check every case of the input table (only the data part)
For j = 2 To tRows '## parse all row from row 2 (header are not checked)
For i = 5 To tCols '## parse all column from col 5 (first col will be copied in each record)
If IsEmpty(tbl.Range.Cells(j, i).Value) = False Then
'## if there is time enetered create a new row in table2 by using the first colmn of the selected cell row and cell header plus some formula
Set oNewRow = tb2.ListRows.Add(AlwaysInsert:=True)
oNewRow.Range.Cells(1, 1).Value = tbl.Range.Cells(j, 1).Value
oNewRow.Range.Cells(1, 2).Value = tbl.Range.Cells(j, 2).Value
oNewRow.Range.Cells(1, 3).Value = tbl.Range.Cells(j, 3).Value
oNewRow.Range.Cells(1, 4).Value = tbl.Range.Cells(1, i).Value
oNewRow.Range.Cells(1, 5).Value = tbl.Range.Cells(j, i).Value
oNewRow.Range.Cells(1, 6).Formula = "=WEEKNUM([#Date])"
oNewRow.Range.Cells(1, 7).Formula = "=YEAR([#Date])"
oNewRow.Range.Cells(1, 8).Formula = "=MONTH([#Date])"
End If
Next i
Next j
ThisWorkbook.RefreshAll
'##unblock calculate and screen refresh
Application.ScreenUpdating = True
Application.Calculate
Application.Calculation = userCalculateSetting
End Sub
VBA solution may not be acceptable under some situations (e.g. cannot embed macro due to security reasons, etc.). For these situations, and otherwise too in general, I prefer using formulae over macro.
I am trying to describe my solution below.
input data as shown in question (B2:F5)
column_header (C2:F2)
row_header (B3:B5)
data_matrix (C3:F5)
no_of_data_rows (I2) = COUNTA(row_header) + COUNTBLANK(row_header)
no_of_data_columns (I3) = COUNTA(column_header) + COUNTBLANK(column_header)
no_output_rows (I4) = no_of_data_rows*no_of_data_columns
seed area is K2:M2, which is blank but referenced, hence not to be deleted
K3 (drag through say K100, see comments description) = ROW()-ROW($K$2) <= no_output_rows
L3 (drag through say L100, see comments description) = IF(K3,IF(COUNTIF($L$2:L2,L2)
M3 (drag through say M100, see comments description) = IF(K3,IF(M2 < no_of_data_columns,M2+1,1),"-")
N3 (drag through say N100, see comments description) = INDEX(row_header,L3)
O3 (drag through say O100, see comments description) = INDEX(column_header,M3)
P3 (drag through say P100, see comments description) = INDEX(data_matrix,L3,M3)
Comment in K3: Optional: Check if expected no. of output rows has been achieved. Not required, if one only prepares this table limited to no. of output rows.
Comment in L3: Goal: Each RowIndex (1 .. no_of_data_rows) must repeat no_of_data_columns times. This will provide index lookup for row_header values. In this example, each RowIndex (1 .. 3) must repeat 4 times. Algorithm: Check how many times RowIndex has occurred yet. If it less than no_of_data_columns times, continue using that RowIndex, else increment the RowIndex. Optional: Check if expected no. of output rows has been achieved.
Comment in M3: Goal: Each ColumnIndex (1 .. no_of_data_columns) must repeat in a cycle. This will provide index lookup for column_header values. In this example, each ColumnIndex (1 .. 4) must repeat in a cycle. Algorithm: If ColumnIndex exceeds no_of_data_columns, restart the cycle at 1, else increment the ColumnIndex. Optional: Check if expected no. of output rows has been achieved.
Comment in R4: Optional: Use column K for error handling, as shown in column L and column M. Check if looked up value IsBlank to avoid incorrect "0" in the output because of blank input in data_matrix.
updated ReversePivotTable function so i can specify number of header columns and rows
Sub ReversePivotTable()
' Before running this, make sure you have a summary table with column headers.
' The output table will have three columns.
Dim SummaryTable As Range, OutputRange As Range
Dim OutRow As Long
Dim r As Long, c As Long
Dim lngHeaderColumns As Long, lngHeaderRows As Long, lngHeaderLoop As Long
On Error Resume Next
Set SummaryTable = ActiveCell.CurrentRegion
If SummaryTable.Count = 1 Or SummaryTable.Rows.Count < 3 Then
MsgBox "Select a cell within the summary table.", vbCritical
Exit Sub
End If
SummaryTable.Select
Set OutputRange = Application.InputBox(prompt:="Select a cell for the 3-column output", Type:=8)
lngHeaderColumns = Application.InputBox(prompt:="Header Columns")
lngHeaderRows = Application.InputBox(prompt:="Header Rows")
' Convert the range
OutRow = 2
Application.ScreenUpdating = False
'OutputRange.Range("A1:D3") = Array("Column1", "Column2", "Column3", "Column4")
For r = lngHeaderRows + 1 To SummaryTable.Rows.Count
For c = lngHeaderColumns + 1 To SummaryTable.Columns.Count
' loop through all header columns and add to output
For lngHeaderLoop = 1 To lngHeaderColumns
OutputRange.Cells(OutRow, lngHeaderLoop) = SummaryTable.Cells(r, lngHeaderLoop)
Next lngHeaderLoop
' loop through all header rows and add to output
For lngHeaderLoop = 1 To lngHeaderRows
OutputRange.Cells(OutRow, lngHeaderColumns + lngHeaderLoop) = SummaryTable.Cells(lngHeaderLoop, c)
Next lngHeaderLoop
OutputRange.Cells(OutRow, lngHeaderColumns + lngHeaderRows + 1) = SummaryTable.Cells(r, c)
OutputRange.Cells(OutRow, lngHeaderColumns + lngHeaderRows + 1).NumberFormat = SummaryTable.Cells(r, c).NumberFormat
OutRow = OutRow + 1
Next c
Next r
End Sub
Code with the claim for some universality
The book should have two sheets:
Sour = Source data
Dest = the "extended" table will drop here
Option Explicit
Private ws_Sour As Worksheet, ws_Dest As Worksheet
Private arr_2d_Sour() As Variant, arr_2d_Dest() As Variant
' https://stackoverflow.com/questions/52594461/find-next-available-value-in-excel-cell-based-on-criteria
Public Sub PullOut(Optional ByVal msg As Variant)
ws_Dest_Acr _
arr_2d_ws( _
arr_2d_Dest_Fill( _
arr_2d_Sour_Load( _
arr_2d_Dest_Create( _
CountA_rng( _
rng_2d_For_CountA( _
Init))))))
End Sub
Private Function ws_Dest_Acr(Optional ByVal msg As Variant) As Variant
ws_Dest.Activate
End Function
Public Function arr_2d_ws(Optional ByVal msg As Variant) As Variant
If IsArray(arr_2d_Dest) Then _
ws_Dest.Cells(1, 1).Resize(UBound(arr_2d_Dest), UBound(arr_2d_Dest, 2)) = arr_2d_Dest
End Function
Private Function arr_2d_Dest_Fill(Optional ByVal msg As Variant) As Variant
Dim y_Sour As Long, y_Dest As Long, x As Long
y_Dest = 1
For y_Sour = LBound(arr_2d_Sour) To UBound(arr_2d_Sour)
' without the first column
For x = LBound(arr_2d_Sour, 2) + 1 To UBound(arr_2d_Sour, 2)
If arr_2d_Sour(y_Sour, x) <> Empty Then
arr_2d_Dest(y_Dest, 1) = arr_2d_Sour(y_Sour, 1) 'iD
arr_2d_Dest(y_Dest, 2) = arr_2d_Sour(y_Sour, x) 'DTLx
y_Dest = y_Dest + 1
End If
Next
Next
End Function
Private Function arr_2d_Sour_Load(Optional ByVal msg As Variant) As Variant
arr_2d_Sour = ReDuce_rng(ws_Sour.UsedRange, 1, 0).Offset(1, 0).Value
End Function
Private Function arr_2d_Dest_Create(ByVal iRows As Long)
Dim arr_2d() As Variant
ReDim arr_2d(1 To iRows, 1 To 2)
arr_2d_Dest = arr_2d
arr_2d_Dest_Create = arr_2d
End Function
Public Function CountA_rng(ByVal rng As Range) As Double
CountA_rng = Application.WorksheetFunction.CountA(rng)
End Function
Private Function rng_2d_For_CountA(Optional ByVal msg As Variant) As Range
' without the first line and without the left column
Set rng_2d_For_CountA = _
ReDuce_rng(ws_Sour.UsedRange, 1, 1).Offset(1, 1)
End Function
Public Function ReDuce_rng(rng As Range, ByVal iRow As Long, ByVal iCol As Long) _
As Range
With rng
Set ReDuce_rng = .Resize(.Rows.Count - iRow, .Columns.Count - iCol)
End With
End Function
Private Function Init()
With ThisWorkbook
Set ws_Sour = .Worksheets("Sour")
Set ws_Dest = .Worksheets("Dest")
End With
End Function
'https://youtu.be/oTp4aSWPKO0