I am trying to write a macro on MS Excel, which will enable me to create random samples and pick random values from those samples for each category in the data.
To be more specific, the data is at 2 levels: firm and year, where each row represents a firm-year-peer observation. For each firm i, at a given year j, we have number of actual peers.
What I want to do is assign to each firm, from the whole sample throughout many years, a random firm from the list of all available firms at that specific year. The trick is that the number of firms to be assigned should be identical to the number of actual peers that a firm has at that year. Also, the randomly assigned values should be different from the firm's actual peers, and of course, the firm itself.
i j k
1 2006 100
1 2006 105
1 2006 110
2 2006 113
2 2006 155
2 2006 200
2 2006 300
For example, Firm 1's actual peers in year 2006 are 100, 105 and 110. However, all possible firms available are 100, 105, 110, 113, 155, 200 and 300. This means that I have to select 3 (because Firm 1 has 3 actual peers) random fictional peers from the 4 firms that are not Firm 1's peer that year (i.e. 113, 155, 200 and 300). Applying the same procedure for Firm 2, I need to select 4 random firms that are not Firm 2's actual peers from all possible firms.
I hope this was clear.
I started trying this function out on MS Excel, but I am open to suggestions if you think other platforms would be more useful.
Your help would be very much appreciated!
Thanks!
Many thanks to everyone who has visited my post.
After some initial struggling, I have managed to figure out the code myself. I am posting it below for anyone who might need it.
Basically I used the randomisation code posted by this gentle soul, and enhanced it for my needs using couple of flags for each new firm and each new year. Hope it is clear for everyone.
Best
Sub Random_Sampling()
'
Dim PeerCount, FirmCount, YearCount As Long
Dim Focal_CIK, fiscalYear As Long
Const nItemsTotal As Long = 1532
Dim rngList As Range
Dim FirmYearRange As Range
Dim FirmStart, FirmStartRow, YearStartRow As Long
Dim ExistingPeers As Range
Dim idx() As Long
Dim varRandomItems() As Variant
Dim i, j, k, m, n As Long
Dim iCntr, jCntr As Long
Dim booIndexIsUnique As Boolean
Set rngList = Sheets("Sheet2").Range("A2").Resize(nItemsTotal, 1)
FirmCount = Cells(2, 10).Value
For k = 1 To FirmCount
FirmStart = Application.WorksheetFunction.Match(k, Columns("E"), 0)
Focal_CIK = Cells(FirmStart, 1).Value
YearCount = Cells(FirmStart, 7).Value
For m = 1 To YearCount
Set FirmYearRange = Range("H" & FirmStart & ":H200000")
YearStartRow = Application.WorksheetFunction.Match(m, FirmYearRange, 0) + FirmStart - 1
fiscalYear = Cells(YearStartRow, 3).Value
PeerCount = Cells(YearStartRow, 9).Value
Set ExistingPeers = Range(Cells(YearStartRow + PeerCount, 2), Cells(YearStartRow + PeerCount, 2))
ReDim idx(1 To PeerCount)
ReDim varRandomItems(1 To PeerCount)
For i = 1 To PeerCount
Do
booIndexIsUnique = True ' Innoncent until proven guilty
idx(i) = Int(nItemsTotal * Rnd + 1)
For j = 1 To i - 1
If idx(i) = idx(j) Then 'Is already picked
ElseIf idx(i) = Focal_CIK Then 'Is the firm itself
booIndexIsUnique = False 'If true, don't pick it
Exit For
End If
For n = 1 To PeerCount
If idx(i) = Cells(YearStartRow + n - 1, 2).Value Then 'Is one of the actual peers
booIndexIsUnique = False 'If true, don't pick it
Exit For
Exit For
End If
Next n
Next j
If booIndexIsUnique = True Then
Exit Do
End If
Loop
varRandomItems(i) = rngList.Cells(idx(i), 1)
Rows(YearStartRow + PeerCount).EntireRow.Insert
'The order of the columns are very important for the following lines
Cells(YearStartRow + PeerCount, 1) = Focal_CIK
Cells(YearStartRow + PeerCount, 2) = varRandomItems(i)
Cells(YearStartRow + PeerCount, 3) = fiscalYear
Cells(YearStartRow + PeerCount, 4) = "0"
Next i
Next m
Next k
End Sub
Related
I have used Variants in the past for something similar, but it was one dimensional in its solution. I am wondering if utilizing Variants with two dimensions would be feasible.
I have ever changing list of dates that correspond with a week, that will be entered in chronologically. The next column is the count of that week. Column C is the building that it took place in.
For example, the first row shown in the image above takes place the week of "10/2/2016" and there was a count of 8 that week which means the buildings in rows 2-9 are correlated with that week, and it continues on for each corresponding week.
I have the sum of the counts for each year, so for the chart in "E1:G14", I want to count each time the building is counted for each year, respectively. I am just confused as to how to approach it and if Variants would be useful or if using a CountIfs for the ranges would work best.
Thank you in advance.
CODE
Private Sub maybe()
Dim sht As Worksheet: Set sht = Worksheets("Sheet3")
Dim wk_cnt As Double: wk_cnt = sht.Range("A1", sht.Range("A1").End(xlDown)).Rows.Count
Dim bld_cnt As Double: bld_cnt = sht.Range("C2", sht.Range("C2").End(xlDown)).Rows.Count
Dim cnt As Double
Dim yrs_cnt As Double
If sht.Range("D3").Value = "" Then
yrs_cnt = 1
Else:
yrs_cnt = sht.Range("D2", sht.Range("D2").End(xlDown)).Rows.Count
End If
Dim yrsArray As Range
If sht.Range("D3").Value = "" Then
Set yrsArray = sht.Range("D2")
Else:
Set yrsArray = sht.Range("D2", sht.Range("D2").End(xlDown))
End If
Dim vCnts As Variant
ReDim vCnts(1 To 12, 1 To yr_cnt)
vCnts(1, 1) = "Irving Building"
vCnts(2, 1) = "Memorial Building"
vCnts(3, 1) = "West Tower"
vCnts(4, 1) = "Witting Surgical Center"
vCnts(5, 1) = "Madison Irving Surgery Center"
vCnts(6, 1) = "Marley Education Center"
vCnts(7, 1) = "410 South Crouse"
vCnts(8, 1) = "Physicians Office Building"
vCnts(9, 1) = "Crouse Business Center"
vCnts(10, 1) = "Commonwealth Place"
vCnts(11, 1) = "Crouse Garage"
vCnts(12, 1) = "CNY Medical Center"
For x = 1 To yrs_cnt
cnt = 0
For y = 2 To wk_cnt
If Year(sht.Cells(y, 1).Value) = sht.Cells(1, x + 5).Value Then
cnt = cnt + sht.Cells(y, 2).Value
sht.Cells(14, x + 5) = cnt
End If
Next y
Next x
End Sub
EDIT
With Column C
With only Columns A & B
I need the numbers to match the second image, but when I include all three columns it looks like the first image after I group it by years year. How can I fix that?
It looks like your PivotTable is using the "Count" column improperly. Where it says "Count of Count", it's telling you that the number shown is how many lines on your data range fit the selected criteria. I think if you change the Value Field Settings to SUM you will be pleased with the difference. See below:
First time poster!
I am hoping someone can help with my a VBA code. I have some experience with VBA coding, but I don't have the knowledge or expertise to handle the task I am facing.
I have a report of entities and their owners.
With this report, you can follow the ownership chain of each entity.
Here is an example of the Report:
Entity #, Entity Name, Parent #, Parent Name, Owner % Inside
100 Entity 1 200 Entity2 100 Yes
200 Entity 2 300 Entity 3 50 Yes
200 Entity 2 400 Entity 4 50 Yes
500 Entity 5 600 Entity 6 100 Yes
600 Entity 6 700 Entity 7 25 Yes
600 Entity 6 800 Entity 8 25 Yes
600 Entity 6 900 Entity 9 50 Yes
800 Entity 8 1200 Entity 12 100 Yes
900 Entity 9 1000 Entity 10 25 No
900 Entity 9 1100 Entity 11 75 Yes
So basically, Entity one is owned 100% by Entity 2. Entity 2 is owned by 50% by Entity 4 and Entity 5. Entity 3 and 4 is not owned by any affiliates. Entity 5 is owned 100% by Entity 6. Entity 6 is owned 25% to Entity 7, 25% by entity 8 and 50% by entity 9 . Entity 8 is owned 100% by entity 12. Entity 9 is owned 25% by entity 10 and 75% by Entity 11. Entity 10 is not an affiliate.
The code should calculate the Pick-up % of the lower entity [100 & 500]. In this case, the Pick-up % for 100 will be 100% because all of the entities in the chain are affiliates. While the pick-up% for 500 is 75% because entity 1000 is not an affiliate.
I have started and stop writing this code at least ten times and each time I get stuck along the way. Here is my issue: In reality, the chain could go up 7 to 8 levels. Once I get back past level two, I do not know how to calculate the pickup % of the entity has multiple owners. For instance, if you look at my table up top. Once I calculate the ownership for 600, I can't figure how to extend the chain to owners of 800 and 900.
Here is a diagram of the ownership structures:
Here is the code I have so far:
Sub ownerinterest()
Sheets("Copyii").Activate
Set dict3 = New Dictionary
nRowCount = Cells(Rows.Count, "B").End(xlUp).Row
arowcount = Cells(Rows.Count, "AA").End(xlUp).Row
ReportArray = Range(Cells(1, "AA"), Cells(arowcount, "AB"))
For i = 2 To nRowCount
GemC = Left(Cells(i, "a"), 5)
ParentC = Cells(i, "d")
PctC = (Cells(i, "J") / 100)
OwnerC = Cells(i, "h")
EntityC = Cells(i, "b")
d = i
If (Not (dict3.Exists(GemC))) Then
Set GEMclass = New Gclass
dict3.Add GemC, GEMclass
dict3(GemC).e = EntityC
dict3(GemC).P = ParentC
dict3(GemC).O = OwnerC
dict3(GemC).Num = d
dict3(GemC).g = GemC
End If
Call countlevels
dict3(GemC).Pct = PctC
Next i
Call Calculepickup
End Sub
Sub countlevels()
For e = LBound(ReportArray, 1) To UBound(ReportArray, 1)
If GemC = ReportArray(e, 1) Then
If ReportArray(e, 2) > 1 Then
Pcount = ReportArray(e, 2)
PctC = 0
For f = 1 To Pcount
TPct = Cells(i + f - 1, "J")
PctC = TPct + PctC
Next f
Exit For
Else
PctC = PctC
Exit For
End If
End If
Next e
End Sub
Sub Calculepickup()
Dim g As Long, h As Integer, j As Integer, m As Integer
Dim NewGem As String
Dim Tpct2 As Double
Dim MainArray() As Variant
Dim MainRange As Range
m = Cells(Rows.Count, "A").End(xlUp).Row
Set MainRange = Range("a1:J" & m)
MainArray() = MainRange
For g = 0 To dict3.Count - 1
Set GEMclass = dict3.Items(g)
ReportGEM = GEMclass.P
GemC = GEMclass.g
PctC = GEMclass.Pct
Debug.Print GemC & "|" & ReportGEM & "|" & PctC
For h = 0 To dict3.Count - 1
If (dict3.Exists(ReportGEM)) Then
NewGem = ReportGEM
For j = LBound(ReportArray) To UBound(ReportArray)
If NewGem = ReportArray(j, 1) Then
If ReportArray(j, 2) > 1 Then
Pcount = 0
Pcount = ReportArray(j, 2)
Tpct2 = 0
Dim K As Integer
For K = LBound(MainArray, 1) To UBound(MainArray, 1)
Dim GEMk As String
GEMk = MainArray(K, 1)
If NewGem = GEMk Then
Debug.Print GEMk & "|" & K
For f = 1 To Pcount
TPct = Cells(K + f - 1, "J")
Debug.Print TPct
Tpct2 = TPct + Tpct2
Debug.Print Tpct2
Next f
Exit For
End If
Next K
End If
End If
Next j
End If
Next h
Next g
End Sub
I believe that the following will do what you want. (It's probably the only real way to associate an "ownership percentage" based on multiple parents each with their own "ownership percentage".)
Public entities As New Dictionary
Public MainArray() As Variant
'I have assumed that the table you posted in the question represented columns A to F of an Excel spreadsheet.
'Change the following constants so it suits your actual layout.
Const colEntity As Integer = 1 ' Assumed column A
Const colParent As Integer = 3 ' Assumed column C
Const colPct As Integer = 5 ' Assumed column E
Const colInside As Integer = 6 ' Assumed column F
Sub Calculepickup()
Dim g As Integer, r As Integer, m As Integer
Dim MainRange As Range
m = Cells(Rows.Count, "A").End(xlUp).Row
Set MainRange = Range("a2:J" & m)
MainArray() = MainRange
'Add each entity to a dictionary, and flag the percentage as uncalculated by setting it to -1
For g = 1 To UBound(MainArray, 1)
If Not entities.Exists(MainArray(g, colEntity)) Then
entities.Add MainArray(g, colEntity), -1
End If
If Not entities.Exists(MainArray(g, colParent)) Then
If MainArray(g, colInside) = "No" Then
'If the entity isn't "inside" store the fact that it is 0% owned
entities.Add MainArray(g, colParent), 0
Else
entities.Add MainArray(g, colParent), -1
End If
End If
Next
r = 0
For Each e In entities.Keys
CalculatePct e
'Write results to columns N and O just so that we can see them
r = r + 1
Cells(r, 14) = e
Cells(r, 15) = entities(e)
Next
End Sub
Sub CalculatePct(e As Variant)
Dim g As Integer
Dim pct As Double
Dim Owned100Pct As Boolean
If entities(e) < 0 Then
pct = 0
Owned100Pct = True ' Keeps track if the entity exists in the table other than as a parent
For g = 1 To UBound(MainArray, 1)
If MainArray(g, colEntity) = e Then
Owned100Pct = False
If entities(MainArray(g, colParent)) = -1 Then
'If we don't know the parent's ownership percentage, go and calculate it
CalculatePct MainArray(g, colParent)
End If
pct = pct + CDbl(MainArray(g, colPct)) / 100 * entities(MainArray(g, colParent))
End If
Next
If Owned100Pct Then
'Assume 100% owned if we don't know the parentage
'("Outside" entities won't go through here as they are already set to 0%)
entities(e) = 1
Else
'Store the entity's percentage
entities(e) = pct
End If
End If
End Sub
So, I made a fun and simple macro that randomly selects R, G, and B values until it uses every possible combination (skipping repeats), and setting the color values of a 10x10 square with each new color.
The only problem is that I have run into the limit for the number of cell formats. Microsoft says that the limit should be around 64000, but I found it to be exactly 65429 on a blank workbook in Excel 2013.
I've included a clear format code, but it seems to have no effect:
Cells(X, Y).ClearFormats
Microsoft lists some resolutions, but 3 out of the 4 of them are essentially "Don't make too many formats", and the 4th format is to use a third party application.
Is there really nothing that can be done in VBA?
A1:J10 will print a new color
K1 will print the percentage to completion
L1 will print the number of colors used
M1 will print the number of times a color combination is repeated
Dim CA(255, 255, 255) As Integer
Dim CC As Long
Dim RC As Long
Dim R As Integer
Dim G As Integer
Dim B As Integer
Dim X As Integer
Dim Y As Integer
CC = 0
RC = 0
X = 1
Y = 1
Do While ColorCount < 16777216
R = ((Rnd * 256) - 0.5)
G = ((Rnd * 256) - 0.5)
B = ((Rnd * 256) - 0.5)
If CA(R, G, B) <> 1 Then
CA(R, G, B) = 1
'Step down to the next row
'If at the 10th row, jump back to the first and move to the next column
If X < 10 Then
X = X + 1
Else
X = 1
If Y < 10 Then
Y = Y + 1
Else
Y = 1
End If
End If
Cells(X, Y).ClearFormats 'doesn't do what I hope :(
Cells(X, Y).Interior.Color = RGB(R, G, B)
CC = CC + 1
Cells(1, 11).Value = (CC / 16777216) * 100
Cells(1, 12).Value = CC
Else
RC = RC + 1
Cells(1, 13).Value = RC
End If
Loop
There are several ways to resolve this issue, but the cleanest and easiest method is to remove all extra styles (I have seen workbooks with 9000+ styles )
With the following simple VBA code you can remove all non-builtin styles and in the vast majority of cases this fixes the error.
Sub removeStyles()
Dim li as long
On Error Resume Next
With ActiveWorkbook
For li = .Styles.Count To 1 Step -1
If Not .Styles(li).BuiltIn Then
.Styles(li).Delete
End If
Next
End With
End Sub
I have raw data I'm trying to sort out by date, the data is in this form:
month:april-2014
offer | value
ofr x | 2132
ofr y | 135
.
.
.
month:mai-2014
offer | value
ofr x | 5115
ofr z | 513
ofr y | 651
and it goes on, there are offers that apear every month and others that dissapear.
I wanted it to look like this :
offer | april-2014 |mai 14 | june ....
ofr x 123 5 6
ofr y 5 1 6
ofr z
ofr a
.
.
any help would be appreciated, thank you
Try to restructure the data like this and use pivot tables?
Date | offer | value
may-2014 |ofr x | 5115
may-2014 |ofr z | 513
may-2014 |ofr y | 651
This first chunk of code is going through and rearranging things for you. The other important thing it does is only sends one column from your selected range to the function. Some important things to remember are you may need to write the search criteria if you key word for "month" is not in the same spot in the text, the word offer is not by itself with no spaces in the following row. Another point of note, is this is treating everything as is. That means if the source cell was text, then the destination cell will be text. To convert from date as text to date as Excel serial that is a separate issue and there are plenty of ways to achieve that as well.
Option Explicit
Sub SortOffer(OfferList As Range)
Dim CounterX As Long, CounterY As Long, jCounter As Long, icounter As Long, MonthCount As Long, UniqueOffers As Long
Dim inlist As Boolean
Dim unsorted() As Variant
Dim sorted() As Variant
MonthCount = WorksheetFunction.CountIf(OfferList, "month*")
UniqueOffers = CountUnique(OfferList.Columns(1).Cells) - MonthCount - 1
ReDim sorted(1 To UniqueOffers + 1, 1 To MonthCount + 1) As Variant
unsorted = OfferList
CounterX = 1
jCounter = 1
sorted(1, 1) = "offer"
For CounterY = LBound(unsorted, 1) To UBound(unsorted, 1)
If Left(unsorted(CounterY, 1), 5) = "month" Then
CounterX = CounterX + 1
sorted(1, CounterX) = Right(unsorted(CounterY, 1), Len(unsorted(CounterY, 1)) - 6)
Else
inlist = False
For icounter = 2 To jCounter
If unsorted(CounterY, 1) = sorted(icounter, 1) Then
sorted(icounter, CounterX) = unsorted(CounterY, 2)
inlist = True
End If
Next icounter
If Not inlist And unsorted(CounterY, 1) <> "offer" And unsorted(CounterY, 1) <> "" Then
jCounter = jCounter + 1
sorted(jCounter, 1) = unsorted(CounterY, 1)
sorted(jCounter, CounterX) = unsorted(CounterY, 2)
End If
End If
Next CounterY
Range("F1").Resize(UBound(sorted, 1), UBound(sorted, 2)).Value = sorted
End Sub
This next function counts the number of unique entries in a range and does not count spaces. I stumbled across this code on this web page. If you subtract the number of months from this count, you will know how many offers are in your table. This is important because it will tell you how to size your array(alt link) that you will later write back as your results
Function CountUnique(ByVal MyRange As Range) As Integer
Dim Cell As Range
Dim J As Integer
Dim iNumCells As Integer
Dim iUVals As Integer
Dim sUCells() As String
iNumCells = MyRange.Count
ReDim sUCells(iNumCells) As String
iUVals = 0
For Each Cell In MyRange
If Cell.Text > "" Then
For J = 1 To iUVals
If sUCells(J) = Cell.Text Then
Exit For
End If
Next J
If J > iUVals Then
iUVals = iUVals + 1
sUCells(iUVals) = Cell.Text
End If
End If
Next Cell
CountUnique = iUVals
End Function
Now just in case the links don't cover it, this answer which was a learning lesson for me was taught in various parts to me by #JNevill, #Ralph, #findwindow, #Gary'sStudent and #ScottCraner. Appologies if I missed someone. I am also sure any of these individuals could do it slicker and take less then 10 hours to write it 8).
Currently I am debugging a piece of code. Currently my code works as intended it assigns a date of to the finaldate variable then looks in the code to delete all dates that are higher than the finaldate variable. Only problem is that the sub procedure needs to be run multiple times in order for this to take effect. For instance when I run through it once it removes about half of the dates, run through it again and it does the same, I usually F5 it about 5 times to confirm its complete. While this is fine in debugging I need to know this will work perfectly everytime.
Sub Remove_Unecessary_Data_1()
Dim ALLCs As Worksheet
Dim DS As Worksheet
Dim finaldate As Date
Set DS = Sheets("Data Summary")
Set ALLCs = Sheets("Asset LLC (Input)")
ALLCs.Select
For y = 1 To 40
If InStr(1, Cells(13, y), "Timestamp of Execution") Then
finaldate = ALLCs.Cells(50, y)
End If
Next
ALLCs.Select
For u = 1 To 40
If InStr(1, Cells(13, u), "Start Date") Then
For p = 2 To 69584
If Cells(p + 14, u) > finaldate Then
Cells(p + 14, u).EntireRow.Delete
End If
Next
End If
Next
end sub
EDIT: Sample Data
Cells(50,y) = 1/12/15
finaldate = Cells(50,Y)
the column headed Start date contains dates that range anywhere from 1/05/15 to 1/30/15.
When working properly all dates after 1/12/15 should have their entire row eliminated.
When deleting rows, you have to work your way from bottom to top, otherwise you end up skipping rows.
For example, you have:
Line 1
>Line 2
Line 3
Line 4
When your code deletes, Line 2, what was "Row" 3 now becomes "Row "2, but you code moves on to see Line 4. Your data now looks like this:
Line 1
Line 3
>Line 4
If you change this bit of your code:
For p = 2 To 69584
If Cells(p + 14, u) > finaldate Then
Cells(p + 14, u).EntireRow.Delete
End If
Next
to this:
For p = 69598 to 16 step - 1
If Cells(p, u) > finaldate Then
Cells(p, u).EntireRow.Delete
End If
Next
Everything should be fine.
*Note: I adjusted your start & end points up by 14, and removed the + 14 from the Cells() reference. No sense in doing the extra math in there...
When you delete a row using:
Cells(p + 14, u).EntireRow.Delete
the row below the deleted row moves up to occupy that space. If that row contains a date that should be deleted it will be ignored because the counter automatically moves onto the next row. For example, say we wish to delete any rows with C or D in the Data column:
Row Number Data
1 A
2 B
3 C
4 D
5 E
becomes:
Row Number Data
1 A
2 B
3 D
4 E
Tthe row counter moves onto 4 without checking the new value in 3 so the D will not be deleted.
You can solve this by changing your If...Then statement to a Do...While loop:
Sub Remove_Unecessary_Data_1()
Dim ALLCs As Worksheet
Dim DS As Worksheet
Dim finaldate As Date
Set DS = Sheets("Data Summary")
Set ALLCs = Sheets("Asset LLC (Input)")
ALLCs.Select
For y = 1 To 40
If InStr(1, Cells(13, y), "Timestamp of Execution") Then
finaldate = ALLCs.Cells(50, y)
End If
Next
ALLCs.Select
For u = 1 To 40
If InStr(1, Cells(13, u), "Start Date") Then
For p = 2 To 69584
Do While (Cells(p + 14, u) > finaldate)
Cells(p + 14, u).EntireRow.Delete
Loop
Next
End If
Next
End sub
This should keep checking that cell after it has deleted the previous row to ensure the replacement row should not be deleted also.
The fact that you delete a row while going increasingly in row's number, you will miss to analyze every rows right after the one you just delete because it (rows(i+1)) has become the rows(i) and yet you increased with the next.
Here is your code taking that into account (and got rid of the useless Select)
Sub Remove_Unecessary_Data_1()
Dim ALLCs As Worksheet, _
DS As Worksheet, _
FinalDate As Date
Set DS = Sheets("Data Summary")
Set ALLCs = Sheets("Asset LLC (Input)")
For y = 1 To 40
If InStr(1, ALLCs.Cells(13, y), "Timestamp of Execution") Then
FinalDate = ALLCs.Cells(50, y)
End If
Next
For u = 1 To 40
If InStr(1, ALLCs.Cells(13, u), "Start Date") Then
For p = 69584 To 2 Step -1
If Cells(p + 14, u) > FinalDate Then
Cells(p + 14, u).EntireRow.Delete
End If
Next
End If
Next
End Sub