VBA: Pull specific row data [closed] - vba

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have my excel sheet data(i converted into array format) which looks like following
1st row......['one', , , , 'Folder', 'Folder', 'Extended Data', 'Extended Data', 'Extended Data','Extended Data' ],
2nd row.....['ID', 'Label', 'Longitude', 'Latitude', 'Country', 'City', 'Inventory', 'Safety stock', 'weight', 'hdsjka'],
3rd row......['AFKBL', 'Kabul, Afghanistan', 69.136749, 34.53091, 'Afghanistan', 'Kabul', 12, 1845, 12, 1845],
4th row......['AFKDH', 'Kandahar, Afghanistan', 65.700279, 31.61087, 'Afghanistan', 'Kandahar', 18, 1193, 18, 1193], ....etc etc
I want to pull all the values in the 2nd row that comes under 'Extended Data' ( which is in 1st row)
and write it into a single column array in a different file..
I want to use this column array for creating a control wrapper in google charts.
I would really appreciate if anybody could write a macro and help me on this..

I can't quite make out how what your array looks like, but it seems to me that the data you want should be simple to obtain by looping across the array cells.
Say the data you want is in row 2 and columns 4 to 7 of the array you already have ("oldarr"), then you just create a new array of newarr(4,1).
dim newarr(4,1)
for j = 1 to 4
newarr(j,1) = arr(2, (j+3)) ''cycles across the needed columns on the second row
next j
You can then paste the contents of newarr wherever you like.
Now, this seems much too simple to require a macro to do, which is why I think I have to be missing something. However, the general approach holds as long as you know which array columns will contain the information you want. The only subtleties I could think of would be if you don't know how many rows or columns you need to copy in each iteration (in which case you could use a dynamic array), or if the columns containing "Extended Data" can change.
Hope this can at least help you get started.

Related

Can anyone explain what (1, 88, 2) means when getting error: "ValueError: Must pass 2-d input. shape=(1, 88, 2)"

Ive been messing with dataframes and lists, trying to understand hw they work, and I was wondering if someone could explain something for me about a list I cant seem to make into a dataframe because its not a 2-d input...
So I am downloading the companies listed on a stock exchange. The stock exchange has about 500 companies. Each company can be in one or more index.
bovespa = pd.read_csv('D:\Libraries\Downloads\IbovParts.csv', sep= ';')
This makes a dataframe from a file, which is a list of all the listed companies on the Brazilian B3 index, with 4 columns: the company name, type of stock, the code and which indexes the stock is part of, for example:
From this dataframe, I want to create a set of smaller dataframes, each of which will contain all the companies in that particular index.
Im not sure its the best way, but I found some similar code that creates a dictionary, where the index name is the key and the value is a list of all the stocks in that particular index.
First I manually made a list of the indexes:
list_of_indexes = ['AGFS', 'BDRX', 'GPTW', 'IBOV', 'IBRA', 'IBXL', 'IBXX', 'ICO2', 'ICON', 'IDIV', 'IFIL', 'IFIX', 'IFNC', 'IGCT', 'IGCX', 'IGNM', 'ISEE', 'ITAG', 'IVBX', 'MLCX', 'SMLL', 'UTIL']
Then this is the code that creates a dictionary of keys (index name) and values (empty lists) then fills the lists:
indexes = {key:[] for key in list_of_indexes}
for k in indexes:
mask = bovespa['InIndexes'].str.contains(k)
list = bovespa.loc[mask, ['Empresa','Code']]
indexes[k].append(list)
This seems to work fine. Checking the printout it does what I want it to do.
Now, I want to choose one of the indexes (for example 'IBOV') and create a new dataframe which contains ONLY the codes of the companies in IBOV. I can then use this list of codes in the yf library to download the financial data for the companies of 'IBOV'.
To do this I tried this code, hoping to get a dataframe with an index, the company name and the company code:
IBOV_codes_df = pd.DataFrame(indexes.get('IBOV'))
and got this error:
ValueError: Must pass 2-d input. shape=(1, 88, 2)
The 'type' of the data Im using (indexes.get('IBOV')) is a list:
type(indexes.get('IBOV'))
returns list, but the pd.DataFrame cant use it. Also, I cant call any of the individual elements in the list. This is what the list looks like (in jupyter):
indexes.get('IBOV')
At first I thought it was a 'normal' list with 88 rows and 2 columns, then I noticed the second square bracket AFTER columns, and len(list) told me this list had only one line. Im still fuzzy on lists and dataframes etc...
Anyway, this error seems to be quite common, and I found a solution here on stackoverflow:
pd.DataFrame(IBOV_codes[0])
Unfortunately, the post on stackoverflow just told the original poster to "do this" with no explanation and it worked. It also worked for me, and created a dataframe that is identical in appearance to the list (but without the brackets, obviously.)
Logically, as there is only one line in the list, [0] is the only callable line to use, so it makes sense. My first question in... why?? What the hecks going on? How can python make a dataframe from a list with only one long, confusing string(?) element? I know its pretty smart, but seriously? How? Also, if there is only one line, why does python throw the error: shape=(1, 88, 2). How is that possible? What does shape=(1, 88, 2) mean or look like? I thought the shape would be (1,1): One row and one column. Very confusing.
My second question is about indexing...
In the original dataframe made from the csv, the list of ALL companies, the index (I assume) is the list of numbers: 0, 1, 2 ... 513.
When I start slicing, and create the final dataframe, using pd.DataFrame(IBOV_codes[0]), the index column is 1, 12,17,24,34... 492, 496, 497, 506, 511. Each company has the same 'index' it had when read from the csv.
The numbers are still sequential, but the index is missing loads of numbers. Are these indexes still integers? Or have they become strings/objects? What would be the best code of practice? To reindex to 0, 1,2,3,4 etc?
If anyone can clear things up, "Thanks!"

What does ++num1var[num2var] mean? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 months ago.
Improve this question
Page 109 of The AWK Programming Language book has a statement to create a 2-D array named attr:
attr[nrel, $1] = ++nattr[nrel]
nrel = an integer representing the number of relations (tables)
nattr = an integer representing the number of attributes (table columns)
Substituting country for $1, 1 for nrel, and 1 for nattr we have:
attr[1, country] = 1[1]
What does the right-hand side of that statement mean? It appears to be referencing subscript 1 of array 1. Can an array be named 1? Would you explain what that expression means, please?
Page 109(...)
I have look into archive.org's version and line
attr[nrel, $1] = ++nattr[nrel]
is sole line referencing nattr and therefore this is where nattr is created, as you are asking about value under key it will be array. You might check that by substituting all but nattr as proposed and using typeof function
awk 'BEGIN{attr[1, "France"] = ++nattr[1];print typeof(nattr)}' emptyfile
gives output
array
therefore shown line does firstly increase value in array nattr under key nrel and then assign such changed value to array attr under key nrel, $1.
(tested in gawk 4.2.1)

Find and replace numeric values preceding a substring [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a dataframe which looks as following:
df['col1'].values
array(['cat 113kd29', 'do56goat24kdasd', 'pig145kd'])
I need to create a new column df['vals'] with following values:
cat 29
do56goatasd
pig
i.e. first I need to look for substring kd and then find the numeric value preceding it. I am not sure how to go about this.
There can be multiple numeric values in each string so I need to find only ones before kd. Please note the string 'cat 113kd29'. Also look at 'do56goat24kdasd'
I tried the following but it didn't work:
df['col1'].str.replace(r'(\d+)kd', '')
Your call to str.replace is correct, but you need to assign it to the original Pandas column on the left hand side of an assignment:
df["col1"] = df["col1"].str.replace(r'\d+kd', '')
Note that str.replace does a global replacement by default, so there is no need to use any sort of flag.
Another way is to match digits precedingkd and kd and replace it with nothing
df["col1"]=df.col1.str.replace('\d+kd\Z','', regex=True)

SSIS Conditional Split Reject files [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to migrate the data to a target table.
However, I want to make a reject file for null values ​​and values ​​whose size exceeds 20 characters. As I do with Conditional Splitting?
I did that but it doesn't work:
"if len(mail)>10 caractère"
i will export this values to reject file
How can i do this Please ?
Design
You can do this directly in a conditional split but I advise against doing so. Instead, compute the boolean (true/false) condition in a Derived Column and add that to your data flow. Then, if you get unexpected results, you can add a data viewer between the Derived Column step and the Conditional Split
Implementation
Add a Derived Column to the data flow. Add a new column called BadMail. If it's true, then we'll route to the bad file. If it's true, it will proceed to the destination.
The Expression language for SSIS will use the ternary operator (test) ? true_condition : false_condition
I am going to test for null ISNULL(mail), longer than 20 len(mail) > 20 and zero length len(mail) == 0.
The || is a logical or so if any of those three conditions are true, then we need to set the BadMail to true
(ISNULL(mail) || len(mail) > 20 || len(mail) == 0) ? true : false
You could simplify that to eliminate the ternary operator but I find being explicit in my intentions helpful in these situations. As a side note, if you are still having issues with unexpected results, add a preceding Derived Column transformation and add a column in for each criteria (null, 0 or greater than 20 character) and then you can inspect them individually.
Now, we add the Conditional Split
The expression here is just our new column BadMail and that will route to Output Path 1 or whatever you name it. The good mail will pass through to the default output path.

Wide Method Call VB.NET [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I've just written this:
ldb.Update(emp.Code,emp.number, "text", String.Empty, "EMP", emp.scheme, emp.status, emp.tod, emp.timestamp, emp.Code, emp.oldfrmd)
Its far to wide! How can I shorten this method call? The problem is this isn't my method, so I can't edit it.
It depends on what your concern is:
Too many parameters? Can't really change that without changing the method, although you could introduce a proxying method containing fewer parameters, where each parameter may map to several of the original ones. It looks like you might want to make this a method on whatever the type of emp is, but it's hard to know without more information.
Just too wide on the screen? Use line continuations:
ldb.Update(emp.Code, emp.number, "text", String.Empty, "EMP", _
emp.scheme, emp.status, emp.tod, emp.timestamp, _
emp.Code, emp.oldfrmd)
(IIRC the "_" isn't actually needed in VB10.)
Too many characters? Introduce some local variables, potentially, shortening the eventual call to something like:
ldb.Update(code, number, "text", "", "EMP", scheme, status, _
tod, timestamp, code, oldfrmd)
(Although your overall code will be bigger, of course.)
Since you can't change the method signature, you must really be passing all those fields of emp into it. I would tend to write my own function (pardon my terribly rusty VB; I'm sure there's something wrong with this):
updateLdb(Employee e)
which simply called ldb's function and did nothing more. Using a single letter for a variable name is generally a bad idea, but in this case it saves your line 16 characters, and in a one-line function, "e" isn't particularly less informative than "emp". As Jon says if you move this function into the Employee class, you can get rid of another 16 characters - and it does appear to really belong there.
I would not use "e" as a variable or parameter name in any function that is longer than one or two lines, but in that small a scope, I think you can get away with it without significantly sacrificing readability.