Extracting rows based on a cell value on dynamic data - vba

I am trying to extract rows from an Excel sheet based on the cell value. Consider below example:
Sheet 1 columns:
Group | Question | Option1 | Option2 | Answer
Sheet 2 columns:
Question | Option1 | Option2 | Answer
As demonstrated above, I'm filtering the rows from Sheet 1 based on the Group and populating the data in a separate Excel sheet.
Issue: If data is added in Sheet 1, how can it reflect in Sheet 2 simultaneously.

Related

Replace rows from table that has specific value of multiple columns?

Let's say I have a table like this called MyTable.
| Column A | Column B || Column C | Column D |
| -------- | -------- || -------- | -------- |
| Cell 1 | Cell 2 || Cell 3 | Cell 4 |
| Cell 5 | Cell 6 || Cell 7 | Cell 8 |
And now I am inserting new row into this table that has format like this:
| Cell 1 | Cell 2 || Cell 3 | Cell Something else |.
What I want to do is replace an existing row from MyTable if the row I am inserting has the same value of the first 3 columns of MyTable (column A, column B, column C). As my real table has 250+ columns and
I want to replace rows if they have same value of 5 columns, I don't think INSERT ON CONFLICT UPDATE is good for this. In my opinion, it would be best to DELETE rows that need to be replaced and just INSERT new ones, but I don't know how to write that query.
I was thinking of INSERT ON CONFLICT UPDATE but firstly: I don't think I can specify more columns in ON CONFLICT part, and secondly: I think that I would need to specify 250 columns in UPDATE part, so that also doesn't work for me.
There is no problem specifying multiple columns in the on conflict clause, you just need a unique constraint on those columns. (see demo). As far as you having 250 columns (a highly questionable design, but another question altogether) you have no way around it you must list every column you want updated.

Populate specific cells in a column in a Pandas Dataframe using a Lookup table

I would like to populate specific cells in a column of a large dataframe using a Lookup table. For a table of type (df_original) with many columns:
| Main_col_1 | Main_col_2 | Main_col_3 |
|:--------------|:--------------:| ----------:|
|common_value_1 | common_value_2 | Old_Value |
and a Lookup table of type (df_lookup), where the column names Main_col_1 and Main_col_2 are common to both tables, but only a subset of the cell values from df_original will be present in df_lookup.
| Main_col_1 | Main_col_2 | Lookup_col_3 |
|:--------------|:--------------:| ----------------:|
|common_value_1 | common_Value_2 | substitute_value |
Some values in Lookup_col_3 (actually, there are two other lookup columns that have not been shown for brevity) are to be copied into df_original. Not all rows in df_original are to have replacements, only those rows whose values in Main_col_1 and Main_col_2 are present in df_lookup. The revised table df_original is to look like this:
| Main_col_1 | Main_col_2 | Main_col_3 |
|:--------------|:--------------:| ----------------:|
|common_value_1 | common_value_2 | substitute_value |
Whenever I use Merge or Join in the simple, well-known manner, the resulting dataframe gives a larger number of rows than was present in df_original, instead of the same number of rows. What additional operations have to be performed for this to work?

Comparing Column1 to Column 2 and writing to Column3 if match

I have an excel worksheet linked to a SQL query in column [Raw Data]. After adding a few columns with formulas to clean up the raw data, i need to find if the value in column [ProcDataQ] exists in column [ProcDataO], all columns comprise to make Table1.
ProcDataQ | ProcDataO | Stat
--------- | --------- | ----
C1234 | C7126 | Ordered
C8372 | C6152 | No Order
C7126 | C1234 | Ordered
I am able to do this with the below formula but i have more than 20,000 records and it takes on or around 30 seconds to load or refresh the table and i figured i could speed this up using a little vba that I'll trigger to run on the query refresh.
=IF(AND(LEFT([#[Raw Data]],1)="q", (NOT(ISERROR(MATCH([#ProcDataQ],[ProcDataO], 0))))),"Ordered", "No Order Placed")
fyi, i am running excel 2010 on PC.
Just use an IF and COUNTIF statement:
=IF(COUNTIF(range, item to look up)>0,"Ordered","Not ordered")

Reference lookup side column values based on cell value

I've got two worksheets, (a) contains rows which need to be dynamically updated based on a cell (ID), (b) contains over 10k product data by columns.
How do I achieve worksheet (a) to lookup data from worksheet (b) and based on the ID it picks up data from nearby columns. So when I change the ID to eg. 02 it will automatically populate above rows.
Worksheet A
Name
Price
Qty
ID
Worksheet B
ID | Name | Price | Qty
01 | Screw| 0.5 | 500
02 | Nail | 0.4 | 1000
03 | Cap | 0.2 | 800
Yes vlookup will work.
in each cell (except ID) you will need to put this formula:
=VLOOKUP(B1,WORKSHEETB!A1:D4,2,FALSE)
where "B1" is your ID and used to reference you table in worksheet B.
WORKSHEETB!A1:D4 is your table array (your table in worksheetB).
"2" is the column that you are referencing. Example Name is found in column 2 of that table array (doesn't matter where the table is located within the sheet name will always be column 2)
see images attached:
Assuming Name is in Cell A1, Price is in Cell A2, Qty is in Cell A3 and ID is in Cell A4, then:
Cell B2 Formula: =VLOOKUP(B4,'Worksheet B'!A:B,2,0)
Cell B3 Formula: =VLOOKUP(B4,'Worksheet B'!A:C,3,0)
Cell B4 Formula: =VLOOKUP(B4,'Worksheet B'!A:D,4,0)

ADO - how to select column from xls file where two or more columns have the same name?

I have an excel file like this:
| | A | B | C | D |
| 1 | Name 1 | Name 2 | Name 3 | Name 2 |
| 2 | Data | Data | Data | Data |
| 3 | Data | Data | Data | Data |
As you can see, headers of two columns have the same name - Name 2.
My question is, is it possible to tell the ADO engine from which column to select data?
Currently my select looks like this:
SELECT [Name 1], [Name 2] FROM [REPORT7_RAW$] WHERE [Name 1] IS NOT NULL
and ADO picks up the data from column which is listed under column B in excel. In other words it takes the first column which have the given name. Unfortunately I have two columns with the same name and I would like to pull out the data from column D. Is it possible?
I could not find any way to select column by its index rather the name.
You will need to change your connection string so that data header names are not used. The normal connection string would look something like this:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcel2007file.xlsx;
Extended Properties="Excel 12.0 Xml;HDR=YES";
You need to change the last bit, HDR=YES, to HDR=NO.
With that type of connection, the columns(fields) then become F1, F2, etc., where F1 = column A, F2 = column B, etc.
This is not ideal, since you are now essentially running the query based on the number of the column rather than the name, but with duplicate column names, this is the only way around that.
Per the comment from #barrowc: This format of the connection string will treat your column names as data. So depending on your query, you may need to include code to filter out the row that contains your column names.