Merging Datasets (Right Join) in VBA - vba

I'm in the process of creating a VBA script where I import two files into separate worksheets. Sheet 1's data includes masked account numbers, actual account numbers, account type and sub-account type. Sheet 2 includes the account number, account type, sub-account type and other columns that are important. What I need to do is merge sheet 1 with sheet 2 so that Sheet 2 will also include the masked account numbers (after the merge I will delete the account numbers and just keep the masked account numbers). Below is an example of what my datasets look like:
I've been doing this process in R with ease, I can simply just merge the datasets with a right_join function but now I need to replicate this process using VBA. I'm trying to search how to merge worksheets (datasets) but what I've found so far are examples of merging substrings which is not what I want. I believe I can do this with a nested vlookup but I was hoping to find a better way to do this in VBA. Any advice and or literature that would help me better understand how to create a script to merge the data would be appreciated. Thanks!

Related

Compare sheet1 columns(400) data matching with sheet2 columns(500) as different column order

sheet 1 have 400 columns and data row upto 10000 rows, i want to compare with sheet 2 it has 600 columns means with additional columns in different order.
Compare sheet1 columns data with sheet2( as different column order) and highlight difference in sheet 2 and put difference in sheet 3 with highlighting the cell for mismatch, im new to VBA need your support,
you can highlight any one column as a primary column in sheet 1
Instead of hoping someone will write your code or teach you from scratch, you might have better luck with putting some research into how to do this with Excel's built-in tools. Therefore, my answer is "don't use VBA" — organizing data is what Excel is meant for and there is plenty of built-in functionality that you're probably not aware of.
First, there's Spreadsheet Compare (available in certain versions), which compares two workbooks (or two versions of the same workbook) and helps you see and organize differences between them. You can also identify potential issues, such as changes in formulas or calculations, or manually entered values.
Also built-in, see Consolidate data in multiple worksheets, allowing you to summarize and report results from separate worksheets into a single document. The sheets can be in the same workbook, or in other workbooks. When you consolidate data, you assemble data so that you can more easily update and aggregate as necessary.
If the columns are in different orders between the worksheets and that causes an issue for you (or for the built-in tools) is a problem, a simple fix is to simply sort the columns alphabetically on both sheets. In order to do this you just need to choose Left-to-Right in the Sort Options.
I'm sure there are other relevant features I'm not thinking of; take a look though "all" the commands available on your version's ribbon to see if there are other commands you're not aware of.
There are also a number of worksheet functions that could also help with a process like this. Exactly which ones depends on your needs (impossible for others to advise on without knowing the details of your current organization method).
The Insert Function Dialog
Off the top of my head, VLOOKUP, HLOOKUP, INDEX, MATCH, FIND, MID, LEFT and RIGHT could all potentially be beneficial to this task. (And still, no VBA required.) Find out more on those at the official Categorized List of Excel Functions, and also see the Lookup Functions section specifically.
Finally, there are a number of free or paid third-party add-ins specifically for Comparing & consolidating worksheets. For example, here is a lengthy list of the comparison functionality offered by DiffEngineX.
It's very common for Excel users to have a task at hand and assume that it's necessary to dive into VBA, without realizing Excel already provides the functionality they need. As a rule of thumb, ask yourself "Is this unique, or is this something that someone might have needed to do before?" If its not unique, chances are it's either already built-in, or there's an existing solution somewhere online.

Excel VBA - Adding keys to duplicated couples

I'm working on a table in Excel with two columns with repeated values (text), and I need to create a new column (same sheet), where each (sorted) couple is associated with an integer.
Here a simplified example:
--> Starting point
--> Expected output
Since the number of rows is really huge (not known a priori - data are imported from external files), I need to write the code in a very efficient way!
All suggestions are warmly welcomed!
Manual method
Manually sort the columns
Insert 1 into cell C1
Insert =IF(AND(A2=A1,B2=B1),C1,C1+1) into cell C2 and copy this cell down.
VBA method (via macro recorder)
Start recording a macro
Manually do all 3 steps from above (manual method)
Stop macro recording
Now you have a basic macro that you can modify to your needs and learn from.

Merging two different spreadsheets database data

newbie here working on something bit complicated..not sure how to start and whats the best way..looking for some advice and tips
So, we have 2 system running using MS Dynamics POS 2009 and have extract of all data (inventory/stock) in spreadsheets. Both dbo have pretty much the same items but because they have been run separately all naming and Part Numbers are in different format.
I need to create one database (one excel file) from both. Where partial match on Part Number will be identified and "merged" (keeping Part Number and Description from sheet1 and updating Stock (sheet1 stock + stock from sheet2)
Problem is that Part Numbers are written in completely different styles (by different people) and can by match only by some partial match (i guess last 3-6 characters in Part numbers)
I am not excel expert so any advice and tips would be appreciated.
Also have thoughts of loading those excel sheets into 2 separate SQL databases and doing it from SSMS as not sure if excel can cope with this
Thanks
I'm not 100% sure of the source data, but based on the available information here are some possible steps:
-Create a new Database in SSMS
-Load the data from your excel extracts in with the import data tool (Right on your newly-created database, tasks, Import data). This
will pull up a wizard that will transform your Excel spreadsheet to a table in SQL server. Do this for all spreadsheets
http://searchsqlserver.techtarget.com/feature/The-SQL-Server-Import-and-Export-Wizard-how-to-guide
-You may be able to do some matching based on the start/end of characters and use a MERGE statement to get unique data. The merge statement
allows you to set a match criteria, and then take certain actions depending on a positive or negative match. For example, if your different POS
systems have two different spreadsheets of products where there is some overlap, but also some products that are unique to each system, you could start with a source table from the first system and only insert products into it that are unique to the other system, if there is a match do nothing. Something like
MERGE ProductA A
USING ProductB B
ON RIGHT(A.ProductID, 5) = RIGHT(B.ProductID,5)
WHEN NOT MATCHED BY TARGET THEN
INSERT (ProductID, Descrption)
VALUES (b.ProductID, b.Description)
https://www.simple-talk.com/sql/learn-sql-server/the-merge-statement-in-sql-server-2008/

Combine Columns in multiple Excel workbooks and auto-de-dupe

I have three workbooks with IDs in column A. I want to create a fourth workbook, which should combine the IDs and de-dupe them automatically so that I can perform a vlookup on them to reference data on the other workbooks. The 3 workbooks with data in them will be constantly updated with new ID numbers added, so I need the master/summary workbook to automatically grab newly added ID numbers and perform vlookups against the other workbooks.
The goal of this is to give a summary view of each record (which corresponds to a person), letting the user know which workbooks that person exists in.
I have tried doing =max() to retreive the number of ID's in each workbook, and combining them, telling me the total # of ID's that exist, combined. Then I tried to perform this: =SMALL(IF(FREQUENCY(Test1:Test2$A$2:$A$1000, ROW($1:$28))<>0, ROW($1:$28), ""), ROW(A1))
+ CTRL + SHIFT + ENTER
But I'm 1. not sure if that'll work and 2. not sure how the syntax works with 3 separate workbooks.
I also tried the union method in VBA with no luck - again I think I'm messing up the syntax.
You can retrieve a unique list of the id numbers in [ID_First.xlsx]Sheet1!$A$2:$A$999 using the following array formula in the master worksheet's A2 (needs a row above it to avoid circular references).
=IFERROR(INDEX([ID_First.xlsx]Sheet1!$A$2:$A$999,MATCH(0, IF(LEN([ID_First.xlsx]Sheet1!$A$2:$A$999),COUNTIF(A$1:A1,[ID_First.xlsx]Sheet1!$A$2:$A$999),1),0)),"")
If you stack similar formulas consecutively, passing calculation on to them with IFERROR(), you can gain a unique list from three separate workbooks.
=IF(LEN(A1),IFERROR(INDEX([ID_First.xlsx]Sheet1!$A$2:$A$999,MATCH(0, IF(LEN([ID_First.xlsx]Sheet1!$A$2:$A$999),COUNTIF(A$1:A1,[ID_First.xlsx]Sheet1!$A$2:$A$999),1),0)),IFERROR(INDEX([ID_Second.xlsx]Sheet1!$A$2:$A$999,MATCH(0, IF(LEN([ID_Second.xlsx]Sheet1!$A$2:$A$999),COUNTIF(A$1:A1,[ID_Second.xlsx]Sheet1!$A$2:$A$999),1),0)),IFERROR(INDEX([ID_Third.xlsx]Sheet1!$A$2:$A$999,MATCH(0, IF(LEN([ID_Third.xlsx]Sheet1!$A$2:$A$999),COUNTIF(A$1:A1,[ID_Third.xlsx]Sheet1!$A$2:$A$999),1),0)),""))),"")
Array formulas require Ctrl+Shift+Enter to finalize. Once entered correctly, fill down as necessary to collect all unique IDs.
With a unique list of id numbers, you can use the same method of nested IFERROR functions to look through a series of three workbooks for additional data.
=IFERROR(VLOOKUP($A2, [ID_First.xlsx]Sheet1!$A$2:$Z$999, 2, FALSE),IFERROR(VLOOKUP($A2, [ID_Second.xlsx]Sheet1!$A$2:$Z$999, 2, FALSE),IFERROR(VLOOKUP($A2, [ID_Third.xlsx]Sheet1!$A$2:$Z$999, 2, FALSE),"")))
I'm offering this as you've mentioned a total of 50 member IDs. This method can quickly (and logrythmically) eat up calculation resources when applied to larger groups of numbers.
If you've got Excel 2016 or later, you could unpivot the data using PowerQuery, which is now built in to Excel under the Get & Transform section of the Data tab in the ribbon. Plenty of examples of how to do this if you search Google for 'unpivot' and 'Powerquery'.
If you have Excel 2010 or 2013, you can download the PowerQuery addin for free from https://www.microsoft.com/en-nz/download/details.aspx?id=39379 (assuming IT let you do so).
PowerQuery is a revolution in Excel data transformation, and the learning curve is a lot less steep than advanced formulas or VBA.

Creating Power point slides with tables in Excel

I have a large workbook that has several connections and queries to an Oracle database to gather data.
I have roughly 6 sheets in this workbook that contain my final data.
I would like to move all of this data to a PowerPoint presentation. I have seen many examples of how to move charts and graphs, but I have none of these in my workbook nor do I need them.
3 of my sheets display data generated by a pivot table on a separate sheet. I have done this because I am trying to avoid showing the pivot filter arrows.
The other three sheets are a table created from the Oracle query. Each sheet has a separate query to display data specific to a certain customer.
I would like to take the data I have in my spreadsheets and build tables in PowerPoint containing that data. I have tried importing the objects to PowerPoint, but since the data can change from minute to minute having to update the links and then refresh the data is rather clumsy. Also, I never know how many rows of data I will have. This is also due to the fact that the data can change minute by minute.
In short I am trying to look at Sheet one. Take all of the data there and build a table in PowerPoint to match. When building the table in PowerPoint only place a max number of 6 rows per PowerPoint slide. Continue to add slides until all of the data is moved.
You'll probably need to cobble together bits of the following that my friends Brian and Naresh have allowed my to post on the PowerPoint FAQ site I maintain, but between the two, it should get you there:
Controlling Office Applications from PowerPoint (by Naresh Nichani and Brian Reilly)
http://www.pptfaq.com/FAQ00795_Controlling_Office_Applications_from_PowerPoint_-by_Naresh_Nichani_and_Brian_Reilly-.htm
Where it starts: the DisplayData project by Naresh Nichani and Brian Reilly
http://www.pptfaq.com/FAQ00784_Where_it_starts-_the_DisplayData_project_by_Naresh_Nichani_and_Brian_Reilly.htm