Excel Macro to combine cells of data when data matches in another column - vba

The best way I can explain my problem is by showing a few screenshots.
I need to turn data like this:
[
Into something that displays like this:
After Data
There are multiple part numbers in the file, and I need the macro to take all the data from a matching part number and transform the data into what is displayed in the second image. All the part numbers are grouped with their data together, so it wouldn't need to run the loop through the top every single time, but adding to the entries with each new piece of data. Something also needs to be done for the years as well, because the way the data is presented, is in a range of years, and I need an entry for each year in that range.
Additional Information:
I am using this data for prep for category data for a BigCommerce site, that is working with a year/make/model plugin on the site, to create a vehicle lookup system. Thus in order for the user to look up their vehicle accurately the categories need to be listed the way they are in the second picture, which needs to be the result of the macro.
I thank anyone who takes the time to look into this, it will cut down the time I spend doing this manually by a huge amount.

You can do this with a formula (without actual VBA):
In cell F2 write: ="YMM/"&C2&"/"&D2&"/"&E2&";"
In cell F3 write: =F2&"YMM/"&C3&"/"&D3&"/"&E3&";"
drag down the formula in F3 until the last row.
The last row will contain the entire string of all vehicles.
I just noticed you may have duplicate values. You can use the built in Remove Duplicates feature to remove those before using the above technique.

Related

Number of unique IDs and sum of values per category in a large dataset

I have a large dataset (approx. 250.000 records) where I have an ID column, different categories an ID can belong to and a value column:
Now I want to calculate the unique occurences of each ID per each category and same for the sum of the value. The result for the example should look like this:
In the example I was able to do this manually. However, I have a large dataset and I cannot do this manually. I thought about it in different ways, but I did not find a good solution for this. One way would be to do it for each single cell with the PowerQuery Editor and then enter the desired number for each cell (this is the way I used to create the solution for the example). But then I have to do this manually with PowerQuery for each cell. Also doing all the work with usual Excel formulas for each single cell is not a good solution, as it includes a lot of manual work. And I would like to avoid doing it manually and thought there must be a better way. If there is an Excel solution I am happy with it. If it is necessary to use VBA I am also ok with it.

Complex IF. formula in Excel

I checked various posts on IF formulas but I cannot find a way to receive the correct result in my report. I manage deliveries and I would like to calculate the delay days basing on the data from delivery report. The trick is that the delay will depend on the status of delivery, as in each case I have to consider a different date and column in Excel. These are the data:
Status of delivery:
Confirmed
Unloaded
Unloading
Not confirmed
Started
In route
Pick-up pending
Prepared
This delivery status is updated in C column in my Raw Data report. For each, I will have to calculate the delay in a different way therefore I figured that IF formula could be of use.
Below you can see the columns that contain the relevant dates for the calculations:
Status of delivery and reference date:
Confirmed - D
Unloaded - D
Unloading - D
Not confirmed - S
Started - D
In route - S
Pick-up pending - E
Prepared - S
I made this formula as below, sadly, only the first record is calculated correctly, the rest of the delays is "null".
=IF(C2="Confirmed";(TODAY()-D2);IF(C2="Unloaded";(TODAY()-D2);IF(C2="Unloading";(TODAY()-D2);IF(C2="Not confirmed";(TODAY()-S2);IF(C2="Started";(TODAY()-D2);IF(C2="In route";(TODAY()-S2);IF(C2="Pick-up pending";(TODAY()-E2);IF(C2="Prepared";(TODAY()-S2);"null"))))))))
Do you happen to have any idea where am I making the error which I don´t see? I will be grateful for any help. If it´s also relevant, I am using Excel 2016.
Breaking it down so the long line becomes readable.
=IF(C2="Confirmed";
(TODAY()-D2);
IF(C2="Unloaded";
(TODAY()-D2);
IF(C2="Unloading";
(TODAY()-D2);
IF(C2="Not confirmed";
(TODAY()-S2);
IF(C2="Started";
(TODAY()-D2);
IF(C2="In route";
(TODAY()-S2);
IF(C2="Pick-up pending";
(TODAY()-E2);
IF(C2="Prepared";
(TODAY()-S2);
"null"
)
)
)
)
)
)
)
)
At first glance, what I'm looking at is basically a vertical lookup schedule here.
So, I created these two colums. One is the text status we're looking for.
The other is the calculated date.
Single date VLOOKUp
Assuming D2, S2 and E2 are fixed fields I made the formula =TODAY() - $D$2
Then it's a simple matter of doing the VLOOKUP in the correct field.
Because we're working with a date type field we need to convert the number to text to get a meaning full date. (I used JJ in the screenshot for years because I have a dutch locale)
Then we also need to handle when VLOOKUP can't find anything, for that we use IFERROR.
=IFERROR(TEXT(VLOOKUP(F6;$A$2:$B$9;2);"MM-DD-YY");"NULL")
And now you have an easy to expand lookup table you can put anywhere, hide on a worker tab, etc.. where you can calculate your values, where you can add and remove values.
Many rows, many dates
But, say you have many rows with different statusses and dates, and you wish to know the number of days it'll take. Then this vertical lookup doesn't look so useful because it can only be used for one row.
We can still leverage VLOOKUP to make our life a bit easier
Assuming there are dates in colums D, E and S
=IFERROR(DATEDIF(TODAY();INDIRECT(ADDRESS(ROW();VLOOKUP(F2;$A$2:$B$9;2;FALSE)));"d");"NULL")
We use VLOOKUP to see which column we need to look in. We use a number here for column, not a letter.
We then use ADDRESS to get an excel address reference for the current ROW(), and the column we found via vlookup
We funnel that through INDIRECT so we can get the value from that targeted cell.
Then we get a DATEDIFference in days from offset today.
We wrap it all in an IFERROR to keep things clean.
You could use an INDEX($D2:$S2;0;VLOOKUP($F2;$A$2:$B$9;2;FALSE)) to get the same effect, as pointed out by Dirk Reichel but you have to mindful then that the index used is from the start of the matrix range. So here the matrix starts on row D. So D2 is index 1 instead of 4 it is with my original method, so you'd need to adjust the lookup table accordingly.

Dynamically creating a pivot table using fuzzy matching

So, I'm constantly being given data in new and different formats. I'm on a crusade to get my work to standardize data for easy use, and if I managed to convince the powers that be to standardize data, this problem becomes entirely moot. Until then, I have the following problem:
I get data in a variety of ways. Sometimes my gross sales are called total sales. Sometimes gross sales before discounts, total sales before discounts, Gross_Sales, etc. Discounts, deductions, exempt amounts, etc. form another column. So on and so forth. I'd like to be able to do the following:
1) Figure out what columns I want,
2) Turn those columns into a pivot table.
For part 1, I have two options, and I'm wondering if there's anymore: The 1st is to use Microsoft's fuzzy-matching add-in to help me match. I'd have a separate tab dedicated to fuzzy matching each column I need. The second is to just generate a long list of all the variants, and to test each one until I find a hit, assign it, and move onto testing the next one.
The second part is turning all of this into a pivot table - the resouces I have so far are https://www.thespreadsheetguru.com/blog/2014/9/27/vba-guide-excel-pivot-tables and How to Create a Pivot Table in VBA
Is there a better method? Is there another way?
Edit: Slightly better method - Grab the data columns, place them into a table, and pivot everything off of that table - it removes the need to re-create pivot tables, just need to move the data over.
Having the same problem, I use a mix of your two methods.
My data consists of a bunch of logs for rejected x-ray images, and the reject reason is a free text field. My solution was to create a table where the first column contains my desired output categories, and then each subsequent column contains a different variation of it.
For example, a row might have (column one/ouput first entry):
Positioning, POS, Positioning Error, Patient Positioning
Note that these are all fairly different from each other. Where the fuzzy matching comes in - it is used to capture all the smaller differences and mispellings around those other columns. When the fuzzy matching section decides a given reason matches a column's entry, it is then replaced with the appropriate desired output reason from column 1 of the table. In my example, a reason of 'Possitioning Err' [sic] would match to column 3 (Positioning Error) and then get converted to Positioning.
Then wash rinse repeat over the rest of your data as needed. This approach was super useful and fairly flexible in helping standardize my data. It was also computationally more expensive, but you'd only need to run the matching portion once I guess.
As for the actual mechanics of going about doing this - I use 2010, so no inbuilt functionality. I run the fuzzy matching code on a temporary worksheet until best percentage matches are found, and then overwrite the actual source data afterwards.

Text Manipulation nestled within a Query (Or ArrayFormula) (Google Sheets)

I'm trying to Query some data in my spreadsheet, returning a manufacturer based on product code. We code our products with a three digit suffix that corresponds to different customers. I know the codes but people viewing the sheet may not.
Right now, I'm trying to split the suffix from the product and perform the wuery in the same formula.
I can do this in two steps, splitting the suffix from the code and querying just the suffix, but I want to know if I can do this all in one code. My current formula returns the data I want but it does not fill the entire range of the sheet. I would rather have this happen automatically as the workbook will be dynamic.
My current formula is:
=QUERY(CxSeries,"select B where C CONTAINS '"&right(Code,3)&"' ")
https://docs.google.com/spreadsheets/d/190kom4q0XOJP4UdLTJpZf5tuJCQTflcuokRp_FJ4pBc/edit?usp=sharing
I'm not sure if query is the right way to go about this, but I'd prefer to stick to that (just because i honestly can't wrap my head around ArrayForumlas).
Thank you,
Clear all formulas you have in column C and enter in C7
=ArrayFormula(vlookup(regexextract(D7:D16,"-(\d+)$")+0, {Sheet5!C6:C,Sheet5!B6:B}, 2, 0))

Excel - How do I find all relevant rows by typing unique invoice# listed Col A

I have a Worksheet with 10 columns and data range from A1:J55. Col A has the invoice # and rest of the columns have other demographic data. Goal is to type the invoice number on a cell and display all the rows matching the invoice number from col A.
Besides auto filter function, the only thing comes to my mind is VBA. Please advice what is the best way to get the data. Thanks for your help in advance.
Alright, I'm pretty proud of this one. Again avoiding VBA, this one uses the volatile formula OFFSET to keep moving its VLOOKUP search down the table until it's found all matches. Just make sure you paste enough rows of the formula that if there are many matches, there's room for all of them to appear. If you put a border around your match area then it would be clear if you ever ran out of room and needed to copy down the formula some more.
Again, in the main section, it's just a single formula (using index):
=IFERROR(INDEX($A$1:$J$200,$M3,MATCH(N$2,$A$1:$J$1,0)),"")
This gets to be so simple because the hard work of the lookup is done by an initial column which looks up the next row that matches the invoice number. It has the formula:
=IFERROR(MATCH($L$2,OFFSET($A$1:$A$200,M2,0),0)+M2," ")
Here is the working example that goes with those formulas:
Let me know if you need any further description of how it works, but it mostly uses the same rules as above so that it's robust in copying and moving around.
I've uploaded the Excel file so you can play with it, but everything you need to reproduce this feature should be in this solution.
Google Docs - Click link and hit Ctrl+S to download and open in Excel.
A popular solution to this problem is a simple VLookup. Lookup the invoice the user types in on the table A1:J55, and then return an adjascent column's data.
Here's an example of it working:
The formula in the highlighted cell is:
=VLOOKUP($L3,$A:$J,MATCH(N$2,$1:$1,0),FALSE)
What's nice about this formula is you only need to type it once and then you can copy it across and it'll automatically pick out the correct column of the table (that's the match part). The rest is very simple:
The first part says lookup value $L3 (the invoice number typed in),
The second part says look it up in range $A:$J (which is where your table is located). I've shown how you can select the entire columns $A:$J so that you can add and remove data without worrying about adjustin the range in your lookups. (Excel takes care of optimizing the formula so that unused cells aren't checked)
The third part picks the column from which the resulting data will be drawn once a matching row is found.
The FALSE part is an indication that the invoice number must match exactly (no approximate matching allowed)
The $ signs ensure that fixed ranges like the location of your source table ($A:$J) and your lookup value ($L3) don't get automatically changed as you copy the formula across for multiple columns.
The formula is pretty easy to adapt if you want to move around your table and the area where you do your lookup. Here's an example:
Bonus
If you want to add a little spiff, you can add a dropdown to the Invoice # field so that the user gets auto-completion and the option to browse existing values like so: