This excel doc is kind of like a guestbook, where there are numerous people's names and the date they visited on (written like 11/17/2014).
The doc is sorted by date, so there is like 100 or 200 names for 11/17/2014, then 200 or so for 11/18/2014... it goes on for a bunch of consecutive dates.
I want to write a formatting rule where it highlights if there are name duplicates during the duration of each day in the date column. This is to get rid of duplicates, and have an accurate representation of people visiting per day.
Things I tried:
regular dupe checker in conditional formatting - it is easy to run a dupe check all within one column. But this is a dupe check based on two columns. There will be many dupes for visitors returning daily.
The built-in formatting rule custom formula writer - it is easy to say "highlight all occurrences where column A cell and column B cell are equal" but not if the two cell patterns occurred more than once.
The Macro writer - I'm pretty rusty on visualbasic so I might be faster at generic dupe checking 100k or so entries by manually highlighting each day range.
TL;DR - Highlight all where 'name' and 'date' pattern occur more than once.
Any suggestions?
conditional formatting formula would be:
=IF(COUNTIFS($A:$A,$A1,$B:$B,$B1)>1, TRUE,FALSE)
dates in column "A", names in column "B", no difference if they are other way around, add conditional formatting whilst having all range selected and active in cell "A1". it WILL be slow though....
Related
I have a Google sheet with 186,000 rows. I have included a dummy spreadsheet to give you an idea of the data. I need to select ALL duplicates, that includes rows where the first names might not match (i.e. Cathy vs Catherine), but they still refer to the same individual. There are also instances where the addresses might be slightly different (like omitting "Ave" in one row but including it in another).
I need to write a query to account for all of these instances, including just regular duplicates. Or I could do multiple queries and just copy the results into one spreadsheet. In any case, I'm at a loss.
Dummy spreadsheet. I have included one example of each case I am trying to account for (3 total).
I have something that may be useful. See my example sheet here:
https://docs.google.com/spreadsheets/d/19h28go-nzunW6zexcMD61QjySUKJA3Q2Ci2Hu3OMuAg/edit?usp=sharing
Basically I build a key value for each record, along the lines you asked for.
All of the last name, part of the first name, part of the address, and the ZIP code. Other variations are easily added.
The formula is just a string concatenation of parts of these fields, as follows:
=ArrayFormula(
IF(ROW(A2:A)=2,"DupeKey",
IF(A2:A<>"",A2:A &LEFT(B2:B,$N$1) &LEFT(G2:G,$N$1) &K2:K,"")))
A valuable option is to allow varying the length of the required matching sub-string, from the first name and address. This is controlled for the formula by selecting a substring length of 1 to 6 in cell N1, and seeing how this changes the duplicate records that are found. The shorter the substring length, the more duplicate (or possibly duplicate) records will be found.
Conditional formating is used to highlight the duplicate records.
And you can use the column filters to sort by different data columns - to put all of the duplicates at the top, sort by column N, in Z-A order, and exclude blanks.
Note that this isn't perfect. If someone accidentally types a space, or anything else, at the start of a data field, it will not be considered a duplicate. Better logic would be required to catch those.
Let me know if this helps.
You can use these formulas:
If cell B3 match "John" write "match", if doesn't match write "no"
=IF(REGEXMATCH(B3,"John"), "match", "no")
If cell F2 contains content of cell B3, write "match", if doesn't match write "no"
=IF(SEARCH(B3, F2)>0,"match","no")
References:
REGEXMATCH
SEARCH
Gooday, Pls I'm partially new to vba. I have several spreadsheet in a workbook for a work station that is prone to making shortage or surplus in daily delivery as a result of human errors. I want a scenario where I can search the entire workbook to extract data from Column K(which displays shortage or surplus) , but it must meet a certain criteria in Column A(date of delivery) and Column D(location of delivery). In other words, I would like to search Column K to know if there is shortage or surplus for any day I choose to search based on date. Any form of assistance is highly appreciated. Thanks.
Your solution could include an InputBox and the Range.Find Method, where you search column A for date, read in the row, and look at column K of the same row. In fact, the Range.Find example is pretty easily modified for your needs.
But you mention location in Column D, so what constraints are on column D? You'll also need to be more specific about where/how you want to extract the data from column K.
I have data on 'JobSheet1' in Column D, I have Invoice Numbers in ascending order (some are repeated for different products on same order), in Column E, I have amounts i.e £50.00.
On a second sheet 'InvoicesSheet1' in Column B, I have the invoice numbers and Column C is where I would like the total for each invoice number to appear.
Can anyone help with very simple VBA or a formula that will search for the Invoice Number its sitting by in 'JobSheet1' Column D and add all the matching invoice number totals from Column E.
Scott Craner is right, with the schema you described, you will get the result you want entering into cells Ci:Cj (where "i" and "j" are
the start and end of your table, respectively):
=SUMIFS('JobSheet1'!E:E,'JobSheet1'D:D,B{i...j})
If this doesn't work, likely issues you need to watch out for would be:
Sheets are not named exactly as you typed here. Maybe they have a leading or trailing space.
Your copy of Excel/Windows may be set with a different regional setting, which requires that formula parameters be separated by semicolons (;) instead of comas.
Your invoice numbers may not be typed precisely the same in the two different formulas.
Your amounts in column E may not be stored as numeric values. You can test for this by selecting a few values from column E - if excel doesn't show their sum and average in the bottom right corner, they are stored as text and you can't perform math operations on them.
I'd need to see your data to see what could be the issue, but that's not what this forum is for - Try constructing a new table with dummy data set up exactly as you described it here and try using this formula, to verify if it works. Then, adjust accordingly as needed.
Assuming you're first invoice number in InvoiceSheet1 is in cell b3, you can use:
=SUMIF(JobSheet1!D:D,InvoiceSheet1!B3,JobSheet1!E:E).
If it's in another row, replace InvoiceSheet!B3, with the relevant cell where your data starts. Copy down the formula for the other invoices
SUMIFS is not necessary with just one lookup condition.
I solved this by amending the SUMIFS suggestion from ScottCraner and this is what I ended up with
=SUMIFS(Jobs!K:K,Jobs!A:A,D3)+SUMIFS(Jobs!L:L,Jobs!A:A,D3)
Does the job!
Please see here for snippet from my spreadsheet, what I am trying to do is fairly simple, however I am unable to find a way to do this after searching through online forums extensively.
Column A contains my order numbers and column B the line items that correspond to each order number.
Column D contains the delivery date as it appears on my printed order sheet, you will see this only pulls through for the first line item on each order - the raw data displays this way and so there is way to change the raw data
Column E simply extrapolates just the date rather than the format Delivery Date: dd/mm/yyyy.
What I would like then, is for column E to have the delivery date copied down to all corresponding cells for each order number - so as per the attached sheet, 30 Jul 2015 would appear for all line items that correspond to order no #1192.
I feel v look up etc will only work to manipulate data once I have these dates copied down. I have tried index match but it doesn't seem to do what I want it to do.
Is there a way to copy down the dates for all line items relative to their order number? I understand that it will probably require copying full lines down column D first and keeping the formula in column E to extrapolate just the date.
Any help is much appreciated
You don't need a macro for this. There are many ways to go about this, I'll show you two, you can figure out the one you like from there.
Select coloumn E, go to Home, Editing, Find & Select, Go To Special. Hit Formulas (if the values are not formulas, go for Constants), and check only Errors. Now type =E2 (or whatever is above your active cell), and hit Ctrl+Enter. It is a wise idea to copy-paste values the whole coloumn E after this.
Another way would be entering this formula in coloumn F (cell F2, then pull it down):
=IFERROR(E2;F1)
Or you could combine this with your original formula, or use a macro to insert the formula in the empty/#error cells etc...
Assuming you are using =RIGHT(D2,LEN(D2)-FIND(":",D2)-1) in E2, then you are well on the way to a solution.
You also mentioned INDEX/MATCH which, if used in column F will pull the Delivery Date in Column E for each Order No:
=INDEX($E$2:$E$31,MATCH(A2,$A$2:$A$31,0))
This finds the position of the first match for your Order No and returns the Delivery Date from column E.
Okay, so here is a sample my current code. I have this in several cells , where B-H7 reflect the absence type it is searching for.
=COUNTIFS('Old Data'!$A:$A,Tracking!A8,'Old Data'!$B:$B,$E$7, 'Old Data'!D:D, ">"&$B$5, 'Old Data'!D:D, "<"&$C$5)
What this code is doing is looping through a spreadsheet, where the employee name is found in the data sheet, comparing the absence types and assorting them to their respective columns and counting them, the last bit of code restricts the search between date ranges.
That being said, I need to add conditions to this that I'm not sure I can without taking it into VBA. In the "Old Data" Sheet in column D I have start time, which displays in MM/DD/YY HH:MM format. In Column E I have End Time, which displays in the same MM/DD/YY HH:MM format.
I need to have a way to
A.) Have the progam count the number of days between these dates and a +1 to the count for each respective day.
B.) If the start and end date are the same, have the program compare the number hours. if it is less than 4, only add a .5 to the counter.
My first thought is to scratch the countifs formula and loop through and parse it out using VBA, but I thought I'd check first to see if it can be done with just the formula as the power of the built in function has been pretty surprising to me so far.
I think I should probably take this from a formula to a VBA function and call it in the cells, but I'm not entirely sure, pretty new to the VBA/Excel scene.
Also, I'm in Excel 2007.
Thanks for any input on this issue!
It's possible to do with a formula but not with COUNTIFS. This array formula should do it
=SUM(IF(('Old Data'!$A$2:$A$100=Tracking!A8)*('Old Data'!$B$2:$B$100=$E$7)*('Old Data'!$D$2:$D$100>$B$5)*('Old Data'!$D$2:$D$100<$C$5),IF('Old Data'!$E$2:$E$100-'Old Data'!$D$2:$D$100<"4:00"+0,0.5,INT('Old Data'!$E$2:$E$100)-INT('Old Data'!$D$2:$D$100)+1)))
confirmed with CTRL+SHIFT+ENTER
I restricted the data range to rows 2 to 100, adjust as required, whole columns is possible but that may slow down the formula considerably
To count workdays only change to this version:
=SUM(IF(('Old Data'!$A$2:$A$100=Tracking!A8)*('Old Data'!$B$2:$B$100=$E$7)*('Old Data'!$D$2:$D$100>$B$5)*('Old Data'!$D$2:$D$100<$C$5),IF('Old Data'!$E$2:$E$100-'Old Data'!$D$2:$D$100<"4:00"+0,0.5,NETWORKDAYS('Old Data'!$D$2:$D$100+0,'Old Data'!$E$2:$E$100+0))))
You can also exclude holidays if you add a holiday range to the NETWORKDAYS function