Comparing text files using Excel - vba

This is not a question about excel vba in particular. The question is what approach would be the best.
Here is the problem I have. I have 2 text files (one of current month and one for a prior month) that have account information as the first piece of information followed by other information that I am not concerned about. Here is an example. The information in bold is the account number and unfortunate for me, the records are not sorted i.e. the account numbers could be in any order
1030887-7 JAMES SMITH 12/15/13 03/05/13 212.50 180+
This kind of information exists in both files. I need to create a report of what is new in the current month file and what was carried over from the prior month. I am not concerned about any that was present in the prior month and not in the current.
I was thinking of reading one set of information into an array, sort and then begin reading the second one to start the comparison. Can anyone suggest a better method? The text files have almost 20000 lines in them.
I should mention that the text file I am trying to compare is a report and so has multiple headers, trailers etc and that is complicating the comparison. Also these accounts are by branch and I have to ensure that I don't mix 2 of them up. It seems to be doable but a little bit complex

Instead of using Excel, I might suggest using a tool like diff. Please see Modern version of WinDiff? for a discussion.
(Win)Diff will perform a line-by-line comparison and tell you what lines are changed, deleted, or inserted.

Related

Compare Dates in a column corresponding to Changes in an other row

I have a large sample of medical data I need to try and analyze patterns in Excel worksheet.
I also put them up in a database in MS ACCESS to do my first filters and stuff.
I have the infos of patients, with tests results(POS or NEG), dates of the sample.
I need to be able to check for each patients when the results change from POS to NEG and from NEG to POS
and compare the dates of those two samples.
So far I was doing it manually which isn't viable for my sample.
I was trying to do something in SQL, but that didn't work out for me.
I also am trying to do some VBA or Excel formulas but I admit I'm getting kind of stumped.
I know I should do some for each cell loop or something but I really am lost.
I already grouped each patients info together using sub-totals and stuff.
Your help or at least pointers would be greatly appreciated :D
Here's an example of my data.
enter image description here
Use a formula like this, that in case the name is the same as the row before and the POS/NEG is different from the row before gives you the number of days in between, blank otherwise:
Of course it will give you an error if you try to use it on the 1st line, just enter it in the 2nd line and copy/paste in all the rest.
This should give you the basis for the rest of your analysis.

What is the best way to approach transferring an 'empty row delimited' excel spreadsheet into two tables in an SQL database?

I have a collection of excel spreadsheets that are formatted... less than ideally.
I'm testing out some solutions involving SQLBulkCopy and OleDB, but I'm a bit concerned about how to handle the format of this sheet.
I was considering writing a custom Insert statement, but would like to see if there may be some easier way to implement a heuristic.
Below is a sample of the data I will be parsing:
The highlighted columns are the ones I'll be loading into the two tables. One table will hold order #s, and the other table will hold all the lines below that order number.
Any suggestions on tackling this would be lovely. The excel sheets are hand entered, so some weird cases exist (one order number with multiple carriers, which imposes the question of whether I should treat the first row with the order number as a line in the database structure I designed.
I'm implementing this importer within VB.net, to my dismay, to avoid being looked at funny by my coworkers :).
One approach would be to save the worksheet to a text file (e.g., CSV) and then use AWK to split it at the empty row. Some examples are in this SO answer: Bash how to split file on empty line with awk
You could then import the CSV files directly into the database.
Amusingly , if I wrote anything in VB.NET I'd definitely get looked at funny by my coworkers
So I'd use a library called EPPlus to read the excel and not have to worry about converting it. How you do the blank line detection is an open question- checking that the Value of ten cells on the row is Nothing or Empty would suffice. Then take the next row as your parent, and proceed with subsequent rows as children until the next blank
Take a look at this answer for more info on how to detect blank rows in Excel- if you get stuck turning any of the c# into vb shoot us a question. Online converters exist because the two languages are the same thing under the hood

VBA challenge: Spent a while with VBA, just messing this up more

So I'm in a rush to put together an excel file for something quick and dirty at work. I've spent several days learning VBA / macros and have learned many individual pieces needed for this, but putting them all together is just not working how I'm looking for.
So I'm taking something similar to the following table of data and trying to reorganize it:
(I can't post the image bc of rep)
![Sample Data Table] https://imgur.com/a/FrwEp
Data in columns d-f are all a list of stuff. What I want to do is start with the first date in column D, find the range of where it fits in column a and copy the data there. For an expense report for example - e1:f1 data would get copied over to b1:c1 since it corresponds with that as the date range. In a nutshell, dates in column A are income dates. They are set to pay all items listed on the right scheduled to be paid before the next pay date. See the finished example here:
![Final Sample Data Table] https://imgur.com/a/niaqB
How might you throw this together to make it work. I'm looking for simplicity as I'm gonna have to heavily modify it to what its actually being applied to.
Sorry for the weird post, this is my first time creating a post myself :)
A handy way to get started with VBA is to record a macro of the actions you need to duplicate, and then analyze the macro's generated VBA code, line by line, revising the code to fit your needs.
(Also: Using Comparison Operators in VBA)

Pull data from multiple Excel sheets, count how many of a certain lead within a month

A continuation of a previous question... where I have run into another formula issue.
We have an Excel spreadsheet where we track our leads. My boss wants to know, specifically "how many leads for each month and how many of each source and the result."
The users on this forum were incredibly helpful, and gtwebb gave me this formula (thank you!):
=COUNTIF(Sheet1!A14:A21,12012)
Which worked beautifully for counting how many leads we had in each month.
Now I need to know how many leads came from the Web, SOI, VP/Sign, etc., WITHIN that month. So I'm hoping there is a formula where I can ask Excel to only look at a certain month, say 12012, and then how many leads we got from the Web or other places. I know I'll need to change our lead source for each formula, but I can't get the basic formula to work. I've tried: =countif(Sheet1!e2:e20,12012, if(Sheet1!m2:m20,Web)), and other variations on this, trying to elaborate on the original formula that worked correctly.
Thank you for your help!
A Pivot Table didn't really ring a bell with me, so I looked into a formula solution.
Pnuts posted a suggestion to try sumifs(). I could only get this to work if there was a column with "1" in it for every line.
A similar function, countifs(), seems to work. Here's a formula for counting the number of lines with a month of "12012" and a source of "Web":
=COUNTIFS(Sheet1!A:A,"=12012",Sheet1!B:B,"=Web")
It assumes that column A is month and column B is source. Good luck!
Your boss wants a Pivot Table made from the data. Unless you have other reasons for using formulas or VBA (not stated in the question), using the pivot table functionality instead will save you a lot of effort in this case.

Comparing two columns and matching values from seperate work sheets

The goal of my project is to create an out of office program that will allow easy tracking and auditing of our Sharepoint site as it doesnt have a built in system to do. I have no background in VBA, but I have done quite a bit of python. That being said I've ran into my first issue. I'm not sure how the syntax works, and what commands I should be using to get the results I want. I.e. sheets vs worksheet vs worksheets.
I have a workbook, 1 sheet is Raw Data, in which I import data from a sharpoint site. It displays the following columns
Resource Name -- Absence Type -- ID -- Start Time -- End Time -- Created -- Modified by
The next sheet I have is tracking, it's called Tracking. On this page the user imputs Resource Names they want to track into Column A, and then the remainign columns are going to display the number of absences that name has so it will look something like
Resource Name -- Vacation -- Sick -- WFH
Clooney, George -- 2 -- 0 -- 7
A counter will run based on each instance that appears in raw data and adds the number to the counter based on the absence type from raw data.
I need a way to loop through Raw Data and look for the names that appear in the Tracking data. If Possible I'd like to store them in a third worksheet jsut for testing purposes. I know the logic I need to use, but what I dont know is the syntax to refrence the pages together. Any insight on the best way to accomplish this?
Question : I need to Search raw data for every instance Resource Name appears in it from the Tracking page and store into another worksheet.
If you don't want to use PivotTables (can be hard to search later) this is the way to do it with COUNTIFS. This formula goes in the "Sick" column of Tracking in row 2 (assuming row 1 is headers).
=COUNTIFS('Raw Data'!A:A,Tracking!A2,'Raw Data'!B:B,"Sick")
It assumes that in Raw Data Name is in column A and AbsenceType is in Column B, but it doesn't matter how many records there are.
The way I understand your question (and it wasn't easy), you are dealing with a bunch of timesheet info. You seem to be trying to count the number of instances of different kinds of time off that people are taking - whether that be vacation, sick or working from home(WFH).
I've never heard someone's name referred to as a "Resource Name" lol.
You really don't need to use VBA for this problem - at least not anything that you can't just record a macro for - it seems to be a rather simple problem that can you solve using a pivot table.
If you want to you can set up a vlookup reference to this pivot table to create the little form that you seem to be trying to create. But really I think your better off just teaching whoever is going to be using this about pivot tables. Let me know if I misunderstood your question and I'll be happy to delete this post.