I am working on a project where we need to search a set of our network drives to examine each file and look for credit card numbers and social security numbers. I have been trying to use the Cornell Spider program without success since it seems to crash every time I use it.
I would like to know if there is a way to use Powershell, or a scripting language available on Windows, to perform an analysis (I am assuming a strings match) that would match the patterns for credit card numbers and social security numbers (probably a regex). If there is a way, and since I am not a programmer, I was curious if there was some code that I could use to do this with. Also, the ability to save/dump the results of what is found out to a file (text or CSV) would be very helpful as well.
Any ideas or help that you can provide would be greatly appreciated.
=======================================================
Okay, I have been working on a test script and have come up with the following:
$spath = "C:\Users\name\Desktop\"
$opath = "C:\Users\name\Desktop\Results.txt"
$Old_SSN_Regex = "[0-9]{3}[-| ][0-9]{2}[-| ][0-9]{4}"
$SSN_Regex = "^(?!000)([0-6]\d{2}|7([0-6]\d|7[012]))([ -]?)(?!00)\d\d\3(?!0000)\d{4}$"
$CC_Regex = "^((?:4\d{3})|(?:5[1-5]\d{2})|(?:6011)|(?:3[68]\d{2})|(?:30[012345]\d))[ -]?(\d{4})[ -]?(\d{4})[ -]?(\d{4}|3[4,7]\d{13})$"
$CC_2_Regex = "^(\d{4}-){3}\d{4}$|^(\d{4} ){3}\d{4}$|^\d{16}$"
Get-ChildItem $spath -Include *.txt -Recurse | Select-String -Pattern $SSN_Regex | Select-Object Path,Filename,Matches | Out-File $opath
Get-ChildItem $spath -Include *.txt -Recurse | Select-String -Pattern $CC_Regex | Select-Object Path,Filename,Matches | Out-File $opath -Append
Get-ChildItem $spath -Include *.txt -Recurse | Select-String -Pattern $CC_2_Regex | Select-Object Path,Filename,Matches | Out-File $opath -Append
This seems to be working well, the problem is that if there is a space before or after the item to be matched, the regexs listed do not catch it. Is there something that I can do differently so that it will catch the item if it has a space before or after the pattern to be matched within a file?
See this PowerGUI.org thread for the solution: PowerShell Script to locate Social Security Numbers (SSN) and Credit Card numbers in files across the network.
Related
For my example, I'm looking to compare a source file with some changes made to the attributes in a table - lets say in the form of another source file.
What i want to achieve is
Sourcefile.csv
Newfile.csv
Deltafile.csv
(this will export only the changes deltas (rows) between the two files)
What i would like to achieve is that the row with the changes is exported as the delta, not just the column attribute.
All other rows that match do not need to be updated.
100,Renie,Stav,Renie.Stav#yopmail.com,Renie.Stav#gmail.com,CHANGE
101,Neila,Germann,CHANGE,Neila.Germann#gmail.com,developer
I've looked at Powershell, FC and SSIS incremental loading to see if this will work for my needs but a need some guidance in the right direction. Any help is greatly appreciated! :)
**Current Method **
Looking indepth at the powershell i tried in https://www.reddit.com/r/PowerShell/comments/cea8ax/compare_2_csv_files_and_export_the_rows_that_do/
Which is
# Compare work
$csv1 = Import-Csv -Path C:\Users\G23\Documents\Hackingfolder\source.csv
$csv2 = Import-Csv -Path C:\Users\G23\Documents\Hackingfolder\new.csv
$head = (Get-Content -Path C:\Users\G23\Documents\Hackingfolder\source.csv | Select-Object -First 1) -split ","
Compare-Object $csv1 $csv2 -Property $head -PassThru| Export-Csv C:\Users\G23\Documents\Hackingfolder\TheDiff.csv -NoTypeInformation
# Remove side indicator if you dont care to know where the diff came from
Compare-Object $csv1 $csv2 -Property $head | Select-Object -Property $head
Ignoring the side indicators i would get the rows that would not match, both of them. its not smart enough to know which one is the updated one. e.g I want the exported delta changes rows only.
PowerShell output
instead of the desired below changes only in csv
100,Renie,Stav,Renie.Stav#yopmail.com,Renie.Stav#gmail.com,CHANGE
101,Neila,Germann,CHANGE,Neila.Germann#gmail.com,developer
Thanks!
I'm having trouble to come up with solution that would compare LineNumbers of matching pairs from two lists. I will show you what I mean on example.
I have one SQL script, where I am inserting some data into existing tables. For ensuring repeatability of the script, before every insert into I am deleting the previous content of the table with "delete" statement. I am able to parse the file and check If every "insert into database1.table1" also have "delete from database1.table1" in the file. But i don't know how to check if the delete statement of the particular table is before the insert into statement (you need to delete the content of the table before you load new data into it). I figured I would need to use the LineNumber property, but I really don't know how to combine it with the database.table check.
This is what i got into first variable with this command:
$insertinto = Get-ChildItem "$packagepath\Init\" -Include 03_Init_*.txt -Recurse | Select-String -Pattern "insert into "
#content of variable
C:\Users\hanus\Documents\sql_init.txt:42:insert into database1.table1
C:\Users\hanus\Documents\sql_init.txt:130:insert into database1.table2
C:\Users\hanus\Documents\sql_init.txt:282:insert into database2.table3
Here is what I got into second variable with this command:
$deletefrom = Get-ChildItem "$packagepath\Init\" -Include 03_Init_*.txt -Recurse | Select-String -Pattern "delete from "
#content of the variable
C:\Users\hanus\Documents\sql_init.txt:40:delete from database1.table1;
C:\Users\hanus\Documents\sql_init.txt:128:delete from database1.table2;
C:\Users\hanus\Documents\sql_init.txt:280:delete from database2.table3;
The expected output would be something like: This"delete from" statement is not before "insert into" statement, even though it's in the file.
I hope I described the problem well. I am new to Powershell and scripting so be please patient with me. Thank you for any help in advance!
You're already using Select-String, so this should be pretty simple. The content of those variables is far more than you're seeing there. Run this:
$deletefrom | Format-List * -Force
You'll see that each match contains an object with properties for what file the match is from, what line number the match was found on, and more. I think if you capture the table that is being modified in your Select-String with a look behind of what you're searching on now you could group on that, and then alert on times where the delete happens after the insert.
Get-ChildItem "$packagepath\Init\*" -Include 03_Init_*.txt -Recurse |
Select-String "(?<=delete from |insert into )([^;]+)" |
Group-Object {$_.Matches[0].value} |
ForEach-Object {
if($_.group[0] -notmatch 'delete from'){Write-Warning "Inserting into $($_.Name) before deleting"}
}
Been pulling my hair out for a couple days trying to understand why I cannot delete (or list) files using the Last Access Time file attribute. All examples I have found on the WWW return Last Write Time... Despite passing { $_.LastAccessTime }
I think it has to do with the examples being written in PowerShell 2.0.
For example,
Get-ChildItem -Recurse -Path c:\ | Where-Object {$_.LastAccessTime -le (Get-Date).AddDays(1)}
Returns
Mode LastWriteTime Length Name
---- ------------- ------ ----
However, using
Select-Object -Property LastAccessTime, FullName
DOES Last Access Time but I don't know how to take that info and make PS delete it.
Get-ChildItem -Recurse -Path c:\ | Select-Object -Property LastAccessTime, FullName
What I want is an updated script that works in PowerShell 5.0 (or whatever Windows 2016 uses), accepts 2 parameters -- PATH and DAYS and deletes them.
Bonus to have it first move the files to a folder instead of delete, such as Archive. I will then run the same script later that can delete the files in the Archive folder.
I also have run FSUTIL (and rebooted) -
fsutil behavior set disablelastaccess 0
In my email today I received an email about getting unused drive letters. This was their solution:
Get-ChildItem function:[d-z]: -Name | Where-Object {-not (Test-Path -Path $_)}
PowerShell Magazine BrainTeaser had this for a solution, same thing.
ls function:[d-z]: -n|?{!(test-path $_)}|random
I have no idea how function:[d-z]: works. I know that for each character between 'd' to 'z' is used but I don't know why the syntax works.
Testing Get-ChildItem function:[d-a]: -Name gives you an error saying Get-ChildItem : Cannot retrieve the dynamic parameters for the cmdlet. The specified wildcard pattern is not valid:[d-a]:
So is that a dynamic parameter? How come is does not show up with Get-Help gci -full?
function: is a PSDrive which exposes the set of functions defined in the current session. PowerShell creates a function for each single letter drive, named as the letter followed by a colon.
So, function:[d-z]: lists the functions from "d:" through "z:"
function:[d-a]: doesn't work because , d-a isn't a range of letters.
I'm trying to set up a script designed to change a bit over 100 placeholders in probably some 50 files. In general I got a list of possible placeholders, and their values. I got some applications that have exe.config files as well as ini files. These applications are stored in c:\programfiles(x86)\ and in d:\In general I managed to make it work with one path, but not with two. I could easily write the code to replace twice, but that leaves me with a lot of messy code and would be harder for others to read.
ls c:\programfiles(x86) -Recurse | where-object {$_.Extension -eq ".config" -or $_.Extension -eq ".ini"} | %{(gc $PSPath) | %{
$_ -replace "abc", "qwe" `
-replace "lkj", "hgs" `
-replace "hfd", "fgd"
} | sc $_PSPath; Write-Host "Processed: " + $_.Fullname}
I've tried to include 2 paths by putting $a = path1, $b = path2, c$ = $a + $b and that seems to work as far as getting the ls command to run in two different places. however, it does not seem to store the path the files are in, and so it will try to replace the filenames it has found in the folder you are currently running the script from. And thus, even if I might be in one of the places where the files is supposed to be, it's not in the other ...
So .. Any idea how I can get Powershell to list files in 2 different places and replace the same variables in both places without haveing to have the code twice ? I thought about putting the code I would have to use twice into a variable, calling it when I needed to instead of writing it again, but it seemed to resolve the code before using it, and that didn't exactly give me results since the data comes from the first part.
If you got a cool pipeline, then every problem looks like ... uhm ... fluids? objects? I have no clue. But anyway, just add another layer (and fix a few problems along the way):
$places = 'C:\Program Files (x86)', 'D:\some other location'
$places |
Get-ChildItem -Recurse -Include *.ini,*.config |
ForEach-Object {
(Get-Content $_) -replace 'abc', 'qwe' `
-replace 'lkj', 'hgs' `
-replace 'hfd', 'fgd' |
Set-Content $_
'Processed: {0}' -f $_.FullName
}
Notable changes:
Just iterate over the list of folders to crawl as the first step.
Doing the filtering directly in Get-ChildItem makes it faster and saves the Where-Object.
-replace can be applied directly to an array, no need for another ForEach-Object there.
If the number of replacements is large you may consider using a hashtable to store them so that you don't have twenty lines of -replace 'foo', 'bar'.