Using PowerShell, how can a SQL file be split into several files based on its content? - sql

I'm trying to use PowerShell to automate the division of a SQL file into seperate files based on where the headings are located.
An example SQL file to be split is below:
/****************************************
Section 1
****************************************/
Select 1
/****************************************
Section 2
****************************************/
Select 2
/****************************************
Section 3
****************************************/
Select 3
I want the new files to be named as per the section headings in the file i.e. 'Section 1', 'Section 2' and 'Section 3'. The content of the first file should be as follows:
/****************************************
Section 1
****************************************/
Select 1
The string: /**************************************** is only used in the SQL file for the section headings and therefore can be used to identify the start of a section. The file name will always be the text on the line directly below.

You can try like this (split is here based on empty lines between sections) :
#create an index for our output files
$fileIndex = 1
#load SQLite file contents in an array
$sqlite = Get-Content "G:\input\sqlite.txt"
#for each line of the SQLite file
$sqlite | % {
if($_ -eq "") {
#if the line is empty, increment output file index to create a new file
$fileindex++
} else {
#if the line is not empty
#build output path
$outFile = "G:\output\section$fileindex.txt"
#push line to the current output file (appending to existing contents)
$_ | Out-File $outFile -Append
}
}
#load generated files in an array
$tempfiles = Get-ChildItem "G:\output"
#for each file
$tempfiles | % {
#load file contents in an array
$data = Get-Content $_.FullName
#rename file after second line contents
Rename-Item $_.FullName "$($data[1]).txt"
}

The below code uses the heading names found within the comment blocks. It also splits the SQL file into several SQL files based on the location of the comment blocks.
#load SQL file contents in an array
$SQL = Get-Content "U:\Test\FileToSplit.sql"
$OutputPath = "U:\TestOutput"
#find first section name and count number of sections
$sectioncounter = 0
$checkcounter = 0
$filenames = #()
$SQL | % {
#Add file name to array if new section was found on the previous line
If ($checkcounter -lt $sectioncounter)
{
$filenames += $_
$checkcounter = $sectioncounter
}
Else
{
If ($_.StartsWith("/*"))
{
$sectioncounter += 1
}
}
}
#return if too many sections were found
If ($sectioncounter > 50) { return "Too many sections found"}
$sectioncounter = 0
$endcommentcounter = 0
#for each line of the SQL file (Ref: sodawillow)
$SQL | % {
#if new comment block is found point to next section name, unless its the start of the first section
If ($_.StartsWith("/*") -And ($endcommentcounter -gt 0))
{
$sectioncounter += 1
}
If ($_.EndsWith("*/"))
{
$endcommentcounter += 1
}
#build output path
$tempfilename = $filenames[$sectioncounter]
$outFile = "$OutputPath\$tempfilename.sql"
#push line to the current output file (appending to existing contents)
$_ | Out-File $outFile -Append
}

Related

eliminating all values that occur in all files in folder with awk

I have a folder with several files of which I want to eliminate all of the terms that they have in common using awk.
Here is the script that I have been using:
awk '
FNR==1 {
if (seen[FILENAME]++) {
firstPass = 0
outfile = FILENAME "_new"
}
else {
firstPass = 1
numFiles++
ARGV[ARGC++] = FILENAME
}
}
firstPass { count[$2]++; next }
count[$2] != numFiles { print > outfile }
' *
An example of the information in the files would be:
File1
3 coffee
4 and
8 milk
File2
4 dog
2 and
9 cat
The output should be:
File1_new
3 coffee
8 milk
File2_new
4 dog
9 cat
It works when I use a small number of files (i.e. 10), but when I start to increase that number, I get the following error message:
awk: file20_new makes too many open files input record number 27, file file20_new source line number 14
Where is the error coming from when I use larger amounts of files?
My main goal is to run this script over all of the files in a folder to generate new files that do not contain any words that occur in all of the files in the folder.
When you use >, a file is opened for writing (and truncated). As suggested in the comments, you need to close your files as you go along. Try something like this:
awk '
FNR==1 {
if (seen[FILENAME]++) {
firstPass = 0
if (outfile) close(outfile) # <-- close the previous file
outfile = FILENAME "_new"
}
else {
firstPass = 1
numFiles++
ARGV[ARGC++] = FILENAME
}
}
firstPass { count[$2]++; next }
count[$2] != numFiles { print > outfile }
' *

PowerShell finding a file and creating a new one

The script I'm working on is producing a log file every time it runs. The problem is that when the script runs in parallel, the current log file becomes inaccessible for Out-File. This is normal because the previous script is still writing in it.
So I would like the script being able to detect, when it starts, that there is already a log file available, and if so, create a new log file name with an increased number between the brackets [<nr>].
It's very difficult to check if a file already exists, as it can have another number each time the script starts. It would be great if it could then pick up that number between the brackets and increment it with +1 for the new file name.
The code:
$Server = "UNC"
$Destination ="\\domain.net\share\target\folder 1\folder 22"
$LogFolder = "\\server\c$\my logfolder"
# Format log file name
$TempDate = (Get-Date).ToString("yyyy-MM-dd")
$TempFolderPath = $Destination -replace '\\','_'
$TempFolderPath = $TempFolderPath -replace ':',''
$TempFolderPath = $TempFolderPath -replace ' ',''
$script:LogFile = "$LogFolder\$(if($Server -ne "UNC"){"$Server - $TempFolderPath"}else{$TempFolderPath.TrimStart("__")})[0] - $TempDate.log"
$script:LogFile
# Create new log file name
$parts = $script:LogFile.Split('[]')
$script:NewLogFile = '{0}[{1}]{2}' -f $parts[0],(1 + $parts[1]),$parts[2]
$script:NewLogFile
# Desired result
# \\server\c$\my logfolder\domain.net_share_target_folder1_folder22[0] - 2014-07-30.log
# \\server\c$\my logfolder\domain.net_share_target_folder1_folder22[1] - 2014-07-30.log
#
# Usage
# "stuff" | Out-File -LiteralPath $script:LogFile -Append
As mentioned in my answer to your previous question you can auto-increment the number in the filename with something like this:
while (Test-Path -LiteralPath $script:LogFile) {
$script:LogFile = Increment-Index $script:LogFile
}
where Increment-Index implements the program logic that increments the index in the filename by one, e.g. like this:
function Increment-Index($f) {
$parts = $f.Split('[]')
'{0}[{1}]{2}' -f $parts[0],(1 + $parts[1]),$parts[2]
}
or like this:
function Increment-Index($f) {
$callback = {
$v = [int]$args[0].Groups[1].Value
$args[0] -replace $v,++$v
}
([Regex]'\[(\d+)\]').Replace($f, $callback)
}
The while loop increments the index until it produces a non-existing filename. The parameter -LiteralPath in the condition is required, because the filename contains square bracket, which would otherwise be treated as wildcard characters.

Powershell Code for List of Distinct Directories

I have a file that contains a list such as:
tables\mytable1.sql
tables\myTable2.sql
procedures\myProc1.sql
functions\myFunction1.sql
functions\myFunction2.sql
From this data (and there will always be a path, and it will always be only one level), I want to retrieve a list of distinct paths (e.g. tables\, procedures\, functions\)
To maybe make it the file that contains this data will already have been read into a list (named $fileList), so the new list ($directoryList ??) can likely derived from it.
I've found reference to the -unique parameter, but I need to look from the start of the line, up to (and including) the '\', of which there will only be one occurrence of).
Assuming you already have the data on $fileList, try this:
$directoryList = $fileList | %{ $_.split("\")[0]} | select -unique
It will do a foreach (the %{}) on the elements of your list, and then split them by the \ and get you only the first part (in your case, the folder name), after that you use select -unique to get just the distinct values.
Alternatively, you could do it like this:
$fileList | %{ $_ -replace "\\.*$","" } | select -unique
Using -replace to remove everything after the \.
Also, if for some reason you don't have the values of your textfile on $fileList already, you can do so using:
$fileList = Get-Content yourFile.txt
Your file may contain empty lines and more often than not there's an empty line on the last one so this will account for that.
It also has a slightly different regular expression to match from the end of the string that is not a \ character which will work for paths with multiple levels including your example.
If you have a text file with the following:
Z:\Path to somewhere\Files\some file 1.txt
Z:\Path to somewhere\Files\some file 2.txt
tables\mytable1.sql
tables\myTable2.sql
procedures\myProc1.sql
functions\myFunction1.sql
functions\myFunction2.sql
With this code which also shows the output after the function:
$fileListToProcess = "$([Environment]::GetFolderPath(""Desktop""))\list.txt"
Function Get-UniqueDirectoriesFromFile {
Param
(
[Parameter(Mandatory = $true, HelpMessage = 'The file where the list of files is.')]
[string]$LiteralPath
)
if (Test-Path -LiteralPath $LiteralPath -PathType Leaf) {
$fileList = [IO.File]::ReadAllLines($LiteralPath)
return $fileList | %{ $_ -replace '\\[^\\]*$', '' } | ? { $_.trim() -ne "" } | Select -Unique
}
else {
return $null
}
}
$uniqueDirs = Get-UniqueDirectoriesFromFile -file $fileListToProcess
# Display the results:
$uniqueDirs
# PS>
# Z:\Path to somewhere\Files
# tables
# procedures
# functions
$uniqueDirs.count
# PS> 4

How to merge multiple text files into one new csv file with column header created using Perl

I am new to Perl and need some guidance. I have multiple text files and want to merge all them into a new csv file. Then, from the csv file, I want to split the string into multiple column as shown in the "Output" format below. Can someone pls help me?
Text File#1.txt
Name:A
Test1:80
Test2:60
Test3:50
Text File#2.txt
Name:B
Test1:85
Test2:78
Test3:60
Output (format #1):
New Text File#3.csv
Name Test1 Test2 Test3
A 80 60 50
B 85 78 60
Output (format #2):
New Text File#3.csv
Name Test Data
A 1 80
A 2 60
A 3 50
B 1 85
B 2 78
reading files:
Open FILE, "filename.txt" or die $!;
#create hash
%hash = ();
#read file - you have to do this for all files
while (<FILE>) {
if(first_row){
$name = (split ':')[1];
}else{
$points = split ':'[1];
push( #{$hash{$name}}, $points );
}
}
at this point you have a hash:
A -> [80,60,50]
B -> [85,78,60]
you can now use this hash to print your csv file:
for file 1:
open CSV1, ">", "1.csv" or die $!
foreach my $name( keys %hash )
{
$points = $hash{$name};
print CSV1 $name + ";" + join(";", #{$points})
print "/n";
}
for file 2:
open CSV2, ">", "2.csv" or die $!
foreach my $name( keys %hash )
{
$points = $hash{$name};
$testnumber = 0;
foreach $point ( in #$points){
print CSV2 $name + ";" $testnumber+1 + ";" + $point + "\n";
}
}
hope this helps you out, if anything is not clear, you can ask.
do not copy paste, it can contain minor errors, but i assume the way of thinking is clear
Update: the ";" is because csv split columns on that character
please feedback

Replace text from select statement from one file to another

I have a bunch of views for my database and need to update the select statements within each view.
I have all the select statements in files called viewname.txt in one dir and in a sub dir called sql; I have the all the views viewname.sql. I want to run a script to take the text from viewname.txt and replace the select statement in the correct viewname.sql in the sql sub dir.
I have tried this to append the text after the SELECT in each .sql file:
for i in */*; do
if ["../$(basename "${i}")" == "$(basename "${i}")"]
then
sed '/SELECT/a "$(basename "$i" .txt)"' "$(basename "$i" .sql)"
fi
done
Any assistance is greatly appreciated!
Dickie
This is an awk answer that's close - the output is placed in the sql directory under corresponding "viewname.sql.new" files.
#!/usr/bin/awk -f
# absorb the whole viewname.txt file into arr when the first line is read
FILENAME ~ /\.txt$/ && FILENAME != last_filename {
last_filename = FILENAME
# get the viewname part of the file name
split( FILENAME, file_arr, "." )
while( getline file_data <FILENAME > 0 ) {
old_data = arr[ file_arr[ 1 ] ]
arr[ file_arr[ 1 ] ] = \
old_data (old_data == "" ? "" : "\n") file_data
}
next
}
# process each line of the sql/viewname.sql files
FILENAME ~ /\.sql$/ {
# strip the "/sql" from the front of FILENAME for lookup in arr
split( substr( FILENAME, 5 ), file_arr, "." )
if( file_arr[ 1 ] in arr ) {
if( $0 ~ /SELECT/ )
print arr[ file_arr[ 1 ] ] > FILENAME ".new"
else
print $0 > FILENAME ".new"
}
}
I put this into a file called awko and chmod +x and ran it like the following
awko *.txt sql/*
You'll have to mv the new files into place, but it's as close as I can get right now.