How to remove space from an array (powershell) - sql

I have a script that reads TAB seperated .TXT file and grabs information from a table and then it creates a .SQL script based off of the names in the list. But every time a $Variable[#] is called it adds an extra space.
This space does not exist in the source data. I am looking for a method of trimming it.
$start | Out-File -filepath $target1 -append
$infile = $source1
$reader = [System.IO.File]::OpenText($infile)
$writer = New-Object System.IO.StreamWriter $file1;
$counter = 1
try {
while (($line = $reader.ReadLine()) -ne $null)
{
$myarray=$line -split "\t" | foreach {$_.Trim()}
If ($myarray[1] -eq "") {$myarray[1]=”~”}
If ($myarray[2] -eq "") {$myarray[2]=”~”}
If ($myarray[3] -eq "") {$myarray[3]=”~”}
If ($myarray[4] -eq "") {$myarray[4]=”~”}
If ($myarray[5] -eq "") {$myarray[5]=”~”}
if ($myarray[0] -Match "\d{1,4}\.\d{1,3}"){
"go
Insert into #mytable Select convert(varchar(60),replace('OSFI Name: "+$myarray[1],$myarray[2],$myarray[3],$myarray[4],$myarray[5],"')), no_,branch,name,surname,midname,usual,bname2
from cust where cust.surname in ('"+$myarray[2].,"',"+$myarray[1],"',"+$myarray[3],"',"+$myarray[4],"',"+$myarray[5],"')' and ( name in ('"+$myarray[1],"','"+$myarray[2],"','"+$myarray[3],"','"+$myarray[4],"','"+$myarray[5],"') or
midname in ('"+$myarray[1],"','"+$myarray[2],"','"+$myarray[3],"','"+$myarray[4],"','"+$myarray[5],"') or
usualy in ('"+$myarray[1],"','"+$myarray[2],"','"+$myarray[3],"','"+$myarray[4],"','"+$myarray[5],"') or
bname2 in ('"+$myarray[1],"','"+$myarray[2],"','"+$myarray[3],"','"+$myarray[4],"','"+$myarray[5],"') )
" -join "," | foreach {$_.Trim()} | Out-File -filepath $target1 -append
}
#$writer.WriteLine($original);
#Write-Output $original;
#Write-Output $newlin
}
}
finally {
$reader.Close()
$writer.Close()
}
$end | Out-File -filepath $target1 -append
Every time it calls $myarray[1] or any other number it adds a space. This is not good as this will create a duplicate entry for every name it pulls in my DB.
We have an existing ".Java" script that does what I am trying to achieve so I know what my output should look like.
The output I should be getting looks like:
go
Insert into #mytable Select convert(varchar(60),replace('OSFI Name: Fake Name Faker unreal ','''''','''')), no_,branch,name,surname,midname,usual,bname2
from cust where cust.surname in ('Faker unreal','Fake Name','~','~','~') and ( name in ('Fake Name', 'Faker unreal', '~', '~', '~') or
midname in ('Fake Name', 'Faker unreal', '~', '~', '~') or
usual in ('Fake Name', 'Faker unreal', '~', '~', '~') or
bname2 in ('Fake Name', 'Faker unreal', '~', '~', '~') )
But instead I am getting
go
Insert into #mytable Select convert(varchar(60),replace('OSFI Name: Fake Name Fake Faker unreal ~ ~ ')), no_,branch,name,surname,midname,usual,bname2
from cust where cust.surname in ('Fake Name ',unreal ',~ ',~ ')' and ( name in ('Fake Name ','Fake Faker ','unreal ','~ ','~ ') or
midname in ('Fake Name ','Fake Faker ','unreal ','~ ','~ ') or
usualy in ('Fake Name ','Fake Faker ','unreal ','~ ','~ ') or
bname2 in ('Fake Name ','Fake Faker ','unreal ','~ ','~ ') )

The use of commas in your string building are contributing to this issue. I see that you are using a -join at the end of that string as well. This could explain why you were using commas as -join is an array operator. In essence you are muddying the waters in how you are building your string by combining that with basic string concatenation. A simple example that shows the issue (minor refactor of your code to accomplish the goal)
'"+$myarray[2].,"',"+$myarray[1],"',"+$myarray[3],"'
Fix those and your issue should go away (see section below about better approaches). That comma makes PowerShell take the next string as part of an array as supposed to a string concat. When arrays are flattened to strings in PowerShell the elements are space delimited. The issue is not a space in the array element but how you are building your string.
Compare these results to see what I mean
$myarray = 65..90 | %{[string]([char]$_) * 4}
"'"+$myarray[2],"'"
"'"+$myarray[2]+"'"
'CCCC '
'CCCC'
In the first output example the 2 element array $myarray[2],"'" is being added to the string "'". In the second the quote then the array element are being added to the string and then another quote. Pure string concatenation.
Consider better string building approaches
Know that there are other ways to do this as well. You can use subexpressions and the format operator if that helps.
"'$($myarray[2])','$($myarray[3])'"
"'{0}','{1}'" -f $myarray[2],$myarray[3]
SQL Injection Warning
You are manually building, with string concatenation, sql code. This means that an attacked could put something malicious in your input file and you could very well execute this. You need to be using command parameterization. Look up SQL injection as this is out of scope of the question.

Related

Timestamp issues with Powershell

I have a small powershell script that pulls the last hour of punch data from a sql db, it then outputs that data to a .csv file. The script is working, but the timestamp is like this:
hh:mm:ss.xxx, i need it to be only hh:mm, Any help would be greatly appreciated!
Below is the script and a snippet of the output:
sqlcmd -h-1 -S ZARIRIS\IRIS -d IA3000SDB -Q "SET NOCOUNT ON; Select Distinct TTransactionLog_1.DecisionTimeInterval,
TTransactionLog_1.UserID, TTransactionLog_1.OccurDateTime, TTransactionLog_1.StableTimeInterval
From TTransactionLog_1
Inner join TSystemLog1 On TTransactionLog_1.NodeID=TSystemLog1.NodeID
Inner join TUser On TTransactionLog_1.UserID=Tuser.UserID
where TSystemLog1.NodeID = 3 and TTransactionLog_1.OccurDateTime >= dateadd(HOUR, -1, getdate())" -s "," -W -o "C:\atr\karen\adminreport3.csv"
Get-Content "C:\ATR\Karen\adminreport3.csv" | ForEach-Object {$_ -replace "44444444","IN PUNCH"} | ForEach-Object {$_ -replace "11111111","OUT PUNCH"} | Set-Content "C:\ATR\Karen\punchreport1.csv" -Force
Output: (where i need the hh:mm format, it needs to read 12:08, not 12:08:19.000)
112213,2022-10-31 12:08:19.000,OUT PUNCH
It would probably be best if your script were to write out a date formatted the way you want in the first place,
but if that's not an option, you really should consider using Import-Csv and Export-Csv to manipulate the data inside.
If the standard quoted csv output is something you don't want, please see this code to safely remove the quotes where possible.
Having said that, here's one way of doing it in a line-by-line fashion:
Get-Content "C:\ATR\Karen\adminreport3.csv" | ForEach-Object {
$line = $_ -replace "44444444","IN PUNCH" -replace "11111111","OUT PUNCH"
$fields = $line -split ','
# reformat the date by first parsing it out as DateTime object
$fields[1] = '{0:yyyy-MM-dd HH:mm}' -f [datetime]::ParseExact($fields[1], 'yyyy-MM-dd HH:mm:ss.fff',$null)
# or use regex on the date and time string as alternative
# $fields[1] = $fields[1] -replace '^(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}).*', '$1'
# rejoin the fields with a comma
$fields -join ','
} | Set-Content "C:\ATR\Karen\punchreport1.csv" -Force

Find a character in a string using Powershell?

I know I could use Contains to find it but it doesn't work.
Full Story:
I have to get the PartNo, Ver, Rev from SQl db and check if they occur in the first line of the text file. I get the first line of the file and store it in $EiaContent.
The PartNo is associated with MAFN as in $partNo=Select PartNo Where MAFN=xxx. Most of the time MAFN returns one PartNo. But in some cases for one MAFN there could be multiple PartNo. So the query returns multiple PartNo(PartNo_1,PartNo_2,PartNo_3,and PartNo_4) but only one of these will be in the text file.
The issue is that each of these PartNo. is treated as a single character in PowerShell. $partNo.Length is 4. Therefore, my check If ($EiaContent.Contains("*$partNo*")) fails and it shouldn't in this case because I can see that one of the PartNo is mentioned in the file. Also, Contains wouldn't work if there was one PartNo. I use like as in If ($EiaContent -like "*$partNo*") to match the PartNo and it worked but it doesn't work when there are multiple PartNo.
Data type of $partNo is string and so is $EiaContent. The data type of PartNo. in SQL is varchar(50) collation is COLLATE SQL_Latin1_General_CP1_CI_AS
I am using PowerShell Core 7.2 and SQL 2005
Code:
$EiaContent = (Get-Content $aidLibPathFolder\$folderName\$fileName -TotalCount 1)
Write-host $EiaContent
#Sql query to get the Part Number
$partNoQuery = "SELECT PartNo FROM [NML_Sidney].[dbo].[vMADL_EngParts] Where MAFN = $firstPartTrimmed"
$partNoSql = Invoke-Sqlcmd -ServerInstance $server -Database $database -Query $partNoQuery
#Eliminate trailing spaces
$partNo = $partNoSql.PartNo.Trim()
If ($EiaContent.Contains("*$partNo*")) {
Write-Host "Part Matches"
}
Else {
#Send an email stating the PartNo discrepancy
}
Thank you in advance to those who try to help.
EDIT
Screenshot
[1]: https://i.stack.imgur.com/hIqJB.png
A1023 A1023MD C0400 C0400MD is the output of the variable $partNo and O40033( C0400 REV N VER 004, 37 DIA 4.5 BRAKE DRUM OP3 ) is the output of the variable $EiaContent
So the query returns multiple PartNo(PartNo_1,PartNo_2,PartNo_3,and PartNo_4) but only one of these will be in the text file.
A1023 A1023MD C0400 C0400MD is the output of the variable $partNo and O40033( C0400 REV N VER 004, 37 DIA 4.5 BRAKE DRUM OP3 ) is the output of the variable $EiaContent
So you first have to split $partNo and then for each sub string of $partNo, search for it in $EiaContent:
If ($partNo -split ' ' | Where-Object { $EiaContent.Contains( $_ ) }) {
Write-Host "Part Matches"
}
This is the generic form that most people are used to. We can simplify the query using the unary form of -split (as we split on the default separator) and use the intrinsic array method .Where() which is faster as it does not involve pipeline overhead.
If ((-split $partNo).Where{ $EiaContent.Contains( $_ ) }) {
Write-Host "Part Matches"
}
As correctly noted in comments, wildcards are not supported by the .Contains() string method.
Wildcards are supported only by the PowerShell -like operator. The following example is just for educational purposes, I wouldn't use it in your case as .Contains() string method is simpler and faster.
If ((-split $partNo).Where{ $EiaContent -like "*$EiaContent*" }) {
Write-Host "Part Matches"
}
Note that -contains would not be suitable here. A common misconception is that -contains does a substring search, when the LHS operand is a string. It doesn't! The operator tests whether a collection (such as an array) on the LHS contains the value given on the RHS.

Finding table names in SSIS .dtsx packages

I am trying to scan SSIS .dtsx packages for table names. Yes, I know that I should use [xml] and a tool that parses SQL language. That does not seem to be possibe at this time. PowerShell can understand [xml], but SQL parsers generally cost++ and using ANTLR is more of an investment than is acceptable at this time. I am open to suggestions, but I am not asking for a tool recommendation.
There are two (2) problems.
1) `&.;` does not appear to be recognized as separate from the table name capture item
2) TABLE5 does not appear to be found
Yes, I also know that schema names should not be hardcoded into source. It makes it difficult/impossible for DBAs to manage the database. That is the way it is done here.
How can I make the regex omit the &.*; from the capture and recognize dbo.TABLE5
Here is the code I am using to scan the .dtsx files.
PS C:\src\sql> Get-Content .\Find-FromJoinSql.ps1
Get-ChildItem -File -Filter '*.dtsx' |
ForEach-Object {
$Filename = $_.Name
Select-String -Pattern '(FROM|JOIN)(\s|&.*;)+(\S+)(\s|&.*;)+' -Path $_ -AllMatches |
ForEach-Object {
if ($_.Matches.Groups.captures[3].value -match 'dbo') {
"$Filename === $($_.Matches.Groups.captures[3].value)"
}
}
}
Here is a tiny sample of the type of text from the .dtsx file.
PS C:\src\sql> Get-Content .\sls_test.dtsx
USE ADATABASE;
SELECT * FROM dbo.TABLE1 WHERE F1 = 3;
SELECT * FROM dbo.TABLE2 T2
FULL OUTER JOIN dbo.TABLEJ TJ
ON T2.KEY = TJ.KEY;
SELECT * FROM dbo.TABLE3 T3
INNER JOIN ADATABASE2.dbo.TABLEK
TK ON
T3.user_id = TK.user_id
SELECT * FROM dbo.TABLE4 T4 FULL OUTER JOIN dbo.TABLE5 T5
ON T4.F1 = T5.F1;
EXIT
Running the script on this data produces:
PS C:\src\sql> .\Find-FromJoinSql.ps1
sls_test.dtsx === dbo.TABLE1
sls_test.dtsx === dbo.TABLE2
sls_test.dtsx === dbo.TABLEJ
sls_test.dtsx === dbo.TABLE3
sls_test.dtsx === ADATABASE2.dbo.TABLEK
TK
sls_test.dtsx === dbo.TABLE4
PS C:\src\sql> $PSVersionTable.PSVersion.ToString()
7.1.5
Indeed strange that some entities (
) are not replaced in those files.
Change the regex pattern a bit to capture the dbo.table names like below.
Using Get-Content
$regex = [regex] '(?im)(?:FROM|JOIN)(?:\s|&[^;]+;)+([^\s&]+)(?:\s|&[^;]+;)*'
Get-ChildItem -Path D:\Test -File -Filter '*.dtsx' |
ForEach-Object {
$match = $regex.Match((Get-Content -Path $_.FullName -Raw))
while ($match.Success) {
"$($_.Name) === $($match.Groups[1].Value)"
$match = $match.NextMatch()
}
}
Using Select-String
As to why Select-String -AllMatches skipped your Table5.
From the docs: "When Select-String finds more than one match in a line of text, it still emits only one MatchInfo object for the line, but the Matches property of the object contains all the matches."
That means you need another loop to get all the $Matches from each $MatchInfo objects to get them in your output:
$pattern = '(?:FROM|JOIN)(?:\s|&[^;]+;)+([^\s&]+)(?:\s|&[^;]+;)*'
Get-ChildItem -Path 'D:\Test' -File -Filter '*.dtsx' |
ForEach-Object {
$Filename = $_.Name
Select-String -Pattern $pattern -Path $_.FullName -AllMatches |
ForEach-Object {
# loop again, because each $MatchInfo object may contain multiple
# $Matches objects if more matches were found in the same line
foreach ($match in $_.Matches) {
if ($match.Groups[1].value -match 'dbo') {
"$Filename === $($match.Groups[1].value)"
}
}
}
}
Output:
sls_test.dtsx === dbo.TABLE1
sls_test.dtsx === dbo.TABLE2
sls_test.dtsx === dbo.TABLEJ
sls_test.dtsx === dbo.TABLE3
sls_test.dtsx === ADATABASE2.dbo.TABLEK
sls_test.dtsx === dbo.TABLE4
sls_test.dtsx === dbo.TABLE5
Regex details:
(?im) Use case-insensitive matching and have '^' and '$' match at linebreaks
(?: Match the regular expression below
Match either the regular expression below (attempting the next alternative only if this one fails)
FROM Match the characters “FROM” literally
| Or match regular expression number 2 below (the entire group fails if this one fails to match)
JOIN Match the characters “JOIN” literally
)
(?: Match the regular expression below
| Match either the regular expression below (attempting the next alternative only if this one fails)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
| Or match regular expression number 2 below (the entire group fails if this one fails to match)
& Match the character “&” literally
[^;] Match any character that is NOT a “;”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
; Match the character “;” literally
)+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 1
[^\s&] Match a single character NOT present in the list below
A whitespace character (spaces, tabs, line breaks, etc.)
The character “&”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
(?: Match the regular expression below
| Match either the regular expression below (attempting the next alternative only if this one fails)
\s Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
| Or match regular expression number 2 below (the entire group fails if this one fails to match)
& Match the character “&” literally
[^;] Match any character that is NOT a “;”
+ Between one and unlimited times, as many times as possible, giving back as needed (greedy)
; Match the character “;” literally
)* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Generate SQL with Powershell & Regex

I have the following ddl(Data definition language) structure in sql:
CREATE TABLE tablename(
key ...,
key1 ...,
u_version ...,
field1 ...,
field2 ...
)
CREATE UNIQUE clustered... (key ASC, key1 ASC)
Because I want to create multiple tables with that structure, only the name of tables, keys and field names are different. So I want to use Powershell script to scan every file in the directory and generate sql script for each file.
The generated script should look like this:
INSERT INTO tablename(key, key1, u_version, field1, field2)
SELECT key, key1, field1, field2
FROM tablename_temp t
WHERE NOT EXISTS (SELECT key, key1
FROM tablename l
WHERE l.key = t.key AND l.key1 = t.key1)
For tablename & table_name_tempdb, I use the following:
switch -Regex ( $line ) {
'(^\s*create\s+table\s+)(?<tablename>[^(]*)' {
$table_name = $matches["tablename"]
$table_name_tempdb="${table_name}_tmp"
break
}
}
Now I want to do the same to keys and other fields. Are there any suggestions?
My idea is to scan every lines from the CREATE TABLE to ")" and add every word begins with " " and ends with " " in a list. Lines before u_version are keys, the others are considered as fields and build query with those.
Here is a working example, based upon the information provided, that outputs the new SQL.
I tweaked you SQL (your insert as is would have failed). Also, if your input files differ from your example, the expression will need tweaking.
Reading content with -Raw allows the whole file to be parsed together.
dir *.sql |%{
$filecontents = Get-Content $_.FullName -Raw
$matches = [Regex]::Match($filecontents, "CREATE TABLE (?'tablename'[^( ]+)\((\W+(?'key'\w+)\W(?'type'[^,)])+[,)])+(\W+u_version\W([^,)])+[,)])(\W+(?'field'\w+)\W(?'type'[^,)])+[,)])+\W+CREATE UNIQUE")
$tablename = $matches.Groups["tablename"].Value
$keys = $matches.Groups["key"].Captures.Value
$fields = $matches.Groups["field"].Captures.Value
'=================================================================='
#"
INSERT INTO $tablename($($keys -join ','), u_version, $($fields -join ','))
SELECT $($keys -join ','), u_version, $($fields -join ',')
FROM $($tablename)_temp t
WHERE NOT EXISTS (SELECT 1
FROM $tablename l
WHERE $(($keys |% { "l.$_ = t.$_" }) -join ' AND '))
"#
}

regex to split name=value,* into csv of name,* and value,*

I would like to split a line such as:
name1=value1,name2=value2, .....,namen=valuen
two produce two lines as follows:
name1,name2, .....,namen
value1,value2, .....,valuen
the goal being to construct an sql insert along the lines of:
input="name1=value1,name2=value2, .....,namen=valuen"
namescsv=$( echo $input | sed 's/=[^,]*//g' )
valuescsv=$( echo $input | ?????? )
INSERT INTO table_name ( $namescsv ) VALUES ( $valuescsv )
Id like to do this as simply as possible - perl awk, or multiple piping to tr cut etc seems too complicated. Given the names part seems simple enough I figure there must be something similar for values but cant work it out.
You can just inverse your character match :
echo $input | sed 's/[^,]*=//g'
i think your best bet is still sed -re s/[^=,]*=([^,]*)/\1/g though I guess the input would have match your table exactly.
Note that in some RDBMS you can use the following syntax:
INSERT INTO table_name SET name=value, name2=value2, ...;
http://dev.mysql.com/doc/refman/5.5/en/insert.html
The following shell script does what you are asking for and takes care of escaping (not only because of injection, but you may want to insert values with quotes in them):
_IFS="$IFS"; IFS=","
line="name1=value1,name2=value2,namen=valuen";
for pair in $line; do
names="$names,${pair%=*}"
values="$values,'$(escape_sql "${pair#*=}")'"
done
IFS="$_IFS"
echo "INSERT INTO table_name ( ${names#,} ) VALUES ( ${values#,} )"
Output:
INSERT INTO table_name ( name1,name2,namen ) VALUES ( 'value1','value2','valuen' )