Powershell 5 Get-content command to select specific word from text file with select-string command I get entire line - powershell-5.0

Powershell 5 Get-content command to select specific word from text file with select-string command I get entire line.
e.g I am running below command and looking to select only specific word, but output gives entire line in which test1 word exists.
PS C:> Get-Content C:\temp\testfile.txt | Select-String test1
hostname is test1, buildhistory 3 hours
and I am looking for command which willl only write test1 in output

You could try this:
Get-Content C:\temp\testfile.txt
| Select-String -Pattern '(test\d+)'
| ForEach-Object -Process {$_.Matches[0].Value}

$failures = Get-Content "C:\Users\Documents\abc.txt" | Select-String -Pattern 'Error' -Context 0, 1
this will get lines after word Error in file abc.txt

Related

Save list of txt files through referencing array [Powershell]

I have a file that has some names (table_names.txt) whose contents are:
ALL_Dog
ALL_Cat
ALL_Fish
and another file that has some entries (test.txt) whose contents include the above names, like:
INSERT INTO ALL_Dog VALUES (1,2,3)
INSERT INTO ALL_Cat VALUES (2,3,4)
INSERT INTO ALL_Fish VALUES (3,4,5)
I need to write a for loop in powershell that creates, within my current directory three separate files: ALL_Dog.txt whose contents are "INSERT INTO ALL_Dog VALUES (1,2,3)", ALL_Cat.txt whose contents are "INSERT INTO ALL_Cat VALUES (2,3,4)", ALL_Fish.txt whose contents are "INSERT INTO ALL_Fish VALUES (3,4,5)"
Here's what I have so far:
[string[]]$tableNameArray = (Get-Content -Path '.\table_names.txt') | foreach {$_ + " VALUES"}
[string[]]$namingArray = (Get-Content -Path '.\table_names.txt') | foreach {$_}
For($i=0; $i -lt $tableNameArray.Length; $i++)
{Get-Content test.txt| Select-String -Pattern $tableNameArray[$i] -Encoding ASCII | Select-Object -ExpandProperty Line | Out-File -LiteralPath $namingArray[$i]}
The problem with what I currently have is that I cannot define the output files as .txt files, so my output files are just "ALL_Dog", "ALL_Cat", and "ALL_Fish".
The solution I'm looking for involves iteration through this namingArray to actually name the output files.
I feel like I'm really close to a solution and would mightily appreciate anyone's assistance or guidance to the correct result.
If I understand the question properly, you would like to get all lines from one file containing a certain table name and create a new textfile with these lines and having the table name as filename, with a .txt extension, correct?
In that case, I would do something like below:
$outputPath = 'D:\Test' # the folder where the output files should go
$inputNames = 'D:\Test\table_names.txt'
$inputCommands = 'D:\Test\test.txt'
# make sure the table names from this file do not have leading or trailing whitespaces
$table_names = Get-Content -Path $inputNames | ForEach-Object { $_.Trim() }
$sqlCommands = Get-Content -Path $inputCommands
# loop through the table names
foreach ($table in $table_names) {
# prepare the regex pattern \b (word boundary) means you are searching for a whole word
$pattern = '\b{0}\b' -f [regex]::Escape($table)
# construct the output file path and name
$outFile = Join-Path -Path $outputPath -ChildPath ('{0}.txt' -f $table)
# get the string(s) using the pattern and write the file
($sqlCommands | Select-String -Pattern $pattern).Line | Out-File -FilePath $outFile -Append
}

How to display the date of each file as the first element of each lines with bash/awk?

I have 7 txt files which are the output of the df -m command on AIX 7.2.
I need to keep only the first column and the second column for one filesystem. So I do that :
cat *.txt | grep hd4 | awk '{print $1","$2}' > test1.txt
And the output is :
/dev/hd4,384.00
/dev/hd4,394.00
/dev/hd4,354.00
/dev/hd4,384.00
/dev/hd4,484.00
/dev/hd4,324.00
/dev/hd4,384.00
Each files are created from the crontab and their filenames are :
df_command-2019-09-03-12:50:00.txt
df_command-2019-08-28-12:59:00.txt
df_command-2019-08-29-12:51:00.txt
df_command-2019-08-30-12:52:00.txt
df_command-2019-08-31-12:53:00.txt
df_command-2019-09-01-12:54:00.txt
df_command-2019-09-02-12:55:00.txt
I would like to keep only the date on the filename, I'm able to do that :
test=df_command-2019-09-03-12:50:00.txt
echo $test | cut -d'-' -f2,3,4
outout :
2019-09-03
But I would like to put each date as the first element of each line of my test1.txt :
2019-08-28,/dev/hd4,384.00
2019-08-29,/dev/hd4,394.00
2019-08-30,/dev/hd4,354.00
2019-08-31,/dev/hd4,384.00
2019-09-01,/dev/hd4,484.00
2019-09-02,/dev/hd4,324.00
2019-09-03,/dev/hd4,384.00
Do you have any idea to do that ?
This awk may do:
awk '/hd4/ {split(FILENAME,a,"-");print a[2]"-"a[3]"-"a[4]","$1","$2}' *.txt > test1.txt
/hd4/ find line with hd4
split(FILENAME,a,"-") splits the filename in to array a split by -
print a[2]"-"a[3]"-"a[4]","$1","$2 print year-month-date, field 1, field 2
> test1.txt to file test1.txt
Date output file : dates.txt
2019-08-20
2019-08-08
2019-08-01
File system data fsys.txt
/dev/hd4,384.00
/dev/hd4,394.00
/dev/hd4,354.00
paste can be used to append the files as columns. Use -d to specify comma as the separator.
paste -d ',' dates.txt fsys.txt

exclude sequences depending on description ID in AWK

I have fasta files which have some description ID ( isoforms 2 , ... Isoform 9 ), i want to exclude them in fasta files.
I used this command line to see which file contain the isoform 2 to 9 ID :
for i in `ls *.fasta`; do l=`grep 'isoform X[2-9]' $i | head -1`; echo $i $l; done | awk '(NF==1){print}' | head
There is a way to include something in my command line for removing them all ?
Thanks.
sed 's/isoform [2-9]\{1,1\}//g' *.fasta

Passing multiple variables to Export-CSV in Powershell

Hey guys I'm having a Powershell 2.0 problem that is driving me crazy. My objective: Create a script that determines the size of the Documents folder along with the current user on the same row, but in two different fields in a csv file. I have tried the following scripts so far:
$startFolder= "C:\Users\$env:username\Documents"
$docinfo = Get-ChildItem $startFolder -recurse | Measure-Object -property length -sum |
$docinfo | Export-Csv -Path C:\MyDocSize\docsize.csv -Encoding ascii -NoTypeInformation
This script works and exports the folder size (sum), along with some columns that I don't need, "Average, Maximum, Minimum, and a property column that has "Length" as a value. Does anyone know how to just show the sum column and none of the other stuff? My main question is however, how do I pass "$env:username" into "$docinfo" and then get "$docinfo" to pass that into a CSV as an additional column and an additional value in the same row as the measurement value?
I tried this:
$startFolder= "C:\Users\$env:username\Documents"
$docinfo = Get-ChildItem $startFolder -recurse | Select-Object $env:username
$docinfo | Export-Csv -Path C:\MyDocSize\docsize.csv -Encoding ascii - NoTypeInformation
This will pass just the current username to the csv file, but without a column name, and then I can't figure out how to incorporate the measurement value with this. Also I'm not even sure why this will pass the username because if I take the "Get-ChildItem $startFolder -recurse" out it will stop working.
I've also tried this script:
$startFolder= "C:\Users\$env:username\Documents"
$docinfo = Get-ChildItem $startFolder -recurse | Measure-Object -property length -sum
New-Object -TypeName PSCustomObject -Property #{
UserName = $env:username
DocFolderSize = $docinfo
} | Export-Csv -Path C:\MyDocSize\docsize.csv -Encoding ascii -NoTypeInformation
This script will pass the username nicely with a column name of "UserName", however in the "DocFolderSize" column instead of the measurement values I get this string: Microsoft.PowerShell.Commands.GenericMeasureInfo
Not sure what to do now or how to get around this, I would be really appreciative of any help! Thanks for reading.
Give this a try"
Get-ChildItem $startFolder -Recurse | Measure-Object -property length -sum | Select Sum, #{Label="username";Expression={$env:username}}
The #{Label="username";Expression={$env:username}} will let you set a custom column header and value.
You can customize the Sum column using the same technique:
Get-ChildItem $startFolder -Recurse | Measure-Object -property length -sum | Select #{Label="FolderSize";Expression={$_.Sum}}, #{Label="username";Expression={$env:username}}
And if you want to show the folder size in MB:
Get-ChildItem $startFolder -Recurse | Measure-Object -property length -sum | Select #{Label="FolderSize";Expression={$_.Sum / 1MB}}, #{Label="username";Expression={$env:username}}

line count with in the text files having multiple lines and single lines

i am using UTL_FILE utility in oracle to get the data in to csv file. here i am using the script.
so i am getting the set of text files
case:1
sample of output in the test1.csv file is
"sno","name"
"1","hari is in singapore
ramesh is in USA"
"2","pong is in chaina
chang is in malaysia
vilet is in uk"
now i am counting the number of records in the test1.csv by using linux commans as
egrep -c "^\"[0-9]" test1.csv
here i am getting the record count as
2 (ACCORDING TO LINUX)
but if i calculate the number of records by using select * from test;
COUNT(*)
---------- (ACCORDING TO DATA BASE)
2
case:2
sample of output in the test2.csv file is
"sno","name","p"
"","",""
"","","ramesh is in USA"
"","",""
now i am counting the number of records in the test2.csv by using linux commans as
egrep -c "^\"[0-9]" test2.csv
here i am getting the record count as
0 (ACCORDING TO LINUX)
but if i calculate the number of records by using select * from test;
COUNT(*)
---------- (ACCORDING TO DATA BASE)
2
can any body help me how to count the exact lines in case:1 and case:2 using the single command
thanks in advance.
Columns in both case is different. To make it generic I wrote a perl script which will print the rows. It generates the regex from headers and used it to calculate the rows. I assumed that first line always represents the number of columns.
#!/usr/bin/perl -w
open(FH, $ARGV[0]) or die "Failed to open file";
# Get coloms from HEADER and use it to contruct regex
my $head = <FH>;
my #col = split(",", $head); # Colums array
my $col_cnt = scalar(#col); # Colums count
# Read rest of the rows
my $rows;
while(<FH>) {
$rows .= $_;
}
# Create regex based on number of coloms
# E.g for 3 coloms, regex should be
# ".*?",".*?",".*?"
# this represents anything between " and "
my $i=0;
while($i < $col_cnt) {
$col[$i++] = "\".*?\"";
}
my $regex = join(",", #col);
# /s to treat the data as single line
# /g for global matching
my #row_cnt = $rows =~ m/($regex)/sg;
print "Row count:" . scalar(#row_cnt);
Just store it as row_count.pl and run it as ./row_count.pl filename
egrep -c test1.csv doesn't have a search term to match for, so it's going to try to use test1.csv as the regular expression it tries to search for. I have no idea how you managed to get it to return 2 for your first example.
A useable egrep command that will actually produce the number of records in the files is egrep '"[[:digit:]]*"' test1.csv assuming your examples are actually accurate.
timp#helez:~/tmp$ cat test.txt
"sno","name"
"1","hari is in singapore
ramesh is in USA"
"2","pong is in chaina
chang is in malaysia
vilet is in uk"
timp#helez:~/tmp$ egrep -c '"[[:digit:]]*"' test.txt
2
timp#helez:~/tmp$ cat test2.txt
"sno","name"
"1","hari is in singapore"
"2","ramesh is in USA"
timp#helez:~/tmp$ egrep -c '"[[:digit:]]*"' test2.txt
2
Alternatively you might do better to add an extra value to your SELECT statement. Something like SELECT 'recmatch.,.,',sno,name FROM TABLE; instead of SELECT sno,name FROM TABLE; and then grep for recmatch.,., though that's something of a hack.
In your second example your lines do not start with " followed by a number. That's why count is 0. You can try egrep -c "^\"([0-9]|\")" to catch empty first column values. But in fact it might be simpler to count all lines and remove 1 because of the header row.
e.g.
count=$(( $(wc -l test.csv) - 1 ))