How to avoid powershell prompt message while copying data - azure-storage

How can I avoid this confirmation message while copying data in another blob? Using –Confirm overwriting file but I don’t want to overwrite just want to ignore if file existing already and move next.
Message from powershell:
Confirm
Are you sure to overwrite 'https://destinationblob111.blob.core.windows.net/container1/file1.doc'?
[Y] Yes  [N] No  [S] Suspend  [?] Help (default is "Y"):
 
code i m using is below
$BlobCopy = Start-CopyAzureStorageBlob -Context $SourceStorageContext -SrcContainer $ContainerName -SrcBlob $BlobName -DestContext $DestStorageContext -DestContainer $ContainerName -DestBlob $BlobName -Confirm
       $BlobCpyAry += $BlobCopy

If you don't want to overwrite the destination blob, you might can check if the destination blob existing, and only copy when it doesn't exist.
If you want to overwrite, add -Force can overwrite the destination blob.

Related

Microsoft R- Tidyverse: If a data file fails to load, create an empty tibble/table in it's place

I am really not sure how to phrase this concisely.. My question is: Is it possible to add an error handling feature so that if a data file (such as a csv) fails to load as a table/tibble, create a blank version of it?
Here is what I mean:
My normal csv load looks like this:
Monday2 <- paste0(my_file_location/my_file_name",Monday,".csv")
leads1 <- tibble(read.csv(Monday2))
Tuesday2 <- paste0("my_file_location/My_file_name",Tuesday,".csv")
leads2 <- tibble(read.csv(Tuesday2))
Wednesday2 <- paste0("my_file_location/my_file_name",Wednesday,".csv")
leads3 <- tibble(read.csv(Wednesday2))
If for some reason my csv failed to load (the file doesn't exist, or I entered the name incorrectly for example) can a blank version of it be created?
My idea for the blank tibble would look like this:
Leads21 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads22 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
Leads23 <- tibble("Column1"= "", "Column2"= "", "Column3"= "")
This blank tibble would be the exact same columns as a properly loaded file. I have 5 files I bind each Friday in an automated process.. and if a file fails to load I can catch it downstream in my process (one of the columns is the file name/date) but I don't want the whole process to fail.
a typical 'failed to load' error looks like this:
In file(file, "rt") : cannot open file 'my_file_location/My_file_name_2022-03-27.csv': No such
file or directory
The bind of all 5 files then fails with an error message like:
### Join full weeks worth of leads into 1 file
Leads <- bind_rows(leads1,leads2,leads3, leads4, leads5)
Error in list2(...) : object 'leads1' not found
This then causes the rest of my code to fail/act incorrectly. If I can bind an empty tibble, my code could finish running and I can check for missing files at the end. Ultimately if a file is missing it is not as important as processing the existing files (so stopping my code to locate/fix the failed load is not important)
My background is in microsoft access VBA and I keep trying to write something like:
If tibble Leads1 exists, use it.. If tibble Leads1 does not exist use Leads21
not sure how to do this in R. I have been trying to read/understand the try() wrapper, but I don't understand how to use it in my case.

How to use the taildir source in Flume to append only newest lines of a .txt file?

I recently asked the question Apache Flume - send only new file contents
I am rephrasing the question in order to learn more and provide more benefitto future users of Flume.
Setup: Two servers, one with a .txt file that gets lines appended to it regularly.
Goal: Use flume TAILDIR source to append the most recently written line to a file on the other server.
Issue: Whenever the source file has a new line of data added, the current configuration appends everything in file on server 1 to the file in server 2. This results in duplicate lines in file 2 and does not properly recreate the file from server 1.
Configuration on server 1:
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1
#using memort channel to hold upto 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100
#connect source, channel,sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1
#define source
agent.sources.r1.type=TAILDIR
agent.sources.r1.channels=k1
agent.sources.r1.filegroups=f1
agent.sources.r1.filegroups.f1=/home/tail_test_dir/test.txt
agent.sources.r1.maxBackoffSleep=1000
#connect to another box using avro and send the data
agent.sinks.c1.type=avro
agent.sinks.c1.hostname=10.10.10.4
agent.sinks.c1.port=4545
Configuration on server 2:
#configure the agent
agent.sources=r1
agent.channels=k1
agent.sinks=c1
#using memory channel to hold up to 1000 events
agent.channels.k1.type=memory
agent.channels.k1.capacity=1000
agent.channels.k1.transactionCapacity=100
#connect source, channel, sink
agent.sources.r1.channels=k1
agent.sinks.c1.channel=k1
#here source is listening at the specified port using AVRO for data
agent.sources.r1.type=avro
agent.sources.r1.bind=0.0.0.0
agent.sources.r1.port=4545
#use file_roll and write file at specified directory
agent.sinks.c1.type=file_roll
agent.sinks.c1.sink.directory=/home/Flume_dump
You have to set position json file. Then the source check the position and write only new added lines to sink.
ex) agent.sources.s1.positionFile = /var/log/flume/tail_position.json

Bigquery error (ASCII 0) encountered for external table and when loading table

I'm getting this error
"Error: Error detected while parsing row starting at position: 4824. Error: Bad character (ASCII 0) encountered."
The data is not compressed.
My external table points to multiple CSV files, and one of them contains a couple of lines with that character. In my table definition I added "MaxBadRecords", but that had no effect. I also get the same problem when loading the data in a regular table.
I know I could use DataFlow or even try to fix the CSVs, but is there an alternative to that does not include writing a parser, and hopefully just as easy and efficient?
is there an alternative to that does not include writing a parser, and hopefully just as easy and efficient?
Try below in Google Cloud SDK Shell (with use of tr utility)
gsutil cp gs://bucket/badfile.csv - | tr -d '\000' | gsutil cp - gs://bucket/fixedfile.csv
This will
Read your "bad" file
Remove ASCII 0
Save "fixed" file into new file
After you have new file - just make sure your table now points to that fixed one
Sometimes it occurs that a final byte appears in file.
What could help is replacing it thanks to :
tr '\0' ' ' < file1 > file2
You can clean the file using an external tool like python or PowerShell. There is no way to load any file with an ASCII0 in bigquery
This is a script that can clear the file with python:
def replace_chars(self,file_path,orignal_string,new_string):
#Create temp file
fh, abs_path = mkstemp()
with os.fdopen(fh,'w', encoding='utf-8') as new_file:
with open(file_path, encoding='utf-8', errors='replace') as old_file:
print("\nCurrent line: \t")
i=0
for line in old_file:
print(i,end="\r", flush=True)
i=i+1
line=line.replace(orignal_string, new_string)
new_file.write(line)
#Copy the file permissions from the old file to the new file
shutil.copymode(file_path, abs_path)
#Remove original file
os.remove(file_path)
#Move new file
shutil.move(abs_path, file_path)
The same but for PowerShell:
(Get-Content "C:\Source.DAT") -replace "`0", " " | Set-Content "C:\Destination.DAT"

SQL - How to attach FileStream enabled db without log file

I'm trying to attach a FileStream enabled database without a log file. My SQL looks something like this:
USE master
CREATE DATABASE MyDB
ON PRIMARY(NAME = N'MyDB', FILENAME = 'C:\myDB.MDF' ),
FILEGROUP myFileGroup CONTAINS FILESTREAM ( NAME = myData, FILENAME = 'C:\myFileGroup')
For Attach
Here is the error I'm receiving:
Msg 5173, Level 16, State 3, Line 2
One or more files do not match the primary file of the database.
If you are attempting to attach a database, retry the operation with the correct files.
If this is an existing database, the file may be corrupted and should be restored from a backup.
Does anyone know if it's possible to attach a FileStream enabled database without the original log file?
Try this blog post:
http://blog.sqlauthority.com/2010/04/26/sql-server-attach-mdf-file-without-ldf-file-in-database/
I would personally go with this one:
CREATE DATABASE TestDb ON
(FILENAME = N'C:\Database\Test\TestDb.mdf')
FOR ATTACH_REBUILD_LOG
GO
And when you have your log rebuilt, you can enable filestream; or try to reattach with the filestream location.

MS SQL Server use old log file location after detach/copy/attach

I create database "Test" in folder "d:\test". Database files are "d:\test\Test.mdf" and "d:\Test\Test_log.ldf". I detach database from MS SQL Server 2008 R2, copy all files to new folder ("d:\test_new"), delete log file ("d:\test_new\Test_log.ldf"), and try to attach database again from new location. When I use SQL Server Management Studio, and choose "d:\test_new\Test.mdf" file, it determines that log file is located in "d:\test\Test_log.ldf" (old location). How can I attach this database with rebuilding log in new location? Just imagine, that I cannot copy ldf file again to new location, and that it is still available there, so SQL Server see it anyway. I want to say to SQL Server - "please, forget that log file, and create new log file here". It's be better if you help me with T-SQL script, but if it will be steps in Management studio - I will convert it to script myself.
What I had tried already:
1.
CREATE DATABASE [test]
ON ( FILENAME = N'D:\test_new\test.mdf' )
FOR ATTACH_REBUILD_LOG
attaches log file from old location (FOR ATTACH - the same)
2.
CREATE DATABASE [test]
ON ( FILENAME = N'D:\test_new\test.mdf' )
LOG ON ( FILENAME = N'D:\test_new\test_log.ldf' )
FOR ATTACH_REBUILD_LOG
returns an error: Unable to open the physical file "D:\test_new\test_log.ldf". Operating system error 2: "2(File not found.)".
3.
sp_attach_db and sp_attach_single_file_db
was tried too. And I even had checked their source codes - they just create dynamic SQL and call CREATE DATABASE ... FOR ATTACH statement.
The question is slowly changed to: "Is it possible?"
UPDATE
Well, it looks like it's not possible with current versions of SQL Server. If anybody knows a way to do it - please, I will be very pleased to know it too!
Edit2: To my knowledge, it is not possible for SQL Server to recreate a log file. It can shrink the ldf, but not create it when only the mdf exists.
When you copy your files from d:\test\ to d:\test_new\, do not delete the d:\test_new\Test_log.ldf.
Leave the log file there, because you cannot reattach the new DB without that log file. Afterwards, you can shrink that log to a minimum size.
So, to synthesize:
Copy your files from d:\test\ to d:\test_new\ and leave the log
file there.
Run your create database script that you posted in your question (point 2).
Run the following script to shrink the log to a minimum size
.
USE test
GO
DBCC SHRINKFILE(logicalFileName, 1)
GO
To find out what logicalFileName is, run sp_helpfile, that will give you the logical file name for your log file:
USE test
GO
EXEC sp_helpfile
GO
more info here
Edit:
I think you need first to detach the test database from the old location:
(You might create a script that does it all, from the following commands)
C:\> osql -E
1> sp_detach_db 'test'
2> go
3> quit
C:\>
Then copy the files to the new location.
C:\> copy d:\test\* d:\test_new\*
Next, attach the test DB to the new path location:
C:\> osql -E
1> sp_attach_db #dbname = N'test', #filename1 = N'd:\test_new\Test.mdf', #filename2 = N'd:\test_new\Test_log.ldf'
2> go
3> quit
C:\>
to test if the new database was successfully attached:
C:\> osql -E
1> use test
2> go
3> quit
C:\>
If there are no errors after the go command, then all is ok
Hope this helps
Microsoft article on how to move files
The users must copy BOTH the .mdf and .ldf files. They then have to use the following command (one or the other).
sp_attach_db (deprecated, use the CREATE DATABASE WITH ATTACH in the future)
EXEC sp_attach_db #dbname = 'dbname', #filename1='d:\test_new\test.mdf', #filename2='d:\test_new\test.ldf'
This will result in the databas using the data file (mdf) and transaction log (ldf) from the \test_new directory
CREATE DATABASE FOR ATTACH
CREATE DATABASE dbname ON '(FILENAME=d:\test_new\test.mdf'), (FILENAME='d:\test_new\test.ldf') FOR ATTACH