Azure Java SDK: container with multiple volumes - azure-storage

I need to mount 2 separate directories as volumes to a newly creted container.
So far I've found the way to mount only one volume since there's no way to add a file share volume via withNewAzureFileShareVolume more than once.
Here's my code:
ContainerGroup containerGroup = azure.containerGroups()
.define(containerName)
.withRegion(Region.US_EAST)
.withExistingResourceGroup("myResourceGroup")
.withLinux()
.withPrivateImageRegistry("registry")
.withNewAzureFileShareVolume("aci-share", shareName)
.defineContainerInstance(containerName)
.withImage("image")
.withVolumeMountSetting("aci-share", "/usr/local/dir1/")
.withVolumeMountSetting("aci-share-2", "/usr/local/dir2/")
.attach()
.withDnsPrefix(team)
.create();
A new storage account gets created with a single file share and I get the following error:
Volumes 'aci-share-2' in container 'team44783530d' are not found

Related

Automate unmounting and mounting step because of Service principal expiration in Databricks

I need to automate task which is related to azure databricks.we have configured a job in azure databricks and suddenly my service prinicipal secrets get expired and my notebook failed. on the next day I created new secrets and unmount --> mount it again and job worked. I know the way to create new secret or get the alert before expiring using logic app and manually will change it.
But how to manage unmount --> mount step automatically? As SP can be used in multiple project.
This is how I am mounting and mount_point used across notebook.
Here is the sample code for Unmounting and mounting notebook.
Check if the path is mounted or not, If it is not mounted yet mount the path with the given condition.
def mount_blob_storage_from_sas(dbutils, storage_account_name, container_name, mount_path, sas_token, unmount_if_exists = True):
if([item.mountPoint for item in dbutils.fs.mounts()].count(mount_path) > 0):
if unmount_if_exists:
print('Mount point already taken - unmounting: '+mount_path)
dbutils.fs.unmount(mount_path)
else:
print('Mount point already taken - ignoring: '+mount_path)
return
print('Mounting external storage in: '+mount_path)
dbutils.fs.mount(
source = "wasbs://{0}#{1}.blob.core.windows.net".format(container_name, storage_account_name),
mount_point = mount_path,
extra_configs = {"fs.azure.sas.{0}.{1}.blob.core.windows.net".format(container_name, storage_account_name): sas_token })
Create a job to run the notebook with given specific time period.
Provision the notebook path and cluster to create the JOB.
Schedule a time to trigger the notebook in Add Schedule .
Select Scheduled path to provide time to trigger.
Job will trigger at given time to run the notebook.

How to add a fargate profile to an existing cluster with CDK

I want to add a new fargate profile to an existing eks cluster.
The cluster is created in another Stack and in my tenant specific stack I am importing my eks cluster via attributes.
self.cluster: Cluster = Cluster.from_cluster_attributes(
self, 'cluster', cluster_name=cluster,
open_id_connect_provider=eks_open_id_connect_provider,
kubectl_role_arn=kubectl_role
)
The error is:
Object of type #aws-cdk/core.Resource is not convertible to #aws-cdk/aws-eks.Cluster
and it is appearing on this line here
FargateProfile(self, f"tenant-{self.tenant}", cluster=self.cluster, selectors=[Selector(namespace=self.tenant)])
If I try calling
self.cluster.add_fargate_profile(f"tenant-{self.tenant}", selectors=[Selector(namespace=self.tenant)])
I get the error that the object self.cluster does not have the attribute add_fargate_profile
While you might think that something is of with importing the cluster, adding manifests and helm charts work just fine.
self.cluster.add_manifest(...) <-- this is working
This is not currently possible in CDK.
As per the docs, eks.Cluster.fromClusterAttributes returns an ICluster, while FargateProfile expects a Cluster explicitly.
A FargateCluster can only currently be created in CDK, not imported.

Amazon SageMaker notebook rl_deepracer_coach_robomaker - Write log CSV on S3 after simulation

I created my first notebook instance on Amazon SageMaker.
Next I opened the Jupyter notebook and I used the SageMaker Example in the section Reinforcement Learning rl_deepracer_coach_robomaker.ipynb. The question is addressed principally to those who are familiar with this notebook.
There you can launch a training process and a RoboMaker simulation application to start the learning process for an autonomous car.
When a simulation job is launched, one can access to the log file, which is visualised by default in a CloudWatch console. Some of the informations that appear in the log file can be modified in the script deepracer_env.py in /src/robomaker/environments subdirectory.
I would like to "bypass" the CloudWatch console, saving the log file informations like episode, total reward, number of steps, coordinates of the car, steering and throttle etc. in a dataframe or csv file to be written somewhere on the S3 at the end of the simulation.
Something similar has been done in the main notebook rl_deepracer_coach_robomaker.ipynb to plot the metrics for a training job, namely the training reward per episode. There one can see that
csv_file_name = "worker_0.simple_rl_graph.main_level.main_level.agent_0.csv"
is called from the S3, but I simply cannot find where this csv is generated to mimic the process.
You can create a csv file in the /opt/ml/output/intermediate/ folder, and the file will be saved in the following directory:
s3://<s3_bucket>/<s3_prefix>/output/intermediate/<csv_file_name>
However, it is not clear to me where exactly you will create such a file. DeepRacer notebook uses two machines, one for training (SageMaker instance) and one for simulations (RoboMaker instance). The above method will only work in a SageMaker instance, but much of what you would like to log such as ("Total rewards" in an episode) is actually in RoboMaker instance. For RoboMaker instances, the intermediate folder feature doesn't exist, and you'll have to save the file to s3 yourself using the boto library. Here is an example of doing that: https://qiita.com/hengsokvisal/items/329924dd9e3f65dd48e7
There is a way to download the CloudWatch logs to a file. This way you can just print, save the logs and parse it. Assuming you are executing from a notebook cell:
STREAM_NAME= <your stream name as given by RoboMaker CloudWatch logs>
task = !aws logs create-export-task --task-name "copy_deepracer_logs" --log-group-name "/aws/robomaker/SimulationJobs" --log-stream-name-prefix $STREAM_NAME --destination "<s3_bucket>" --destination-prefix "<s3_prefix>" --from <unix timestamp in milliseconds> --to <unix timestamp in milliseconds>
task_id = json.loads(''.join(task))['taskId']
The export is an asynchronous call, so give it a few minutes to download. If you can print the task_id, then the export is done.

Using Microsoft Sync Framework to sync files across network

The file synchronization example given here - http://code.msdn.microsoft.com/Release/ProjectReleases.aspx?ProjectName=sync&ReleaseId=3424 only talks about syncing files on the same machine. Has anyone come across a working example of using something like WCF to enable this to work for files across a network?
Bryant's example - http://bryantlikes.com/archive/2008/01/03/remote-file-sync-using-wcf-and-msf.aspx is not complete and is only a one way sync and is less than ideal.
The Sync framework can synchronize files across the network as long as you have an available network share.
In the constructor of the FileSyncProvider set the rootDirectoryPath to a network share location that you have read and write permissions to:
string networkPath = #"\\machinename\sharedfolderlocation";
FileSyncProvidor provider = new FileSyncProvider(networkPath);
To do a two way sync in this fashion you will need to create a FileSyncProvider for both the source and destination systems and use the SyncOrchestrator to do the heavy lifting for you.
An example:
string firstLocation = #"\\sourcemachine\sourceshare";
string secondLocation = #"\\sourcemachine2\sourceshare2";
FileSyncProvidor firstProvider = new FileSyncProvider(firstLocation);
FileSyncProvidor secondProvider = new FileSyncProvider(secondLocation);
SyncOrchestrator orchestrator = new SyncOrchestrator();
orchestrator.LocalProvider = firstProvider;
orchestrator.RemoteProvider = secondProvider;
orchestrator.Direction = SyncDirectionOrder.DownloadAndUpload;
What this does is define two filesync providers and the orchestrator will sync the files in both directions. It tracks creates, modifications, and deletes of files in the directories set in the providers.
All that is needed at this point is to call Synchronize on the SyncOrchestrator:
orchestrator.Synchronize();

Snapshots on Amazon EC2

I used CreateImageRequest to take a snapshot of a running EC2 machine. When I log into the EC2 console I see the following:
AMI - An image that I can launch
Volume - I believe that this is the disk image?
Snapshot - Another entry related to the snapshot?
Can anyone explain the difference in usage of each of these? For example, is there any way to create a 'snapshot' without also having an associated 'AMI', and in that case how do I launch an EBS-backed copy of this snapshot?
Finally, is there a simple API to delete an AMI and all associated data (snapshot, volume and AMI). It turns out that our scripts only store the AMI identifier, and not the rest of the data, and so it seems that that's only enough information to just Deregister an image.
The AMI represents the launchable machine configuration - it does NOT actually contain any of the machine's data, just references to it. An AMI can get its disk image either from S3 or (in your case) an EBS snapshot.
The EBS Volume is associated with a running instance. It's basically a read-write disk image. When you terminate the instance, the volume will automatically be destroyed (this may take a few minutes, note).
The snapshot is a frozen image of the EBS volume at the point in time when you created the AMI. Snapshots can be associated with AMIs, but not all snapshots are part of an AMI - you can create them manually too.
More information on EBS-backed AMIs can be found in the user's guide. It is important to have a good grasp on these concepts, so I would recommend giving the entire users guide a good read-over before going any further.
If you want to delete all data associated with an AMI, you will have to use the DescribeImageAttribute API call on the AMI's blockDeviceMapping attribute to find the snapshot ID; then delete the AMI and snapshot, in that order.
This small PS script takes the AMI parameter (stored in a variable), grab the snapshots of the given AMI ID by storing them into an array, and finally perform the required clean up (unregister & remove the snapshots).
# Unregister and clean AMI snapshots
$amiName = 'ami-XXXX' # replace this with the AMI ID you need to clean-up
$myImage = Get-EC2Image $amiName
$count = $myImage[0].BlockDeviceMapping.Count
# Loop and store snapshotID(s) to an array
$mySnaps = #()
for ($i=0; $i -lt $count; $i++)
{
$snapId = $myImage[0].BlockDeviceMapping[$i].Ebs | foreach {$_.SnapshotId}
$mySnaps += $snapId
}
# Perform the clean up
Write-Host "Unregistering" $amiName
Unregister-EC2Image $amiName
foreach ($item in $mySnaps)
{
Write-Host 'Removing' $item
Remove-EC2Snapshot $item
}
Clear-Variable mySnaps