couldn't query a file from azure blob storage (DBT) - dbt

here is the detail config for yml files
-- the profiles.yml config---
default: dbt_project
dbt_project:
target: dev
outputs:
dev:
type: synapse #synapse #type: Azuresynapse
driver: 'ODBC Driver 17 for SQL Server' # (The ODBC Driver installed on your system)
server: XXXXXXX
database: XXXXXXX
port: 1433
schema: XXXXXXX
#authentication: sqlpassword
user: XXXXXXX
password: XXXXXXX
azure_blob:
type: azure_blob
account_name: XXXXXXX
account_key: XXXXXXX
container: data-platform-archive #research-container/Bronze/Freedom/ABS_VESSEL/
prefix: abc/FGr1/fox/
--------------- dbt_project.yml-------------------------
name or the intended use of these models
name: 'dbt_project'
version: '1.0.0'
config-version: 2
This setting configures which "profile" dbt uses for this project.
profile: 'dbt_project'
m
odel-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
target-path: "target"
clean-targets:
- "target"
- "dbt_packages"
models:
dbt_project:
staging:
+materialized: table
utilities:
+materialized: view
azure_Blob:
staging:
+materialized: view
model
{{ config(
materialized='view',
connection='azure_blob'
) }}
select *
from {{ source('data-platform-archive/abc/FGr1/fox/', 'abc.parquet') }}
Compilation Error in model dbt_stg_DL_abs_acm_users
Model 'model.dbt_project.dbt_stg_DL_abs_acm_users' 'abc.parquet' which was not found

Related

Tekton build Docker image with Kaniko - please provide a valid path to a Dockerfile within the build context with --dockerfile

I am new to Tekton (https://tekton.dev/) and I am trying to
Clone the repository
Build a docker image with the Dockerfile
I have a Tekton pipeline and when I try to execute it, I get the following error:
Error: error resolving dockerfile path: please provide a valid path to a Dockerfile within the build context with --dockerfile
Please find the Tekton manifests below:
1. Pipeline.yml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: clone-read
spec:
description: |
This pipeline clones a git repo, then echoes the README file to the stout.
params:
- name: repo-url
type: string
description: The git repo URL to clone from.
- name: image-name
type: string
description: for Kaniko
- name: image-path
type: string
description: path of Dockerfile for Kaniko
workspaces:
- name: shared-data
description: |
This workspace contains the cloned repo files, so they can be read by the
next task.
tasks:
- name: fetch-source
taskRef:
name: git-clone
workspaces:
- name: output
workspace: shared-data
params:
- name: url
value: $(params.repo-url)
- name: show-readme
runAfter: ["fetch-source"]
taskRef:
name: show-readme
workspaces:
- name: source
workspace: shared-data
- name: build-push
runAfter: ["show-readme"]
taskRef:
name: kaniko
workspaces:
- name: source
workspace: shared-data
params:
- name: IMAGE
value: $(params.image-name)
- name: CONTEXT
value: $(params.image-path)
1. PipelineRun.yml
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: clone-read-run
spec:
pipelineRef:
name: clone-read
podTemplate:
securityContext:
fsGroup: 65532
workspaces:
- name: shared-data
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
# - name: git-credentials
# secret:
# secretName: git-credentials
params:
- name: repo-url
value: https://github.com/iamdempa/tekton-demos.git
- name: image-name
value: "python-test"
- name: image-path
value: $(workspaces.shared-data.path)/BuildDockerImage2
And here's my repository structure:
. . .
.
├── BuildDockerImage2
│ ├── 1.show-readme.yml
│ ├── 2. Pipeline.yml
│ ├── 3. PipelineRun.yml
│ └── Dockerfile
├── README.md
. . .
7 directories, 25 files
Could someone help me what is wrong here?
Thank you
I was able to find the issue. Issue was with the way I have provided the path.
In the kaniko task, the CONTEXT variable determines the path of the Dockerfile. And the default value is set to ./ and with some additional prefix as below:
$(workspaces.source.path)/$(params.CONTEXT)
That mean, the path of the workspaces is already being appended and I don't need to append that part as I mentioned in the image-path value below:
$(workspaces.shared-data.path)/BuildDockerImage2
Instead, I had to put just the folder name as below:
- name: image-path
value: BuildDockerImage2
This fixed the problem I had.

Install dbt vault

dbtvault has already been added to the packages.yml file, the yaml file looks like this:
I am following a tutorial that can be found here: dbtset_up
This is a tutorial that I will on snowflake so that I can be able to provide the metadata and it will generate the sql for us and the required links and hubs.
name: dbtvault_snowflake_demo
profile: dbtvault_snowflake_demo
version: '5.3.0'
require-dbt-version: ['>=1.0.0', '<2.0.0']
config-version: 2
analysis-paths:
- analysis
clean-targets:
- target
seed-paths:
- seeds
macro-paths:
- macros
model-paths:
- models
test-paths:
- tests
target-path: target
vars:
load_date: '1992-01-08'
tpch_size: 10 #1, 10, 100, 1000, 10000
models:
dbtvault_snowflake_demo:
raw_stage:
tags:
- 'raw'
materialized: view
stage:
tags:
- 'stage'
enabled: true
materialized: view
raw_vault:
tags:
- 'raw_vault'
materialized: incremental
hubs:
tags:
- 'hub'
links:
tags:
- 'link'
sats:
tags:
- 'satellite'
t_links:
tags:
- 't_link'
I am getting an error when I ran this command:
dbt depsdbt
The error is as follows:
usage: dbt [-h] [--version] [-r RECORD_TIMING_INFO] [-d] [--log-format {text,json,default}]
[-
-no-write-json]
[--use-colors | --no-use-colors] [--printer-width PRINTER_WIDTH] [--warn-error] [--no-
version-check]
[--partial-parse | --no-partial-parse] [--use-experimental-parser] [--no-static-parser]
[--profiles-dir PROFILES_DIR]
[--no-anonymous-usage-stats] [-x] [--event-buffer-size EVENT_BUFFER_SIZE] [-q] [--no-
print]
[--cache-selected-only | --no-cache-selected-only]
{docs,source,init,clean,debug,deps,list,ls,build,snapshot,run,compile,parse,test,seed,run-
operation} ...
dbt: error: argument
{docs,source,init,clean,debug,deps,list,ls,build,snapshot,run,compile,parse,test,seed,run-
operation}: invalid choice: 'depsdbt' (choose from 'docs', 'source', 'init', 'clean', '
'debug', 'deps', 'list', 'ls', 'build', 'snapshot', 'run', 'compile', 'parse', 'test',
'seed', 'run-operation')
The command is:
dbt deps
I believe you tried to execute dbt depsdbt, which is not a command

How to attach a volume to docker running in tekton pipelines

I have a problem attaching a volume to the docker image running inside tekton pipelines. I have used the below task
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: distributor-base
namespace: cicd
labels:
app.kubernetes.io/version: "0.1"
annotations:
tekton.dev/pipelines.minVersion: "0.12.1"
tekton.dev/platforms: "linux/amd64"
spec:
params:
- name: builder_image
description: The location of the docker builder image.
default: docker:stable
- name: dind_image
description: The location of the docker-in-docker image.
default: docker:dind
- name: context
description: Path to the directory to use as context.
default: .
workspaces:
- name: source
steps:
- name: docker-build
image: docker
env:
# Connect to the sidecar over TCP, with TLS.
- name: DOCKER_HOST
value: tcp://localhost:2376
# Verify TLS.
- name: DOCKER_TLS_VERIFY
value: '1'
# Use the certs generated by the sidecar daemon.
- name: DOCKER_CERT_PATH
value: /certs/client
- name: DOCKER_USER
valueFrom:
secretKeyRef:
key: username
name: docker-auth
- name: DOCKER_TOKEN
valueFrom:
secretKeyRef:
key: password
name: docker-auth
- name: DIND_CONFIG
valueFrom:
configMapKeyRef:
key: file
name: dind-env
workingDir: $(workspaces.source.path)
args:
- --storage-driver=vfs
- --debug
securityContext:
privileged: true
script: |
#!/usr/bin/env sh
set -e
pwd
ls -ltr /workspace/source
docker run --privileged -v "/workspace/source:/workspace" busybox ls -ltr /workspace
volumeMounts:
- mountPath: /certs/client
name: dind-certs
sidecars:
- image: $(params.dind_image)
name: server
args:
- --storage-driver=vfs
- --debug
- --userland-proxy=false
resources:
requests:
memory: "512Mi"
securityContext:
privileged: true
env:
# Write generated certs to the path shared with the client.
- name: DOCKER_TLS_CERTDIR
value: /certs
volumeMounts:
- mountPath: /certs/client
name: dind-certs
# Wait for the dind daemon to generate the certs it will share with the
# client.
readinessProbe:
periodSeconds: 1
exec:
command: ['ls', '/certs/client/ca.pem']
volumes:
- name: dind-certs
emptyDir: {}
in the above task workspace comes from another git-clone task
workspaces:
- name: source
in this task, I am trying to run a docker image that has access to the workspace folder , because I have to modify some files in the workspace folder.
when we look into the script
pwd
ls -ltr /workspace/source
docker run --privileged -v "/workspace/source:/workspace"
below is the console log of above 3 commands
workspace/source
total 84
-rwxr-xr-x 1 50381 50381 3206 Jun 1 10:13 README.md
-rwxr-xr-x 1 50381 50381 10751 Jun 1 10:13 Jenkinsfile.next
-rwxr-xr-x 1 50381 50381 5302 Jun 1 10:13 wait-for-it.sh
drwxr-xr-x 4 50381 50381 6144 Jun 1 10:13 overlays
-rwxr-xr-x 1 50381 50381 2750 Jun 1 10:13 example-distributor.yaml
drwxr-xr-x 5 50381 50381 6144 Jun 1 10:13 bases
-rw-r--r-- 1 50381 50381 0 Jun 1 10:13 semantic.out
-rw-r--r-- 1 50381 50381 44672 Jun 1 10:13 final.yaml
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
462eb288b104: Pulling fs layer
462eb288b104: Verifying Checksum
462eb288b104: Download complete
462eb288b104: Pull complete
Digest: sha256:ebadf81a7f2146e95f8c850ad7af8cf9755d31cdba380a8ffd5930fba5996095
Status: Downloaded newer image for busybox:latest
total 0
basically pwd command is giving me results
and ls -ltr command also gives me the results
but when I try to attach /workspace/source folder as a volume to busybox docker, I am not able to see the content.
i mean since I have attached a volume into the directory /workspace I would expect the contents from local folder /workspace/source but I see 0 results from the above log.
basically volume is not getting attached properly.
can anyone please help me to fix this issue.
below is my pipeline run triggered by tekton-triggers
apiVersion: triggers.tekton.dev/v1alpha1
kind: TriggerTemplate
metadata:
name: github-gitops-template
namespace: cicd
spec:
params:
- name: gitRevision
description: The git revision (SHA)
default: master
- name: gitRepoUrl
description: The git repository url ("https://github.com/foo/bar.git")
- name: gitRepoName
description: The git repository name
- name: branchUrl
description: The git repository branch url
- name: repoFullName
description: The git repository full name
- name: commitSha
description: The git commit sha
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: $(tt.params.gitRepoName)-
spec:
timeout: 0h10m
pipelineRef:
name: gitops-pipeline
serviceAccountName: github-service-account
params:
- name: url
value: $(tt.params.gitRepoUrl)
- name: branch
value: $(tt.params.gitRevision)
- name: repoName
value: $(tt.params.gitRepoName)
- name: branchUrl
value: $(tt.params.branchUrl)
- name: repoFullName
value: $(tt.params.repoFullName)
- name: commitSha
value: $(tt.params.commitSha)
workspaces:
- name: ws
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
below is my task run:
completionTime: '2022-06-01T10:13:47Z'
conditions:
- lastTransitionTime: '2022-06-01T10:13:47Z'
message: All Steps have completed executing
reason: Succeeded
status: 'True'
type: Succeeded
podName: gitops-core-business-tzb7f-distributor-base-pod
sidecars:
- container: sidecar-server
imageID: 'docker-pullable://gcr.io/tekton-releases/github.com/tektoncd/pipeline/cmd/nop#sha256:1d65a20cd5fbc79dc10e48ce9d2f7251736dac13b302b49a1c9a8717c5f2b5c5'
name: server
terminated:
containerID: 'docker://d5e96143812bb4912c6297f7706f141b9036c6ee77efbffe2bcb7edb656755a5'
exitCode: 0
finishedAt: '2022-06-01T10:13:49Z'
message: Sidecar container successfully stopped by nop image
reason: Completed
startedAt: '2022-06-01T10:13:37Z'
startTime: '2022-06-01T10:13:30Z'
steps:
- container: step-docker-build
imageID: 'docker-pullable://docker#sha256:5bc07a93c9b28e57a58d57fbcf437d1551ff80ae33b4274fb60a1ade2d6c9da4'
name: docker-build
terminated:
containerID: 'docker://18aa9111f180f9cfc6b9d86d5ef1da9f8dbe83375bb282bba2776b5bbbcaabfb'
exitCode: 0
finishedAt: '2022-06-01T10:13:46Z'
reason: Completed
startedAt: '2022-06-01T10:13:42Z'
taskSpec:
params:
- default: 'docker:stable'
description: The location of the docker builder image.
name: builder_image
type: string
- default: 'docker:dind'
description: The location of the docker-in-docker image.
name: dind_image
type: string
- default: .
description: Path to the directory to use as context.
name: context
type: string
sidecars:
- args:
- '--storage-driver=vfs'
- '--debug'
- '--userland-proxy=false'
env:
- name: DOCKER_TLS_CERTDIR
value: /certs
image: $(params.dind_image)
name: server
readinessProbe:
exec:
command:
- ls
- /certs/client/ca.pem
periodSeconds: 1
resources:
requests:
memory: 512Mi
securityContext:
privileged: true
volumeMounts:
- mountPath: /certs/client
name: dind-certs
steps:
- args:
- '--storage-driver=vfs'
- '--debug'
env:
- name: DOCKER_HOST
value: 'tcp://localhost:2376'
- name: DOCKER_TLS_VERIFY
value: '1'
- name: DOCKER_CERT_PATH
value: /certs/client
- name: DOCKER_USER
valueFrom:
secretKeyRef:
key: username
name: docker-auth
- name: DOCKER_TOKEN
valueFrom:
secretKeyRef:
key: password
name: docker-auth
- name: DIND_CONFIG
valueFrom:
configMapKeyRef:
key: file
name: dind-env
image: docker
name: docker-build
resources: {}
script: |
#!/usr/bin/env sh
set -e
pwd
ls -ltr /workspace/source
docker run --privileged -v "/workspace/source:/workspace" busybox ls -ltr /workspace
securityContext:
privileged: true
volumeMounts:
- mountPath: /certs/client
name: dind-certs
workingDir: $(workspaces.source.path)
volumes:
- emptyDir: {}
name: dind-certs
workspaces:
- name: source
basically we have to attach volume to sidecar, since docker run happens in side card
volumeMounts:
- mountPath: /certs/client
name: dind-certs
- name: $(workspaces.source.volume)
mountPath: $(workspaces.source.path)

Serverless: TypeError: Cannot read property 'stage' of undefined

frameworkVersion: '2'
plugins:
- serverless-step-functions
- serverless-python-requirements
- serverless-parameters
- serverless-pseudo-parameters
provider:
name: aws
region: us-east-2
stage: ${opt:stage, 'dev'}
runtime: python3.7
versionFunctions: false
iam:
role: arn:aws:iam::#{AWS::AccountId}:role/AWSLambdaVPCAccessExecutionRole
apiGateway:
shouldStartNameWithService: true
lambdaHashingVersion: 20201221
package:
exclude:
- node_modules/**
- venv/**
# Lambda functions
functions:
generateAlert:
handler: handler.generateAlert
generateData:
handler: handler.generateDataHandler
timeout: 600
approveDenied:
handler: handler.approveDenied
timeout: 600
stepFunctions:
stateMachines:
"claims-etl-and-insight-generation-${self:provider.stage}":
loggingConfig:
level: ALL
includeExecutionData: true
destinations:
- Fn::GetAtt: ["ETLStepFunctionLogGroup", Arn]
name: "claims-etl-and-insight-generation-${self:provider.stage}"
definition:
Comment: "${self:provider.stage} ETL Workflow"
StartAt: RawQualityJob
States:
# Raw Data Quality Check Job Start
RawQualityJob:
Type: Task
Resource: arn:aws:states:::glue:startJobRun.sync
Parameters:
JobName: "data_quality_v2_${self:provider.stage}"
Arguments:
"--workflow-name": "${self:provider.stage}-Workflow"
"--dataset_id.$": "$.datasetId"
"--client_id.$": "$.clientId"
Next: DataQualityChoice
Retry:
- ErrorEquals: [States.ALL]
MaxAttempts: 2
IntervalSeconds: 10
BackoffRate: 5
Catch:
- ErrorEquals: [States.ALL]
Next: GenerateErrorAlertDataQuality
# End Raw Data Quality Check Job
DataQualityChoice:
Type: Task
Resource:
Fn::GetAtt: [approveDenied, Arn]
Next: Is Approved ?
Is Approved ?:
Type: Choice
Choices:
- Variable: "$.quality_status"
StringEquals: "Denied"
Next: FailState
Default: HeaderLineJob
FailState:
Type: Fail
Cause: "Denied status"
# Header Line Job Start
HeaderLineJob:
Type: Parallel
Branches:
- StartAt: HeaderLineIngestion
States:
HeaderLineIngestion:
Type: Task
Resource: arn:aws:states:::glue:startJobRun.sync
Parameters:
JobName: headers_lines_etl_rs_v2
Arguments:
"--workflow-name.$": "$.Arguments.--workflow-name"
"--dataset_id.$": "$.Arguments.--dataset_id"
"--client_id.$": "$.Arguments.--client_id"
End: True
Retry:
- ErrorEquals: [States.ALL]
MaxAttempts: 2
IntervalSeconds: 10
BackoffRate: 5
Catch:
- ErrorEquals: [States.ALL]
Next: GenerateErrorAlertHeaderLine
End: True
# Header Line Job End
GenerateErrorAlertDataQuality:
Type: Task
Resource:
Fn::GetAtt: [generateAlert, Arn]
End: true
resources:
Resources:
# Cloudwatch Log
"ETLStepFunctionLogGroup":
Type: AWS::Logs::LogGroup
Properties:
LogGroupName: "ETLStepFunctionLogGroup_${self:provider.stage}"
This is what my serverless.yml file looks like.
When I run the command:
sls deploy --stage staging
It show
Type Error ----------------------------------------------
TypeError: Cannot read property 'stage' of undefined
at Variables.getValueFromOptions (/snapshot/serverless/lib/classes/Variables.js:648:37)
at Variables.getValueFromSource (/snapshot/serverless/lib/classes/Variables.js:579:17)
at /snapshot/serverless/lib/classes/Variables.js:539:12
Your Environment Information ---------------------------
Operating System: linux
Node Version: 14.4.0
Framework Version: 2.30.3 (standalone)
Plugin Version: 4.5.1
SDK Version: 4.2.0
Components Version: 3.7.4
How I can fix this? I tried with different version of serverless.
There is error in yamlParser file, which is provided by serverless-step-functions.
Above is my serverless config file.
It looks like a $ sign is missing from your provider -> stage?
provider:
name: aws
region: us-east-2
stage: ${opt:stage, 'dev'} # $ sign is missing?
runtime: python3.7
versionFunctions: false
iam:
role: arn:aws:iam::#{AWS::AccountId}:role/AWSLambdaVPCAccessExecutionRole
apiGateway:
shouldStartNameWithService: true
lambdaHashingVersion: 20201221

Unable to deploy application on EC2 instance using AWS CloudFormation template through cfn-init and UserData script

I am trying to deploy sample.war application on EC2 instance at the time of launch. That is when an instance is launched the application should be deployed automatically on it using cfn-init and Metadata. I added a user with policy and authentication with no luck. If I wget with the S3 path, the file is being downloaded. Below is my script. What am I missing in this, or is there any other way to do this?
---
AWSTemplateFormatVersion: 2010-09-09
Description: Test QA Template
Resources:
MyInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: !Ref AMIIdParam
InstanceType: !Ref InstanceType
Metadata:
AWS::CloudFormation::Init:
config:
packages:
yum:
java-1.8.0-openjdk.x86_64: []
tomcat: []
httpd.x86_64: []
services:
sysvinit:
httpd:
enabled: true
ensureRunning: true
files:
/usr/share/tomcat/webapps/sample.zip:
source: https://s3.amazonaws.com/mybucket/sample.zip
mode: '000500'
owner: tomcat
group: tomcat
authentication: S3AccessCreds
AWS::CloudFormation::Authentication:
S3AccessCreds:
type: 'S3'
accessKeyId: !Ref HostKeys
secretKey: Fn::GetAtt:
- HostKeys
- SecretAccessKey
buckets: !Ref BucketName
CfnUser:
Type: AWS::IAM::User
Properties:
Path: '/'
Policies:
- PolicyName: 'S3Access'
PolicyDocument:
Statement:
- Effect: 'Allow'
Action: s3:*
Resource: '*'
HostKeys:
Type: AWS::IAM::AccessKey
Properties:
UserName: !Ref CfnUser
I was unable to reproduce this using the following template:
---
AWSTemplateFormatVersion: 2010-09-09
Description: Test QA Template
Resources:
MyInstance:
Type: AWS::EC2::Instance
Properties:
ImageId: ami-08589eca6dcc9b39c
InstanceType: t2.micro
KeyName: default
UserData:
Fn::Base64: !Sub |
#!/bin/bash -xe
/opt/aws/bin/cfn-init -s ${AWS::StackId} --resource MyInstance --region ${AWS::Region}
Metadata:
AWS::CloudFormation::Init:
config:
packages:
yum:
java-1.8.0-openjdk.x86_64: []
tomcat: []
httpd.x86_64: []
services:
sysvinit:
httpd:
enabled: true
ensureRunning: true
files:
/usr/share/tomcat/webapps/sample.zip:
source: https://s3.amazonaws.com/mybucket/sample.zip
mode: '000500'
owner: tomcat
group: tomcat
(In other words, use of the above template allowed me to install a sample.zip file using cfn-init.)
Thus there is something permissions-related in the way you're accessing the S3 bucket.
Suffice to say it is a bad practice to use Access Keys. Have a look at this document on best practices of assigning an IAM Role to an EC2 instance and then adding a Bucket Policy that grants appropriate access to that Role.