I am running Hive as an action in oozie. Is there a way I can use the property variables in Hive? If yes, how do i set them? For example: When I am creating an external table, I woul dlike to set the location as a propery.
CREATE EXTERNAL TABLE IF NOT EXISTS test(
id bigint,
name string
)
row format DELIMITED FIELDS TERMINATED BY "^"
location "/user/test/data";
So is it possible to set location as
location ${input}
Where in I set $(input) in my properties file.
Following the convention from the above question, you can access the property by using ${hiveconf:input} in your hive commands.
In order to define a property named input, you would have to modify hive-site.xml and add a snippet like
<property>
<name>input</name>
<value>input_value</value>
</property>
However, if input is an environment variable (say, from bash), you can access it using ${env:input}. For example, ${env:HOME} or ${env:PATH}
You can set one with set input=/user/test/data and retrieve it with ${hiveconf:input}.
A more detailed description of this can be found here using variables
Related
I have got requirement saying, blob storage has multiple files with names file_1.csv,file_2.csv,file_3.csv,file_4.csv,file_5.csv,file_6.csv,file_7.csv. From these i have to read only filenames from 5 to 7.
how we can achieve this in ADF/Synapse pipeline.
I have repro’d in my lab, please see the below repro steps.
ADF:
Using the Get Metadata activity, get a list of all files.
(Parameterize the source file name in the source dataset to pass ‘*’ in the dataset parameters to get all files.)
Get Metadata output:
Pass the Get Metadata output child items to ForEach activity.
#activity('Get Metadata1').output.childItems
Add If Condition activity inside ForEach and add the true case expression to copy only required files to sink.
#and(greater(int(substring(item().name,4,1)),4),lessOrEquals(int(substring(item().name,4,1)),7))
When the If Condition is True, add copy data activity to copy the current item (file) to sink.
Source:
Sink:
Output:
I took a slightly different approaching using a Filter activity and the endsWith function:
The filter expression is:
#or(or(endsWith(item().name, '_5.csv'),endsWith(item().name, '_6.csv')),endsWith(item().name, '_7.csv'))
Slightly different approaches, similar results, it depends what you need.
You can always do what #NiharikaMoola-MT suggested . But since you already know the range of the files ( 5-7) , I suggest
Declare two paramter as an upper and lower range
Create a Foreach loop and pass the parameter and to create a range[lowerlimit,upperlimit]
Create a paramterized dataset for source .
Use the fileNumber from the FE loop to create a dynamic expression like
#concat('file',item(),'.csv')
We're working on converting our Classic Azure Pipelines to YAML Pipelines. One thing that is not clear is how to ensure that two different variable groups with variables with the same name but different meaning don't step on each other.
For example, if I have variable groups vg1 and vg2, each with variable named secretDataDestination, how do I ensure that the correct secretDataDestination is used in a YAML Pipeline?
A more concerning example is, if we initially have two variable groups without overlapping variable names, how do we ensure that adding a newly-overlapping variable name to a group doesn't replace use of the variable as originally intended?
A workaround is leveraging output variables in Azure DevOps with some small inline PowerShell task code.
First, create 2 jobs. Each job with their own variable group, in this case Staging and Prod. Both groups contain the variables apimServiceName and apimPrefix. Add the variables as a job output by echoing them as isOutput=true like this:
- job: StagingVars
dependsOn:
variables:
- group: "Staging"
steps:
- powershell: >-
echo "##vso[task.setvariable variable=apimServiceName;isOutput=true]$(apimServiceName)"
echo "##vso[task.setvariable variable=apimPrefix;isOutput=true]$(apimPrefix)"
name: setvarStep
- job: ProdVars
dependsOn:
variables:
- group: "Prod"
steps:
- powershell: >-
echo "##vso[task.setvariable variable=apimServiceName;isOutput=true]$(apimServiceName)"
echo "##vso[task.setvariable variable=apimPrefix;isOutput=true]$(apimPrefix)"
name: setvarStep
Then, use the variables in a new job, where you specify a new variable name and navigate to the job output to get a value, this works because the variable groups are each placed into their own job, so they will not overwrite any variable:
- job:
dependsOn:
- StagingVars
- ProdVars
variables:
ServiceNameSource: "$[ dependencies.StagingVars.outputs['setvarStep.apimServiceName'] ]"
UrlprefixSource: "$[ dependencies.StagingVars.outputs['setvarStep.apimPrefix'] ]"
ServiceNameDestination: "$[ dependencies.ProdVars.outputs['setvarStep.apimServiceName'] ]"
UrlprefixDestination: "$[ dependencies.ProdVars.outputs['setvarStep.apimPrefix'] ]"
if I have variable groups vg1 and vg2, each with variable named secretDataDestination, how do I ensure that the correct secretDataDestination is used in a YAML Pipeline?
Whether we use classic mode or YAML, it is not recommended to define a variable with the same name in different variable groups. Because when you refer to different variable groups containing the same variable name in the same pipeline, you cannot avoid step on each other.
When you use the same variable name in different variable group in the same pipeline, just like Matt said,
"You can reference multiple variable groups in the same pipeline. If
multiple variable groups include the same variable, the variable group
included last in your YAML file will set the variable's value."
variables:
- group: variable-group1
- group: variable-group2
That means that the variable value in the variable group written later will overwrite the variable value in the variable group written first
I guess you already know this, so you post your second question. Let us now turn to the second question.
if we initially have two variable groups without overlapping variable
names, how do we ensure that adding a newly-overlapping variable name
to a group doesn't replace use of the variable as originally intended?
Indeed, Azure devops currently does not have such a function or mechanism to intelligently detect whether different variable groups have the same variable name, and give a prompt.
I think this is a reasonable request, I add your request for this feature on our UserVoice site which is our main forum for product suggestions:
The ability to detect the same variable in a variable group
As workaround, the simplest and most direct way is that open the variable group of your pipeline link in the Library tab, and directly ctrl + F to search for the existence of the same variable.
Another way is to use REST API Variablegroups - Get Variable Groups By Id to get all the variables, then the loop compares with the variable we are going to enter whether the same variable exists.
I am a newbie to Datafctory. As part of my pipeline, I execute an sp to fetch the next record to process using Lookup and then use the returned value in a Set Variable.
If the SP returns noting then the Set Variable fails with the following error
Activity SetBatchId failed: The expression 'activity('usp_get_next_archive_batch').output.firstRow.id' cannot be evaluated because property 'firstRow' doesn't exist, available properties are 'effectiveIntegrationRuntime'.
Is there a way in DF to check the property exists before using it
thanks
Please add a question mark after ‘output’. Means ‘output?.firstRow’.
See also this post.
Azure Data Factory: For each item() value does not exist for a particular attribute
The expression should be 'activity('usp_get_next_archive_batch').output['firstRow'].['id']
I have got a function module that counts some variables in sap system and export it as single INT4. But when I try to use this in gateway service, it says me
"no output table mapped" How can i overcome it, I tried to put this variable in a table and export then but I couldnt.
DATA: EV_ENQ TYPE STANDARD TABLE OF seqg3.
CALL FUNCTION 'ENQUEUE_READ'
EXPORTING
guname = '*'
IMPORTING
number = EV_TABLESIZE
TABLES
enq = EV_ENQ.
Ev_Tablesize is the variable that I want to export. It holds the total lock count.
Your parameter should be mapped under your service implementation in SEGW. If it is not, then you should map them again and be sure that the parameter is being displayed.
I have set up the code as described in this question.
Creating an alias works, as well as dropping it.
For members that I have created myself, this is working correctly, but for existing members I get the following error when selecting from the alias:
SQL State: 42704
Vendor Code: -204
Message: [SQL0204] MyMemberName in MyLib type *FILE not found.
Cause . . . . . : MyMemberName in
TPLWHS type *FILE was not found. If the member name is *ALL, the table
is not partitioned. If this is an ALTER TABLE statement and the type
is *N, a constraint or partition was not found. If this is not an
ALTER TABLE statement and the type is *N, a function, procedure,
trigger or sequence object was not found. If a function was not found,
MyMemberName is the service program that contains the function. The
function will not be found unless the external name and usage name
match exactly. Examine the job log for a message that gives more
details on which function name is being searched for and the name that
did not match.
Recovery . . . : Change the name and try the request
again. If the object is a node group, ensure that the DB2 Multisystem
product is installed on your system and create a nodegroup with the
CRTNODGRP CL command. If an external function was not found, be sure
that the case of the EXTERNAL NAME on the CREATE FUNCTION statement
exactly matches the case of the name exported by the service program.
Any help you can offer is much appreciated. Thanks!
EDIT: Here is my code:
create alias MyLib.MyAlias for MyLib.MyLogicalFile(MyMember);
select * from MyLib.MyAlias;
drop alias MyLib.MyAlias;
The format of Lib.Alias has worked for me when I directly created the phyiscal and logical members. Perhaps the logical file is missing? I'll double check...
This error message can indicate that the file/logical file/member does not exist.