i have function list_for_android = list(df[df['device_os'] == 'Android'].device_brand.unique()) which should display a list of brands if its operating system is android. In its usual form, this function works fine, but I need to shove it into the Pipeline and it df[df['device_os'] == 'Android'] doesn't work in the Pipeline. When shape in Pipeline is called, it outputs (0.18), i.e. there is no data there at all, although there should be about 1,8 million rows. How do I redo this piece of code for Pipeline?
Related
I am using jmeter to test the performance for the ride booking app.I need to run the while controller which runs the events fetching api continuously until the ride is completed or if driver is not available. This runs correctly for one user .But if i run the plan for multiple users then the while controller enters infinite loop.How can I fix this?
While Controller is being executed unless its condition (a Function or Variable) resolves to true
If it runs into an endless loop - most probably your server responds with something you don't expect, i.e. an error because it gets overloaded.
So I would suggest taking 2 actions:
Temporarily enable storing of responses into .jtl or a separate file and inspect what does the server return and amend your While Controller's condition accordingly
And/or limit maximum number of iterations of the While Controller to some reasonable number, i.e. 10 or 20 or whatever is acceptable value, example __jexl3() function
${__jexl3("${status}" != "running" && ${__jm__While Controller__idx} < 20,)}
Central Issue
I have two render pipelines in one single render command encoder. The first pipeline writes to a buffer which is used in the second pipeline. This does not seem to work and I expect it to be a synchronization problem. When I use one separate render command encoder for each render pipeline I get the desired result. Can this be solved with one single render command encoder or do I need two separate encoders to synchronize the buffer?
Here is the more specific case:
The first pipeline is a non-rasterizing pipeline only running a vertex shader to output to a MTLBuffer storing MTLDrawPrimitivesIndirectArguments to use for the drawPrimitives call for the second pipeline, which looks like this:
// renderCommandEncoder is MTLRenderCommandEncoder
// firstPipelineState and secondPipelineState are two different MTLRenderPipelineState
// indirectArgumentsBuffer is a MTLBuffer containing MTLDrawPrimitivesIndirectArguments
// numberOfVertices is number of vertices suited for first pipeline
// first pipeline
renderCommandEncoder.setRenderPipelineState(firstPipelineState)
renderCommandEncoder.setVertexBuffer(indirectArgumentsBuffer, offset: 0, index: 0)
renderCommandEncoder.drawPrimitives(type: .point, vertexStart: 0, vertexCount: numberOfVertices)
// second pipeline
renderCommandEncoder.setRenderPipelineState(secondPipelineState)
renderCommandEncoder.setVertexBuffer(secondPipelineBuffer, offset: 0, index: 0)
renderCommandEncoder.drawPrimitives(type: .point, indirectBuffer: indirectArgumentsBuffer, indirectBufferOffset: 0)
renderCommandEncoder.endEncoding()
How can I make sure that the indirectArgumentsBuffer has been written to by the first pipeline when issuing a call to drawPrimitives for the second pipeline, which uses and needs the contents of indirectArgumentsBuffer?
I believe you need to use separate encoders. In this (somewhat dated) documentation about function writes, only atomic operations are synchronized for buffers shared between draw calls.
Refer the following code:
def parse(height_id):
driver2=webdriver.Firefox()
driver2.get(height_id)
block_details_web_element=driver2.find_element_by_xpath('//*[#id="__next"]/div[3]/div/div[3]/div[1]/div/div')
print (block_details_web_element)
print(block_details_web_element.text)
print(" --- BLOCK PARSED SUCCESSFULLY! ---\n")
driver2.quit()
#the following parts are outside the function:
p = Pool(4)
records = p.map(parse, height_list_on_each_page2) #height_list_on_each_page2 is a list of URLs.
p.terminate()
p.join()
(Eg. of a typical URL: https://www.blockchain.com/btc/block/000000000000000000028e473f1c95060a63100c9861525105b1f5ced81a7fa0 )
Now, this works fine, but takes huge time. So, I planned to put the statement driver2=webdriver.Firefox() outside the parse function, so that I don't re-create the instance of WebDriver each time I'm calling the function. [And I also removed the statement driver2.quit() from this function]. This is saving time, however, almost half of the times the print statment print (block_details_web_element) is returning None as output.
I doubt that the find_elements_by_xpath method is not working correctly. Is this because of multi-processing that I am using?
Any insight, why this is happening? Please suggest a solution.
EDIT : (this is the error)
(https://www.dropbox.com/s/1yvknbdg9wefuwy/selenium.jpeg?dl=0)
I have a PCollection as a result of a pipeline after doing Bigquery processing, now I want to use some part of that data separate from the pipeline. How do I transfer a PCollection to a List so that I can iterate through it and use the content.
Am I doing something wrong conceptually ?
Once you are done with data processing inside your Dataflow pipeline, you'd likely want to write the data into a persistent storage, such as files in Cloud Storage (GCS), a table in BigQuery, etc.
You can then consume the data outside Dataflow, for example, to read it into a List. Obviously, it would need to fit into memory for that specific action.
What I would do is creating "side outputs" (https://cloud.google.com/dataflow/model/par-do) that is another PCollection that you create together with your main process so in the end you will have 2 PCollections as result of your BQ process.
Just ensure that on your process function you create a condition to add elements to side output collection. Something like this:
public final void processElement(final ProcessContext context) throws Exception {
context.output(bqProcessResult);
if (condition) {
context.sideOutput(myFilterTag, bqProcessResult);
}
}
The result of that process is not a PCollection but a PCollectionTuple so you just have to do the following:
PCollectionTuple myTuples = previous process using the function above...;
PCollection<MyType> bqCollection = myTuples.get(bqTag);
PCollection<MyType> filteredCollection = myTuples.get(myFilterTag);
I have a batch job in AX 2012 R2 that runs, essentially iterating over a table and creating an instance of a class (that extends RunBaseBatch) that gets added as a task.
I also have some post processing items I need to do, after all the tasks have completed.
So far, the following is working:
while select stagingTable where stagingTable.OperationNo == params.paramOperationNo()
{
batchHeader = this.getCurrentBatchHeader();
batchTask = OperationTask::construct();
batchHeader.addRuntimeTask(batchTask,this.getCurrentBatchTask().RecId);
}
batchHeader.save();
postTask = PostProcessingTask::construct();
batchHeader.addRuntimeTask(postTask,this.getCurrentBatchTask().RecId);
batchHeader.addDependency(postTask,batchTask,BatchDependencyStatus::FinishedOrError);
batchHeader.save();
My thought is that this will add a dependency on the post process task to not start until we get Finished or Error on the last task added in the loop. What I get instead is an exception "The dependency could not be created because task '' does not exist."
I'm uncertain what I'm missing, as the tasks all get added executed successfully, it seems that just the dependency doesn't want to work.
Several things, where this code is being called matters. Is the code already in batch? Is the code calling in doBatch() before/after the super? etc.
You have a while-select, does this create multiple batch tasks? If it does, then you need to create a dependency on each batch task object. This is one problem I see. If your while-select statement only selects 1 record and adds one task, then the problem is something else, but you shouldn't do a while-select to select one record.
Also, you call batchHeader.save(); two times. I'd probably remove the first call. I'd need to see what is instantiating your code.
Where you have this.getCurrentBatchTask().RecId, depending on if your code is in batch or not, try replacing that with BatchHeader::getCurrentBatchTask().RecId
And where you have batchHeader = this.getCurrentBatchHeader(); replace that with batchHeader = BatchHeader::getCurrentBatchHeader();
EDIT Try this code (fix whatever to make it compile)
BatchHeader batchHeader = BatchHeader::getCurrentBatchHeader();
Set set = new Set(Types::Class);
SetEnumerator se;
BatchTask batchTask;
PostTask postTask;
while select stagingTable where stagingTable.OperationNo == params.paramOperationNo()
{
batchTask = OperationTask::construct();
set.add(batchTask);
batchHeader.addRuntimeTask(batchTask,BatchHeader::getCurrentBatchTask().RecId);
}
// Create post task
postTask = PostProcessingTask::construct();
batchHeader.addRuntimeTask(postTask,BatchHeader::getCurrentBatchTask().RecId);
// Create dependencies
se = set.getEnumerator();
while (se.moveNext())
{
batchTask = se.current(); // Task to make dependent on
batchHeader.addDependency(postTask,batchTask,BatchDependencyStatus::FinishedOrError);
}
batchHeader.save();