Download from a http location using apache camel - apache

I need to download a file from an http location to my local system using apache camel. When I gave the below code
from("http://url/filename.xml")
.to("file://C:location")
it has worked for ftp but not working when the url is "http". That is, it is not downloading the file from the http location to the local address provided in the "to()".

This should work.
from("direct:abc")
.setHeader("Accept", simple("application/xml"))//Change it according to the file content
.setHeader(Exchange.HTTP_METHOD, constant("GET"))
.to("http://url/filename.xml")
.to("file:///tmp/?fileName=yourFileName.xml");
You cannot use from("Some url"). Above route is triggered whenever there is a message on direct:abc endpoint. You can change the yourFileName.xml to whatever filename you want it to be stored as.
Instead of a trigger from route, you can as well use a timer or any other means of self triggering.
The reason you cannot consume from a rest enpoint like this
from("http://url/filename.xml")
is you cannot consume from http endpoint. So there needs to be a trigger. Infact the exception message is pretty clear when you do like this. It says
org.apache.camel.spring.boot.CamelSpringBootInitializationException: org.apache.camel.FailedToCreateRouteException: Failed to create route route1: Route(route1)[[From[http://url/filename.xml]] -> [To[... because of Cannot consume from http endpoint

the http component cannot be used as a consumer ie. you cannot have a route as from("http://...")
you need to use a consumer component that will start the route.
You could try something like this
from("timer:foo?fixedRate=true&period=5000")
.to("http://url/filename.xml")
.to("file://C:location")

Related

Apache proxy module: how to put back data read with ap_get_client_block()?

I am modifying an Apache proxy module mod_proxy_http. I want to inspect the content prior to proxying the request. I am able to use ap_setup_client_block(), ap_should_client_block() and ap_get_client_block() in a loop to retrieve the client request message body. Unfortunately, it also depletes the message body and when the final message reached the API server, the message body is gone.
Is there a non-destructive version of ap_get_client_block() or is there a way to put back the content I read using ap_get_client_block()?

How to pass original URI, with arguments, to Traefik ErrorPage handler specified in `query`?

I'm trying to use nginx to serve a custom error page using the Error Page middleware so that 404 requests to a lambda service (which I don't control) can be handled with a custom error page. I want to be able to get the context of this original request on that error page, either in Nginx for further forwarding, or else as a header for further handling e.g. in PHP or whatnot so I can provide contextual links on the 404 page.
However, right now after the redirection to Nginx in Traefik's ErrorPage middleware it seems the request has lost all the headers and data from the original service query.
The relevant part of my dockerfile:
traefik.port=8080
traefik.protocol=http
traefik.docker.network=proxy
traefik.frontend.rule=PathPrefix:/myservice;ReplacePathRegex:^/myservice/(.*) /newprefix/$$1
traefik.frontend.errors.myservice.status=404
traefik.frontend.errors.myservice.service=nginx
traefik.frontend.errors.myservice.query=/myservice-{status}
Nginx receives the forwarded 404 request, but the request URI comes through as nothing more than the path /myservice-404 specified in query (or /, if I omit traefik.frontend.errors.myservice.query). After the ReplacePathRegex I have the path of the original request available in the HTTP_X_REPLACED_PATH header, but any query arguments are no longer accessible in any header, and nginx can't see anything else about the original URI. For example, if I requested mysite.com/myservice/some/subpath?with=parameters, the HTTP_X_REPLACED_PATH header will show /myservice/some/subpath but not include the parameters.
Is it possible in Traefik to pass another service the complete context about the original request?
What I'm really looking for is something like try_files, where I could say "if this traefik request fails, try this other path instead", but I'd settle for being able to access the original, full request arguments within the handling backend server. If there was a way to send Nginx a request with the full path and query received by Traefik, that would be ideal.
tl;dr:
I am routing a request to a specific service in Traefik
If that request 404s, I want to be able to pass that request to Nginx for further processing / a contextual error page
I want Nginx and/or the page which receives the ErrorPage redirect to be able to know about the request that 404'd in the service
Unfortunately this is not possible with Traefik. I tried to achieve something similar but I realized that the only information that we are able to pass to the error page is the HTTP code, that's it.
The only options available are mentioned in their docs: https://doc.traefik.io/traefik/middlewares/errorpages/

Redirect url based on ID using lua

I'm extremely new to Lua as well as nginx.we're trying to set up authentication.
I'm trying to write a script that could be injected in my NGINX which would actually listen to a an endpoint.
My api would give give me a token. I would receive this token and check if it exists in my YAML file or probably JSON file .
based on the privilege mentioned in the file, I would like to redirect it the respective url with necessary permissions.
Any help would be highly appreciated.
First of all, nginx on its own has no Lua integration whatsoever; so if you just have an nginx server, you can't script it in Lua at all.
What you probably mean is openresty, aka. the lua-nginx-module, which lets you run Lua code in nginx to handle requests programatically.
Assuming that you have a working nginx + lua-nginx-module installed and running, what you're looking for is the rewrite_by_lua directive, which lets you redirect the client to a different address based on their request.
(Realistically, you'd likely want to use rewrite_by_lua_block or rewrite_by_lua_file instead)
Within the Lua block, you can make API calls, execute some logic, etc. and then redirect to some URI internally with ngx.exec or send an actual redirect to the client with ngx.redirect.
If you want to read in a JSON or YAML file, you should do so in the init_by_lua so the file gets loaded only once and then stays in memory. The lua-cjson module comes with nginx, so you can just use that to parse your json data into a Lua table.

Calling a REST Dynamic URI from Biztalk

I am struggling to work out how to set-up a WCF-WebHttp send port in BizTalk 2013 with a dynamic REST URI. Does anyone know the correct combination of Address URI and Http Method/URL Mapping in the endpoint settings?
The correct destination URI (that works when calling it with Postman etc) needs to be of the form:
http://servername/NYCC.Portal/resources/serviceinstance/case/{caseIDhere}/status/ServiceFulfilled
This is the form I am trying in the send port (that uses the new message variables), but I am getting a 404 on this:
Send Port Config
Please refer below link it might be helpful.
http://soa-thoughts.blogspot.com/2013/03/biztalk-server-2013-new-adapters-series.html
You will need to use below in URI http://servername/NYCC.Portal/resources/serviceinstance
and below in Http mthod and URL mapping
case/{caseIDhere}/status/ServiceFulfilled
Then do the value mappin for CaseIDhere

Advanced Scrapy use Middleware

I want to developt many middlewares to be sure websites'll be parse.
This is the workflow I thinks :
First try with TOR + Polipo
If 2 HTTP errors, try without TOR (so website know my IP)
If 2 HTTP errors, try with proxy (use one of my other server to make HTTP REQ)
If 2 HTTP errors, try with random proxy (on list of 100). This is repeat 5 times
If none works, I save informations on ElasticSearch database, to see on my control panel
I'll create a custom middleware, with process_request function wich contains all of this 5 methods. But I don't find how save type of connection (for exemple if TOR not works, but direct connection yes, I want to use this settings for all of my other scrap, for the same website). How can I save this settings ?
Other thinks, I've a pipeline wich download images of items. Is there a solution to use this middleware (idealy with saving settings) to use on it ?
Thanks in advance for you're help.
I think you could use the retry middleware as a starting point:
You could use request.meta["proxy_method"] to keep track of which one you are using
You could reuse request.meta["retry_times"] in order to track how many times you have retried a given method, and then set the value to zero when you change the proxy method.
You could use request.meta["proxy"] to use the proxy server you want via the existing HTTP proxy middleware. You may want to tweak the middlewares ordering so that the retry middleware runs before the proxy middleware.