Server returns "not logged-in" when using fetch api - react-native

I'm sending request to instagram.com/p/SHORTCODE url on Postman and it give me a proper response. But when i try that request using fetch api on react-native, it doesn't give the same response. It gives an html doc starting with:
<html lang="en" class="no-js not-logged-in ">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>
Page Not Found • Instagram
</title>
Is it related to the browser that i use with Postman ? I'm copying all the cookies from Postman reqest to fetch api. It should give me the same response but it does not.

Related

Unable to scrape parts of a page webpage with scrapy

I'm using scrapy to crawl an e-commerce website I'm experienced with simpler websites where scrapy alone or with splash/selenium handle most cases.
I have a new situation where I have no experience to deal with. From my investigations it could be like a captcha but without any request to the user.
I've made tests to solve it with scrapy alone, scrapy and selenium with no success.
With my scrapy request I receive the following response
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Challenge Validation</title>
<link rel="stylesheet" type="text/css" href="/_sec/cp_challenge/sec-2-9.css">
<script type="text/javascript">function cp_clge_done(){location.reload(true);}</script>
<script src="/_sec/cp_challenge/sec-cpt-int-2-9.js" async defer></script>
<script type="text/javascript">sessionStorage.setItem('data-duration', 5);</script>
</head>
<body>
<div class="sec-container">
<div id="sec-text-container"><iframe id="sec-text-if" class="custmsg" src="https://beta.elcorteingles.es/sgfm/statics/eci_non_food/contents/cc/cca.html"></iframe></div>
<div id="sec-if-container">
<iframe id="sec-cpt-if" class="crypto" data-key="" data-duration=5 src="/_sec/cp_challenge/ak-challenge-2-9.htm"></iframe>
</div>
</div>
</body>
</html>
With the chrome inspector i see also noticed two GET requests (non-java) that might be related:
check -> returns HTML ( ... <title>RP iframe</title> ...)
check-session?origin=https%3A%2F%2Fwww.elcorteingles.es -> returns HTML (...<title>OP iframe</title>...)
Using scrapy shell with view(response) it looks like a captcha situation, waiting for something. Page example could be:
scrapy shell "https://www.elcorteingles.es/supermercado/0110120903000022-coosur-aceite-de-oliva-intenso-1-botella-1-l/"
The title 'challenge validation' suggests it. I have no idea how to handle with this case. From research, I've seen solutions involving scrapy middleware but for cases where input was asked from the user. I found no example similar to this case. Any guidance on how to proceed is appreciated.

Linkedin Open Graph Sharing not working

I have a page setup for Open Graph Protocol because our app is built upon Angular 1.x now when we share a URL using LinkedIn. Share Popup opens but it does not crawl open graph tags sometimes and sometimes it shows the proper crawl tags it was working fine till last week. here is the image which shows the preview area:
Scenario for sharing a link:
User comes on our site: www.example.com/event/[EVENT_ID] and clicks share to LinkedIn.
Popups opens using: https://www.linkedin.com/shareArticle?mini=true&url=https://example.com/event/0u83s43rf6r/4295028179 where 4295028179 is event id and 0u83s43rf6r is a random key for sharing because of cache busting.
Now we are using apache mod_rewrite to redirect LinkedIn, Facebook, Twitter bot to our crawler page where Open graph tags are rendered.
Apache Mod Rewrite Settings in .htaccess file
RewriteCond %{HTTP_USER_AGENT} ^(facebookexternalhit/(.*)|Facebot|Twitter(.*)|Pinterest|LinkedIn(.*)|LinkedInBot)$ [NC]
RewriteRule ^(event)/([_0-9a-zA-Z]+)/([0-9]+)$ https://share.example.com/web/crawler/details/$3 [R=301,L]
So the end url becomes when crawler redirect based on USER AGENT where open graph tags are rendered: http://share.example.com/web/crwaler/details/4295028179
Here is the rendered html tags:
<html>
<head>
<script type="text/javascript">window.location = 'https://example.com/event/236129271' // if it's a browser then redirect it to website</script>
<meta property="og:title" content="Event Title" />
<meta property="og:description" content="Event Description" />
<meta property="og:image" content="Event Thumbnail" />
<meta name="title" content="LinkedIn Share Test" />
<meta name="description" content="Event Description" />
<meta property="og:image:width" content="188" />
<meta property="og:image:height" content="71" />
<!-- Twitter Card Working Fine-->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="Event Title">
<meta name="twitter:description" content="Event Description">
<meta name="twitter:image" content="Event Image">
</head>
<body>
</body>
</html>
Last week this logic is working fine on Linkedin but now somehow it's not working.
Your code seems fine, you have the right og: tags, etc..
Whenever you're not sure that you're using the LinkedIn share API, check out your website with the LinkedIn Post Inspector, and this will tell you how the LinkedIn API is looking at your webpage. It covers many things, from <title> tags, to og: tags, to oEmbed tags, etc., etc..
Worried about caching? Why not test a URL like example.com?someFakeParameter=123? This will similarly bypass the caching at the LinkedIn Post Inspector.
If you could post your actual URL that you're sharing, I could give you a better answer, but hopefully something here helps!

Google Plus Share not working for dynamic title description and images

I know similar question is asked multiple times but they do not have correct answers and none of them is working for me. I have trying to share my url, title and description on google+ but seems like its not working.
I tried everything given on the page - https://developers.google.com/+/web/snippet/
My web page has dynamic title, description and image but Google+ is not able to take all that information from Open graph parameters (og:image, og:title etc) provide
Following are my open graph parameters which will be filled during page load and I checked it through debugger, all the information is coming correctly.
<title>this is test </title>
<meta charset="UTF-8"/>
<meta name="description" content="this is test desc"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<meta property="og:locale" content="en_US" />
<meta property="og:type" content="website" />
<meta property="og:image" content=""/>
<meta property="og:title" content="" />
<meta property="og:description" content="" />
<meta property="og:url" content="" />
<meta property="og:site_name" content="" />
and the sharing link which I have been using is : https://plus.google.com/share?hl=en&url=xxxx
Can someone please suggest how will it work with dynamic contents?
Note: Open graph is not working for facebook also but I had to explicitly provide all parameters in share link of Facebook.
Google+ Snippets do not work with dynamic apps. The info has to be returned in the HTML since Google's parser does not execute JavaScript.

How to capture JS redirects in Selenium?

Is there any way to capture all the redirects on the page performed in JS? For instance, let's take a look at this web page making redirect using window.location
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Redirect JS</title>
</head>
<body>
<script>
window.location = "http://www.example.com";
</script>
</body>
</html>
or meta tag
<meta http-equiv="refresh" content="0; url=http://example.com/">
I would like to render web page and get all urls where user has been redirected. Is it possible? How to do that in selenium?
In Python: http://selenium-python.readthedocs.org/en/latest/api.html : webdriver has property current_url. After you driver.get() the page, I would assume current_url is the redirected URL. Is it not?
Your requirement "in Selenium" will make this impossible. Selenium interacts with a browser as a human would - a human should generally not know or care about all the redirects. If you are willing to abandon Selenium for this purpose, then there are libraries such as HttpBuilder (in the Java world) and many others (for other languages) that allow you to manipulate and watch HTTP traffic, which is what you are after here.

Facebook Scraper cannot retrieve image

In my page, I have this:
<meta property="og:image" content="<?php echo $picURL; ?>"/>
Which when executed is rendered like this:
<meta property="og:image" content="http://a3.sphotos.ak.fbcdn.net/hphotos-ak-ash3/556898_400257580012798_100000856787624_1059515_311974781_n.jpg"/>
But the Facebook scraper is seeing it like this:
<meta property="og:image" content="">
It seems it is not considering images from Facebook.
See my answer here but you can't have images hosted on Facebook set as your meta tags. It used to give you a clearer error message that you couldn't hotlink Facebook images.