how to get pjscrape to print out current url in a file?

how to get pjscrape to print out current url in a file? - phantomjs

I am using pjscrape to scrape content from dynamic pages generated by a site. Please see code below.
I cant figure out what I need to do to get it to print out the url of the scraped page in the json variables dumped to a file. I have tried various ways of doing it - including document.url etc ( see lines 3-6 that are commented out in code below ). However I cant figure out how to get the urlFound variable to get the right value. Of course, the answer might be dead simple but its eluding me. Any other way of doing this? Help!
var scraper = function() {
return {
//urlFound:$(window.location.href),
//urlFound: $(this).window.location.href,
//urlFound: _pjs.toFullUrl($(this).attr('href')),
//urlFound: _pjs.toFullUrl($(this).URL),
// Heck - how to print out the url being scraped???
name: $('h1').text(),
marin: _pjs.getText($("script:contains('marin')"))
}
};
pjs.config({
// options: 'stdout', 'file' (set in config.logFile) or 'none'
log: 'stdout',
// options: 'json' or 'csv'
format: 'json',
// options: 'stdout' or 'file' (set in config.outFile)
writer: 'file',
outFile: 'scrape_output.json'
});
pjs.addSuite({
url: 'http://www.mophie.com/index.html',
moreUrls: function() {
return _pjs.getAnchorUrls('li a');
},
scraper: scraper
});

Don't need jquery for your selector on window.location.href. Not sure how to get access to the internal url of pjscraper, but changing your code to this works:
var scraper = function() {
return {
urlFound: window.location.href,
name: $('h1').text(),
marin: _pjs.getText($("script:contains('marin')"))
}
};

Or you can just use document.URL...save that as a variable and then write it to a file using How to read and write into file using JavaScript

Related

How to download file from Sanity via HTTP?

I would like to know if there is possibility to download file from Sanity with HTTP request?
I only have reference ID:
{
file: {
asset: {
_ref: "file-fxxxxxxxxxxxxxxxxxxxx-xlsx"
_type: "reference"
}
}
}
I would like to do this is this scenario:
<a href="https://cdn.sanity.io/assets/clientID/dataset/file-xxxxxxxxxxx-xlsx">
Download File
</a>

You can, indeed 🎉 With a bit of custom code you can do it just from the _ref, which is the file document's _id
Creating the URL from the _ref/_id of the file
The _ref/_id structure is something like this: file-{ID}-{EXTENSION} (example: file-207fd9951e759130053d37cf0a558ffe84ddd1c9-mp3).
With this, you can generate the downloadable URL, which has the following structure: https://cdn.sanity.io/files/{PROJECT_ID}/{DATASET}/{ID_OF_FILE}.{EXTENSION}. Here's some pseudo Javascript code for the operation:
const getUrlFromId = ref => {
// Example ref: file-207fd9951e759130053d37cf0a558ffe84ddd1c9-mp3
// We don't need the first part, unless we're using the same function for files and images
const [_file, id, extension] = ref.split('-');
return `https://cdn.sanity.io/files/${PROJECT_ID}/${DATASET}/${id}.${extension}`
}
Querying the URL directly
However, if you can query for the file's document with GROQ that'd be easier:
*[(YOUR FILTER HERE)] {
file->{ url } // gets the URL from the referenced file
}
You can do the same with images, too.

Supply Test Data into Nightwatch

I tried to supply test data to nightwatch but i don't know how. How to supply any dynamic test data to Nightwatch testing?
I don't want to hardcoded the value into the code. I want to supply it from file.
EDIT:
.setValue('selector', 'DEBBIE A/P EKU')

Since you mentioned it in one of your comments you want to read the values from a file, I recommend you doing it via pseudo-JSON (actually .js). Also a solution I applied in my company.
I have multiple json files which contain certain test data that I didn't want to have in the code. The basic structure of those looks like this:
module.exports = {
WHATEVER_IDENTIFIER_I_WANT: 'Some shiny value'
}
My Page Object contains a line like this:
const STATIC = require('path/to/static/file')
…
.setValue('selector', STATIC.WHATEVER_IDENTIFIER_I_WANT)
And yea, it is not highly sophisticated but it fulfils the purpose.
If you don't want to use module.exports and .js you can still use some node methods to load and parse a JSON. E.g.
fs.readFileSync / fs.readFile (to load the JSON file)
const file = fs.readFileSync('path/to/file')
JSON.parse() (to retrieve a JavaScript Object)
const STATIC = JSON.parse(file)
Hope this is useful for you :)

I've been through the same issue. At the moment my set up is like this:
Raw data are in the excel sheet. I use node.js to convert excel sheet into json file. Then use json data in nightwatch.
Here is the code to read the json file in nightwatch:
module.exports = {
tags: ['loginpage'],
// if not regular size logout button is not visible
'Login' : function (client) {
var credentials;
try{
credentials = require('./path/to/inputJsonData.json');
} catch(err) {
console.log(err);
console.log ('Couldn\'t load the inputJsonData file. Please ensure that ' +
'you have the inputJsonData.json in subfolder ./path/to ' +
'in the same folder as the tests');
process.exit();
}
Here is the code that use data from it:
client
.url(credentials.url)
.waitForElementVisible('body', 1000)
.assert.title('Sign In Home Page')
.login(credentials.username,credentials.password)
// some more steps here
.logout()
.end();
}
};
inputJsonData.json data
{
"url": "http://path/to/my/input/credentials/file",
"username": "yourUserName",
"password": "yourPassword"
}
My problem/question:
How to find the count of elements read into the json object from a file when the file has following format?:
[
{
....
},
{
....
},
.....
{
....
}
]
My failed attempt to get the number of elements: JSON.parse(company).count where company is another json read file like credentials in above code.
Answer: use standard javascript array property length company.length

TheBayOr answered the question concisely regarding the use of files. Just to add that if you don't literally mean a non 'code' file but simply using a different location to store the values then the most common approach is using globals.
You can place an array of values in either your nightwatch.json...
"test_settings" : {
"default" : {
"selenium_port" : 4444,
"selenium_host" : "localhost",
"silent": true,
"globals" : {
"VARIABLE_1" : "i'm a variable",
"VARIABLE_2" : "i'm a variable too"
},
"desiredCapabilities": {
"browserName": "chrome",
"javascriptEnabled": true,
"acceptSslCerts": true
}
},
"other_environment" : {
"globals" : {
"VARIABLE_1" : "i'm a different variable",
"VARIABLE_2" : "i'm a different variable too"
You can use them in tests with something like....
.url(browser.globals.VARIABLE_1)
Notice in the above you can have sets of globals under different environments. This has the advantage of meaning you can have multiple sets and use the one you want by running nightwatch -e 'my desired environment'.
Similarly this can be achieved by putting your array of data in a globals file e.g. globals.js and referencing it in your 'globals.path'.
If you want to get really into it you can even store your variables in global.js then use the 'fs' library to write the values to a file, then have your tests read from there. I'd recommend a new question if thats what you intend.
Hopefully that adds something :)

In my case I just created a function which read variables , data ,etc
more details here: https://stackoverflow.com/a/64616920/3957754

WL.download with multiple files (OneDrive API)

I'm trying to implement a OneDrive picker. The user can select his files and then, when saving, i can get these files and download them.
I follow the OneDrive API Documentation, and i get this :
WL.init({ client_id: clientId, redirect_uri: redirectUri });
WL.login({ "scope": "wl.skydrive wl.signin" }).then(
function(response) {
openFromSkyDrive();
},
function(response) {
log("Failed to authenticate.");
}
);
function openFromSkyDrive() {
WL.fileDialog({
mode: 'open',
select: 'single'
}).then(
function(response) {
log("The following file is being downloaded:");
log("");
var files = response.data.files;
for (var i = 0; i < files.length; i++) {
var file = files[i];
log(file.name);
WL.download({ "path": file.id + "/content" });
}
},
function(errorResponse) {
log("WL.fileDialog errorResponse = " + JSON.stringify(errorResponse));
}
);
}
function log(message) {
var child = document.createTextNode(message);
var parent = document.getElementById('JsOutputDiv') || document.body;
parent.appendChild(child);
parent.appendChild(document.createElement("br"));
}
In the select options, you can set 'single' or 'multi' to permit to the user to select one or more files from the picker.
But when i try to set 'multi', the WL.download method only work for the last file.
Thanks for help !
ps: i didn't found real solution on stackoverflow or any forum

This is a quirk with the WL.Download() function. It creates a hidden iframe to execute the download, but it uses the same iframe for all the downloads it does. So if you queue up two downloads in quick succession, it will navigate the iframe twice and you'll only end up actually downloading the last file. WL.Download() does not expose when a download is complete, so you can't simply wait for one to finish before starting the next.
Unfortunately, the code sample is a bit misleading, putting the WL.Download() calls in a for-loop. We've taken note of these issues.
In the meantime, to unblock yourself, you can get the download URL from the 'file.source' property and initiate the download yourself.

Loading store with params

What I am trying to do is load store with params like below so I only get the first ten items of my store.
app.stores.actualites.load({
params : {
start:0,
limit:10,
},
callback : function(records, operation, success) {
app.loadmask.hide();
}
});
But this is not working, it returns all the 18 store items.
If I put the start param to 1, it will return 17 items, so this param is working but not the other.
Update : Store code
app.stores.actualites = new Ext.data.Store({
model: 'app.models.Actualites',
proxy: {
type: 'ajax',
url: app.stores.baseAjaxURL + '&jspPage=%2Fajax%2FlistActualites.jsp',
reader: {
type: 'json',
root: 'actualite',
successProperty: 'success',
totalProperty: 'total',
idProperty: 'blogEntryInfosId'
}
}
});
The weird thing here is when I try the URL in a browser and add &start=0&limit=1 it works just fine...
Update : Try with extraParams
I also tried to do it with extraParams but this still doesn't work
app.stores.actualites.getProxy().extraParams.start = 1;
app.stores.actualites.getProxy().extraParams.limit = 2;
app.stores.actualites.load({
callback : function(records, operation, success) {
app.loadmask.hide();
}
});

The pagination functionality has to be actually implemented at your server side. Sencha will only maintain the pages and will send you proper start and limit values. You need to access these values at your server side script and return appropriate results depending on those.
If you are using a list, then you can use Sencha's inbuilt ListPaging plugin which takes care of the start/limit parameter in its own.

This might sound weird, but I changed to name of the param 'limit' to 'stop' both on the client and the server and it worked...

it should be something like that:
app.stores.actualites.getProxy().setExtraParams({
start:1,
limit:2
})

ExtJS 4: Changing Store param names

Right now I'm running into a problem where I can't seem to change the param names page, start, limit, and dir for a Ext.data.Store.
In ExtJS 3 I could do this:
paramNames :
{
start : 'startIndex',
limit : 'pageSize',
sort : 'sortCol',
dir : 'sortDir'
}
I tried adding this configuration to the Ext.data.Store for ExtJS 4 however 'start', 'limit', 'sort', and 'dir' still show up as the default param names. I need to be able to change this as the server side functionality requires these param names. This also causes paging and remote sorting to not work since the param names don't match what the server side resource is expecting.
So is there a new way in ExtJS 4 to change these param names like in ExtJS 3?

take a look at Proxy,
see http://docs.sencha.com/ext-js/4-0/#/api/Ext.data.proxy.Server
directionParam,limitParam...

To dynamically modify the parameters just before the load of a store you can do this:
/* set an additional parameter before loading, not nice but effective */
var p = store.getProxy();
p.extraParams.searchSomething = search;
p.extraParams.somethingelse = 'This works too';
store.load({
scope : this,
callback: function() {
// do something useful here with the results
}
});

Use this code:
proxy: {
type: 'ajax',
url: '/myurl',
method: 'GET',
**extraParams: { myKeyword: 'abcd' },**
reader: {
type: 'json',
root: 'rows'
}
}
Now you can change your myKeyword value from abcd to xyz in following way.
gridDataStore.proxy.extraParams.keyword='xyz';
gridDataStore.load();
this will set your parameters' value and reload the store.

The keys were renamed and moved to the Ext.data.Proxy object. Here's a simple example that tells ExtJS to use the default Grails parameter names:
Ext.create('Ext.data.Store', {
// Other store properties removed for brevity
proxy: {
// Other proxy properties removed for brevity
startParam: "offset",
limitParam: "max",
sortParam: "sort",
directionParam: "order",
simpleSortMode: true
}
});
I also set the simpleSortMode so that each of the parameters are sent to the server as discrete request parameters.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how to get pjscrape to print out current url in a file? - phantomjs

Or you can just use document.URL...save that as a variable and then write it to a file using How to read and write into file using JavaScript

Related

How to download file from Sanity via HTTP?

Supply Test Data into Nightwatch

WL.download with multiple files (OneDrive API)

Loading store with params

ExtJS 4: Changing Store param names

Categories

Resources