I'm sure my problem is based on a lack of understanding of asynch programming in node.js but here goes.
For example: I have a list of links I want to crawl. When each asynch request returns I want to know which URL it is for. But, presumably because of race conditions, each request returns with the URL set to the last value in the list.
var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
var url = links[link];
require('request')(url, function() {
console.log(url);
});
}
Expected output:
http://google.com
http://yahoo.com
Actual output:
http://yahoo.com
http://yahoo.com
So my question is either:
- How do I pass url (by value) to the call back function? OR
- What is the proper way of chaining the HTTP requests so they run sequentially? OR
- Something else I'm missing?
PS: For 1. I don't want a solution which examines the callback's parameters but a general way of a callback knowing about variables 'from above'.
Answer
Your url
variable is not scoped to the for
loop as JavaScript only supports global and function scoping. So you need to create a function scope for your request
call to capture the url
value in each iteration of the loop by using an immediate function:
var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
(function(url) {
require('request')(url, function() {
console.log(url);
});
})(links[link]);
}
BTW, embedding a require
in the middle of loop isn't good practice. It should probably be re-written as:
var request = require('request');
var links = ['http://google.com', 'http://yahoo.com'];
for (link in links) {
(function(url) {
request(url, function() {
console.log(url);
});
})(links[link]);
}
No comments:
Post a Comment