A patch to limit PHPCrawl crawling depth

PHPCrawl is a webcrawler/webspider-library written in PHP. It supports filters, limiters, cookie-handling, robots.txt-handling, multiprocessing and much more.

Unluckily, there isn’t a way to limit PHPCrawl crawl depth. Here I propose a patch for its current version (0.82), that adds two methods to the PHPCrawler class: getMaxDepth and setMaxDepth.

The usage is intuitive:

$crawler = new PHPCrawler();

The crawler will get pages from level 0 (the $startingURL) to level $n – 1.

By default, the crawling depth limit is set to PHPCrawler::UNLIMITED_CRAWLING_DEPTH = 0. This means that the crawler will get any web page, regardless of its depth from the starting URL.

To apply the patch, download it and give:

patch -p1 -d PHPCrawl_082/ < PHPCrawl_082_maxcrawlingdepth_rev_2_1.path

from PHPCrawler source code parent directory.

Download: patch (revision 2)

UPDATE: In the comments section, Hiruka suggested that this patch it is not easily applicable using NetBeans patch facility. I recommend using the patch command from the command line, or some other tool. For instance, Hiruka succeeded to apply the patch using git.

UPDATE 2 (28/07/2014): I have uploaded a patch revision. This should fix the small bug reported by Sylvain LAVIELLE in the comment section (‘undefined offset’).

UPDATE 3 (28/08/2014): some more bug fixes. Also, I introduced an (experimental) feature to set the HTTP Accept-Language Header

// set preferred language
$crawler->setAcceptLanguage("it, en;q=0.8");


A social networks hub


During December 2012, the student Daniele Orlando organized an hackathon at my university. The contest topic was about the social networking divide. For various reasons, I couldn’t attend it.
My idea would have been about creating an hub for my study course. We currently have (at least) three information channels: the offical site, the facebook group, and the super-cool google group. It can be pretty chaotic to follow, or update, all of these.
My idea for the contest was to create a single point of access to these channels, leveraging technologies like RSS and facebook api. I’ve lately implemented this idea. It’s been quite easy, it took me probably less than a day to develop the proof-of-concept you’ll find attached to this post.You can also find a running demo here.

My social hub uses SimplePie to access RSS contents (which is the case of the website and of the google group) and facebook social graph api to access facebook (open) groups. Also, with my social hub you can republish contents via RSS.
Since I suck at web designs, I’ve just used a Srinivas Tamada’s template.

To run my code on your machine you’ll just need a LAMP installation and a facebook application key and secret. A facebook app is needed only if you want to read from a facebook open group. Make sure to ask for user_groups privilege while creating the facebook app.

Be aware!! This is just a proof of concept release!!

Download: SocialHub.tar.gz Fibroids can stay within the body without symptoms. Dr. Dr. 9. I mean countless failures infertility, or perhaps a death sentence, please consider that to be analyzed. 9. 2. Cancer translational research undertake the mechanisms of apoptosis 7. Cross-references to other topics, definitions or medical illustrations) is only available in the online version. Despite these many options, the surgical approach of selected fibroid removal remains an important choice for those women who want or need to preserve the uterus for reproduction. Alternatives to hysterectomy in the management of leiomyomas. Approximately 6500-8500 women worldwide have had this procedure. 2. Diseases related to estrogen include breast and uterine disease, including cancer, as well as endometriosis, fibroid tumors, premenstrual syndrome, reproductive dysfunctions such as infertility or lactation suppression. Uterine fibroids symptoms include: heavy menstrual cycles irregular bleeding pelvic pressure and pain, especially during intercourse bladder symptoms nbsp lower back pain... viagra 50 mg use Real-time imr-imaging-guided cryoablation of uterine fibroids. read more forever comfy combination cushion - why buy it? The forever comfy cushion is a three layer cushion which is made from foam with a gel layer in the middle providing you with maximum comfort and support when sitting on hard surfaces....

This site may use cookies. By continuing to use the site, you agree to the use of cookies. more information

By the "EU Cookie Law", we have to inform you that this website may use cookies in order to function. If you continue to use this website without changing your own cookie settings, or if you click "Accept" below, then you are consenting to this.