Perl: HTTP::Tiny delete leaves broken anchor tags

You’re misunderstanding what delete does. All your code does is remove the href attribute from that DOM element in your Mojo::DOM representation. It has nothing to do with HTTP::Tiny.

What you actually want to do is call ->strip on the <a> element, which removes it from the DOM, but keeps its content intact.

Since you are already using Mojo::DOM, you can just as well use Mojo::UserAgent. There is no need to pull in another UA module. You’ve already got the whole Mojolicious installed anyway.

You can use a HEAD request rather than a GET request to check if a resource is available. There is no need to download the whole thing, the headers are sufficient.

Your code (without the DB part) can be reduced to this.

use strict;
use warnings;
use Mojo::DOM;
use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;
my $dom = Mojo::DOM->new(<DATA>);

foreach my $element ($dom->find('a[href]')->each) {
    $element->strip
        unless $ua->head($element->attr('href'))->res->is_success;
}

print $dom;

__DATA__
This <a href="http://example.org">link works</a>.
This <a href="http://httpstat.us/404">one does not</a>!

This outputs:

This <a href="http://example.org">link works</a>. This one does not!

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top