================================================================================ Challenge In this challenge, we will modify our original rssbot.pl to include 7 preloaded RSS feeds. For example, to get the headlines from the IRCNow Almanack, the user would type: !ircnow We also need to add the ability for the user to add and delete RSS feeds. To add an RSS feed, a user can type !add name URL To delete an RSS feed, a user can type !delete name URL Finally, the old RSS bot displayed every single article in the RSS feed. Some feeds can be very long with hundreds of articles in them. Let's update the bot so it only displays 5 items at a time. ================================================================================ Modifying rssbot.pl We're going to change the name of RSSBot to NewsBot, so the filenames will change from rssbot.pl to newsbot.pl. Next, we're going to replace the scalar $url with the hash %feedURLs so we can download from multiple RSS feeds: --- /home/perl103/rssbot.pl Tue Aug 31 04:59:42 2021 +++ /home/perl104/newsbot.pl Wed Sep 1 10:38:42 2021 @@ -6,21 +6,66 @@ use base qw(Bot::BasicBot); use XML::RSS::Parser; -my $url = 'https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss'; +my %feedURLs = ( + "undeadly" => "http://undeadly.org/cgi?action=rss", + "eff" => "https://www.eff.org/rss/updates.xml", + "hackernews" => "https://news.ycombinator.com/rss", + "krebs" => "https://krebsonsecurity.com/feed", + "ircnow" => "https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss", + "schneier" => "https://www.schneier.com/blog/atom.xml", + "slashdot" => "http://rss.slashdot.org/Slashdot/slashdotMain", + "theregister" => "https://www.theregister.com/headlines.rss", +); The keys for %feedURLs are the names of the news sites, and the values are the URLs of the RSS feeds. Inside the subroutine said, we need to check for two new commands, !add and !delete, plus the rss feed itself. sub said { my $self = shift; my $arguments = shift; - if ($arguments->{body} =~ /^!rss/) { + if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) { + my ($name, $url) = ($1, $2); + $feedURLs{$name} = $url; + $self->say( + channel => $arguments->{channel}, + body => "$name added.", + ); + } We first check to see if the user typed !add . Here, we use perl regular expressions (regex for short) to see if the user typed in a valid feed name and URL. NOTE: It is very important to check that data is valid. If you don't, it can become a source of security holes which attackers can use to steal control of your program. Let's take a closer look at the if condition: + if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) { We check if the message $arguments->{body} fits the right format. It must begin with the string !add, followed by one or more whitespace characters, then http:// or https://, then one or more printing characters up to the end of the string. The feed name is captured in $1 and the URL is captured in $2. If the IRC message matches our regex, we then store the name and URL as a key-value pair in our hash %feedURLs, with the name as key and the URL as value. We then send a message to the channel saying that $name has been added. In the next block, we check to see if the user typed !delete + if ($arguments->{body} =~ m{^!delete\s+(\w+)$}) { + my $name = $1; + delete($feedURLs{$name}); + $self->say( + channel => $arguments->{channel}, + body => "$name deleted.", + ); + } If it matches our regular expression, we delete the key-value pair from %feedURLs and then send a message to the channel. Now, if a user sends any other command, we check to see if a key-value pair is defined for the feed: + if ($arguments->{body} =~ /^!(\w+)$/) { + my $name = $1; + if (!exists($feedURLs{$name})) { + $self->say( + channel => $arguments->{channel}, + body => "Error: $name has not been added", + ); + return; + } If none is defined, we send a message to the channel showing an error. If a URL is defined for the feed, then we create a new XML::RSS::Parser object. We're going to replace the old foreach loop because the old loop printed out every single item in an RSS feed. Some of the new feeds we add have hundreds of articles; a for loop allows us to limit the articles to 5 per feed. my $p = XML::RSS::Parser->new; + my $url = $feedURLs{$name}; my $feed = $p->parse_uri($url); - foreach my $i ( $feed->query('//item') ) { - my $title = $i->query('title'); - my $contributor = $i->query('dc:contributor'); - my $link = $i->query('link'); In the code below, we first find the feed's title, then loop through each item in the feed using a for loop. We start with index $i = 0 and stop when we have printed all items or after we have finished 5, whichever comes first. Each time through the loop, we increment (add one) to $i. + my $qtitle = $feed->query('/channel/title'); + my $feed_title = $qtitle->text_content; + my @qitems = $feed->query('//item'); + for (my $i = 0; $i < scalar(@qitems) && $i < 5; $i++) { Inside the loop, we store the query for each into $qitem. We create a hash called %item for each item, and we store the feed's title and tags inside. If the tag is undefined, we store an empty string. + my $qitem = $qitems[$i]; + my %item; + $item{feed_title} = $feed_title; + foreach my $tag (qw(title dc:contributor link comments)) { + my $qtag = $qitem->query($tag); + if(defined($qtag)) { + $item{$tag} = $qtag->text_content; + } else { + $item{$tag} = ""; + } + } We then send a message to the channel, properly formatted, with the feed's title and the value of the tags for each item. $self->say( channel => $arguments->{channel}, - body => $title->text_content.' - '.$contributor->text_content.': '.$link->text_content, + body => "[\002$item{feed_title}\002] $item{title} ($item{'dc:contributor'}) $item{link}: $item{comments}", ); } } Many IRC clients will interpret \002 as a bold character. (Hint: sample code is in /home/perl104/newsbot.pl) ================================================================================ Username: perl104 Password: Hp9XsPhANc6 Server: freeirc.org Port: 22 ================================================================================