Blob


1 ================================================================================
3 Challenge
5 In this challenge, we will modify our original rssbot.pl to include 7
6 preloaded RSS feeds. For example, to get the headlines from the IRCNow
7 Almanack, the user would type: !ircnow
9 We also need to add the ability for the user to add and delete RSS feeds.
10 To add an RSS feed, a user can type !add name URL
11 To delete an RSS feed, a user can type !delete name URL
13 Finally, the old RSS bot displayed every single article in the RSS feed.
14 Some feeds can be very long with hundreds of articles in them. Let's
15 update the bot so it only displays 5 items at a time.
17 ================================================================================
19 Modifying rssbot.pl
21 We're going to change the name of RSSBot to NewsBot, so the filenames
22 will change from rssbot.pl to newsbot.pl.
24 Next, we're going to replace the scalar $url with the hash %feedURLs
25 so we can download from multiple RSS feeds:
27 --- /home/perl103/rssbot.pl Tue Aug 31 04:59:42 2021
28 +++ /home/perl104/newsbot.pl Wed Sep 1 10:38:42 2021
29 @@ -6,21 +6,66 @@
30 use base qw(Bot::BasicBot);
31 use XML::RSS::Parser;
33 -my $url = 'https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss';
34 +my %feedURLs = (
35 + "undeadly" => "http://undeadly.org/cgi?action=rss",
36 + "eff" => "https://www.eff.org/rss/updates.xml",
37 + "hackernews" => "https://news.ycombinator.com/rss",
38 + "krebs" => "https://krebsonsecurity.com/feed",
39 + "ircnow" => "https://wiki.ircnow.org/index.php?n=Site.AllRecentChanges?action=rss",
40 + "schneier" => "https://www.schneier.com/blog/atom.xml",
41 + "slashdot" => "http://rss.slashdot.org/Slashdot/slashdotMain",
42 + "theregister" => "https://www.theregister.com/headlines.rss",
43 +);
45 The keys for %feedURLs are the names of the news sites, and the values
46 are the URLs of the RSS feeds.
48 Inside the subroutine said, we need to check for two new commands,
49 !add and !delete, plus the rss feed itself.
51 sub said {
52 my $self = shift;
53 my $arguments = shift;
54 - if ($arguments->{body} =~ /^!rss/) {
55 + if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) {
56 + my ($name, $url) = ($1, $2);
57 + $feedURLs{$name} = $url;
58 + $self->say(
59 + channel => $arguments->{channel},
60 + body => "$name added.",
61 + );
62 + }
64 We first check to see if the user typed !add <name> <url>. Here, we use
65 perl regular expressions (regex for short) to see if the user typed in
66 a valid feed name and URL.
68 NOTE: It is very important to check that data is valid. If you don't,
69 it can become a source of security holes which attackers can use to
70 steal control of your program.
72 Let's take a closer look at the if condition:
74 + if ($arguments->{body} =~ m{^!add\s+(\w+)\s+(https?://[[:print:]]+)$}) {
76 We check if the message $arguments->{body} fits the right format. It must
77 begin with the string !add, followed by one or more whitespace characters,
78 then http:// or https://, then one or more printing characters up to the
79 end of the string. The feed name is captured in $1 and the URL is captured
80 in $2.
82 If the IRC message matches our regex, we then store the name and URL as
83 a key-value pair in our hash %feedURLs, with the name as key and the URL
84 as value. We then send a message to the channel saying that $name has
85 been added.
87 In the next block, we check to see if the user typed
88 !delete <username> <email>
90 + if ($arguments->{body} =~ m{^!delete\s+(\w+)$}) {
91 + my $name = $1;
92 + delete($feedURLs{$name});
93 + $self->say(
94 + channel => $arguments->{channel},
95 + body => "$name deleted.",
96 + );
97 + }
99 If it matches our regular expression, we delete the key-value pair
100 from %feedURLs and then send a message to the channel.
102 Now, if a user sends any other command, we check to see if a key-value
103 pair is defined for the feed:
105 + if ($arguments->{body} =~ /^!(\w+)$/) {
106 + my $name = $1;
107 + if (!exists($feedURLs{$name})) {
108 + $self->say(
109 + channel => $arguments->{channel},
110 + body => "Error: $name has not been added",
111 + );
112 + return;
113 + }
115 If none is defined, we send a message to the channel showing an
116 error.
118 If a URL is defined for the feed, then we create a new XML::RSS::Parser
119 object. We're going to replace the old foreach loop because the old
120 loop printed out every single item in an RSS feed. Some of the new feeds
121 we add have hundreds of articles; a for loop allows us to limit the
122 articles to 5 per feed.
124 my $p = XML::RSS::Parser->new;
125 + my $url = $feedURLs{$name};
126 my $feed = $p->parse_uri($url);
127 - foreach my $i ( $feed->query('//item') ) {
128 - my $title = $i->query('title');
129 - my $contributor = $i->query('dc:contributor');
130 - my $link = $i->query('link');
132 In the code below, we first find the feed's title, then loop through
133 each item in the feed using a for loop. We start with index $i = 0 and
134 stop when we have printed all items or after we have finished 5, whichever
135 comes first. Each time through the loop, we increment (add one) to $i.
137 + my $qtitle = $feed->query('/channel/title');
138 + my $feed_title = $qtitle->text_content;
139 + my @qitems = $feed->query('//item');
140 + for (my $i = 0; $i < scalar(@qitems) && $i < 5; $i++) {
142 Inside the loop, we store the query for each into $qitem. We create
143 a hash called %item for each item, and we store the feed's title
144 and tags inside. If the tag is undefined, we store an empty string.
146 + my $qitem = $qitems[$i];
147 + my %item;
148 + $item{feed_title} = $feed_title;
149 + foreach my $tag (qw(title dc:contributor link comments)) {
150 + my $qtag = $qitem->query($tag);
151 + if(defined($qtag)) {
152 + $item{$tag} = $qtag->text_content;
153 + } else {
154 + $item{$tag} = "";
155 + }
156 + }
158 We then send a message to the channel, properly formatted, with the feed's
159 title and the value of the tags for each item.
161 $self->say(
162 channel => $arguments->{channel},
163 - body => $title->text_content.' - '.$contributor->text_content.': '.$link->text_content,
164 + body => "[\002$item{feed_title}\002] $item{title} ($item{'dc:contributor'}) $item{link}: $item{comments}",
165 );
169 Many IRC clients will interpret \002 as a bold character.
171 (Hint: sample code is in /home/perl104/newsbot.pl)
173 ================================================================================
175 Username: perl104
176 Password: Hp9XsPhANc6
177 Server: freeirc.org
178 Port: 22
180 ================================================================================