<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Flameeyes's Weblog: Service entry: looking for information about a "Yeti bot"</title>
    <link>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Proud to be European</description>
    <item>
      <title>Service entry: looking for information about a &amp;quot;Yeti bot&amp;quot;</title>
      <description>&lt;p&gt;The stats for my blog, yesterday, were all messed up. With a quite usual amount of unique visits (about 900), the amount of requested pages and of total hits went skyrocketing, totaling more than 400MB of traffic, against an usual ~100 MB (depending on blog posts and if they are picked up by other sites too).&lt;/p&gt;


	&lt;p&gt;Sure it wasn&amp;#8217;t last year&amp;#8217;s Slashdot effect bound stats, but they were still quite a bit of a bandwidth being used. AWstats wasn&amp;#8217;t picking up any new bot, not a specific single IP biasing the stats, so I had to do some manual analysis to find the cause&amp;#8230;&lt;/p&gt;


	&lt;p&gt;There are two possible culprits, one is a German IP (reporting Opera as user agent), which seemed to refresh &lt;a href="http://blog.flameeyes.eu/articles/2008/01/15/tech-side-overcomes-differences"&gt;my last post on Gentoo&amp;#8217;s &#8220;issue&#8221;&lt;/a&gt; constantly starting from 9 AM till 10 PM. Seems a legit request, although I&amp;#8217;d suggest that reader, if (s)he&amp;#8217;s reading this, to use the &lt;span class="caps"&gt;RSS&lt;/span&gt; feed for the comments instead, that will save their and mine bandwidth ;)&lt;/p&gt;


	&lt;p&gt;The other clearly is a bot, as it advertise itself as such:  &amp;#8220;Yeti/0.01 (nhn/1noon, yetibot@naver.com, check robots.txt daily and follow it)&amp;#8221; . The requests from this bot come from a single B class, although mixed with a different &amp;#8220;NaverBot&amp;#8221; (which points to &lt;a href="http://help.naver.com/delete_main.asp"&gt;http://help.naver.com/delete_main.asp&lt;/a&gt; in the useragent).&lt;/p&gt;


	&lt;p&gt;The netblock owner is &lt;span class="caps"&gt;NHN&lt;/span&gt; Corporation, which seems to be the entity behind that &lt;a href="http://naver.com/"&gt;Naver&lt;/a&gt; site, which seems to be some kind of search engine, likely something similar to Technorati, but my Korean is&amp;#8230; well let&amp;#8217;s just say the only Asiatic language I can barely understand is Japanese.&lt;/p&gt;


	&lt;p&gt;I don&amp;#8217;t mind indexing, I don&amp;#8217;t stop any robot in my robots.txt, and right now bandwidth is far from being a problem (it would have been a &lt;em&gt;very big&lt;/em&gt; problem if the blog was still hosted on my home connection though), but they hit the robots.txt file 384 times just yesterday, out of 542 hits total in the day! I&amp;#8217;d very much like to write them about this at this point.&lt;/p&gt;


	&lt;p&gt;So the question would be, am I the only one hit by this &amp;#8220;Yeti bot&amp;#8221; out there? Any of my readers understand Korean and can tell me what the page linked above for NaverBot says?&lt;/p&gt;


	&lt;p&gt;Sorry for the service posting.&lt;/p&gt;</description>
      <pubDate>Fri, 18 Jan 2008 02:27:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:9650c6d8-fca8-489c-8509-753fa5ca1598</guid>
      <author>flameeyes@gmail.com (Diego "Flameeyes" Petten&#242;)</author>
      <link>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot</link>
      <comments>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot#comments</comments>
      <category>English</category>
      <category>Naver</category>
      <category>SearchEngines</category>
      <category>Robots</category>
      <category>Service</category>
    </item>
    <item>
      <title>"Service entry: looking for information about a "Yeti bot"" by blog.coldtobi.de</title>
      <description>&lt;p&gt;Yes, YETI was also here, reloading the very same page at least 20 times&amp;#8230;.&lt;/p&gt;</description>
      <pubDate>Tue, 22 Jan 2008 23:18:45 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:aaa8e22e-837b-4042-8648-31196d45b526</guid>
      <link>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot#comment-1817</link>
    </item>
    <item>
      <title>"Service entry: looking for information about a "Yeti bot"" by Ken</title>
      <description>&lt;p&gt;Hi,&lt;/p&gt;


	&lt;p&gt;Just checked today&amp;#8217;s log on &lt;a href="http://www.college-loans.us" rel="nofollow"&gt;www.college-loans.us&lt;/a&gt; and found about eight hits from yetibot @ naver.com. First time I noticed it, but I wasn&amp;#8217;t looking before&amp;#8230;.I&amp;#8217;ll keep an eye out now.&lt;/p&gt;</description>
      <pubDate>Tue, 22 Jan 2008 22:01:47 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:8939cc8f-b3eb-495b-b743-6c455ac8c2a3</guid>
      <link>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot#comment-1812</link>
    </item>
    <item>
      <title>"Service entry: looking for information about a "Yeti bot"" by Infirit</title>
      <description>&lt;p&gt;I found it &lt;a &gt; here&lt;/a rel="nofollow"&gt;. But I would block it as this is not how a bot should behave.&lt;/p&gt;</description>
      <pubDate>Fri, 18 Jan 2008 20:24:41 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:95179812-9bf5-4e97-badd-48cca2dd8148</guid>
      <link>http://blog.flameeyes.eu/articles/2008/01/18/service-entry-looking-for-information-about-a-yeti-bot#comment-1792</link>
    </item>
  </channel>
</rss>
