My Email Tools can help you in so many ways it is hard to imagine them all! Free 14 Day Trial!
| Five Practical Reasons for Fighting Plagiarism |
| Published: July 25, 2007, 12:43 pm |
| Tags: Prevention, Products, Legal Issues, Articles, Dmca, Content Theft, Copyright Infringement, Copyright Law, Creative Commons, Google, Marketing, Plagiarism, Reuptation, Scraping, Search Engines, Seo |
| nsfw) as it pertains to scraping, one penalty is certain, increased competition. Even if there is no algorithmic “penalty” placed on your site, the plagiarists will still show up for in your keyword results. For example, if you had a keyword unique to your site, you’d be number one for certain. If you were plagiarized six |
|
|
| WordPress and Comment Spam |
| Published: July 24, 2007, 3:56 pm |
| Tags: Prevention, Personal Experiences, Articles, Akismet, Automattic, Comment Spam, Content Theft, Copyright Infringement, Copyright Law, Plagiarism, Scraping, Spam, Spam Blogs, Wordpress |
| I would like to take an aside and delve into a related topic that has been on my mind for the past few months: Comment spam. Though it doesn’t have much to do with content theft, I have several reasons for wanting to cover this. First many of the RSS scrapers and spam bloggers also use this technique to supplement their work. Second, in some |
|
|
| Picking a Dead Man's Pocket |
| Published: July 19, 2007, 12:23 pm |
| Tags: Prevention, Legal Issues, Articles, Dmca, Archive, Caching, Content Theft, Copyright, Copyright Infringement, Internet Archive, Meta Tags, Plagiarism, Robots Txt, Rss, Scraping, Spam, Spammers, Splogs |
| theft, plagiarism and scraping is making you think about packing up and shutting down your site, you might want to think again. These days, even death does not put an end to content theft. It merely opens up new avenues for it. As a recent article on Blue Hat SEO pointed out, nothing is ever really deleted from the Web. Caching sites and |
|
|
| Protecting the Comment Feed |
| Published: August 1, 2007, 2:51 pm |
| Tags: Prevention, Articles, Atom, Comments, Content Theft, Copyright Infringement, Feed, Plagiarism, Rss, Scraping, Spam Blogs, Splogs, Wordpress |
| compared to traditional RSS scraping, comment scraping provides a different set of problems. First, since the Webmaster does not hold the copyright the majority of the comments posted on a site, just the ones they posted. Thus, they can not always file a DMCA notice or a cease and desist letter for the comments that are taken. This makes |
|
|
| Plagiarism Today Turns Two |
| Published: August 2, 2007, 2:24 pm |
| Tags: Housekeeping, Personal, Articles, Anniversary, Birthday, Content Theft, Copyright Infringement, Plagiarism, Plagiarism Today, Scrapers, Scraping, Spam Blogs, Splogs, Two |
| just plagiarism, including scraping, spam blogging and image hotlinking. That, in turn, where Plagiarism Today has been for the past year or so, working with Webmasters, companies and organizations to raise awareness and create solutions to help artists, writers and musicians who post their work on the Web get the credit they deserve. That |
|
|
| A Scrape of a Scrape |
| Published: August 7, 2007, 11:07 am |
| Tags: Prevention, Personal Experiences, Articles, Content Theft, Copyright Infringement, Ie7, Plagiarism, Rss, Scraping, Spam, Spam Blogging, Spam Blogs, Splogs |
| alike exactly how bad scraping is on the Web. I discuss my past experiments on the topic and how, depending on your keywords, suspicious traffic starts showing up with the first post. However, as I was searching for information on IE7 security flaws for another site I’m working on, I ran across something that was truly mind-blowing. |
|
|
| Transcraping: Multi-Lingual Content Theft |
| Published: August 29, 2007, 3:55 pm |
| Tags: Prevention, Legal Issues, Articles, Blogging, Blogs, Content Theft, Copyright Infringement, Copyright Law, Plagiarism, Rss, Scraping, Transcraping, Translations |
| to Worry Though this type of scraping is still definitely copyright infringement, translation and derivative works rights go to the copyright holder, there are several reasons why it is less worrisome than more traditional scraping. The good news and the bad news are one and the same: Search engines can’t detect this kind of plagiarism. |
|
|
| ASP.NET: REGEX Parse the RSS / ATOM Feed Url from a Page |
| Published: August 18, 2007, 12:07 pm |
| Tags: Regex, Asp Net, Screen Scraping |
| 2. - Grabel's LawI've been scraping again, I confess. Just can't resist it. One of the things I've run into when grabbing a bunch of web pages in a threadpool callback is how to determine if the page sports the autodiscovery tags (e.g. there is a feed for the site).Here is one way to do this with a little bit of REGEX:using |
|
|
| Autodiscovery and RSS Scraping |
| Published: September 5, 2007, 2:49 pm |
| Tags: Prevention, Articles, Autodiscovery, Content Theft, Copyright Infringement, Plagiarism, Rss, Scraping, Spam, Spam Blogs, Splogs |
| the spammers who have been scraping his feed. Though the idea was tempting, the sad truth is that autodiscovery does not play a major role one way or another in dealing with feed scraping. Though it helps browsers and users find the feed, spammers have other methods of feed detection that bypass not only the tags in your HTML, but your site |
|
|
| The DMCA on Seven Blog Hosts |
| Published: September 6, 2007, 4:07 pm |
| Tags: Legal Issues, Personal Experiences, Articles, Dmca, Aol, Blogger, Blogging, Blogsome, Blogspot, Content Theft, Copyright, Copyright Infringement, Copyright Law, Livejournal, Microsoft, Msn, Plagiarism, Rss, Scraping, Sixapart, Typepad, Windows Live Spaces, Wordpress |
| For the next chapter in the “DMCA Seven” series, we’re taking a look at one of the most common types of hosts out there, blog hosts. Many of these hosts have been copyright headaches for Webmasters. They are prime targets for spam blogs and scrapers and some have played a huge role in rise of the “splogosphere”. |
|
|
| Legal and Ethical Link Blogging |
| Published: September 12, 2007, 6:04 pm |
| Tags: Legal Issues, Articles, News, Content Theft, Copyright Infringement, Duncan Riley, Google, Google Reader, Plagiarism, Robert Scoble, Rss, Scoble, Scraping, Spam Blogs, Splogging, Splogs, Techcrunch |
| on this site about why RSS scraping is not acceptable. Yet, many in the pro-syndication camp continue to talk of implied licenses or variations of the theme. Unfortunately for them, the implied license theory has no basis in law and has never actually been tested and seems likely to fail when it is. The notion of an implied license with RSS |
|
|
| RSS Brief: Another Scraping/Spam Threat |
| Published: September 14, 2007, 12:08 pm |
| Tags: Legal Issues, Articles, News, Content Theft, Copyright Infringement, Icerocket, Pay Per Post, Ppp, Rss, Rss Brief, Scraping, Spam Blogging, Splogging, Splogs, Technorati |
| fewer copyright issues than scraping full feeds. Though an RSS Brief feed might be less keyword rich, it would also be much more modified from the original, making it harder for search engines and Webmasters to spot. Depending on the nature of the spammer, they might find this RSS Brief feeds preferable to the existing alternatives. Also, much |
|
|
| Attributor Signs Up Reuters |
| Published: September 17, 2007, 12:24 pm |
| Tags: Products, Articles, News, Attributor, Content Theft, Copyright Infringement, Fingerprinting, Msm, Plagiarism Reuters, Rss, Scraping |
| In a press release dated today, content monitoring company Attributor announced that they have signed a deal with the British news service Reuters. This deal closely mirrors a similar arrangement Attributor announced with the Associated Press in May of this year. According to its press release, Attributor will “fingerprint original |
|
|
| Modified Scraping on the Rise |
| Published: November 8, 2007, 5:33 pm |
| Tags: Articles, Legal Issues, News, Personal Experiences, Prevention, Blogspot, Content Theft, Copyright Infringement, Duplicate Content, Google, Plagiarism, Rss, Scraping, Search Engine Spam, Spam Blogs, Splogs, Synonymized, Thesaurus |
| struggle with this kind of scraping as will all other plagiarism checkers that work by looking for verbatim copying. This includes high-end academic solutions. Even Google Alerts can be thwarted by this if the phrase being searched for is modified in the process of spinning the article. Though Blogwerx is working on a product that can detect |
|
|
| workFRIENDLY: An Accidental Scraper |
| Published: November 9, 2007, 4:07 pm |
| Tags: Articles, Dmca, Legal Issues, News, Prevention, Anonymouse, Content Theft, Copyright Infringement, Fair Use Dmca, Google, Google Cache, Plagiarism, Proxies, Scraping, Spam Blogs, Splogs, Workfriendly |
| one of the most prolific scraping sites I’ve seen and certainly one of the best at getting their results listed in Google. What is Going On Fundamentally, workFRIENDLY is little different than other proxy services on the Web including Anonymouse and even Google cache. What separates workFRIENDLY from these services is that it modifies |
|
|