Feed RSS

Premios 20Minutos

International Year of Astronomy 2009

Fighting email spam in Thunderbird

This Is Not Your Room!Fighting email spam isn’t just about saving time, it’s a social responsability. If you spend some time to render it useless you are helping to drop the risk vs. benefit relation for spammers and so, you are actually working to stop them.

Though I don’t make an intensive use of email —I could go a few days without checking it and the sky wouldn’t fall—, I don’t like either to spend needless time with it. I use Thunderbird and its spam filter takes care of some of the dirty job, but it still misses some, so I decided to make my incoming mail folders more Zen using some ‘this is not your room’ rules.

Thunderbird hasn’t implemented global filters (note: filter refers to the policy and rule to each condition within the filter), so the provided workaround is to use the Global Inbox, then use either filters, column sorting or searches to browse mails by account. The Global Inbox is activated at the account management, under Server settings, in the Advanced dialog. The built-in Global Inbox apparently has the annoying feature of making the already received mails in the former folders non-accessible (no wizard is offered to solve that), so you must move manually all the mails from the account folders before activating it. Don’t try to replace the Global Inbox with a redirect filter: when a mail is moved it won’t be filtered by the destination folder. I know, Thunderbird filtering is a bit clumsy. It doesn’t even allow for regular expressions, oh well…

After turning on the Global Inbox, I renamed my ‘local folders’ to ‘classified e-mails’, just for semantic aesthetics. Also I added the Account column to the view (click the rightmost square thingie with a tiny down arrow).

Next, I went to Preferences | Advanced | General, then to the configuration editor to tweak mail.adaptivefilters.junk_threshold to 30%. That’s the Bayesian junk control confidence threshold. The default 90% value is very conservative, but be careful during the next days until the junk control becomes better trained (I almost missed an administrative alert right after changing the threshold). Notice that you can set the junk control to ignore mail from your contacts list.

Classifying legitimate mails is still the best way to avoid spam’s white noise (if you are using email for business probably you should be classifying mail already). I created the folders ‘10. Work’, ‘20. Contacts’, ‘30. Whitelist’, ‘40. Other senders’, ‘50. Blog comments’, ‘60. Mailing lists’, ‘70. Newsletters’, ‘80. Archive’, ‘90. Filtered Spam’ (by Sturgeon’s Law). Numbering folders allows you to sort them, in Thunderbird by default they are alphabetical. Because I’m a retro guy, I used the old BASIC line numbering.

Now it’s time to put on the filters (they are in the Tools menu). Remember that priority matters and the filters that move mails must go after those that doesn’t.

First, I set up a filter to send work and contact mails from the address books to folders 10 and 20 and whitelisted and other known addresses to folder 30. This will ensure that I’ll always read first mails that are legitimate —well, as long as I don’t make spammer friends.

Then I wanted to backup the junk control filter with some keyword checking for things I know I’m not going to be emailed about, like Rolex watches, xx% off offers, penises and so on. These can be included in the same filter with several ‘match any of these’ rules (be sure to pick the right circle button option). Because there’s some risk of false positives, I made these mails go to folder 90 instead of to the junk folder so I can review them more easily or set up a different retention policy. Anyway, non English natives like me can safely assume that fangirls/boys aren’t going to praise our physical attributes in Anglosaxon.

Blog comments (from my Wordpress blog), mailing lists, newsletters, etc., have addresses and subjects easy to match and filter. They add a lot of bulk, so keeping them apart will make it a whole lot easier to spot junk later. Some can be directly whitelisted and classified and some you may want to run them after keyword filtering.

The last filter sent all unknown senders that were left in the inbox to folder 40 (to keep them within a closer visual space and not separated by all the sent/drafts/junk folders). Additional filters and subfolders can be added before this to organize mails that come from different accounts or use certain subjects.

Finally, I reviewed the unknown senders to put all those that I could trust into a Whitelist address book. Spam exploits Blacklist system weaknesses but can’t do much against Whitelist systems because gaining trust is something that hardly can be automated. The address books are your Whitelist system so it’s important to do frequent keeping.

After Rain

puddles.png
El castillo estaba indefenso tras la lluvia.
The castle was defenseless after the rain.

Wordpress Comment Spam Attack

This blog has been targeted the last two days by a comment spam bot, issuing about 120 spammy comments. Comments never got on air because they went into the moderation queue, as I had set up a moderation rule during a previous ‘benign’ attack of the same kind of bot and that rule catched them all. If you receive a similar flood of comments, look at them to find a common pattern and block that pattern at Options | Discussion. This particular bot can be blocked by adding a rule against “mail.com”. You can use too the tips from Google guy Matt Cutts to increase the security of your Wordpress install (they are good for other CMSs too).

This bot I was talking about wasn’t acting in a subtle way at all. For backlinks, getting one spam comment through is mostly enough, and one is more likely to go unnoticed that hundreds. The URLs were just random, so it could seem unskilled or pointless spam. Reading here and there, though, I’ve heard that the bot actually seems to “bomb” the anti-spam plugins so they aren’t able anymore to filter the bad comments. I don’t use spam killers since this isn’t a popular blog, I just changed some code to make it look and work different than a vanilla Wordpress install. A good idea if you don’t mind repeating it with each update is to hack comments.php and wp-comments-post.php so the form fields have different names, and also rename the later file to avoid it being hit directly; you can even leave a hidden fake comment form with default field names to chaff the simplest bots.

I’d suggest too to fill the moderated keywords box. I’m not posting here my whole list since some words would be the last thing I’d like to be indexed for, but to stop praiser-bots you shouldn’t forget to include “impressive”, “information”, “informative” or “webpage!” Readers use those words very rarely (or maybe it’s just my blog that isn’t very impressive at all).

These tips are very, very far from being a bullet-proof way to stop spam. Any human specifically willing to spam you is going to spam you. But bots are rarely designed to hit non-common installs, they get lesser returns for hitting pages that show signs of being fighting spam actively because anything that goes through is likely going to be manually deleted. A bot too clever could even annoy the real net gurues out there… I don’t think spammers would be very happy of discovering that a feral bot had got some net honchos get personal into tracking them.

If you wonder, I’m not into that kind of techy knowledge (my PHP is even clumsy), but I learned about bot strategies and code evolution with artificial life simulations. You can only wonder about how often natural diversity can beat intelligent design. If you make your CMS install unique in its own way, it’s very likely to be skipped by most regular bots.

Hunger in the New Century

The fall of real estate investments is nothing compared with the shadow of hunger following the food price rises, but there may be links between them. Here in the eastern Spain local food production doesn’t look as vigorous as before since the promises of residential projects, the derivation of water to urbanized areas and the lack of the same dedicated labor as we had some decades ago have left many fields abandoned.

Then there’s the diversified speculative businesses that targeted bioproducts suitable for being processed into fuel. Their business is speculation, that’s not going to go away with the fall of real estates, they will grab the food market until it has to be taken almost off their cold dead hands. While governs are patching and repairing the damages done to the owners of home loans, they’ll feed the food bubble until some unsustainable damage is done. This means that the economic model that created both the real estate and the food prices problems is still kicking around and there’s little to achieve by running behind them with a mop funded with public taxes.

There’s no excuse for an economic crisis right in the XXI century. They should know better by now. The kind of failure leading to a shortage of first need products would be a serious global failure, and we can remember what that lead to in the past.

How people talk about Iraq war

The philologist’s idea of amusement is like this: playing with web searches and looking which words people use (Whooha, carnival of excitement!).

Do you want to know how people talk about Iraq war? No bother with polls and brainy dudes, Google has it. See, lets ask him (or her?):

The “freedom speech” seems to be used more often by both supporters and detractors than negative words. I see some “Is Iraq a success?” but less “Is Iraq a failure?” Maybe the “freedom speech” is too strong by itself as a marketing campaign to not to be “bought”, even by those with critic opinions?

Qwertyuiopasdfghjklzxcvbnm

You cannot help but have an issue with Internet when that title word has about ten thousand more page results than you.

But I think it can be a good measurement of self relevance. The net is about a lot of noise and a bit of meaningful things. If you want to be over the white noise, you must beat at least the most irrational searches. At Level 1 of progress, you should be able to reach beyond the conscious, deliberate writing of the whole series of letters on an English qwerty keyboard. At Level 2 —now we get into serious businesses—, you should be ahead of a full qwertyuiop string. Level 3 is only available for the most advanced net hackers. There, you must be more relevant than the asdf keyword of doom! Only then you’ll be sure that your word stands over the purest form of idle typing.

You are a rational human. That means that you should be able to produce more relevant things than what a monkey could do randomly banging a keyboard. Or a bored blogger, in any case.

Playa

Valencia Beach
A veces es difícil encerrarse en casa a escribir.
Sometimes it’s difficult to lock oneself in home to write.

Microformats on Technorati

Just a quick post.

Technorati (here my Technorati Profile) is implementing Microformats.

And what’s that? Well, it’s a way to embed collectable data within webpages using just classes and proper rel properties. That way, bots can identify easily your tags, profile or license and offer properly classified searches and directories.

There’s, of course, the usual trick in it. Microformats beta specification allows for creating profile cards with all sorts of personal data. You know, there will be always something intrinsically wrong with publishing your telephone number in a publicly accessible, highly bot-friendly format.

The blogosphere is dead, long live the blogosphere

No, really, I’m just anticipating the events. Someday, somewhere, a big blog will suddenly crash, and some ‘smart’ dude will say: “this day the blogosphere died”. Though blogs will still continue to exist, whatever one meant for “blogsphere” won’t be there anymore.

But then, what if it had happened already? Three years ago, when I thought “blogosphere” I thought, because it was that way, about a decentralized heavily interlinked community of self-aware individuals —and sorry for the IT slang. Did something change radically either on concept or scale in these last three years?

Yes. Advertising. And blogs as low quality-high spread commercial platforms.

Advertising networks, and therefore the Search Engine Circus competition, has moved the network from the blog-to-blog sphere to the blog-to-ads-network sphere, and replaced the core of the interlinking from the «lets form a super-cool bloggers microsphere» to «lets form a cartel to powergame the Google-slot machine». But the worst part of the link-rush fever, as always, is being paid by the small mimicking guys lured by the big promises. The once natural and balanced ecosystem of blog links is now spammed to death by the massive creation of artificial relations between blogs. You don’t know anymore if something is being referred because it’s worth it, or because the blogger wants to place some keywords, or is being paid for linking —which is not evil by itself—, and pretending that it is his sincere personal tastes —which is blatantly shameless. And we cannot forget, too, that the bulk of the chatty-sociable people has moved away of regular blogs to the meat-market of look-at-me web profiles, so they are not greasing anymore the interstices of the read-what-I-think sphere.

So, is the blogosphere of the old good days dead already? Do we bloggers live in tiny islands in the after global-web-warming apocalypse? Or is that still waiting to happen?

Books in the corner

Books in the corner
Siempre hay una necesidad imperiosa de espacio en la habitación de un lector.
There’s always an imperious need for space in a reader’s room.

« Previous Entries
» Next Entries