[vox-tech] Is there better spam detection?

Wed Oct 19 14:30:30 PDT 2011

On Wed, Oct 19, 2011 at 12:48:20PM -0700, Alex Mandel wrote:
> On 10/19/2011 12:07 PM, Brian Lavender wrote:
> > It seems taht spammers have gotten keen to the Bag O Words Bayes analysis and
> > now pump out some unrelated paragraph before they put in their spam.
> >
> > Is someone working on a meaningful spam analysis detection that I can use?
> >
> > brian
> 
> I don't have a solution, but that's not new news. I've seen emails with 
> quotes from random famous books in them for years. Thunderbirds learning 
> Junk filter seems to do fine with most of that, as does gmail.
> 
> Is this for a desktop app, or for mail server you run?
> 

Mail server I run. I was looking at some of Ken's papers on Sentiment
analysis, working with ANTLR, and thinking about Hidden Markov
Model. It seems that someone has got to be working on a context deriver
analyzer. Sort of like random quote from from famous book and then comes
"Would you like a loan?". The two don't follow each other. I would think
that if you could gather some meaning out of a paragraph, put some sort
of measurement on it, then a meaning on another paragraph and apply Bayes
rule, you could nail spam a lot better. The difficult part seams to be
putting meaning on a group of words, but that seems to be what Ken is
working on.

Did you know that Bayes discovered his theorom when he was trying to 
prove the existence of God?

brian
-- 
Brian Lavender
http://www.brie.com/brian/

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the other
way is to make it so complicated that there are no obvious deficiencies."

Professor C. A. R. Hoare
The 1980 Turing award lecture