Wednesday, January 11, 2012

Anti-spam techniques--wiki

The techniques are used as reference to figure out the method for distinguishing the relevant and non relevant content. http://en.wikipedia.org/wiki/Anti-spam_techniques.

1.1 
Detecting spam based on the content of the e-mail, either by detecting keywords such as "viagra" or by statistical means (content or non-content based), is very popular. The content also doesn't determine whether the email was either unsolicited or bulk, the two key features of spam. Non-content base statistical means can help lower false positives because it looks at statistical means vs. blocking based on content/keywords.-----keyword detection and behavior patten analysis;
1.2. 
Many filtering systems take advantage of machine learning techniques, which improve their accuracy over manual methods;--- keyword detection by machine learning, the next stage objects;
1.3.
Other more advanced techniques analyze message patterns in real time to detect spam like behavior and then compares it to global databases of spam.Pattern detection, Pioneered by Commtouch,This method is more automated than most because the service provider maintains the comparative spam database instead of the system administrator.--- real-time analysis of behavior patten based on database, can be part of our new further research;
1.4.
Enforcing RFC standards & Checksum-based filtering --- the format and behavior differentiation;




2.1.
The most popular DNSBLs (DNS Blacklists) are lists of IP addresses of known spammers, known open relays, known proxy servers, compromised “zombie” spammers--- list of sites;
2.2.
list sites authorized to send email, and (sometimes) to determine the reputation of those sites.--- reputation of sites and views of articles for parameter weight in relevance ranking;


3.1

Some systems let individual users have some control over this balance by setting "spam score" limits, etc. the open source programs SpamAssassin and Policyd-weight uses some or all of the various tests for spam, and assigns a numerical score to each test. Each message is scanned for these patterns, and the applicable scores tallied up. If the total is above a fixed value, the message is rejected or flagged as spam. ---weighted scoring;








No comments:

Post a Comment