![]() |
Optimal Queueing Strategies for E-mail Virus ScanningAbstractOptimal queueing strategies are derived for an email virus scanning system consisting of multiple queues of varying message size limits running in parallel. The general VirScan system is described, a queueing model is defined, and expressions are derived for the overall average time that messages spend waiting in the queues. The distributions used for message sizes are based on statistics from real email servers. Queue size limits are then determined numerically to minimize the wait time under two different queueing strategies for several email server examples. IntroductionEmail predates the Internet, and was one of the first services to take advantage of the Internet. In the early 1990s, commercial vendors started to realize the potential of Internet email, and began to move from proprietary message formats and closed systems, to standardized message encoding schemes and servers with integrated Internet SMTP (Simple Mail Transfer Protocol) capabilities [1]. Today, graphical desktop email programs such as Netscape Communicator and Microsoft Outlook are staples of Internet users' desktops. These easy to use but powerful programs have the ability to send and receive plain text messages, as well as messages containing binary attachments, such as word processing documents, spreadsheets, executable scripts, hyperlink tags to web sites, and program binaries. Vendors have given insufficient attention to risks involved with the handling of potentially malicious or disruptive message content. Some of these risks have been documented in a series of Carnegie Mellon University CERT Coordination Center advisories [2,3,4,5,6,7,8,9]. The ease of pointing and clicking has, for many end users, erased the distinction between launching applications and accessing data with applications. Users no longer need to distinguish - they just point at what they want to access and click with the mouse; the operating system determines which application to launch. To further complicate matters, common spreadsheets and word processors have embedded scripting capabilities in their document file formats, so even supposed documents can contain hidden executable program instructions [10]. Even technically advanced users are easily tricked into launching executable attachments received via email. Furthermore, bugs in email software can force execution of message attachments, even without any user intervention [6]. Internet email has become a significant transport mechanism for computer viruses, as well as junk messages that waste time and resources. Email is a primary means of business communication, but is simultaneously a huge liability. The Internet has become both a necessary business tool and a hostile working environment. System and network administrators must protect their users and networks from the results of email propagated virus attacks. They also need to prevent voluminous quantities of unsolicited junk email from consuming their system resources. Desktop protection is helpful, but is hard to update and support reliably for large numbers of users simultaneously. There is a clear demand for fast, reliable, centralized, server-side email filtering, such as the VirScan system described here. Email Virus ScanningIn the VirScan Email virus scanning system [11], two sendmail [12] daemons are run. One daemon listens on the SMTP port and stores incoming messages in queue in; the other daemon scans queue out to deliver messages.
A VirScan feed process scans the in queue and distributes messages among N feed queues, feed1, ..., feedN, depending on the message size and queue load. Each feed queue except for feedN has a limit on message size. In the example setup shown in Figure 1, messages over 50KB in size will only be placed in queues feed4 and feed5. Each feed queue load is measured by the time at which the queue is expected to be empty, and the feed process places messages in the least-loaded queue which can accept the message size. This helps smaller messages to get processed faster. N VirScan work processes are run, one for each feed queue. We assume that the system has at least N CPUs so that the work processes are run in parallel. Alternatively, each work process could run on a separate single-CPU system, with the feed process run on a separate load-balancing front-end system. Each work process scans messages for viruses, bad file name extensions in attachments, and spam. Bad messages are moved to the virus, ext, or spam quarantine queues. Clean messages are moved to the out queue. If an error occurs while processing a message, the message is moved to the error queue. |