Automatic survey of inbound mail (MX) servers in academic domains in the UK

by Ramón Casero Cañas on 6 June 2007 , last updated

Archived This page has been archived. Its content will not be updated. Further details of our archive policy.

Introduction

This report presents the results and methodology of a semi-automated survey of inbound mail (MX) servers in all academic domains in the UK (ac.uk) with a focus on Further and Higher Education (FE and HE). The survey has been performed in June 2007.

A mail server or Mail Transfer Agent (MTA) is a software system that transfers email messages from one computer to another, so all MTAs act both as a receiver and a sender.

MTAs are an interesting focus of study for OSS Watch because email is ubiquitous in both FE and HE, but also in commercial companies and public administration. It is therefore a heavily used tool that allows the study of academic institutions with respect to the rest of the world.

MTAs can be used as inbound or outgoing servers, or both. Inbound servers receive email sent to the domain, and outgoing servers deal with email sent from the domain.

This study only takes into account the former, as they can be queried in an easily automated way in order to conduct extensive studies. This opens many possibilities: comparison of automated and traditional surveys (e.g. emailing mail managers), easy update of the results to study evolution over time, larger samples, etc..

The results of this survey show that external providers and academic institutions in general clearly prefer open source MTAs over proprietary alternatives, with the exception of FE institutions, that manifest a moderate preference for proprietary systems.

Methodology

Overview and limitations

The ideal goal of this survey would be to evaluate which MTAs are used in academic institutions in the UK in an automated way. Due to a number of factors the actual goal is to evaluate which front-end inbound MTAs are used in ac.uk institutions in a semi-automated way.

The ideal goal is limited to the actual goal for several reasons:

Only inbound servers
Inbound and outgoing servers need to be treated separatedly and using different approaches; limited resources for this study meant that we chose to deal only with inbound servers. See section “Identification of email servers” for details
Only front-end severs
Inbound email servers can be front-end (visible to the external world) or internal (hidden from the external world). This study deals only with front-end servers, as to the best of our knowledge there is no automated way to survey internal servers. See section “Identification of email servers” for details
ac.uk institutions instead of academic institutions
Domains in ac.uk correspond to a heterogeneous mixture of organizations and projects, so this study uses the term institution in a loose sense. See sections “Acquisition of lists of institutions” and “Mapping between domains and institutions” for details
Semi-automated
Although scripts have been written to automate data acquisition and compilation, human intervention is still necessary to keep the list of known MTA systems updated, run the scripts and publish the results

There are further limitations to this study:

Load balancing
We treat load balancing schemes as single servers. See section “Identification of email servers” for details
Security through obscurity
Some mail managers choose to hide the information about the MTA
Security through active misleading
Some mail managers may doctor the information provided by the MTA so that it identifies itself as another system. We have no figures to quantify the extent of this practice

Acquisition of lists of institutions

OSS Watch national surveys have studied the usage of open source software in FE and HE institutions. Colleges, universities, etc. are readily classified as academic institutions in the sense that they are organizations situated in complexes of buildings for the promotion of learning and research.

  • Further Education (FE): Institutions for students of age 14-19 to prepare for GCSE A levels examinations (e.g. sixth form colleges) or vocational training (e.g. vocational colleges), and for adult learning
  • Higher Education (HE): Institutions that award academic degrees, typically universities and vocational universities, for students of age 18 onwards

When this survey was conducted, HERO, the official gateway to universities, colleges and research organisations in the UK, was able to provide comprehensive lists of FE institutions and HE institutionsAs of June 2009 the HERO.ac.uk website is now closed. The British Council provides lists of educational institutions too, but there is no perfect correspondence with HERO’s lists. It would also be possible to obtain lists of organisations connected to education and research from JANET (the UK’s education and research network)., that can easily be parsed and compiled into lists of domains, one domain per institution, e.g.

abcol.ac.uk abingdon-witney.ac.uk accross.ac.uk fife.ac.uk altoncollege.ac.uk amersham.ac.uk …

However, 3641 ac.uk domains were registered at the time of this study, of which only 659 (18.1%) correspond to HE and FE institutions.

This suggests that in order to obtain a more realistic picture of the usage of MTAs in education in the UK, other types of organizations and projects (not necessarily institutions) need to be considered, the diversity of which is illustrated in the table below.

  • academies (e.g. Academy of Medical Sciences), national institutes (e.g. National Institute for Environmental eScience)
  • research centres (e.g. Accelerator Science and Technology Centre, Centre for Contemporary British History)
  • research projects (e.g. Ancient Cyprus Webproject, Collaborative Computational Projects)
  • research councils (e.g. Science and Technology Facilities Council)
  • national education programmes (e.g. Aimhigher)
  • mission training (e.g. All Nations Christian College)
  • associations (e.g. Association for the Study of German Politics), unions (e.g. British Ornithologists’ Union), societies (e.g.Computers and the History of Art)
  • JISC projects (e.g. Authenticated Networked Guided Environment for Learning), other projects (e.g. DfEE Project)
  • websites for students with disabilities (e.g. Being Inclusive in the Creative and Performing Arts)
  • schools and faculties of FE/HE centres (e.g. Birmingham School of Acting)
  • combined clinical and research units (e.g. Blood Pressure Unit at St. George’s Hospital Medical School)
  • consortia of colleges to collaborate in administrative matters (e.g. The Bloomsbury Consortium)
  • consortia of HE centres (e.g. Consortium of Arts & Design Institutions in Southern England)
  • police training centres (e.g. Bramshill Police Training College)
  • military training centres (e.g. Britannia Royal Naval College)
  • associations (e.g. British Association for International & Comparative Education)
  • non-profit organisations for research (e.g. British Atherosclerosis Society)
  • institutes (e.g. British Institute at Ankara, Institute of Rural Health)
  • research data centres (e.g. The British Isles GPS archive Facility, The COntinuous REcording System)
  • portals (e.g. Bulletin Board For Libraries)
  • information and advisory services (e.g. OSS Watch)
  • committees or councils of schools/departments (e.g. Committee of Heads of University Geosciencies Departments, Medical Schools Council)
  • national committees (e.g. JISC)
  • digital archives (e.g. The Corpus of Romanesque Sculpture in Britain and Ireland)
  • survey services (e.g. Destinations of Leavers from Higher Education (DLHE) Survey)
  • administrative and support bodies for HE (e.g. Higher Education External Relations Association)
  • infrastructure (e.g. JANET)
  • museums (e.g. Leicester City Museums)

Nonetheless and for simplicity, we are going to use the term institution in a loose sense to be able to classify ac.uk domains in 3 groups: FE, HE and Other. That is, we are going to denote both actual academic institutions and activities or organisations closely related to education as institutions.

Mapping between domains and institutions

It would greatly simplify this study if we could assume that there is a one-to-one mapping between domains and institutions, but this is not the case:

  • Some institutions use domain redundancy, e.g. both ox.ac.uk and oxford.ac.uk are University of Oxford domains
  • Some domains are registered but not being used, e.g. aac.ac.uk
  • Some institutions are split between different centres, each with one domain, e.g. The White Rose Centre for Enterprise Teaching and Learning has registered domains sheffieldcetle.ac.uk and leedscetle.ac.uk, for their centres at the Universities of Sheffield and Leeds, respectively
  • Some schools, faculties and organizations have their own domain, but they are in fact part of a larger institution, e.g. bristollawschool.ac.uk is registered to the Faculty of Law at the University of Western England

In this study we have decided to identify institutions and domain redundancy by the entry “Registered for” in the WHOIS database (see Appendix A). This method can differentiate institutions with domains that use the same mail servers and authoritative Domain Name System (DNS) server, e.g.

  • OSS Watch: oss-watch.ac.uk registered for “JISC Open Source Advisory Service”
  • University of Cambridge: cam.ac.uk registered for “The University of Cambridge”

It can also detect redundancy between domains, although some refinement is necessary, as the WHOIS entries are entered by hand and are prone to small differences or typos, e.g.

  • ed.ac.uk registered for “The University of Edinburgh”
  • edinburgh.ac.uk registered for “University of Edinburgh”
  • bcftcs.ac.uk registered for “Birmingham College of Food Tourism & Creative Studies”
  • cof.ac.uk registered for “Birmingham College of Food - Tourism and Creative Studies”

The steps followed to map domains to institutions are:

  • The WHOIS queried about each domain
  • The “Registered For” fields are recorded as institution identifiers
  • Redundancies are removed to build a list of unique institution identifiers
  • Each domain is tagged with a unique institution identifier

Using this approach 2914 institutions for 3641 domains were identified.

An incorrect method of mapping between domains and institutions

An idea for the mapping that we discarded was to use WWW servers, under the assumption that each institution will have a website, and that all redundant domains will be aliases of the same homepage.

This is unfortunately not the case. For example, the University of Oxford web servers have different IP addresses $ host www.ox.ac.uk www.ox.ac.uk has address 163.1.0.45 $ host www.oxford.ac.uk www.oxford.ac.uk has address 163.1.0.90

OSS Watch is a separate institution, but the webserver shares the same IP address save for the last byte $ host www.oss-watch.ac.uk www.oss-watch.ac.uk has address 163.1.0.145

Identification of email servers

A list of front-end inbound email servers associated to a certain domain can be obtained with a DNS query (see Appendix B). This list can be emtpy, or have 1 or more entries depending on how many servers are running.

The list assigns a priority to each server (low priority servers should be contacted only if higher priority servers are not responding), but in this study we make no distinction between them.

It is also worth noting that some institutions may use load balancing schemes. Load balancing is a special case where several servers are running, but they all appear as a single machine to the outside world. In this study, load balancing is treated as a single MTA.

Domain redundancy is another hurdle for a study like this. In some cases, several domains of an institution are valid email domains, e.g. both user@ed.ac.uk and user@edinburgh.ac.uk are valid email addresses for the University of Edinburgh. But sometimes, only one domain is valid, e.g. user@ox.ac.uk are valid email addresses for the University of Oxford, but not user@oxford.ac.uk.

Another difficulty is found with outsourced email. Institutions may run their own mail servers, outsource them to another institution or company, or a combination of both. For example, OSS Watch email is handled by the University of Oxford, even though they can be considered separate institutions. Another example, the Architectural Association School of Architecture’s primary server is internal, while 2 secondary servers are outsourced to a private company.

Following the considerations above, in this study email servers associated to an institution are found in the following way:

  • All domains belonging to an institution are grouped
  • A DNS query is run per domain to find all email servers
  • Redundancies are removed, i.e. each server is counted only once per institution
  • Email servers are tagged as either belonging to a private company (those with org.uk, co.uk, org, net, gov.uk, com and biz domains) or an academic institution

Identification of MTAs

In most cases, emails are transmitted using the SMTP or Simple Mail Transfer Protocol (RFC 2821).

Information about the MTA running on an email server can be obtained by opening a telnet session to port 25. This process can be automated to query large numbers of servers (see Appendix C).

Information obtained from the telnet session falls in one of 3 categories:

  • open: the email server responds giving the name of the MTA, and usually the version number, e.g. Exim 4.62. Mail administrators who use the “open” category rely on the doctrine of security by design
  • obscured: the mail manager has removed the name of the MTA from the telnet output, replaced it by a nonstandard identifier (e.g. “RHCMarshal” instead of “MailMarshal”), or replaced the welcome banner by a string of asterisks. These are examples of security through obscurity
  • misleading: the mail manager has doctored the output of the telnet session so that it looks as if another MTA is running. This is an active form of security through obscurity

The steps followed to identify MTAs are

  • A telnet 25 session is attempted for each email server
  • The welcome banner is recorded for successful connections
  • An SMTP HELP command is sent
  • The output is recorded
  • The telnet session is closed
  • The collated output of the telnet session is matched to a list of known MTA systems to look for matches
  • The email server is tagged as running an unknown or known MTA

Results

Overall statistics

Table 1 shows that at the time of this study there are 3641 top domains in ac.uk, of which only a negligible amount are not registered to any institution (0.3%). Of the rest, those registered to FE (17.5%) are roughly twice as many as those registered to HE (9.7%).

According to the lists obtained from HERO, there are ~2.7 times more FE than HE institutions (454 FE, 166 HE). Table 2 shows that 441 (97.1%) and 163 (98.2%), respectively, were identified in the WHOIS record of one of the queried domains.

“Server names” refer to the host names provided by the MX record in the DNS. A small amount of these names (3.7%) had no IP address and could not be queried. The rest were considered valid mail servers (“With IP address”).

FE and HE have an average of 1.4 and 2.2 valid mail servers per institution, respectively, while the rest of institutions have only 0.6.

These results are to be expected, with HE institutions being less numerous but having in general more IT resources than FE, and taking into account that the “Other” category includes many pseudo-institutions (see “2.2. Acquisition of lists of institutions” for details).

Note that the number of institutions in FE, HE and Other add up to the total number of institutions. This is not the case with servers, because the same machine can be serving more than one institution.

Email by institution

Table 3 shows that almost all FE and HE institutions (> 98.0%) use email, while those in the “Other” category have a much lower rate (< 60.0%). This can be explained because many of the “Other” domains are projects, ventures, etc. hosted by FE and HE institutions. Typically, the “Other” domain is used for a website, but email is routed via the hosting institution.

“Unknown provider” means that the domain of the mail server name could not be identified. In this study, email providers for ac.uk institutions were found in one of these domains: ac.uk, org.uk, co.uk, .org, .net, .gov.uk, .com, .biz, .de, .dk, .net.uk, .uk, .edu. In this case, the 2 unknown providers correspond to MX records that instead of containing a server name contain its IP address.

79.1% of HE institutions run their own email servers, while 25.8% have outsourced email, either to another ac.uk institution or to a commercial company. “Overlap” means that some institutions have both in house servers and outsourced servers.

The overlap is 3 times more common in FE than in HE, where fewer institutions run their own email servers (69.8%) and more outsource them (46.3%).

Interestingly, in the “Other” category most of the institutions outsource their email. Note that the 48.0% figure refers to the total number of institutions; if only institutions with email were considered, the outsourcing rate would be 80.4%.

Table 4 shows the split between ac.uk and other domains (typically commercial companies) for those institutions that outsource email. Between 71.4% (HE) and 78.4% (FE) of the outsourcing is to non-ac.uk domains.

We can also see from Table 4 that in general outsourcing is done to either ac.uk or non-ac.uk domains, but not both, as the overlap is very small.

Email by server

Tables 5 and 6 are in the same spirit as those in the previous section, but looking at usage of email by server instead of by institution, to get a more complete picture.

Table 5 shows that most servers providing ac.uk institutions with an email service are within ac.uk itself (76.7% and 89.8% for FE and HE, respectively). In the “Other” category the rate is lower but still the majority, 64.0%.

Correspondingly, the rates of servers outside ac.uk are 23.1%, 10.2% and 35.9%.

Table 6 shows that for servers within ac.uk an overwhelming majority are used to provide the service to the same institution that hosts them. The results suggest that a few FE and HE institutions (5.2% and 6.4%, respectively) host email for other ac.uk institutions. In fact, most of the institutions-providing-for-other-institutions happens in the Other category (44.6%).

External email providers

Table 7 presents a list of non-ac.uk providers, and the number of institutions they serve. The provider’s names have been anonymized to avoid promoting specific companies. The results clearly show that the first 2 providers have 40.8% and 70.6% of the FE and HE market, respectively. Both are commercial companies. Provider 1 leads in HE and FE, while Provider 2 leads in the Other category. There are 225 providers with fewer than 5 customers that have been removed from the table.

5 providers serve 62.7% of FE institutions that outsource email, while this figure grows to 85.3% in the case of HE institutions. Quite interestingly, the maket in the Other category is much more fragmented, with 22 providers serving 50.4% of institutions.

Table 8 presents the same list of providers, but in this case evaluated in terms of number of servers.

While Providers 1 and 2 are also leading for FE and HE by number of servers, the rates are lower, with 29.7% and 56.7%, respectively. This can be explained by many providers who have only 1 or 2 customers, but use several servers for them. Provider 17 appears second in this table because of the high server/customer ratio it has in the Other category (a ratio of 4.0). There are 231 providers with fewer than 5 servers that have been removed from the table.

5 providers contribute 44.0% and 81.1% of servers for FE and HE, respectively, and in the same fashion as above, the Other category is much more fragmented, with 50 providers contributing 50.2% of the servers.

MTAs and filters by institution

Systems running in email servers were classified as either MTAs or filters (and only servers hosted by ac.uk domains were taken into account). The difference is that an MTA’s main purpose is to transfer email, while a filter searches incoming emails for spam, viruses, etc., and then passes the clean ones onto an MTA that is often hidden from the outside world.

In cases where the system could not be identified, it was tagged as “unknown MTA”. There are two reasons why this happens. The first originates in the way this study identifies MTAs, searching for a known string in the output of the telnet session, e.g. “Exim”. If a system is not included in our list, then it cannot be identified. However, we did a systematic review of telnet outputs and we are reasonably confident that all known systems used by ac.uk institutions or their external providers have been added to the list.

The second reason is that some postmasters delete or edit the information provided by the server, as explained in section 1.5 “Identification of MTAs”. We think that most, if not all, of “Unknown” cases can be assigned to the categories of security through obscurity or active misleading.

Table 9 shows that 65.3% (FE) and 63.2% (HE) of institutions obscure their mail server’s identification. At the same time, 41.8% (FE) and 50.7% (HE) make that information available. (There is some overlap because an institution can be using both identified and unidentified MTAs.)

The most popular MTA for FE is MS Exchange, in 21.7% of institutions. The most popular for HE is Exim, in 33.1% of institutions. In the Other category, Exim is also the most popular, with 22.7% vs. 10.5% of MS Exchange.

The percentages in Table 9 refer to “Institutions in ac.uk with email”. Regarding institutions “Running known MTAs”, the 3 most popular open source systems (Exim, postfix and MMDF) are used in 54.9% of cases (33.3% in FE, 72.4% in HE, 58.2% in Other), while the 4 most popular proprietary MTAs (MS Exchange, Qmail, MDaemon and Merak) are used in 28.6% of cases (54.1% in FE, 13.0% in HE and 24.0% in Other). Sendmail is a special case, as it is distributed under a licence that some people recognise as open source but it is not in the list of approved OSI licences. The OSS Watch position on this is that unless software is released under an OSI approved licence then it is not open source. Sendmail is used in 13.4% of institutions (7.4% in FE, 11.6% in HE, 15.1% in Other).

Thus, in general, open source or Sendmail is 2.4 times more likely to be used than proprietary alternatives for MTAs. In HE the imbalance is very noticeable, as HE institutions are 6.5 times more likely to choose an open source or Sendmail MTA than an proprietary one. On the other hand, FE institutions are 1.3 times more likely to choose a proprietary system than an open source or Sendmail one.

Table 10 shows that the use of filters (or at least, filters that could be identified) is negligible, with 2 proprietary systems (MailMarshal and MIMEsweeper) and 1 open source system (qpsmtpd) leading the classification.

MTAs and filters by server

The results in Table 11 are comparable to those of Table 9. The 3 most popular open source MTAs run in 49.8% of servers (31.7% in FE, 81.6% in HE, 53.4% in Other), while the 4 most popular proprietary solutions run in 37.2% of servers (53.7% in FE, 8.4% in HE, 33.2% in Other).That is, a server hosted in ac.uk is 1.6 times more likely to run an open source or Sendmail MTA than a proprietary one. But there is again a big difference between FE and HE. Servers in HE are 10.6 times more likely to run open source or Sendmail MTAs than proprietary, while servers in FE are 1.3 times more likely to run proprietary than open source or Sendmail MTAs.

Table 12 shows the results for servers hosted by domains outside ac.uk, typically commercial companies to which academic institutions outsource the mail service.

Positive identification of MTAs is ~10% less frequent in external domains than in ac.uk. Of those systems that could be identified, the 3 most popular open source MTAs are Exim and postfix (53.2% of servers with known MTAs). The 3 most popular proprietary MTAs are Qmail, MS Exchange and Merak (19.0% of servers). Sendmail is in 17.8% of servers. Thus, it is 3.7 times more likely to find an open source or Sendmail MTA than a proprietary one in external providers.

The results of Table 13 are very similar to Table 10, and suggest low usage of filter systems. According to Table 14, this is even lower for external providers.

Conclusions

All ac.uk domains were systematically queried to find information about the type of MTAs that academic institutions (in a broad sense) are using.

Both the methodology and the results show that the evaluation of MTA usage is quite complex. To obtain a full picture it is necessary to consider deployment by institution and server, and it is difficult to define what an “institution” is. In fact, most of the domains are registered to pseudo-institutions that are neither in HE nor in FE.

Other hurdles for the study are that servers can overlap between institutions, and that institutions can use 1 or more servers, either in house, from another academic institution or from an external provider, or combinations thereof.

We are limited to study inbound front-end servers, and even for those, in roughly half of the cases the MTA can not be identified because the information has been edited in the spirit of security through obscurity.

However, emails in response to a question we posted to the UK mail manager mailing lists suggested interest from the community for this kind of data. All domains were given the option of opting out of the study, and only 3 in 3,631 chose to do so.

Some results were to be expected. For example, that 98%+ of FE and HE institutions use email. More surprising is the fact that ~46% of FE institutions outsource their email (for HE this is only ~28%). At the same time, ~70% of FE and ~80% of HE have in house servers. (There is some overlap due to some institutions having both). Of those institutions that outsource, ~80% in FE and ~70% in HE use non-ac.uk providers, typically commercial companies. In fact, just 2 companies cover ~50% of the FE and ~70% of the HE outsourcing market. The 5 most popular external providers cover ~62% and ~85% of the market, respectively. The market is much more fragmented for institutions in the Other category, where ~50% is covered by 22 providers.

In total, academic institutions are 2.4 times more likely to choose open source or Sendmail MTAs than proprietary ones. Academic servers are 1.6 times more likely to run open source or Sendmail MTAs, and this figure grows to 3.7 for external servers.

However, these figures are not evenly distributed among institutions, as HE institutions are 6.5 times more likely to chose open source or Sendmail MTAs, while FE institutions are 1.3 times more likely to choose proprietary systems. In terms of servers, HE ones are 10.6 times more likely to run open source or Sendmail MTAs, while FE servers are 1.3 times more likely to run proprietary ones.

The most popular open source MTAs are Exim, postfix and MMDF, while the proprietary ones are MS Exchange, Qmail, MDaemon and Merak. Sendmail is a special case in that it is not open source but neither is it considered closed source.

It was also noted that some institutions and external providers run filters to protect their front-end servers, but the number is negligible.

Thus the results confirm trends observed in the previous Open Source Surveys conducted by OSS Watch in 2003, 2006, 2008 and 2010. Namely, that there is a healthy uptake of open source software in academic institutions, with higher percentages in HE than in FE. (The latter tend to prefer proprietary software.)

Acknowledgements

The contribution of Andy Saunders of OUCS was invaluable in terms of technical guidance and general suggestions about the study. We are also grateful to those UK mail managers who chipped in and offered advice or pointed out errors and omissions. And finally, thanks to all those OUCS and OSS Watch staff members who at some point offered their help.

Appendix A: Querying the WHOIS

WHOIS is a protocol that can be used to query special databases for information about domains, e.g. owner, contact address, servers, etc. Queries can be made from JANET’s WHOIS web interface [http://www.ja.net/services/whois/lookup.php] or from the command line with a WHOIS client (e.g. the program whois in Unix systems)

$ whois -h whois.ja.net cam.ac.uk Domain: cam.ac.uk Registered For: The University of Cambridge Domain Owner: The University of Cambridge Registered By: JANET Servers: authdns0.csx.cam.ac.uk 131.111.8.37 authdns1.csx.cam.ac.uk 131.111.12.37 dns0.cl.cam.ac.uk 128.232.0.19 dns1.cl.cam.ac.uk 128.232.0.18 dns0.eng.cam.ac.uk 129.169.8.8 ns2.ic.ac.uk bitsy.mit.edu Registrant Contact: Chris Thompson Registrant Address: University of Cambridge Computing Service New Museums Site Pembroke Street Cambridge CB2 3QH United Kingdom +44 1223 334715 (Phone) +44 1223 334679 (FAX) hostmaster@ucs.cam.ac.uk Entry updated: Monday 14th May 2007 Entry created: Wednesday 17th September 2003

Appendix B: Querying the DNS

Domain Name System (DNS) servers work in a similar way to phone books. Each human-readable hostname (e.g. www.ox.ac.uk) is linked to a number called an IP address (163.1.0.45). For domains, DNS servers provide the hostnames of the email servers associated with them. This information can be obtained from the command line with a DNS lookup application (e.g. program host in Unix systems)

$ host -t mx aaschool.ac.uk aaschool.ac.uk mail is handled by 10 relay2.red.net. aaschool.ac.uk mail is handled by 10 relay3.red.net. aaschool.ac.uk mail is handled by 5 smtp.aaschool.ac.uk.

The number by the hostname indicates the priority level (a lower number means higher priority). That is, the primary mail server at aaschool.ac.uk is smtp.aaschool.ac.uk while relay2.red.net and relay3.red.net are secondary or backup servers. In this study we have made no distinction between primary and secondary servers.

Appendix C: Querying the email server

Email transmission is performed using SMTP (Simple Mail Transfer Protocol). The mail server listens to SMTP connections on port 25, and it is usually possible to open a telnet session on this port and find information about the MTA being used in the welcome banner. Telnet sessions can be opened with the command line program telnet in Unix systems.

$ telnet oxmail.ox.ac.uk 25 Trying 129.67.1.161… Connected to oxmail.ox.ac.uk. Escape character is ‘^]’. 220 relay0.mail.ox.ac.uk ESMTP Exim 4.62 Mon, 16 Jul 2007 19:42:04 +0100

In this case, the MTA used by the mail server oxmail.ox.ac.uk is Exim v4.62. If the name of the MTA is not provided in the welcome banner, sometimes it is possible to obtain it with a HELP command, e.g.

$ telnet mail.aims.ac.uk 25 Trying 69.44.19.51… Connected to mail.aims.ac.uk. Escape character is ‘^]’. 220 ns03beta.netscanuk.com ESMTP help 214 qmail home page: http://pobox.com/~djb/qmail.html

Telnet sessions are interactive, but they can be automated with an expect script in Unix systems.

Appendix D: General flowchart of the study

Appendix E: Opt out email

The following text was emailed to the postmaster address of every domain queried in this survey, with an option for them to opt-out of the study.

Dear postmaster,

You have been sent this email to inform you that your domain has been included in the “OSS Watch automated survey of inbound Mail Transfer Agents (MTAs) in academic domains in the UK” study.

OSS Watch is a JISC-funded organization hosted by Oxford University.

OSS Watch promotes awareness and understanding of the legal, social, technical and economic issues that arise when educational institutions engage with free and open source software. It does this by providing free and unbiased advice and guidance to UK higher and further education.

The purpose of this study is to evaluate the usage of inbound MTAs in ac.uk domains, and the results will be published under a Creative Commons Attribution-ShareAlike licence.

The methodology of the study is the following: A telnet session is opened to port 25 of every surveyed inbound email server in ac.uk domains, and “help” and “quit” commands are issued. Then, if possible, the MTA running in the server is identified from the output of the telnet session.

The information collected from the server is available to any person with an internet connection and an open SMTP port. It is acquired using standard protocols and is freely provided by the server as part of its normal functionality.

For this study, each server needs to be accessed only once, and if it were to become periodic, this would mean no more than 1 access every month per server under normal circumstances. Thus, we believe that the impact in the work load of the studied servers is negligible.

However, if for some reason you object to these data being gathered from your domain, please let us know and your domain will be removed from future surveys.

Yours faithfully,

Ramon Casero ramon.casero@oucs.ox.ac.uk.

OSS Watch Development Officer.

http://www.oss-watch.ac.uk/