Statistics from the Autoreporter service as open data

Autoreporter statistical data

Provided by the NCSC-FI at FICORA, Autoreporter is a service that automatically collects malware and information security incident observations concerning Finnish networks. The observations are submitted for remedies to the parties responsible for the information security of Finnish networks. Autoreporter monitors over 200 AS numbers.

The statistical data produced by the Autoreporter service is published as open data on this page. The data cover all malware and information security incident observations made by the service from 6 March 2005 onwards. The following data on the observations have been published: time window (UTC time zone), AS number, IP address, and a more detailed classification of the observation.
The time window consists of full days and one IP address is shown in the time frame only once per observation type. Due to the dynamism of IP addresses, it is possible that a single IP address is used by several users during the time window.

AS numbers and IP addresses have been anonymised by means of a hash function. IP addresses have been anonymised in order to protect the identity of a single user, whereas AS numbers have been anonymised on the request of network administrators. However, the geographic position of IP addresses has been ascertained at the city level before the anonymisation of IP addresses. The geographic position has been defined by using the commercial database provided by the American company MaxMind (database date: 2012-07-03). The data from the database in question are incomplete, so the geographical location may be inaccurate. In addition to the name of the city, the country code (cc) comes from the database.

The statistical data have been published as text files, as well as in the CSV and JSON formats according to the examples below. The character set used in the files is UTF-8.

CSV

# date from|date to|anon AS-number|anon IP-address|IP version|maincategory|subcategory|cc|city
...
2011-12-08 00:00:00+0000|2011-12-08 23:59:59+0000|52258e89bec324649e6a1b56c797902662a6c0e7|10940f75591111d1baa5d908df817c512bc778f5|4|bot|dnschanger|FI|Helsinki
2011-12-08 00:00:00+0000|2011-12-08 23:59:59+0000|52258e89bec324649e6a1b56c797902662a6c0e7|10940f75591111d1baa5d908df817c512bc778f5|4|bot|unspecified bot|FI|Helsinki
2011-12-08 00:00:00+0000|2011-12-08 23:59:59+0000|52258e89bec324649e6a1b56c797902662a6c0e7|1442a64c5cef7e3a3f6d356899fd77ee45de1a51|4|suspected spambot|blocklist|FI|Vanda
...

JSON

{
"autoreporter": {
"opendata": [
...
{
"date": {
"from": "2011-12-08 00:00:00+0000",
"to": "2011-12-08 23:59:59+0000"
},
"asn": [
{
"number": "52258e89bec324649e6a1b56c797902662a6c0e7",
"ipaddress": [
{
"address": "10940f75591111d1baa5d908df817c512bc778f5",
"version": "4",
"cc": "FI",
"city": "Helsinki",
"incident": [
{
"category": {
"main": "bot",
"sub": "dnschanger"
}
},
{
"category": {
"main": "bot",
"sub": "unspecified bot"
}
}
]
},
{
"address": "1442a64c5cef7e3a3f6d356899fd77ee45de1a51",
"version": "4",
"cc": "FI",
"city": "Vanda",
"incident": [
{
"category": {
"main": "suspected spambot",
"sub": "blocklist"
}
}
]
},
...

Observation types

The main categories of the observation types are:

  • bot: A workstation infected by malware is usually connected to a net managed by the attacker. The workstation becomes bot client software that can be commanded via the existing botnet.
  • bruteforce: An IP address from which attempts to infiltrate commonly used network services have originated. Random character strings or words found in dictionaries have been used to systematically crack the passwords of network services. A workstation at that IP address may have been the subject of a data break-in or it may be infected by malware.
  • cc: A workstation infected by malware is usually connected – in one way or another – to a botnet managed by the attacker. An administration server that is used to command such botnets has been identified at the IP address.
  • dameware: Old versions of DameWare Mini Remote Control software include a vulnerability that can be remotely utilised. The vulnerability allows installation of malware on a workstation. Possible successful utilisation of this DameWare vulnerability has been detected at the IP address.
  • ddos: This category lists both workstations participating in DoS attacks and targets subjected to DoS attacks.
    defacement: The IP address points to a defaced website.
  • dipnet: Dipnet is a worm infecting Windows-based workstations using the LSASS vulnerability.
  • fastflux: Fastflux is a way, implemented at the name server level, to hide the administration servers of a botnet, websites participating in phishing, and websites used to spread malware. The name server continuously returns new IP addresses to a specific domain name. The returned IP addresses are usually those of infected workstations that are part of a botnet.
  • malware: The IP address has been used to distribute malware.
  • malweb: A malicious website has been detected at the IP address. This category usually consists of IP addresses with harmful JavaScript, iFrame references, or other malicious components to the website.
  • phishing: A website participating in phishing has been detected at the IP address. It is usually a workstation or web server that has been the subject of a data break-in.
  • proxy: A workstation or server has been turned into an open proxy server that is being utilised. The utilisation can take many forms. It can be used for sending spam or commanding botnets, for instance.
  • router: An active device on the network has been turned into a proxy server that is being utilised. The utilisation can take many forms.
  • scan: Inconsiderate network research has been conducted from an IP address. Alternatively, it is a matter of a system that has fallen into the wrong hands or that includes a workstation infected by malware.
  • spreaders: Malware use a variety of spreading mechanisms, one of which is the vulnerabilities of operating systems. Signs indicating that the IP address is being used to spread malware have been observed.
  • suspected spambot: The IP address has been used in an attempt to send spam. There is reason to suspect that the workstation has been infected by malware and it is being used to send spam.
  • worm: The IP address has been used in an attempt to spread a worm. Network worms usually use a vulnerability of the network service to spread.

The main categories of the observations are specified with subcategories. The subcategories can include e.g. the names of malware and the port numbers exposed to a network scanning or a break-in attempt.

Files

  • Open data 2016 (CSV and JSON format)
    MD5: e3d2bffd7b9fc445a15ee3bd55f71c48
    SHA1: 64c3bb660b692b159d772aaf79b040f575e26f96
  • Open data 2015 (CSV and JSON format)
  • MD5: aad730f3476bca6513a39cd75f4eea87
    SHA1: dd5212ae6440e9503b65a3a635cf668b6dce3fbd
  • Open data 2014 (CSV and JSON format)
    MD5: 2c0637b724c5d5be10588259e31d5258
    SHA1: ac72477541fe45713260c7fa906ee9813509786f
  • Open data 2013 (CSV and JSON format)
    MD5: f4ad9f934b5e310c58ba6651e05f1873
    SHA1: 4384d49c3b681f4a1e1e92995e11bb3e10ca7b82
  • Download the open data 2005 - 2012 in the CSV format [zip file, 39 Mb]
    MD5: 6634cc151e2fb16a763ba1f3e6bb3405
    SHA1: b2e88f7d766934a9b8a9edc482504c3f07111092
  • Download the open data 2005 - 2012 in the JSON format [zip file, 41 Mb]
    MD5: d02e7051ea961c8f38bcfe6059f3f571
    SHA1: bc23bf87e420b93afd5e7f68a3e834e0359bf2d7

Feedback

We gladly receive your feedback and development suggestions via the
NCSC-FI's Facebook page, Twitter, or the electronic contact form.

Key words: Information security , Malware , NCSC-FI , Statistics

LinkedIn Print