Botnet:
What are bots:
The word botnet originally comes from the words robot and network. A bot is a device infected by viscous malware which then becomes part of a network. Or a net of infected devices controlled by a single attacker or attack group (a bot-master).
Infected devices are controlled remotely by threat actors, often cybercriminals, and are used for specific functions, so the malicious operations stay hidden to the user. The most common use of botnets is to send message spam and generate malicious traffic for DDOS attacks.
What is a botnet:
A botnet is a collection of internet-connected devices, which may include PCs, servers, mobile devices and internet of things devices that are infected and controlled by a common type of malware. Users are often not aware of a botnet modifying or using their system.
How it works:
Botnet malware typically looks for vulnerable devices across the internet, rather than targeting distinct people, companies or industries. The objective when making a botnet is to infect as many connected devices as possible. Using the computing power and resources of these devices they have infected. This can also benefit bot-masters for running automated tasks that remain hidden to the users of the devices.
For example; an ad fraud botnet that infects a user's PC will take over the system's web browsers to divert fraudulent traffic to certain online advertisements. However, to stay hidden, the botnet will not take override the web browsers. This would alert the user. Instead, the botnet uses a small amount of the browser's processes, running in the background. Being able to send a very small amount of traffic from the infected device to the targeted ads.
However, on its own that fraction of bandwidth taken from an individual infected device won't do much for the cybercriminals running the ad fraud campaign. Although a botnet that combines millions of devices will be able to generate a massive amount of fake traffic for ad fraud. Whilst this happen they also avoid being detected by the individuals using the devices.
Types of attacks:
Cross-site scripting: Inserting malicious JavaScript into the header of an otherwise legitimate Web site.
DNS cache poisoning: Hacking a DNS so that it directs people who enter legitimate URLs to the hacker's malicious Web site.
iFrames: The "inline frame" HTML element that can create invisible frames capable of executing malware.
Pharming: Creating an illegitimate copy of a real Web site and redirecting traffic to the phony site to obtain information or download malicious code.
Pretexting: Pretending to be a legitimate entity to lure people to malicious sites.
Toxic blogs: Uploading links to malicious Web sites, or when blogs support HTML or scripts, uploading malicious code or using iFrames.
The scale of the problem:
Current estimates of the scale of the problem:
Statistics for 2014 / UK Organisations
Ð 73% of respondents have outsourced business processes over the Internet
Ð 52% consider use of social networking sites to be important to their business.
Ð 75% of large businesses allow staff to use smart phones and tablets to connect to their systems
Ð Over 80% of manufacturing, leisure, retail and financial firms have confidential data on the Internet.
Ð 85% use a wireless network.
Ð 55% use Voice over IP telephony.
Ð 84% are heavily dependent on their IT systems
Global State of Information Security Survey 2017:
¥ 28% of survey respondents reported security compromises of mobile devices.
¥ 38% of survey respondents reported phishing scams.
¥ 48% of IT services are delivered via the Cloud
¥ 51% of respondents say they actively monitor and analyze threat intelligence to help detect risks and incidents.
¥ 51% of respondents actively monitor & analyze information security intelligence
¥ 62% of respondents use managed security services for authentication, identity and access management, real-time monitoring and analytics, and threat intelligence.
The typical lifecycle of Botnets:
Including how machines become Botnets:
It begins with the infection stage. The exploitation of victim computer can be due to any one of the following reasons: Unpatched vulnerabilities. Backdoors left by Trojans.
Password guessing and brute force attacks. The infected machine is called a zombie or a drone. Once a host is infected, it downloads the bot binary source from a remote server and installs automatically. The bot then looks up for the address of IRC servers by DNS Lookups. These IRC servers are called Command and Control (C&C) servers. On obtaining the C&C server's address, the bot then logs into it and authenticates itself as a part of the in-particular botnet. The bots can then update their bot software, this is usually functionalities added to the bot software, if an update were available and add more C&C servers. IRC servers are used for C&C by bot masters is due to the following reasons.
•Easy to install i.e. private network can be installed easily.
•Easy to control i.e. using features like username, passwords. •Interactive i.e. two-way communication between bot master and zombie machine is possible. These zombie machines, when it is up and connected to the Internet, will log into the C&C server and wait for bot master's commands. Bot master logs into the C&C and can now issue commands for the bots to perform.
How they are controlled:
In a traditional botnet, the bots are typically infected with a Trojan horse and use Internet Relay Chat (IRC) to communicate with a central C&C server. Botnets are often used to distribute malware and gather misappropriated information, such as credit card numbers. Depending on the purpose and structure of the botnet, the C&C server may also issue commands to begin a DDoS (distributed denial of service) attack.
Command and control servers (C&C servers) are computers that issue commands to members of a botnet. Botnet members may be referred to a zombie and the botnet itself may be referred to as a zombie army.
How are they utilised:
Popular botnet topologies include:
• Star topology – the bots are organised around a central server.
• Multi-server topology – there are multiple C&C servers for redundancy.
• Hierarchical topology – multiple C&C servers are organized into tiered groups.
• Random topology – co-opted computers communicate as a peer-to-peer botnet (P2P botnet).
Since IRC communication is typically used to command botnets, it is often guarded against, which has motivated the drive for more covert ways for C&C servers to issue commands. Alternative channels used for botnet command include JPG images, Microsoft Word files and posts from LinkedIn or Twitter dummy accounts.
Description of existing research based solutions and their limitations:
Peer to peer:
How peer to peer botnet works: (diagrams)
Peer to Peer (P2P) botnets try to solve the problem of security researchers and authorities targeting domains or servers, by creating a decentralized network. The idea of P2P is that all the bots connect & communicate with to each-other to remove the need for a centralized server, however it’s not as straight forward as that.
If bots are communicating with each-other, then the bot-master needs to make sure only he can command the bots, this is usually done using digital signing. Signing is performed by using asymmetric encryption, a special type of encryption that required two keys (public and private). if one key is used to encrypt a message, it can only be decrypted with the other key. If the bot-master keep one key secret (private key) and embed the other key in the bot (public key), he can use his key to encrypt commands and then the bots can decrypt them using the public key: without the bot-master’s private key, no one can encrypt the commands.
Computers that are behind NAT, Firewalls, or use a proxy server to access the internet: cannot accept incoming connection, but can make outgoing connection. This is a bit of a problem as it would prevent most bots being connected to by other bots. In traditional botnets, this obviously isn’t a problem as the bots connect to the server, so a peer to peer network still requires servers in a way.
Figure 2
Bots that can accept incoming connections (not behind Proxy / NAT / Firewall) act as servers (usually referred to as nodes or peers), the bots that are not capable of accepting connections (usually referred to as workers) will then connect to one or more nodes to receive commands (Figure 2). Although the nodes are technically servers, they are used in a way that prevents take down, that is: the workers are distributed between many nodes, allowing them to shift to another node if one is taken down. P2P botnets only work if there are enough nodes that it is impractical to take them all down, the bad news is because the nodes are legitimate computers, they can’t simply be seized like a server would be.
Each node maintains a list of IP addresses of other nodes which it shares with the workers, the workers then store the lists, allowing them to switch nodes if the current one were to die. At this stage the botnet would just be many small groups of bots connected to many different nodes, which would be impossible to command. For commands to circulate the entire network, either: The bots will connect to multiple nodes and pass any commands received to the other nodes; The nodes connect to other nodes and pass commands between themselves; or a combination of the two.
Nearly all peer to peer botnets is existence have a vulnerability in the peer sharing mechanism. As explained earlier, the nodes are required to maintain and share a list of other nodes with the workers, to distribute the workers among the vast number of nodes. It would be incredibly time consuming or even impossible for the bot-master to manually provide each node with a list of other nodes, so the nodes do it automatically. When a new bot is identified as being capable of accepting connections, the node it is connected to adds it to the node list and shares it with the other nodes.
Propose a theoretical solution to detect peer to peer botnet:
Objective:
The objective of Bot-Miner is to detect groups of compromised machines within a monitored network that are part of a botnet. We do so by passively analysing network traffic in the monitored network.
Note that we do not aim to detect botnets at the very moment when victim machines are compromised and infected with malware (bot) code. In many cases these events may not be observable by passively monitoring network traffic. For example, an already infected laptop may be carried in and connected to the monitored network, or a user may click on a malicious email attachment and get infected. In this paper, we are not concerned with the way internal hosts become infected (e.g., by malicious email attachments, remote exploiting, and Web drive-by download). We focus on the detection of groups of already compromised machines inside the monitored network that are part of a botnet.
This detection approach meets several goals:
• it is independent of the protocol and structure used for communicating with the bot-master (the C&C channel) or peers, and is resistant to changes in the location of the C&C server(s).
• it is independent of the content of the C&C communication. That is, we do not inspect the content of the C&C communication itself, because C&C could be encrypted or use a customized (obscure) protocol.
• it generates a low number of false positives and false negatives.
• the analysis of network traffic employs a reasonable amount of resources and time, making detection relatively efficient.
Architecture:
This shows the architecture of our Bot-Miner detection system, which consists of five main components: C-plane monitor, A-plane monitor, C-plane clustering module, A-plane clustering module, and cross-plane correlator.
The two traffic monitors in C-plane and A-plane can be deployed at the edge of the network examining traffic between internal and external networks, like Bot-Hunter and Bot-Sniffer. They run in parallel and monitor the network traffic. The C-plane monitor is responsible for logging network flows in a format suitable for efficient storage and further analysis, and the A-plane monitor is responsible for detecting suspicious activities (e.g., scanning, spamming, and exploit attempts). The C-plane clustering and A-plane clustering components process the logs generated by the C-plane and A-plane monitors, respectively. Both modules extract several features from the raw logs and apply clustering algorithms to find groups of machines that show very similar communication (in the C-plane) and activity (in the A-plane) patterns. Finally, the cross-plane correlator combines the results of the C-plane and A-plane clustering and makes a final decision on which machines are possibly members of a botnet. In an ideal situation, the traffic monitors should be distributed on the Internet, and the monitor logs are reported to a central repository for clustering and cross-plane analysis.
In our current prototype system, traffic monitors are implemented in C for efficiency (working on real-time network traffic). The clustering and correlation analysis components are implemented mainly in Java and R, and they work offline on logs generated from the monitors
Justify why it’s better:
Propose a practical policy for Home-Broadband users to try reduce the probability of home computers becoming a Bot in a Botnet:
What security should home users use:
Firewalls:
Restrict people trying to enter to a carefully controlled point, preventing attackers from getting close to other defences and restricts people to leaving at a controlled point. Firewalls are the focus for security decisions, they inforce the security policy with the capability of limiting the devices exposure. Also, Logging internet activity efficiently.
Packet-Filtering Firewalls:
Screening router can send, drop, reject, and log the packet, one screening router can help protect an entire network, whereas simple packet filtering is extremely efficient. Packet filtering is widely available in the software and hardware products
Proxy Services
Proxy services are specialised applications or server programs that take users' requests for Internet services (such as FTP and Telnet) and forward them to the actual services. Transparency is the major benefit of proxy services. The proxy server can control what users do because it can make decisions about the requests it processes.
Some proxy servers do in fact just forward requests on, no matter what they are. These may be called generic proxies or port forwarders. Proxy services can be good at logging, can provide caching and perform intelligent filtering. Proxy systems can perform user-level authentication and automatically provide protection for weak or faulty IP implementations.
Guard
Guard is sophisticated proxy firewall. Guards decide what services to perform on the user’s behalf based on its available knowledge. No clear-cut definition of when something is a guard or proxy firewall.
Network Address Translation (NAT)
Network address translation can help restrict incoming traffic. Network address translation helps to enforce the firewall's control over outbound connections.
A Virtual Private Network (VPN) is a way of employing encryption and integrity protection to use the public network.
Intrusion Detection System:
An Intrusion Detection System (IDS) is a device or software application that monitors a network and/or information system for malicious activities or policy violations. They respond to suspicious activity by warning the system administrator, displaying an alert, and logging the event.
Unlike firewalls or access controls, IDSs are not intended to deter or prevent attacks, they’re only one piece of the whole security package and must be supplemented by other security and protection mechanisms. They are a very important part of a security architecture but don’t solve all our problems. The accuracy of intrusion detection is generally measured in terms of false positives (false alarms) and false negatives (attacks not detected).
IDS is capable of; Monitoring and analysing users and system activities. Auditing of system and configuration vulnerabilities, assessing integrity of critical system and data files. Recognition of pattern reflecting known attacks, statistically analysis for abnormal activities. Data trail, tracing activities from point of entry up to the point of exit, installation of decoy servers (honey pots) and installation of vendor patches (some IDS).
On the other hand, IDS are not capable of compensating for weak authentication and identification mechanisms or Investigating attacks without human intervention. They can’t guess the content of your organization security policy or compensate for weakness in networking protocols, e.g. IP Spoofing. Compensate for integrity or confidentiality of information, analysing all traffic on a very high speed network. Deal effectively with attack at the packet level, deal effectively with modern network hardware.
Active and passive IDS:
An active IDS is also known as an Intrusion Prevention System (IPS) which is configure to automatically block suspected attacks in progress without any intervention required by an operator.
IPS itself may be used to affect a Denial of Service (DoS) attack by intentionally flooding the system with alarms that cause it to block connections until no connections or bandwidth are available.
A passive IDS is a system that is configured only to monitor and analyse network traffic activity and alert an operator to potential vulnerabilities and attacks.
Network-based and host-
based IDS
A network-based IDS usually consists of a network appliance (or sensor) with a Network Interface Card (NIC) operating in promiscuous mode and a separate management interface. A host-based IDS requires small programs (or agents) to be installed on individual systems to be monitored.
Knowledge-based and behaviour-based IDS
A knowledge-based (or signature-based) IDS references a database of previous attack profiles and known system vulnerabilities to identify active intrusion attempts.
A behaviour-based (or anomaly–based) IDS references a baseline or learned pattern of normal system activity to identify active intrusion attempts.
Centralised approach
Has one (or more) major component with Central Intrusion Analysis Engine (CIAE), Network Monitor (NM), Traffic details passed to CIAE for analysis/response. CIAE central tasks, CIAE and NMs locations. Most existing IDS perform their data processing centrally even though their data collection is distributed.
Decentralised approach
Attempts to address the issues of scalability, ease of configuration and fault tolerance. Layered structure, only events that are part of a distributed attack are forwarded to a higher-level entity. Decentralised detection nodes analyse traffic on the network, autonomous distributed agents co-operate.
Characteristic
Centralised
Distributed
Run continually
A relatively small number of components need to be kept running.
Harder because a larger number of components need to be kept running.
Reliable
The state of the IDS is centrally stored, making it easier to recover it after a crash.
The state of the IDS is distributed, making it more difficult to store in a consistent and recoverable manner.
Resist subversion
A smaller number of components need to be monitored. However, these components are larger and more complex, making them more difficult to monitor.
A larger number of components need to be monitored. However, because of the larger number, components can crosscheck each other. The components are also usually smaller and less complex.
Minimal overhead
Impose little or no overhead on the systems, except for the ones where the analysis components run, where a large load is imposed. Those hosts may need to be dedicated to the analysis task.
Impose little overhead on the systems because the components running on them are smaller. However, the extra load is imposed on most of the systems being monitored.
Configurable
Easier to configure globally, because of the smaller number of components. It may be difficult to tune for specific characteristics of the different hosts being monitored.
Each component may be localised to the set of hosts it monitors, and may be easier to tune to its specific tasks or characteristics.
Adaptable
By having all the information in fewer locations, it is easier to detect changes in global behaviour. Local behaviour is more difficult to analyse.
Data are distributed, which may make it more difficult to adjust to global changes in behaviour. Local changes are easier to detect.
Characteristic
Centralised
Distributed
Adaptable
By having all the information in fewer locations, it is easier to detect changes in global behaviour. Local behaviour is more difficult to analyse.
Data are distributed, which may make it more difficult to adjust to global changes in behaviour. Local changes are easier to detect.
Scalable
The size of the IDS is limited by its fixed number of components. As the number of monitored hosts grows, the analysis components will need more computing and storage resources to keep up with the load.
A distributed IDS can scale to a larger number of hosts by adding components as needed. Scalability may be limited by the need to communicate between the components, and by the existence of central coordination components.
Graceful degradation of service
If one of the analysis components stops working, most likely the whole IDS stops working. Each component is a single point of failure.
If one analysis component stops working, part of the network may stop being monitored, but the rest of the IDS can continue working.
Dynamic reconfiguration
A small number of components analyse all the data. Reconfiguring them likely requires the IDS to be restarted.
Individual components may be reconfigured and restarted without affecting the rest of the IDS.
Home users should use antivirus software to start off to protect their devices. It was originally made to be able to detect and/or remove computer viruses. Antivirus is broken down into 4 different subsections:
Real-time protection
Real-time protection, on-access scanning, background guard, resident shield, auto-protect, and other synonyms refer to the automatic protection provided by most antivirus, anti-spyware, and other anti-malware programs. This monitors computer systems for suspicious activity such as computer viruses, spyware, adware, and other malicious objects in 'real-time', in other words while data loaded into the computer's active memory: when inserting a CD, opening an email, or browsing the web, or when a file already on the computer is opened or executed
Rootkit detection
Anti-virus software can attempt to scan for rootkits. A rootkit is a type of malware designed to gain administrative-level control over a computer system without being detected. Rootkits can change how the operating system functions and in some cases, can tamper with the anti-virus program and render it ineffective. Rootkits are also difficult to remove, in some cases requiring a complete re-installation of the operating system
Heuristics:
Many viruses start as a single infection and through either mutation or refinements by other attackers, can grow into dozens of slightly different strains, called variants. Generic detection refers to the detection and removal of multiple threats using a single virus definition.
Signature- based detection: Traditional antivirus uses signatures to find malware. When malware arrives in the hands of antivirus, it analysis malware researches or by dynamic analysis systems. Once it’s determined to be malware a proper signature of the file Is extracted and added to the signatures databases of the antivirus software.