diff --git a/content/posts/mapping-fail2ban.md b/content/posts/mapping-fail2ban.md new file mode 100644 index 0000000..150a935 --- /dev/null +++ b/content/posts/mapping-fail2ban.md @@ -0,0 +1,149 @@ +--- +title: "Mapping Fail2ban's list of malicious IPv4 scanners" +date: 2021-08-30T23:47:11+02:00 +draft: true +--- + +# Some context + +Back when I built [my gitea server](https://git.roboces.dev/) for the first time, I noticed something strange: it would work nicely, but only for so many hours at a time. Soon enough, it would just crash or stop responding without an apparent reason, leaving me scratching my head. + +I had opened sshd's well-known port to the Internet with the naive impression that having my server with no valid ssh login would be more than enough protection. What could possibly happen? Someone stealing my laptop and using my ssh key to push into my random dark-web repo to assert dominance? + +Oh boy, was I wrong. Not even days after first exposing the ssh port to the Internet, the sheer amount of malicious traffic would make my server crash. The chinese botnets didn't care know nor care about sshd not accepting logins: they just kept trying to brute force in. It is well-known that there is a gigantic amount of IPv4 scanning going on, in the order of thousands of packages per day, but that is a mere fraction of what you can get by showing a well-known port. Before installing [fail2ban](https://www.fail2ban.org), my server was receiving hundreds of login attempts per second. + +# Meeting the scanners + +Fail2ban's method for keeping law and order is fairly straightforward: you give it a # of failed tries, an amount of time to be banned, and it adds temporal `iptables` rules when someone has tried and failed to connect one too many times. I will be using its daily log to get a better grasp of where all the botting is coming from. Fail2ban's logfiles look something like this: + +``` +$ head /var/log/fail2ban.log +2021-08-29 00:00:34,222 fail2ban.server [94]: INFO rollover performed on /var/log/fail2ban.log +2021-08-29 00:01:21,092 fail2ban.actions [94]: NOTICE [sshd] Unban 186.3.164.76 +2021-08-29 00:03:05,205 fail2ban.actions [94]: NOTICE [sshd] Unban 222.186.30.112 +2021-08-29 00:09:41,049 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:40 +2021-08-29 00:09:42,651 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:42 +2021-08-29 00:09:45,665 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:45 +2021-08-29 00:09:48,369 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:48 +2021-08-29 00:09:51,574 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:51 +2021-08-29 00:09:51,638 fail2ban.actions [94]: NOTICE [sshd] Ban 221.181.185.159 +2021-08-29 00:09:53,229 fail2ban.filter [94]: INFO [sshd] Found 221.181.185.159 - 2021-08-29 00:09:53 +``` + +The only data I am interested in is the IP addresses (and the quantity of them), so we trim the file accordingly, taking care to remove duplicates: + +``` +$ grep -E "\WBan" /var/log/fail2ban.log | awk '{ print $8 }' | sort --unique | tee banlog +1.116.211.170 +1.117.214.250 +1.15.106.44 +1.15.151.58 +1.15.183.51 +1.15.21.246 +1.179.137.10 +1.226.12.132 +1.53.89.181 +1.85.216.176 +[...] +``` + +Take care to use `sort --unique` instead of something like `uniq`, which only detects adjacent duplicates. + +# Scanning the scanners + +Now having their IPs, we can get a rough estimation of where the traffic is coming from. There are many online services you can use to get this data, but they won't let you do queries in bulk without charging you for some kind of database suscription. If someone knows a program that _just works_ with batteries included, please tell me. + +Anyhow, I ended up using [IP2Location's](https://www.ip2location.com) BIN-format database along with its [Python API](https://www.ip2location.com/development-libraries/ip2location/python). They require a free account to download their database files, but a burner email or [an alias]( {{< ref "/automating-aliases.md" >}}) will do just fine. + +IP2Location's module can be installed in the usual fashion: + +``` +$ pip install IP2Location --user +``` + +After which we can get our hands on deck. I'm not much of a pythoner myself, so I decided to make a simple .py that outputs formatted lines so I can keep using my shiny UNIX tools: + + +```python +#!/usr/bin/env python +import sys, IP2Location + +def main(): + + # Argument checking + if (len(sys.argv) < 3): + print("Usage: ip_query.py ") + return + + # Get a list of ips as trimmed strings + with open(sys.argv[1], "r") as ips_file: + ip_list = [line.rstrip() for line in ips_file] + + # Open connection to binary database + database = IP2Location.IP2Location(sys.argv[2], "SHARED_MEMORY") + + # field delimiter + d = "~" + + for ip in ip_list: + record = database.get_all(ip) + print(record.ip + d + + record.country_short + d + + record.country_long + d + + record.region + d + + record.city + d + + record.latitude + d + + record.longitude + d + + record.zipcode + d + + record.timezone) + +if __name__ == '__main__': + main() +``` + +Depending on which database you chose, it may have more or less fields available. Later I will cut what I don't need, but for now I'm dumping everything. You can use whichever delimiter you want, but I don't recommend using `,` or any other that could be included in a country's name or timezone info. + +``` +$ chmod +x ip_query.py +$ ./ip_query.py banlog IP2LOCATION-LITE-DB11.BIN | tee ipstats +1.116.211.170~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.117.214.250~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.15.106.44~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.15.151.58~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.15.183.51~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.15.21.246~CN~China~Beijing~Beijing~39.907501~116.397232~100006~+08:00 +1.179.137.10~TH~Thailand~Krung Thep Maha Nakhon~Bangkok~13.750000~100.516670~10200~+07:00 +[...] +``` + +Now, this is looking much better. I was curious about which countries were the biggest culprits, although it isn't much of a surprise: + +``` +$ cut -d'~' -f2,3 ipstats | sort | uniq -c | sort -r | head + 239 CN~China + 118 US~United States of America + 45 IN~India + 33 ID~Indonesia + 26 VN~Viet Nam + 25 NL~Netherlands + 23 SG~Singapore + 22 KR~Korea (Republic of) + 22 DE~Germany + 21 RU~Russian Federation +``` + +We just made Our Very Own Top 10 Of Shame! And remember that is is _just one day's worth of logs_, from a server that barely half-a-dozen people use, and not even taking into account repeated offenses from the same IP. Goes to show you how crazy IPv4 scanning has gotten. + +# Mapping the scanners + +To top it off, I would like to have some sort of graphical visualization of this heinous crimes. There are some great libraries out there to plot coordinate data into a mapamundi. I would consider [something like folium](https://georgetsilva.github.io/posts/mapping-points-with-folium/) if I were to do more with the Python side of this blogpost. But that's not what we're here for today. Today we're using crappy sites and copy-pasting. + +``` +$ cut -d'~' -f6,7 --output-delimiter=',' ipstats | xclip -selection clipboard +``` + +Will do just what we want. `--output-delimiter` is a cool flag that will substitute whatever your delimiter is with a different one. Most places that let you paste coordinates in bulk require comma-separated lines, and that is what we just copied to our clipboard. + +We can use a place like [mapcustomizer](https://www.mapcustomizer.com) for our very first, crappy data visualizing: + +![Image]({{< static "img/mapcustomizer.png" >}}) diff --git a/layouts/shortcodes/static b/layouts/shortcodes/static new file mode 100644 index 0000000..c8e90ac --- /dev/null +++ b/layouts/shortcodes/static @@ -0,0 +1,5 @@ +{{- .Scratch.Set "path" (.Get 0) -}} +{{- if hasPrefix (.Scratch.Get "path") "/" -}} + {{- .Scratch.Set "path" (slicestr (.Scratch.Get "path") 1) -}} +{{- end -}} +{{- .Scratch.Get "path" | absURL -}} diff --git a/static/img/mapcustomizer.png b/static/img/mapcustomizer.png new file mode 100644 index 0000000..1252fd1 Binary files /dev/null and b/static/img/mapcustomizer.png differ