Monday, September 14, 2015

How I Ditched Logstash

Ever since Logstash came on the scene several years ago, ELK (Elasticsearch Logstash Kibana) has become the de facto standard of log processing for me and pretty much everybody I talk to. Like any product that is used frequently, over time, you just can't help but developing a love-and-hate relationship with it. But lately there has be a lot more hate than love, at least for me.

Our setup is pretty simple and straight forward: a cluster of load balanced logstash processor nodes sits in front of a cluster of elasticsearch nodes and we use logstash-forwarder to push logs from our servers to logstash.

The first issue we've encountered was that our logstash 1.4 cluster does not scale well and start to crash under high lumberjack connection counts. The issue was addressed in logstash 1.5 with the implementation of the "Circuit Breakers". While it solves the problem of logstash crashing (we jokingly call it "shark attack" as the graph of memory utilization and active connection counts looks like a bunch of shark fins as it climbs up slowly and drops sharply when logstash crashes), it does not however solve the problem of logstash-forwarder connections constantly being dropped and has to reconnect all the time.

Because the bottleneck is how logstash handles lumberjack connection, not CPU load (during the "shark attack" CPU was running 95% idle), the obvious and sensible solution is to have a buffering or queuing service to handle the incoming messages and let logstash pull it at its own pace. However, logstash-forwarder's outright refusal of implementing such feature (https://github.com/elastic/logstash-forwarder/issues/18) forced our hand to move away from logstash-forwarder all together. (And no, running an actual logstash in agent mode is not going to work because it is extremely resource hungry).

After smooth sailing for a couple of months, thinking all logstash problem is now behind us because we implemented a pretty robust Beaver + SQS solution that would scale logstash cluster using the count of SQS visible message counts, an unexpected problem hit us again.

This time, we got alerted that our SQS queue has grown alarmingly long, but all of our logstash instances (including the ones that are launched by auto scaling) are just sitting idle. After some digging around, we noticed tons of 400 responses from elasticsearch before logstash completely quits processing anything with NO ERROR MESSAGES about how and why the log processor stopped. What's worse, it's not an easily detectable crash as logstash process still sits there and running, just not doing anything. A restart of logstash would make it start processing log again, but it just quits after about 5 minutes. User has reported the problem (https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/144) but it seems nobody knows the exact underlying cause nor is there a plan to fix it.

We limped along using cron to restart logstash every 5 minutes (yeah, it's that bad) while furiously looking for a fix or an alternative solution. Fortunately, while all these were going down, we were looking for a solution to log docker container activities, and I was in the process of evaluating fluent (http://www.fluentd.org/). Since we pretty much hit a brick wall with logstash, I start to experiment with letting fluent pull messages off SQS and process them then push it directly to elasticsearch. What I found is simply astonishing:

As soon as we brought a single fluent instance online to pull messages off the same SQS queue, using the same type of EC2 instance as logstash nodes, the single fluent instance manages to handle about 90% of log messages off the SQS queue while three logstash instances combine are only processing the other 10%. Basically fluent manages to out process logstash on a ratio of 30 to 1. On top of that, from our observation, fluent's memory footprint is tiny (about 40MB, about the size fluent web site claims to be), compare logstash where 300~400MB of memory footprint is quite common.

At this point, we have replace majority of our logstash cluster with fluent and only leaving several behind to process some legacy nodes that is still utilizing logstash-forward/lumberjack protocol as fluent doesn't have a lumberjack input plugin. We replace the cluster of 8 logstash instances with 3 fluent instances and it is handling as much or even more throughput as logstash used to handle. In about a month, once the legacy nodes that still uses logstash-forwarder is switched over to using fluent pushing logs to SQS, we would ditch logstash for good.

Thursday, March 27, 2014

Something I learned Integrating OpenStack with Active Directory

So during last couple of weeks, I've been testing and migrating my OpenStack environment to use Active Directory as the backend identity provider for Keystone (the identity component of OpenStack). The process itself, is actually fairly simple and well documented in the OpenStack documentation. But there are a bit of gotchas that took me a little head scratching and googling before I figured them out. Here is what I learned in this process. Maybe this could be useful for somebody who's trying to do the same:

Havana vs Grizzly

If you're still running Grizzly release or older, you probably want to get to Havana before trying to integrating your system with Active Directory or LDAP in general, due to the fact before Havana, there's no separation between identity (authentication) and authorization (project and group memberships and roles). So you either have to enable keystone to update to LDAP server or manage all the OpenStack related information through LDAP server. Basically, what it mounts to is that you can't just "plug-and-play" your AD server to keystone--you'll need to do some schema planning before it can be used.

In Havana release, you can use separate backends for identity and authorization, meaning you can plug your keystone right into AD without any modification to AD's schema (well, just a little bit, see the service account section below), and still use your SQL server for authorization.

If you have been use SQL backend for authentication

If you have already setup your environment to use SQL backend, probably due to the limitation of LDAP backend in releases prior to Havana, there's a couple of things you might want to watch out switching to LDAP backend:

While there is no problem leaving your old SQL backend based user in your project/tenant as members, they do not resolve once your identity is switched, unless the value in your user id field of your LDAP backend user matches exactly with ones inside your SQL database. It normally wouldn't cause any problem unless you're trying to list membership for project/tenant, which I believe that's what the dashboard does when you try to modify projects/tenants.

My recommendation would be either record what you have in project/tenant and role assignment and remove existing entries prior to switching identity backend, or, if you feel adventures, go find out the corresponding entries for all users_id field currently exists in SQL backend, and swap them out in the "user_project_metadata" table in Keystone database.

Setup your service users

Switching identity backend means ditching your existing backends, which means all your existing useres are no longer accessible, INCLUDING YOUR SERVICE USERS!

You'll need to setup your existing service users (nova, glance, cinder, swift, etc.) on your LDAP server before you switch your identity backend, otherwise you'll run into some strange problems due to OpenStack services couldn't talk to identity service to confirm authorizations for operations.

Use Active Directory Global Catalog

If your Active Directory or LDAP server sends out referral references, python-ldap put them in the search results as a tuple in form of (None, ['referral_uri']).

In Havana, Keystone does not take this into consideration, and will fail with a message 

AttributeError: 'list' object has no attribute 'iteritems'


because it's expecting the second value in the tuple to be a dictionary. This problem will be fixed in the next (Icehouse) release as seen from this github commit. But if you're using Havana, you either have to patch it yourself, (line 602-604 in core.py in that github commit), or if you're using Active Directory, instead of talking to standard LDAP ports (389/636), use Global Catalog ports (3268/3269) where it will not send out referral references.

Set page_size

If you have enough user in your LDAP/AD, non-paged LDAP call will get a 'Size limit exceeded' error. You'll need to set your "page_size" option in your Keystone's [ldap] section to make sure you don't run into the problem. 


Make sure your python-ldap is version 2.3

Due to the change in python-ldap, paged query has a different API in version 2.4 and is not backward compatible. If for any weird reason (you probably wouldn't, I only stumbled upon it because of some bizarre case with my test environment) your python-ldap is version 2.4 or above (check it with 'pip show python-ldap'), you would want to downgrade it to version 2.3. Otherwise you'd be seeing a "AttributeError: 'module' object has no attribute 'LDAP_CONTROL_PAGE_OID'" showing up in your logs.

Thursday, September 26, 2013

The Wolf's Adventure in OpenStack Land, Part 2: Tangled in the Network

Like I said in the last post, getting OpenStack to work is fairly easy if you're running RHEL/CentOS/Oracle/Scientific Linux. Aside from using the automagic script, packstack, OpenStack also provides a step-by-step instruction for manual installation. Personally, I'd recommend you at least go through it once, because it'll give you a much better understanding how each component of OpenStack is connected.

The advantage of using packstack, though, is that it'll take care of setting up the database (MySQL) and the AMQP (Qpid) and hooking all components up correctly to them so you don't have to do that manually (or write your own script to do it).

Now the foundation is set, here comes the interesting part: Networking. 

Unless you omitted the

 --os-quantum-install=n  

during packstack run, the default network manager it sets up will be Nova-Network FlatDHCPManager on a single-host setup. If you have chosen to install Quantum (or Neutron depending on the release you got), then Quantum/Neutron will be your default network manager. While you can achieve a lot more complex network topology with Quantum/Neutron, there are several reasons that you wouldn't want to use it for production use:

  • Lack of Active/Active High Availability - As of Grizzly release, there is no equivalent of multi_host mode (we'll cover this network option later), so you either have to run network node on some computer node (which defeats the purpose of dedicated network node), or put in extra hardware just to be a standby and can't even utilize its power or bandwidth (there's no active/active support for Quantum/Neutron just yet), or run the risk of your entire infrastructure depending on a single point of network failure where all your VMs will lose access to public network when your network node goes down. I for one am not so comfortable with this. 
  • Available Bandwidth - When using Quantum/Neutron, your VMs traffic to public network are routed through your network node. So if you have VMs that has high network traffic to the outside network, you can't utilize your bandwidth of each compute node's network connection, instead, it's all funnels through your network node. It may not be a big deal in most cases, but I just don't like the idea that my network packets has to take an unnecessary hop before it get go onto the network. 

So for those reasons, I would personally recommend against running Quantum/Neutron in production environment and stick with Nova-Network, at least for now.

Nove-Network ships with three network managers:

  • Flat Network Manager (FlatManager)
  • Flat DHCP Network Manager (FlatDHCPManager)
  • Vlan Network Manager (VlanManager)

Rather than trying to explain it myself, the guys at Mirantis have some very good blog article about how those different network manager work and how to set it up: FlatManager and FlatDHCPManager, Single Host FlatDHCPManager Network, VlanManager for larger deployment.

The articles' coverage ranges from small and simple network, to large deployment utilizing Vlan. But it's missing the very thing that applies to my situation: my environment is probably best described as a small to medium sized private cloud deployment where I'm not on a large enough scale to worry about the broadcast domain size or tenant numbers, yet I would like to configure my system with multi_host mode to eliminate the single point of failure. What I need is a Flat DHCP Network setup with multi_host enabled.

The official document is rather brief and has some errors that you need to correct with a trial-and-error method, and Internet documentation on how to set it up is surprisingly nonexistent. So to save a good 30 minutes of trial-and-error process, I'll walk through the process on how you would set up your system to be in multi_host mode, more specifically, how to go from default packstack install to multi_host DHCP mode, in my next post. So, stay tuned. ;-)

Tuesday, September 24, 2013

The Wolf's Adventure in OpenStack Land, Part 1: Meet the Grizzly

Prologue

Time is changing, so is technology. Infrastructure management of present days is nothing like it was just a decade ago. Every institution moves along with the current wave of technological revolution is looking to get involved with the five letter word: CLOUD.

Being someone who's in charge of a moderate sized infrastructure inside an institution, I too am caught in the tidal wave. After studying the use case that could benefit my infrastructure, I decided to roll my infrastructure over to OpenStack based environment.

OpenStack, the most popular open source cloud software suite, it has tons of features and great community support. Even though it provides detailed and thorough documentations, it can still be a bit overwhelming and chaotic for even the veteran sys admins who has been in the traditional infrastructure management.

This is the record my adventure along with the lessons I've learned during the quest of rolling my existing infrastructure from the old brick-and-mortar environment to OpenStack. Getting OpenStack up and running is actually pretty easy. But getting OpenStack to run the way you want it to actually takes some effort and understanding of how OpenStack works. Having gone through that entire process myself, I hope this adventure log could be somewhat of use to admins that are new to OpenStack and in the same boat as I was trying to roll the entire infrastructure over to OpenStack based environment.

Adventure 1

First thing first, as much as I like Ubuntu based system, most of my servers are equipped with LSI controllers and it's a known fact that LSI doesn't play well with Ubuntu/Debian. So since I was rolling my infrastructure from a completely different platform (Solaris 10/11), I didn't quite have the urge to marry to one or another distro. After a brief uphill battle against getting LSI working on Ubuntu, I settled on CentOS/RHEL/Oracle/Scientific Linux. (Yes, I could probably have gotten LSI working on Ubuntu if I just spent a little time, but Ubuntu doesn't give me any significant advantage, nor do I want to run my entire production environment on a "hack"). So this adventure is all about running OpenStack on RHEL based systems.

Getting OpenStack up and running is fairly easy to do, thanks to RDO, the OpenStack distribution built specifically for RedHat based distros. Simply follow the Quick Start Guide, you'll get your all-in-one setup of OpenStack up and running in no time. You can play around with it to get familiar with its interface, api, and concept in general. When it comes to production environment however, the all-in-one mode is definitely not the preferred choice. Instead, you'd want to separate your controller node from your compute node, you'd probably want to do something like:

 packstack --install-hosts=10.0.0.2,10.0.0.3 --os-quantum-install=n  

Assuming:
  • Controller node IP 10.0.0.2
  • Compute node IP 10.0.0.3
Packstack would ssh to each IP as root (I'm a little annoyed by this, but since these machines will never host application, only VMs, I'll let it slide. If you disable ssh root login by default like I do, you can always just temporarily enable it and disable it later.)

While we're at it, might as well throw in the ntp setting:

 packstack --install-hosts=10.0.0.2,10.0.0.3 --os-quantum-install=n --ntp-severs=pool.ntp.org  

Allow the command run to completion, it should complete the single-host FlatDHCP setup (I'll explain that later) for one controller node (10.0.0.2) and one compute node (10.0.0.3).

It's preferable to run this on a minimal installed CentOS/RHEL/SL, previously installed Qpid or MySQL on the controller could cause some part of Puppet run to fail.

After installation completes, you'll get a file named keystonerc_admin in your /root which contains your admin password. You can now use it to log into the web interface running on your controller node's port 80 or just source it to play with commands.

Now the introduction to Grizzly (the codename for the current stable release as of this post is written) is complete. Next time we'll talk a little bit more about networking element in OpenStack so we can have a finalized packstack command.