04/07/2014

April Showers Bring More Talks to Yelp

Yelp is hosting many exciting talks this month! Let’s take a closer look at two, but be sure to check out all of the links below to see what’s in the pipeline.

What does it take to monitor an architecture that is continually in flux and being changed by hundreds of developers? A seriously flexible monitoring framework: Sensu! A resident Yelp Site Reliability Engineer, Kyle Anderson, will be giving a tour of how Sensu works at Yelp. He’ll be explaining how Sensu hooks into the Yelp SOA architecture to empower individual development teams to monitor their own services in real time. He’ll also be discussing the difficulty of monitoring servers in AWS, and how Sensu can be used to track them.

Yelp hosts this talk, put on by San Francisco Dev Ops, tomorrow April 8th!

Next up is a talk about RxJava. Functional reactive programming is an interesting, relatively new paradigm that models application state as a set of dependencies updated via a stream of data. As a concrete example, one option of building a UI that depends on mouse clicks is to write a callback function. When the mouse is clicked, the function is called to do some processing and optionally update the UI. However, callbacks can quickly get complicated, especially when they involve multiple threads that may interact with each other.

The reactive alternative treats all mouse clicks as a iterable stream of events (Observable, in Rx jargon). An Observer can apply functional constructs -- such as filter, map, or zip -- to the stream of clicks and can in turn update other Observable objects. For examples involving real code, check out the blog post by Netflix.

Sound cool? Then come check out Greg Benson, a software engineer on the Android @ Netflix team, as he presents on incorporating reactive programming into your Android apps with RxJava. Sign up for this April 10th presentation at San Francisco Android Livecode!

  • Tuesday, April 8, 2014 - 6:00PM - Sensu @ Yelp: A Guided Tour (San Francisco Dev Ops)
  • Wednesday, April 9, 2014 - 6:00PM - Java 8 Lambda Expressions & Streams (Java User Group)
  • Thursday, April 10, 2014 - 7:00PM - Reactive Programming on Android Using RxJava with Greg Benson! (San Francisco Android Livecode)
  • Thursday, April 17, 2014 - 6:30PM - Working Hard, or Artily Twerking Kongsciousness (Designers + Geeks)
  • Tuesday, April 22, 2014 - 6:30PM - Women Who Code: Lightning Talks (Women Who Code)
  • Wednesday, April 23, 2014 - 6:45PM - Best Practices For Lean User Acquisition and Engagement (Products That Count)
  • Thursday, April 24, 2014 - Large-Scale Machine Learning with Apache Spark (SF Machine Learning)
  • Wednesday, April 30, 2014 - 6:15PM- Django 1.7 and You (Django Meetup Group)

03/31/2014

MySQL replication, error 1159, and why you might want to upgrade

At Yelp, we use MySQL replication to provide the performance we need to serve our 120 million monthly unique visitors. In this post Josh Snyder walks us through some of the internals of MySQL’s replication system, and discusses an issue that we saw in production.


MySQL replication and the slave I/O thread’s purpose

On a MySQL slave, the task of replication is split between two threads: the I/O and SQL threads. The I/O thread’s job is to download the log of events executed on the master and write them to disk in the relay log. The SQL thread then reads and executes the downloaded events on the slave. For the purposes of this post we’ll discuss the I/O thread.

The slave I/O thread connects to a master server using the MySQL protocol, just like any other client. It executes certain housekeeping queries on the master, and immediately afterward asks for a dump of the binary log starting at a specific log file and position. Once the slave I/O thread requests the binlog dump, the connection will be devoted entirely to the binlog stream until the TCP stream is closed. This means that the results of the housekeeping queries mentioned above will need to be cached for use during the entire lifetime of the TCP stream. Practically speaking this is not a problem, because it’s very cheap to break and reestablish the connection.

What happens when the connection breaks?

Nothing right away. The MySQL slave can’t recognize the difference between silence due to a lack of events in the binary log and silence due to connection issues.

The slave I/O thread can take a guess, though, which is where slave_net_timeout and MASTER_HEARTBEAT_PERIOD come in.

The manual describes slave_net_timeout as, “the number of seconds to wait for more data from the master before the slave considers the connection broken, aborts the read, and tries to reconnect.” This is correct, but the actual behavior is somewhat more subtle. As implemented, slave_net_timeout is the number of seconds that any read on the slave I/O socket will block before timing out. It’s passed directly to setsockopt(2) as SO_RCVTIMEO. Therefore, if the I/O thread ever spends more than slave_net_timeout seconds reading from its socket, the read will time out and an error will be returned. MySQL handles this error internally as ER_NET_READ_INTERRUPTED, and its ultimate reaction is to terminate the connection and initiate a new one.

A small value of slave_net_timeout could then pose a problem. If slave_net_timeout is two seconds, MySQL will terminate and reconnect the replication stream whenever the master spends more than two seconds without publishing an event to its binary log. To remedy this, the slave can request that the master send periodic heartbeats using the MASTER_HEARTBEAT_PERIOD replication setting (which defaults to half of slave_net_timeout). If the master heartbeat period elapses and the master has no events to send, it will insert a Heartbeat_log_event into the binary log stream. These events go over the wire, but are never written to disk; they don’t even get recorded in the slave’s relay log.

Heartbeat events serve to “fill the silence” created by an idle master. Busy masters (i.e. those that always have an event to send within one heartbeat period) will never send heartbeat events. This can be verified on a slave by checking that the Slave_received_heartbeats status variable is never incremented when the master is busily accepting writes.

What does a slave reconnection look like?

I set up a master-slave MySQL pair within Docker containers, with an active binary log stream coming down from the master. I set slave_net_timeout to 30 seconds and then broke replication by removing the replication TCP flow from the tables of my stateful firewall.

Using iptables/conntrack to break a replication connection

First, set up a firewall rule that uses conntrack to disallow all invalid TCP sessions:

root@mysql_master # iptables -A INPUT -p tcp --dport 3306 -m state --state INVALID -j LOG
root@mysql_master # iptables -A INPUT -p tcp --dport 3306 -m state --state INVALID -j DROP

Be sure to disable conntrack’s TCP loose mode, which will otherwise allow TCP ACKs to establish valid connections:

root@mysql_master # echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose

Then simply delete the conntrack entry for MySQL replication, and our stateful firewall will block any MySQL tcp sessions that were in flight:

root@mysql_master # conntrack -D -p tcp --dport 3306

I set my slave_net_timeout to 30 seconds and then broke replication. As expected, the slave I/O thread stopped advancing. Seconds_Behind_Master displays zero, because the slave SQL thread is not “actively processing updates”. Nothing of consequence happens until slave_net_timeout elapses.

Once that happens, MySQL reconnects and Seconds_Behind_Master spikes to...wait what?

Seconds_Behind_Master: 1634
...
Last_IO_Errno: 0

How did that happen? We spent only the thirty second slave_net_timeout disconnected from the master host, but the slave SQL thread thinks it’s a whopping 1634 seconds behind. This is due to MySQL bug #66921, discussed by Domas Mituzas here. Succinctly, when the connection is reestablished, the slave SQL thread re-executes the format description event from the master’s binary log, noting its timestamp as the latest executed event. Seconds_Behind_Master becomes equal to the age of the format description event in the master’s binary log file, which in this case was 1634 seconds.

MySQL doesn’t bother to print to the error log that the I/O thread recovered from a connection failure. It considers these kinds of issues routine, so it gracefully recovers without comment. This presents an issue for a database administrator who would like to know how frequently reconnects are occurring. So far the only way I’ve found to gather this information is to interrogate the master about the age of its ‘Binlog Dump’ threads. [1]

select id, host, time_ms, date_sub(now(), interval time second) as started
from information_schema.processlist
where command = 'Binlog Dump'
order by time_ms;

In some cases, the I/O thread’s attempt to reconnect will fail. This will cause a cheery error message to appear in the error log:

[ERROR] Slave I/O: error reconnecting to master 'repl@169.254.0.5:3306' - retry-time: 60 retries: 86400, Error_code: 2003

Error code 2003 is the familiar “Cannot connect to MySQL server” error. Replication isn’t currently happening, but MySQL will gracefully handle the situation by continuing its attempts to reconnect to the master [2]. The error message means that the slave I/O thread will reconnect at 60 second (MASTER_CONNECT_RETRY) intervals, and will persist in doing so 86400 (MASTER_RETRY_COUNT) times.

An annoying outage

On one occasion, packet loss due to a network provider issue caused some of Yelp’s MySQL 5.5 series databases to print different and scarier messages:

[ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159
[ERROR] The slave I/O thread stops because SET @master_heartbeat_period on master failed. Error: Error_code: 1593

perror informs us that these errors are network timeout and fatal slave errors, respectively.

$ perror 1159 1593
MySQL error code 1159 (ER_NET_READ_INTERRUPTED): Got timeout reading communication packets
MySQL error code 1593 (ER_SLAVE_FATAL_ERROR): Fatal error: %s

In both cases, MySQL did not gracefully handle the situation. The slave I/O thread did not attempt to reconnect: it was stone dead. A human had to intervene to restart replication. These problems arose because the I/O thread wasn’t correctly handling errors when doing its start-of-connection housekeeping queries. Our investigation revealed that each error message implicates a different code site.

The first error, ER_NET_READ_INTERRUPTED, is already familiar to us. It’s the error that occurs when slave_net_timeout elapses without receiving any data from the master. In this case, the replication client executed SHOW VARIABLES LIKE 'SERVER_ID', and did not receive a response in time. As explained above, MySQL drops the connection and reconnects when it encounters this error during a binlog dump. At this specific code site, MySQL handles errors a little differently. Instead of catching and handling ERR_NET_READ_INTERRUPTED the code checks for any network error. The code that matches the error codes looks like:

bool is_network_error(uint errorno)
{
  if (errorno == CR_CONNECTION_ERROR ||
    errorno == CR_CONN_HOST_ERROR ||
    errorno == CR_SERVER_GONE_ERROR ||
    errorno == CR_SERVER_LOST ||
    errorno == ER_CON_COUNT_ERROR ||
    errorno == ER_SERVER_SHUTDOWN)
    return TRUE;
  return FALSE;
}

As you can see, ER_NET_READ_INTERRUPTED is missing from this list. Because of this, MySQL decided to terminate the slave I/O thread instead of gracefully reconnecting.

The second error was similar. Setting the heartbeat period on the master is one of the housekeeping queries done by the slave before requesting a binlog dump. The query looks like SET @master_heartbeat_period=%s. Due to the way this query’s result was handled, any error at all would cause death of the slave I/O thread.

We worked with Percona to get fixes for these two issues included in the Percona Server 5.5.36 release. The patch, written by Vlad Lesin, also includes numerous test cases to exercise this behavior.

Rubber Meets Road

I wanted to see Vlad’s change in practice, so I constructed a testbed to mimic packet loss. I chose some very aggressive settings in an attempt to get a high frequency of connection breakage.

On the master, I left my existing stateful firewall rules in place. I then emulated 60% packet loss:

tc qdisc add dev eth0 root netem drop 60%

On the slave, I set

SET GLOBAL slave_net_timeout=2;
CHANGE MASTER TO master_connect_retry=1;

I continuously wrote into the master at a rate of 5 queries/second, with each event being 217 bytes. This translates to 1085 bytes/second of binlog traffic.

Using Percona Server 5.5.36 as a baseline, I used tcpdump to determine that these network conditions caused MySQL to reconnect 74 times over a five minute period. At no point did the I/O thread fail entirely. Percona Server 5.5.35 is a different story. Over a five minute period under the same conditions, MySQL reconnected 69 times. On five occasions the I/O thread failed completely:

3x [ERROR] Slave I/O: The slave I/O thread stops because SET @master_heartbeat_period on master failed. Error: , Error_code: 1593
2x [ERROR] Slave I/O: The slave I/O thread stops because a fatal error is encountered when it try to get the value of SERVER_ID variable from master. Error: , Error_code: 1159

Based on these results, I’d say the fix is effective. Moreover, it seems unlikely that there are any further bugs of this kind in the reconnection code.

In conclusion

I was very happy to have a checkout of MySQL handy while debugging this. In general, I’ve found the MySQL replication source code to be readable enough (if not simple), especially when you’re searching for a specific error code. Most importantly, though, I wouldn’t be able to live without the ability to spin-up a throwaway MySQL instance for testing.

Footnotes

  1. [1] well, I lied. I know of one other way. You can configure replication in a way that causes the slave I/O thread to print a warning when it connects. The warning will be printed to the slave’s error log. I’ve seen this, for instance, when a MySQL 5.6 replica connects to a MySQL 5.5 master.
  2. [2] whether your application will gracefully handle the resulting replication delay is a different story entirely

03/06/2014

March Events at Yelp HQ: Python Madness & More!

Love Python but can’t make it all the way to Montreal for PyCon this year? Never fear, Yelp is hosting a sneak peak of many of the talks! Instead of buying plane tickets, a hotel room, and a conference pass, you can kick back, drink beer, and have pizza with us! Next week’s Python meetups, both PyLadies and SF Python, will be hosting practice talks for the conference. Let’s dive into two of the talks: a novice session on machine learning, and an intermediate talk on working with external languages.

At a high level, Melanie Warrick will get us started on machine learning. “Big Data” and analytics don’t have to be just scary buzzwords. Python is a great language to start tinkering in this area, and Melanie will show us how! Join us March 11th at the PyLadies event to learn more. Note, this event is open to both PyLadies and PyGents!

On the low level side of things, Christine Spang will describe best practices in invoking subprocesses and wrapping C code. Subprocesses can be used to run other binaries, shell processes, or even other Python processes you’d like to keep in a separate namespace.  But understanding all the options is important to using that power effectively and correctly.  “Dropping down” to C code can be an effective way to speed up a Python program and Python provides a way idiomatically to interact with highly optimized libraries. SF Python will host this talk and others on March 12th.

And there’s plenty more where that came from! Lada Adamic, a data scientist at Facebook, will model information spread at the Products that Count meetup, and at Designers + Geeks we’ll hear about cutting edge techniques to create narrative-driven experiences in the world around us.

See you at Yelp!

Copyright © 2004-2014 Yelp | Privacy Policy