mr mr cote

Educational, Investigative, and Absurd Writings by M. R. Côté

Phabricator and Lando Launched

| Comments

The Engineering Workflow team at Mozilla is happy to announce that Phabricator and Lando are now ready for use with mozilla-central! This represents about a year of work integrating Phabricator with our systems and building out Lando.

There are more details in my post to the dev.platform list.

A Vision for Engineering Workflow at Mozilla (Part One)

| Comments

The OED’s second definition of “vision” is “the ability to think about or plan the future with imagination or wisdom.” Thus I felt more than a little trepidation when I was tasked with creating a vision for my team. What should this look like? How do I scope it? What should it cover? The Internet was of surprisingly little help; it seems that either no one thinks about tooling and engineering processes at this level, or (perhaps more likely) they keep it a secret when they do. The best article I found was from Microsoft Research in which they studied how tools are adopted at Microsoft, and their conclusion was essentially that they had no overarching strategy.

Around six months later, I presented a Vision for Engineering Workflow at our fortnightly managers' meeting. But first, some context: a bit about Mozilla’s Engineering Workflow team, and about the challenges we face.

Engineering Workflow

The Engineering Workflow team was created in the Great Reorg of 2017, when, amongst other large changes, its predecessor, the Automation & Tools team (aka the A-Team) was split into two, with the part focussed more on test automation joining the newly formed Product Integrity org. The other half remained in the Engineering Operations org, along with a related team, managed by coop, that worked on the build and version-control systems. In February, these two teams were consolidated into a single team, with kmoir joining the team as a new manager while coop headed off to manage the Taskcluster team.

We named this new team “Engineering Workflow” to reflect that it is focussed on the first stages of the Firefox engineering pipeline, that is, tools and processes that most developers use on a day-to-day basis. Our mission is as follows:

The engineering workflow team exists to improve the quality, clarity, and efficiency of Firefox development through the integration and development of tools and automation.

More specifically, the major pieces of the engineering pipeline that we work on are

  • Tracking issues
  • Reviewing code
  • Landing code
  • Building Firefox

Just as importantly, there are many related systems that we don’t own. These include

  • Tests and test frameworks. As mentioned above, these are the responsibility of the Product Integrity org.
  • Release and update infrastructure. This is the domain of release engineering and release management.
  • Metrics related to product use. Although we are starting to collect our own metrics, data related to Firefox itself is collected and analyzed by the data and product teams.
  • Firefox Developer Experience (aka devtools). I mention this only because they have (or at least had) a similar name. This is the team that works on the developer tools that are shipped as part of Firefox.
  • Low-level tools. These tools are very product focussed, requiring intimate knowledge of the Firefox codebase and C++ development. This team is managed by Anthony Jones and is part of the Runtime Engineering group.
  • Products not built from mozilla-central. To allow us to focus (I seem to really love that word), we prioritize work to help developers working within the mozilla-central codebase. Many of our tools are also used by other teams (including ourselves!) but support requests from them are considered lower priority.

Of course, we can and do work with many of these other teams on joint ventures. Over time I would like to better coordinate our respective road maps to deliver even more impact to engineering at Mozilla.

Challenges

Mozilla is a unique place. Not only are we a nonprofit that works in the open, but we develop a massive application with contributors, both paid and volunteer, located all around the world. This means we also face unique challenges when it comes to figuring out what tools and automation to integrate, build, and/or improve to maximize impact. I’ll touch on a few here.

Diverse workflows and strong preferences

Tales of “religious wars” within software develop stretch back decades, so it is no surprise that many Mozillians have strong opinions about the way they prefer to work. What is less common is that Mozilla has generally shied away from defining official (or even officially supported) tools and processes. I won’t get into the merits of this approach, but it does impact tooling teams, who have to either support multiple workflows in their tools or unilaterally decide to prioritize some over others.

A few examples:

  • git versus modern hg versus mq. Not only are developers split across two VCSes, but even within Mercurial users there are differences (though thankfully mq usage seems to be much lower than a few years ago).
  • microcommits and commit series. Some developers tend to create a single patch per bug. A good number create a few patches per bug, sometimes as followups but often as one chunk of work. And there are a small number who, at least at times, create long series of commits, sometimes on the order of 20 to 40. Furthermore, despite the growing popularity of the commit-series philosophy, including at Mozilla, proper support in review tools remains rare.
  • importance of features in code-review and issue-tracking apps. Unsurprisingly, as developers spend much of their days working with bugs and code changes, they tend to get opinionated as to how the tools could be made better. It’s tricky to know which features to prioritize when both improving and migrating tools.

I am happy to report, however, that there is more and more support for consolidating workflows at Mozilla.

Engineering scale

Firefox is a huge application. A full Mercurial clone currently takes up 3.6 GB of disk space. Without Mercurial metadata the codebase, including build, test, and third-party libs and apps, contains over 245 000 files in more than 17 000 directories totalling almost 20 million lines of code. There aren’t too many projects the size of Firefox, open source or otherwise.

Unsurprisingly, since Firefox remains very active, there are a lot of changes going into the codebase: about 180 per day in April 2018. Contrast this with about 25-30 per day going into the Linux kernel. This also doesn’t count pushes to the try server for testing works in progress. April saw 210 compute years in our CI system.

Finally, we have complicated security requirements. Mozilla is open by design, with many tools and processes exposed to the public (and the public Internet!). Our approach to governance does not restrict positions of authority and responsibility to employees. These complexities and subleties can be seen in BMO’s permission system, which is much more fine-grained than what is built into most issue-tracking and code-review tools.

All these factors create difficult problems when integrating third-party tools into our systems. In addition, although we do this less than we used to, our scale means that there are not always existing solutions out there that meet our needs, requiring us to write custom applications that need to be highly scalable and secure.

Legacy

Related to the scale of Firefox development is its long legacy. The Mozilla Foundation is 20 years old, with the Netscape code dating back even further. Although Mozilla has grown dramatically as an entity, many workflows persist over the years, including the reviewing of patches in Bugzilla and the use of Mercurial queues. Understandably, when developers have used a certain workflow for many years, they are often skeptical of change. Yet newer contributors are more familiar with modern workflows, so modern tooling can help attract and retain both employees and volunteers, in addition to the various other advantages in terms of ergonomics, usability, and efficiency. Contending with these two perspectives can be difficult.

In addition to legacy workflows, we also have a number of legacy systems. Many of these systems continue to serve us well, and we are constantly making improvements to them. However, large-scale changes can be difficult, both because of the age of these systems and codebases, but also because over time they have been integrated with many other applications and used in ways we aren’t aware of and sometimes don’t expect. This makes planning challenging and requires a lot of communication.

Decision-making and responsibility

I’m happy to say that this set of challenges has seen the most improvement of the ones I’ve highlighted. I’ll mention them regardless as we can always be improving, and an understanding of our history helps.

Decision-making at Mozilla has been challenging for a number of reasons, mainly due to its history and rapid growth. In particular, there was a common view that we aimed for consensus on all major decisions, which, while well-intentioned, did not scale, and in fact was contradicted by both our management and module systems. This has led both to stalled decisions and sudden decisions that avoided discussions altogether. I’ve previously written about my perspectives and experiences in making decisions at Mozilla. Thankfully, as I also noted, this culture is changing, and making effective, reasoned decisions is getting easier.

Within my own team, or at least its previous incarnation as the A-Team, we experienced our own difficulties making decisions and prioritizing work. There were no clear lines of authority and responsibility when it came to tooling, which also contributed to our team becoming too service oriented. Again, this is changing for the better with existence of both Engineering Operations and Product Integrity, whose directors are peers of those of the product-focussed departments.

Thus ends my preamble on the context of developing a Vision for Engineering Workflow. In my next post, I’ll delve into the Vision itself.

Lando Demo

| Comments

Lando is so close now that I can practically smell the tibanna. Israel put together a demo of Phabricator/BMO/Lando running on his local system, which is only a few patches away from being a deployed reality.

One caveat: this demo uses Phabricator’s web UI to post the patch. We highly recommend using Arcanist, Phabricator’s command-line tool, to submit patches instead, mainly because it preserves all the relevant changeset metadata.

With that out of the way, fasten your cape and take a look:

More Lessons From MozReview: Mozilla and Microcommits

| Comments

There is a strong argument in modern software engineering that a sequence of smaller changes is preferable to a single large change. This approach facilitates development (easier to debug, quicker to land), testing (less functionality to verify at once), reviews (less code to keep in the reviewer’s head), and archaeology (annotations are easier to follow). Recommended limits are in the realm of 300-400 changed lines of code per patch (see, for example, the great article “How to Do Code Reviews Like a Human (Part Two)”).

400 lines can still be a fairly complex change. Microcommits is the small-patch approach taken to its logical conclusion. The idea is to make changes as small as possible but no smaller, resulting in a series of atomic commits that incrementally implement a fix or new feature. It’s not uncommon for such a series to contain ten or more commits, many changing only 20 or 30 lines. It requires some discipline to keep commits small and cohesive, but it is a skill that improves over time and, in fact, changes how you think about building software.

Former Mozillian Lucas Rocha has a great summary of some of the benefits. Various other Mozillians have espoused their personal beliefs that Firefox engineering would do well to more widely adopt the microcommits approach. I don’t recall ever seeing an organized push towards this philosophy, however; indeed, for better or for worse Mozilla tends to shy away from this type of pronouncement. This left me with a question: have many individual engineers started working with microcommits? If we do not have a de juror decision to work this way, do have a de facto decision?

We designed MozReview to be repository-based to encourage the microcommit philosophy. Pushing up a series of commits automatically creates one review request per commit, and they are all tied together (albeit through the “parent review request” hack which has understandably caused some amount of confusion). Updating a series, including adding and removing commits, just works. Although we never got around to implementing support for confidential patches (a difficult problem given that VCSs aren’t designed to have a mix of permissions in the same repo), we were pretty proud of the fact that MozReview was unique in its first-class support for publishing and reviewing microcommit series.

While MozReview was never designated the Firefox review tool, through organic growth it is now used to review (and generally land) around 63% of total commits to mozilla-central, looking at stats for bugs in the Core, Firefox, and Toolkit products:

To be honest, I was a little surprised at the numbers. Not only had MozReview grown in popularity over the last year, but much of its growth occurred right around the time its pending retirement was announced. In fact, it continued to grow slightly over the rest of the year.

However, we figured that, owing to MozReview’s support for microcommits, this wasn’t quite a fair comparison. Bugzilla’s attachment system discourages multiple patches per bug, even with command-line tools like bzexport. So we figured that, generally, a fix submitted to MozReview would have more parts than a corresponding fix submitted as a traditional BMO patch. Thus the bug-to-MozReview-request ratio would be lower than the bug-to-patch ratio. We ran a query on the number of MozReview requests per bug in about the last seven months. These results yielded further surprises:

About 75% of MozReview commit “series” contain only a single commit. 12% contain only two commits, 5% contain three, and 2.7% contain four. Series with five or more commits make up only 5.3%.

Still, it seems MozReview has perhaps encouraged the splitting up of work per bug to some degree, given that 25% of series had more than one commit. We decided to compare this to traditional patches attached to bugs, which are both more annoying to create and to apply:

Well then. Over approximately the same time period, of bugs with old-style attachments, 76% had a single patch. For bugs with two, three, and four patches, the proportions were 13%, 7%, and 1.5%, respectively. This is extremely close to the MozReview case. The mean is almost equal in both cases, in fact, slightly higher in the old-style-attachment case: 1.65 versus 1.61. The median in both cases is 1.

Okay, maybe the growing popularity of MozReview in 2017 influenced the way people now use BMO. Perhaps a good number of authors use both systems, or the reviewers preferring MozReview are being vocal about wanting at least two or three patches over a single one when reviewing in BMO. So we looked back to the situation with BMO patches in early 2016:

Huh. One-, two-, three-, and four-patch bugs accounted for 74%, 14%, 5%, and 2.6%, respectively.

For one more piece of evidence, this scatter plot shows that, on average, we’ve been using both BMO and MozReview in about the same way, in terms of discrete code changes per bug, over the last two years:

There are a few other angles we could conceivably consider, but the evidence strongly suggests that developers are (1) creating, in most cases, “series” of only one or two commits in MozReview and (2) working in approximately the same way in both BMO and MozReview, in terms of splitting up work.

I strongly believe we would benefit a great deal from making more of engineering’s assumptions and expectations clearer; this is a foundation of driving effective decisions. We don’t have to be right all the time, but we do have to be conscious, and we have to own up to mistakes. The above data leads me to conclude that the microcommit philosophy has not been widely adopted at Mozilla. We don’t, as a whole, care about using series of carefully structured and small commits to solve a problem. This is not an opinion on how we should work, but a conclusion on how we do work, informed by data. This is, in effect, a decision that has already been made, whether or not we realized it.

Although I am interested in this kind of thing from an academic perspective, it also has a serious impact to my direct responsibilities as an engineering manager. Recognizing such decisions will allow us to better prioritize our improvements to tooling and automation at Mozilla, even if it has to first precipitate a serious discussion and perhaps a difficult, conscious decision.

I will have more thoughts on why we have neither organically nor structurally adopted the microcommits approach in my next blog post. Spoiler: it may have to do with prevailing trends in open-source development, likely influenced by GitHub.

Phabricator and Lando November Update

| Comments

With work on Phabricator–BMO integration wrapping up, the development team’s focus has switched to the new automatic-landing service that will work with Phabricator. The new system is called “Lando” and functions somewhat similarly to MozReview–Autoland, with the biggest difference being that it is a standalone web application, not tightly integrated with Phabricator. This gives us much more flexibility and allows us to develop more quickly, since working within extension systems is often painful for anything nontrivial.

Lando is split between two services: the landing engine, “lando-api”, which transforms Phabricator revisions into a format suitable for the existing autoland service (called the “transplant server”), and the web interface, “lando-ui”, which displays information about the revisions to land and kicks off jobs. We split these services partly for security reasons and partly so that we could later have other interfaces to lando, such as command-line tools.

When I last posted an update I included an early screenshot of lando-ui. Since then, we have done some user testing of our prototypes to get early feedback. Using a great article, “Test Your Prototypes: How to Gather Feedback and Maximise Learning”, as a guide, we took our prototype to some interested future users. Refraining from explaining anything about the interface and providing only some context on how a user would get to the application, we encouraged them to think out loud, explaining what the data means to them and what actions they imagine the buttons and widgets would perform. After each session, we used the feedback to update our prototypes.

These sessions proved immensely useful. The feedback on our third prototype was much more positive than on our first prototype. We started out with an interface that made sense to us but was confusing to someone from outside the project, and we ended with one that was clear and intuitive to our users.

For comparison, this is what we started with:

And here is where we ended:

A partial implementation of the third prototype, with a few more small tweaks raised during the last feedback session, is currently on http://lando.devsvcdev.mozaws.net/revisions/D6. There are currently some duplicated elements there just to show the various states; this redundant data will of course be removed as we start filling in the template with real data from Phabricator.

Phabricator remains in a pre-release phase, though we have some people now using it for mozilla-central reviews. Our team continues to use it daily, as does the NSS team. Our implementation has been very stable, but we are making a few changes to our original design to ensure it stays rock solid. Lando was scheduled for delivery in October, but due to a few different delays, including being one person down for a while and not wanting to launch a new tool during the flurry of the Firefox 57 launch, we’re now looking at a January launch date. We should have a working minimal version ready for Austin, where we have scheduled a training session for Phabricator and a Lando demo.

Decisions, Decisions, Decisions: Driving Change at Mozilla

| Comments

As the manager responsible for driving the decision and process behind the move to Phabricator at Mozilla, I’ve been asked to write about my recent experiences, including how this decision was implemented, what worked well, and what I might have done differently. I also have a few thoughts about decision making both generally and at Mozilla specifically.

Please note that these thoughts reflect only my personal opinions. They are not a pronouncement of how decision making is or will be done at Mozilla, although I hope that my account and ideas will be useful as we continue to define and shape processes, many of which are still raw years after we became an organization with more than a thousand employees, not to mention the vast number of volunteers.


Mozilla has used Bugzilla as both an issue tracker and a code-review tool since its inception almost twenty years ago. Bugzilla was arguably the first freely available web-powered issue tracker, but since then, many new apps in that space have appeared, both free/open-source and proprietary. A few years ago, Mozilla experimented with a new code-review solution, named (boringly) “MozReview”, which was built around Review Board, a third-party application. However, Mozilla never fully adopted MozReview, leading to code review being split between two tools, which is a confusing situation for both seasoned and new contributors alike.

There were many reasons that MozReview didn’t completely catch on, some of which I’ve mentioned in previous blog and newsgroup posts. One major factor was the absence of a concrete, well-communicated, and, dare I say, enforced decision. The project was started by a small number of people, without a clearly defined scope, no consultations, no real dedicated resources, and no backing from upper management and leadership. In short, it was a recipe for failure, particularly considering how difficult change is even in a perfect world.

Having recognized this failure last year, and with the urging of some people at the director level and above, my team and I embarked on a process to replace both MozReview and the code-review functionality in Bugzilla with a single tool and process. Our scope was made clear: we wanted the tool that offered the best mechanics for code-review at Mozilla specifically. Other bits of automation, such as “push-to-review” support and automatic landings, while providing many benefits, were to be considered separately. This division of concerns helped us focus our efforts and make decisions clearer.

Our first step in the process was to hold a consultation. We deliberately involved only a small number of senior engineers and engineering directors. Past proposals for change have faltered on wide public consultation: by their very nature, you will get virtually every opinion imaginable on how a tool or process should be implemented, which often leads to arguments that are rarely settled, and even when “won” are still dominated by the loudest voices—indeed, the quieter voices rarely even participate for fear of being shouted down. Whereas some more proactive moderation may help, using a representative sample of engineers and managers results in a more civil, focussed, and productive discussion.

I would, however, change one aspect of this process: the people involved in the consultation should be more clearly defined, and not an ad-hoc group. Ideally we would have various advisory groups that would provide input on engineering processes. Without such people clearly identified, there will always be lingering questions as to the authority of the decision makers. There is, however, still much value in also having a public consultation, which I’ll get to shortly.

There is another aspect of this consultation process which was not clearly considered early on: what is the honest range of solutions we are considering? There has been a movement across Mozilla, which I fully support, to maximize the impact of our work. For my team, and many others, this means a careful tradeoff of custom, in-house development and third-party applications. We can use entirely custom solutions, we can integrate a few external apps with custom infrastructure, or we can use a full third-party suite. Due to the size and complexity of Firefox engineering, the latter is effectively impossible (also the topic for a series of posts). Due to the size of engineering-tools groups at Mozilla, the first is often ineffective.

Thus, we really already knew that code-review was a very likely candidate for a third-party solution, integrated into our existing processes and tools. Some thorough research into existing solutions would have further tightened the project’s scope, especially given Mozilla’s particular requirements, such as Mercurial support, which are in part due to a combination of scale and history. In the end, there are few realistic solutions. One is Review Board, which we used in MozReview. Admittedly we introduced confusion into the app by tying it too closely to some process-automation concepts, but it also had some design choices that were too much of a departure from traditional Firefox engineering processes.

The other obvious choice was Phabricator. We had considered it some years ago, in fact as part of the MozReview project. MozReview was developed as a monolithic solution with a review tool at its core, so the fact that Phabricator is written in PHP, a language without much presence at Mozilla today, was seen as a pretty big problem. Our new approach, though, in which the code-review tool is seen as just one component of a pipeline, means that we limit customizations largely to integration with the rest of the system. Thus the choice of technology is much less important.

The fact that Phabricator was virtually a default choice should have been more clearly communicated both during the consultation process and in the early announcements. Regardless, we believe it is in fact a very solid choice, and that our efforts are truly best spent solving the problems unique to Mozilla, of which code review is not.

To sum up, small-scale consultations are more effective than open brainstorming, but it’s important to really pay attention to scope and constraints to make the process as effective and empowering as possible.


Lest the above seem otherwise, open consultation does provide an important part of the process, not in conceiving the initial solution but in vetting it. The decision makers cannot be “the community”, at least, not without a very clear process. It certainly can’t be the result of a discussion on a newsgroup. More on this later.

Identifying the decision maker is a problem that Mozilla has been wrestling with for years. Mitchell has previously pointed out that we have a dual system of authority: the module system and a management hierarchy. Decisions around tooling are even less clear, given that the relevant modules are either nonexistent or sweepingly defined. Thus in the absence of other options, it seemed that this should be a decision made by upper management, ultimately the Senior Director of Engineering Operations, Laura Thomson. My role was to define the scope of the change and drive it forward.

Of course since this decision affects every developer working on Firefox, we needed the support of Firefox engineering management. This has been another issue at Mozilla; the directorship was often concerned with the technical aspects of the Firefox product, but there was little input from them on the direction of the many supporting areas, including build, version control, and tooling. Happily I found out that this problem has been rectified. The current directors were more than happy to engage with Laura and me, backing our decision as well as providing some insights into how we could most effectively communicate it.

One suggestion they had was to set up a small hosted test instance and give accounts to a handful of senior engineers. The purpose of this was to both give them a heads up before the general announcement and to determine if there were any major problems with the tool that we might have missed. We got a bit of feedback, but nothing we weren’t already generally aware of.

At this point we were ready for our announcement. It’s worth pointing out again that this decision had effectively already been made, barring any major issues. That might seem disingenuous to some, but it’s worth reiterating two major points: (a) a decision like this, really, any nontrivial decision, can’t be effectively made by a large group of people, and (b) we did have to be honestly open to the idea that we might have missed some big ramification of this decision and be prepared to rethink parts, or maybe even all, of the plan.

This last piece is worth a bit more discussion. Our preparation for the general announcement included several things: a clear understanding of why we believe this change to be necessary and desirable, a list of concerns we anticipated but did not believe were blockers, and a list of areas that we were less clear on that could use some more input. By sorting out our thoughts in this way, we could stay on message. We were able to address the anticipated concerns but not get drawn into a long discussion. Again this can seem dismissive, but if nothing new is brought into the discussion, then there is no benefit to debating it. It is of course important to show that we understand such concerns, but it is equally important to demonstrate that we have considered them and do not see them as critical problems. However, we must also admit when we do not yet have a concrete answer to a problem, along with why we don’t think it needs an answer at this point—for example, how we will archive past reviews performed in MozReview. We were open to input on this issues, but also did not want to get sidetracked at this time.

All of this was greatly aided by having some members of Firefox and Mozilla leadership provide input into the exact wording of the announcement. I was also lucky to have lots of great input from Mardi Douglass, this area (internal communications) being her specialty. Although no amount of wordsmithing will ensure a smooth process, the end result was a much clearer explanation of the problem and the reasons behind our specific solution.

Indeed, there were some negative reactions to this announcement, although I have to admit that they were fewer than I had feared there would be. We endeavoured to keep the discussion focussed, employing the above approach. There were a few objections we hadn’t fully considered, and we publicly admitted so and tweaked our plans accordingly. None of the issues raised were deemed to be show-stoppers.

There were also a very small number of messages that crossed a line of civility. This line is difficult to determine, although we have often been too lenient in the past, alienating employees and volunteers alike. We drew the line in this discussion at posts that were disrespectful, in particular those that brought little of value while questioning our motives, abilities, and/or intentions. Mozilla has been getting better at policing discussions for toxic behaviour, and I was glad to see a couple people, notably Mike Hoye, step in when things took a turn for the worse.

There is also a point in which a conversation can start to go in circles, and in the discussion around Phabricator (in fact in response to a progress update a few months after the initial announcement) this ended up being around the authority of the decision makers, that is, Laura and myself. At this point I requested that a Firefox engineering director, in this case Joe Hildebrand, get involved and explain his perspective and voice his support for the project. I wish I didn’t have to, but I did feel it was necessary to establish a certain amount of credibility by showing that Firefox leadership was both involved with and behind this decision.

Although disheartening, it is also not surprising that the issue of authority came up, since as I mentioned above, decision making has been a very nebulous topic at Mozilla. There is a tendency to invoke terms like “open” and “transparent” without in any way defining them, evoking an expectation that everyone shares an understanding of how we ought to make decisions, or even how we used to make decisions in some long-ago time in Mozilla’s history. I strongly believe we need to lay out a decision-making framework that values openness and transparency but also sets clear expectations of how these concepts fit into the overall process. The most egregious argument along these lines that I’ve heard is that we are a “consensus-based organization”. Even if consensus were possible in a decision that affects literally hundreds, if not thousands, of people, we are demonstrably not consensus-driven by having both module and management systems. We do ourselves a disservice by invoking consensus when trying to drive change at Mozilla.

On a final note, I thought it was quite interesting that the topic of decision making, in the sense of product design, came up in the recent CNET article on Firefox 57. To quote Chris Beard, “If you try to make everyone happy, you’re not making anyone happy. Large organizations with hundreds of millions of users get defensive and try to keep everybody happy. Ultimately you end up with a mediocre product and experience.” I would in fact extend that to trying to make all Mozillians happy with our internal tools and processes. It’s a scary responsibility to drive innovative change at Mozilla, to see where we could have a greater impact and to know that there will be resistance, but if Mozilla can do it in its consumer products, we have no excuse for not also doing so internally.

Phabricator Update

| Comments

tl;dr

  • Development Phabricator instance is up at https://mozphab.dev.mozaws.net/, authenticated via bugzilla-dev.allizom.org.
  • Development, read-only UI for Lando (the new automatic-landing service) has been deployed.
  • Work is proceeding on matching viewing restrictions on Phabricator revisions (review requests) to associated confidential bugs.
  • Work is proceeding on the internals of Lando to land Phabricator revisions to the autoland Mercurial branch.
  • Pre-release of Phabricator, without Lando, targeted for mid-August.
  • General release of Phabricator and Lando targeted for late September or early October.
  • MozReview and Splinter turned off in early December.

Work on Phabricator@Mozilla has been progressing well for the last couple months. Work has been split into two areas: Phabricator–Bugzilla integration and automatic landings.

Let me start with what’s live today:

Our Phabricator development instance is up at https://mozphab.dev.mozaws.net/. We’ve completed and deployed a Phabricator extension to use Bugzilla for authentication and identity; on our mozphab.dev instance, this is tied to bugzilla-dev.allizom.org. If you would like to poke around our development instance, please be our guest! Note that it is a development server, so we make no guarantees as to functionality, data preservation, and such, as with bugzilla-dev. Also, if you haven’t used bugzilla-dev in the last year or two (or ever), you’ll either need to log in with GitHub or get an admin to reset your password, since email is disabled on this server. Ping mcote or holler in #bmo on IRC. I’ll have a follow-up post on exactly what’s involved in using Bugzilla as an authentication and identity provider and how it affects you.

The skeleton of our new automatic-landing service, called Lando, is also deployed to development servers. While it doesn’t actually do any landings yet, the UI has been fleshed out. It pulls the current status of a “revision” (which is Phabricator’s term for a review request) and displays relevant details. It is currently pulling live data from mozphab.dev. This is what it looks like at the moment, although we will continue to iterate on it:

What we’re working on now:

The other part of Bugzilla integration is ensuring that we can support confidential revisions (review requests) in Phabricator tied to confidential bugs in a seamless way. The goal is to have the set of people who can view a confidential bug in Bugzilla be equal to the set of people who can view any Phabricator revisions associated with that bug. We knew that matching any third-party tool to Bugzilla’s fine-grained authorization system would not be easy, but Phabricator has proven even trickier to integrate than we anticipated. We have implemented the code that sets the visibility appropriately for a new revision, and we have the skeleton code for keeping it in sync, but there are some holes in our implementation that we need to plug. We’re continuing to dig into this and have set a goal to have a solid plan within two weeks, with implementation to follow immediately.

In parallel, within Lando we are working on the logic to take a diff from a Phabricator revision, verify the lander’s credentials and permissions, and turn it into a commit on the autoland branch of hg.mozilla.org. We have much of the first point done now, are consulting with IT on the best solution for the second, and will be starting work on the third shortly (which is actually the easiest, since we’re leveraging pieces of MozReview’s Autoland service).

Launch plans:

At the point that we have completed the Bugzilla-integration work described above, we’ll have what we need for a production Phabricator environment integrated with Bugzilla. This is planned for mid-August. We are calling this our pre-release launch, as Lando will not be complete, but we will be inviting some teams to try out Phabricator, to catch issues and frustrations before going to general release. Lando and the general rollout of Phabricator to all Firefox enginering will follow in late September or early October. We’ll have some brownbags to introduce Phabricator and our integrations, and we will ensure documentation is available and discoverable both for general Phabricator usage and our customizations, including automatic landings.

Due to the importance of the Firefox 57 release, Splinter and MozReview will remain functional but will be considered deprecated. New contributors should be directed to Phabricator to avoid the frustration of having to switch processes. Splinter will be turned off and MozReview will be moved to a read-only mode in early December.

More updates to follow!

Conduit's Commit Index

| Comments

As with MozReview, Conduit is being designed to operate on changesets. Since the end result of work on a codebase is a changeset, it makes sense to start the process with one, so all the necessary metadata (author, message, repository, etc.) are provided from the beginning. You can always get a plain diff from a changeset, but you can’t get a changeset from a plain diff.

Similarly, we’re keeping the concept of a logical series of changesets. This encourages splitting up a unit of work into incremental changes, which are easier to review and to test than large patches that do many things at the same time. For more on the benefits of working with small changesets, a few random articles are Ship Small Diffs, Micro Commits, and Large Diffs Are Hurting Your Ability To Ship.

In MozReview, we used the term commit series to refer to a set of one or more changesets that build up to a solution. This term is a bit confusing, since the series itself can have multiple revisions, so you end up with a series of revisions of a series of changesets. For Conduit, we decided to use the term topic instead of commit series, since the commits in a single series are generally related in some way. We’re using the term iteration to refer to each update of a topic. Hence, a solution ends up being one or more iterations on a particular topic. Note that the number of changesets can vary from iteration to iteration in a single topic, if the author decides to either further split up work or to coalesce changesets that are tightly related. Also note that naming is hard, and we’re not completely satisfied with “topic” and “iteration”, so we may change the terminology if we come up with anything better.

As I noted in my last post, we’re working on the push-to-review part of Conduit, the entrance to what we sometimes call the commit pipeline. However, technically “push-to-review” isn’t accurate, as the first process after pushing might be sending changesets to Try for testing, or static analysis to get quick automated feedback on formatting, syntax, or other problems that don’t require a human to look at the code. So instead of review repository, which we’ve used in MozReview, we’re calling it a staging repository in the Conduit world.

Along with the staging repository is the first service we’re building, the commit index. This service holds the metadata that binds changesets in the staging repo to iterations of topics. Eventually, it will also hold information about how changesets moved through the pipeline: where and when they were landed, if and when they were backed out, and when they were uplifted into release branches.

Unfortunately a simple “push” command, whether from Mercurial or from Git, does not provide enough information to update the commit index. The main problem is that not all of the changesets the author specifies for pushing may actually be sent. For example, I have three changesets, A, B, and C, and pushed them up previously. I then update C to make C′ and push again. Despite all three being in the “draft” phase (which is how we differentiate work in progress from changes that have landed in the mainline repository), only C′ will actually be sent to the staging repo, since A and B already exist there.

Thus, we need a Mercurial or Git client extension, or a separate command-line tool, to tell the commit index exactly what changesets are part of the iteration we’re pushing up—in this example, A, B, and C′. When it receives this information, the commit index creates a new topic, if necessary, and a new iteration in that topic, and records the data in a data store. This data will then be used by the review service, to post review requests and provide information on reviews, and by the autoland service, to determine which changesets to transplant.

The biggest open question is how to associate a push with an existing topic. For example, locally I might be working on two bugs at the same time, using two different heads, which map to two different topics. When I make some local changes and push one head up, how does the commit index know which topic to update? Mercurial bookmarks, which are roughly equivalent to Git branch names, are a possibility, but as they are arbitrarily named by the author, collisions are too great a possibility. We need to be sure that each topic is unique.

Another straightforward solution is to use the bug ID, since the vast majority of commits to mozilla-central are associated with a bug in BMO. However, that would restrict Conduit to one topic per bug, requiring new bugs for all follow-up work or work in parallel by multiple developers. In MozReview, we partially worked around this by using an “ircnick” parameter and including that in the commit-series identifiers, and by allowing arbitrary identifiers via the --reviewid option to “hg push”. However this is unintuitive, and it still requires each topic to be associated with a single bug, whereas we would like the flexibility to associate multiple bugs with a single topic. Although we’re still weighing options, likely an intuitive and flexible solution will involve some combination of commit-message annotations and/or inferences, command-line options, and interactive prompts.

Conduit Field Report, March 2017

| Comments

For background on Conduit, please see the previous post and the Intent to Implement.

Autoland

We kicked off Conduit work in January starting with the new Autoland service. Right now, much of the Autoland functionality is located in the MozReview Review Board extension: the permissions model, the rewriting of commit messages to reflect the reviewers, and the user interface. The only part that is currently logically separate is the “transplant service”, which actually takes commits from one repo (e.g. reviewboard-hg) and applies it to another (e.g. try, mozilla-central). Since the goal of Conduit is to decouple all the automation from code-review tools, we have to take everything that’s currently in Review Board and move it to new, separate services.

The original plan was to switch Review Board over to the new Autoland service when it was ready, stripping out all the old code from the MozReview extension. This would mean little change for MozReview users (basically just a new, separate UI), but would get people using the new service right away. After Autoland, we’d work on the push-to-review side, hooking that up to Review Board, and then extend both systems to interface with BMO. This strategy of incrementally replacing pieces of MozReview seemed like the best way to find bugs as we went along, rather than a massive switchover all at once.

However, progress was a bit slower than we anticipated, largely due to the fact that so many things were new about this project (see below). We want Autoland to be fully hooked up to BMO by the end of June, and integrating the new system with both Review Board and BMO as we went along seemed increasingly like a bad idea. Instead, we decided to put BMO integration first, and then follow with Review Board later (if indeed we still want to use Review Board as our rich-code-review solution).

This presented us with a problem: if we wouldn’t be hooking the new Autoland service up to Review Board, then we’d have to wait until the push service was also finished before we hooked them both up to BMO. Not wanting to turn everything on at once, we pondered how we could still launch new services as they were completed.

Moving to the other side of the pipeline

The answer is to table our work on Autoland for now and switch to the push service, which is the entrance to the commit pipeline. Building this piece first means that users will be able to push commits to BMO for review. Even though they would not be able to Autoland them right away, we could get feedback and make the service as easy to use as possible. Think of it as a replacement for bzexport.

Thanks to our new Scrum process (see also below), this priority adjustment was not very painful. We’ve been shipping Autoland code each week, so, while it doesn’t do much yet, we’re not abandoning any work in progress or leaving patches half finished. Plus, since this new service is also being started from scratch (although involving lots of code reuse from what’s currently in MozReview), we can apply the lessons we learned from the last couple months, so we should be moving pretty quickly.

Newness

As I mentioned above, although the essence of Conduit work right now is decoupling existing functionality from Review Board, it involves a lot of new stuff. Only recently did we realize exactly how much new stuff there was to get used to!

New team members

We welcomed Israel Madueme to our team in January and threw him right into the thick of things. He’s adapted tremendously well and started contributing immediately. Of course a new team member means new team dynamics, but he already feels like one of us.

Just recently, we’ve stolen dkl from the BMO team, where he’s been working since joining Mozilla 6 years ago. I’m excited to have a long-time A-Teamer join the Conduit team.

A new process

At the moment we have five developers working on the new Conduit services. This is more people on a single project than we’re usually able to pull together, so we needed a process to make sure we’re working to our collective potential. Luckily one of us is a certified ScrumMaster. I’ve never actually experienced Scrum-style development before, but we decided to give it a try.

I’ll have a lot more to say about this in the future, as we’re only just hitting our stride now, but it has felt really good to be working with solid organizational principles. We’re spending more time in meetings than usual, but it’s paying off with a new level of focus and productivity.

A new architecture

Working within Review Board was pretty painful, and the MozReview development environment, while amazing in its breadth and coverage, was slow and too heavily focussed on lengthy end-to-end tests. Our new design follows more of a microservice-based approach. The Autoland verification system (which checks users permissions and ensures that commits have been properly reviewed) is a separate service, as is the UI and the transplant service (as noted above, this last part was actually one of the few pieces of MozReview that was already decoupled, so we’re one step ahead there). Similarly, on the other side of the pipeline, the commit index is a separate service, and the review service may eventually be split up as well.

We’re not yet going whole-hog on microservices—we don’t plan, for starters at least, to have more than 4 or 5 separate services—but we’re already benefitting from being able to work on features in parallel and preventing runaway complexity. The book Building Microservices has been instrumental to our new design, as well as pointing out exactly why we had difficulties in our previous approach.

New operations

As the A-Team is now under Laura Thomson, we’re taking advantage of our new, closer relationship to CloudOps to try a new deployment and operations approach. This has freed us of some of the constraints of working in the data centre while letting us take advantage of a proven toolchain and process.

New technologies

We’re using Python 3.5 (and probably 3.6 at some point) for our new services, which I believe is a first for an A-Team project. It’s new for much of the team, but they’ve quickly adapted, and we’re now insulated against the 2020 deadline for Python 2, as well as benefitting from the niceties of Python 3 like better Unicode support.

We also used a few technologies for the Autoland service that are new to most of the team: React and Tornado. While the team found it interesting to learn them, in retrospect using them now was probably a case of premature optimization. Both added complexity that was unnecessary right now. React’s URL routing was difficult to get working in a way that seamlessly supported a local, Docker-based development environment and a production deployment scenario, and Tornado’s asynchronous nature led to extra complexity in automated tests. Although they are both fine technologies and provide scalable solutions for complex apps, the individual Conduit services are currently too small to really benefit.

We’ve learned from this, so we’re going to use Flask as the back end for the push services (commit index and review-request generator), for now at least, and, if we need a UI, we’ll probably use a relatively simple template approach with JavaScript just for enhancements.

Next

In my next post, I’m going to discuss our approach to the push services and more on what we’ve learned from MozReview.

Project Conduit

| Comments

In 2017, Engineering Productivity is starting on a new project that we’re calling “Conduit”, which will improve the automation around submitting, testing, reviewing, and landing commits. In many ways, Conduit is an evolution and course correction of the work on MozReview we’ve done in the last couple years.

Before I get into what Conduit is exactly, I want to first clarify that the MozReview team has not really been working on a review tool per se, aside from some customizations requested by users (line support for inline diff comments). Rather, most of our work was building a whole pipeline of automation related to getting code landed. This is where we’ve had the most success: allowing developers to push commits up to a review tool and to easily land them on try or mozilla-central. Unfortunately, by naming the project “MozReview” we put the emphasis on the review tool (Review Board) instead of the other pieces of automation, which are the parts unique to Firefox’s engineering processes. In fact, the review tool should be a replaceable part of our whole system, which I’ll get to shortly.

We originally selected Review Board as our new review tool for a few reasons:

  • The back end is Python/Django, and our team has a lot of experience working with both.

  • The diff viewer has a number of fancy features, like clearly indicating moved code blocks and indentation changes.

  • A few people at Mozilla had previously contributed to the Review Board project and thus knew its internals fairly well.

However, we’ve since realized that Review Board has some big downsides, at least with regards to Mozilla’s engineering needs:

  • The UI can be confusing, particularly in how it separates the Diff and the Reviews views. The Reviews view in particular has some usability issues.

  • Loading large diffs is slow, but also conversely it’s unable to depaginate, so long diffs are always split across pages. This restricts the ability to search within diffs. Also, it’s impossible to view diffs file by file.

  • Bugs in interdiffs and even occasionally in the diffs themselves.

  • No offline support.

In addition, the direction that the upstream project is taking is not really in line with what our users are looking for in a review tool.

So, we’re taking a step back and evaluating our review-tool requirements, and whether they would be best met with another tool or by a small set of focussed improvements to Review Board. Meanwhile, we need to decouple some pieces of MozReview so that we can accelerate improvements to our productivity services, like Autoland, and ensure that they will be useful no matter what review tool we go with. Project Conduit is all about building a flexible set of services that will let us focus on improving the overall experience of submitting code to Firefox (and some other projects) and unburden us from the restrictions of working within Review Board’s extension system.

In order to prove that our system can be independent of review tool, and to give developers who aren’t happy with Review Board access to Autoland, our first milestone will be hooking the commit repo (the push-to-review feature) and Autoland up to BMO. Developers will be able to push a series of one or more commits to the review repo, and reviewers will be able to choose to review them either in BMO or Review Board. The Autoland UI will be split off into its own service and linked to from both BMO and Review Board.

(There’s one caveat: if there are multiple reviewers, the first one gets to choose, in order to limit complexity. Not ideal, but the problem quickly gets much more difficult if we fork the reviews out to several tools.)

As with MozReview, the push-to-BMO feature won’t support confidential bugs right away, but we have been working on a design to support them. Implementating that will be a priority right after we finish BMO integration.

We have an aggressive plan for Q1, so stay tuned for updates.