Our Selenium to Cypress Journey

Here at ServiceTitan, we are always working to improve our automation test framework. Back in May 2020, we realized that our team had out-grown our existing Selenium framework; so, a few of our key members got together and discussed what was next for us. Then came Cypress — an all-in-one JavaScript testing framework, assertion library, with mocking and stubbing without Selenium. As a team, we instantly fell in love with it and pushed forward on a plan for transition. In this post, I am going to talk about all the fun and challenges during our path as we converted our technology from Selenium to Cypress.

Our “Selenium” Issues

Changing tools or frameworks is never easy, but a good plan helps. Before we dive into the details on how we transition from Selenium to Cypress, let’s take a look at our issues with our existing Selenium framework.

Existing Framework: C# with Selenium WebDriver, bundled within the main ServiceTitan application codebase. Since our tests/framework reside within our main application, our test data generation objects were basically extended from some of the core application’s existing models and controllers. Our framework was designed to run our tests against a locally hosted/built application. The main approach of this architecture allowed each developer to run tests locally against their work branch before merging back to the main release/master branch. With that method, we reduced long regression test cycles.

Developing with Selenium

Our company is growing very quickly with developers being hiring at lightning speed, and with our current approach, scaling our tests had becoming a problem due to the following reasons:

Expensive

Slow startup, setup, and teardown
High test maintenance costs

Unstable

Out-of-process communication
Dependence on waits, builder specs, and amount of data
The dependency on the main application creates a lot of flaky tests

Rigid

Not portable enough to run on a live environment
Dependency on the main application.

How Does Cypress Help?

We started looking at other frameworks based on our problems. We POCed a few others like nightwatch.js, puppeteer, etc. But when we started to explore Cypress, we felt like we might have found a match! Given all the features provided by Cypress, we are now circling back to the problems I mentioned earlier:

Inexpensive

Cypress is fast to deploy and execute tests
Cypress debugger runs live in the browser; allowing for fast test updates and maintenance and requires no external driver

Stable

Automatic waits and retries

Flexible

Can run on any environment by pointing tests at a new base URL
With options like ‘decoupling’ from the main app, tests are able to run against any environment

Cypress Architecture

Cypress uses a different architecture compared to Selenium. The Cypress engine directly operates inside the browser. In other words, it is the browser that is executing your test code. It also means it has native access to your Document Object Model (DOM) and all the web elements on your page; giving you absolute control.

from https://www.edgewordstraining.co.uk/cypress-vs-selenium/

At ServiceTitan, we established 4 basic principles for our QA automation engineers to implement UI tests into our Cypress framework:

Isolation — Keep tests flow against specific pages and isolates them from the rest of the application. Tests will not need to navigate outside of the target flow.
Separation of Concerns — Stubbing the backend service calls through mockup data as much as possible.
Independent — Tests should not depend on each other. One test should not know of the existence of the other ones and, therefore, should not conflict should they be run in parallel.
Stateless — Tests should be able to restart anytime or shouldn’t depend on the state of test data.

All QA Automation Engineers follow the above principles when writing integration UI tests. This way, the support boundary for these tests is well defined with this clean implementation.

Scaling

One of the biggest drawbacks to building tests on our old Selenium framework is scalability and performance. We don’t use Selenium grid; we set up 20 core CPU window servers as our test agents so that we can customize our framework to run tests in parallel. Also, we realize that once we run more than 10 threads (10 parallel tests) at the same time, the test results were flakier than when running tests with a lower number of parallelism. Basically, we have to pick between test quality and test performance.

On the other hand, by using Cypress, we do not need to sacrifice either test performance or test quality. As cypress tests, we are building through Javascript/Typescript on top of node.js, we can package all the tests and the framework and run them through our low-cost virtual Linux-base agents instead of our expensive window servers. Also, test results are much more stable.

Timeline and Results

As a result, we started our POC about 2 years ago, and today, we finally go live and have 100% replaced our old Selenium framework. And, the results of the switch are stunning. Let’s review the following improvements:

Within 18 months, we have built more than 3000+ UI tests, compared to our old framework that ran about 800 tests in 4 years.

The test performance is significantly faster, meaning we can run more regression cycles within a week.

And the most important part, we got happier and more motivated QA engineers.

Conclusion

The Cypress framework is far from perfect and can still be challenging with the implementation and adoption. There is no current support for multiple tabs/browsers. Also, they have a smaller support community compared to Selenium; but these problems are small compared to the numerous benefits that have been gained through the transition and we are excited about the future.

I would like to give a shoutout for all the hard work from my team. Especially Carlos S, Parin P, Michael R, and the entire ST Cypress-Council group. We will not be here without your hard work!

April 4, 2019April 4, 2019

Year 1 at ServiceTitan

Another year is done and dusted! Time flies! I worked for ServiceTitan over one year now. Looking back at the QA organization at the time I joined and comparing it to now, we made a lot of changes. So, it’s time to review all of them and grade the results.

Here is the observation from the first two months since I started:

All the great stuff: (Highlights)

We have some of the most incredibly talented and hard working Engineers, QAs, and product managers I have ever worked with.
The overall testing process is good, and the organization saw great value from our QAs.
Developers, QAs, and PMs collaborated very well along with each other

Stuff, that might need some help: (Opportunities)

The application produced a lot of bugs, (mostly regressions), after each release.
There is substantial Automation tests backlog.

The solution seems simple; we need to run regression tests consistently, and invest more in test automation! Well, what if you don’t have the time and resources to do both? From here, I will discuss some of the changes I made in attempting to address these two issues as listed. I will also walk through the outcome of these changes.

Organizational Decision

The team I inherited when I joined Service Titan had the following organizational structure:

This simple QA organization structure worked very well when we were small, and there was only a single product owner/development team. When the team started to grow, the QA team was running into a lot of scaling issues with this organization structure.

Why you might ask? Here are the main concerns:

Manual testers don’t do regression testing, and the assumption is that automation QAs will automate all the test cases handed off by them as part of the CI/CD automation process. Due to this assumption, our regression tests coverage were low and often time release code with a lot of bugs.
With the ever increasing of the complexity of our software design, automation QA engineers were depending on manual testers to “explain” how the feature(s) worked (it won’t be enough to create good automation if by just reading test cases alone without understanding the use cases). Often time, manual testers were too busy to accommodate the needs of the automation QAs, this caused a vast backlog and delay of the automation tests.

So after I studied all these problems carefully, I decided to roll out the following organizational changes.

Break down the silo between Automation QA teams vs manual testers, create QA teams which based on the functionality of the application(squad) which align with Dev/Product team. By doing this, we were able to identify QA leads for each group and provide growth opportunities.
Since the cost differences are minimal between a QA engineer vs a Manual Tester, I decided that we will only hire QA engineers who have experience in automation on going forward. (know how to code)
Create a new team called: QA framework, and I hired senior developers in this team whose primary function is to manage the test frameworks and train all the existing manual tester how to do automation.

Now the new structure looks like this:

Technical Decision

ST core application is written in C#, .NET framework. Our Automation framework is also written in C# nunit with selenium webdriver. By choosing nunit using parallelizable attribute allow us to run tests in parallel. This enabled test suite to scale when adding more tests.

Performance first approach: The depth of the tests are important, but as a team, we decide to make our technical decision based on “test performance” as a priority. Adherence to this decision led the team to make easy technical trade-offs such as: if tests should part of this framework, refactoring is needed, or approving test code PR. We also refactor test steps within cases to use Headless Chrome (no visual verification needed) instead of using Webdriver. This saves us 5% of execution times for all tests.

CI/CD ready: We always build our tests/framework based on the fact that we need to CI/CD ready. Tests can be run against each branch, PR and should be able to start by anyone within the team.

Process Decision

Team Communication

Something simple, set-up a weekly team meeting once a week for 15–30 minutes to do a quick check-in. It also served as meeting to provide any leadership level information pass downs amongst the team. This meeting became a primary communication channel for the overall teams.

Introducing functional/Regression test plan

There were no test planning conducted as manual testers were treated more like a support team. Stories were coming to their queue as first come first serve base. There wasn’t any document that captured what and how as a team we did our testing.

The first step on what a QA should do when testing code, is to plan how to test. I picked a couple of manual testers and automation QAs with more experiences in the team, and instruct them to create a test plan template. As a group, we also develop a process of reviewing these test plans. After we tried with a couple of teams, we then roll it out to every QA.

Create QA onboarding training plan for new hire

Training material for new hire is critical. I spent the first six months to come up with a 25 points onboarding checklist which I felt like if I knew these things during the time I was hired, I would be in so much better shape. The checklist includes things like: which group mailing alias to added or required slack channels to be included, to something like walking through the deployment process; to running your automation tests locally. I established eight goals that each new hire must complete within the first three months. Also, by providing a go-to contact within the team for each point of the checklist so that the new hire can find that help when needed.Summary

Summary

As a result

Post Release Production Issues

This graph shows our production bugs post-release. In other places, we also call these “site issues”.

Automation Test Coverage:

Automation QA Engineer by Quarter:

Automation QAs finally outnumbered Manual Testers.

Nothing is perfect, there are things still not working as expected…

Communication within the team is still not perfect, but it’s improving.

We still can’t catch all the bugs! As our test coverage improves, so do all the edge cases out there. The ideal scenario is that if we can create a bug finder which keeps crawling our application code and chew on bugs like PacMan would be awesome!

Our process is still not perfect, just like every other place I worked.

Finally! We can improve “eat club” menu. J/K. I am pretty satisfied with the stuff that they provided, but don’t mind if they throw in a couple of lobsters as the appetizer.

January 9, 2019February 21, 2019

Love and hate against the usage of SLACK!!

I love the invention of slack; but at the same time, I hate it so much I want to throw my phone away everytime I heard that “ding” sound.

Slack is a very powerful tool, especially at work. The tool allows me to talk to people and get an instant response without leaving my desk, or get my hand off my computer. The conversation between me and my team members was recorded and easily searchable. Communication and work collaboration become easy with the call and screen share features.

Because of it’s so easy to use, it makes me never stop working! My co-workers start to add me to all different private and public channels and it is quite overwhelming. Same discussions happen in multiple channels with a different group of people. At the end of the day, people ended up not responding to any critical slack messages and commented on useful facts.

This is the time that I miss the good old days when meetings run on paper and pen.

September 3, 2018December 10, 2018

Interview Questions for QA Automation Engineer

Throughout my career, I believe I’ve interviewed at least 500 candidates for QA positions. I’ve been “lucky” enough to hire some very bright engineers, who ended up having very successful careers. Now that I come to think about it… how did I find these bright engineers? I guess most people today would think “LinkedIn.” But I can tell you that my most successful hires have always come from referrals.

I believe that the screening and interview process is very important. I also believe that each hiring manager needs to create an interview plan and identify the “features” of the type of engineers they are looking for.

What do I mean by a “feature”? A feature is a set of traits that you believe will improve an engineer’s chance of being successful in the role. For me, I usually look for interviewees who love to listen to others, are patient, have good communication skills, and are good at the process of elimination.

Once I’ve identified the features I’m looking for, I need a hiring/interview plan.

Here you go:

Create a clear job description and define roles/responsibilities for the hiring role.
Identify a few key interviewers for the hiring panel.
Set up a good set of technical tests for candidates. I like to use hackerrank to set up my tests. One of the features of hackerrank that I really like is that you can create test cases and test each candidate’s solution.
Make sure you come up with a list of problems to pose to your candidates that can help you judge whether they have all the features you’re looking for.
Don’t forget to sell your team and vision to the candidates! Remember, the evaluation process goes both ways! You want to make sure the candidate likes what you’re offering.

Good luck on your next hiring!

PS, I have my list of favor questions, DM me and I’ll share with you!

February 26, 2018February 26, 2018

My Experience on using JIRA Cloud API to customize your release and quality data

I am sure a lot of us used JIRA for bug tracking, sprint planning, story telling, features logging, and etc… Most of the QA I know somewhat touches JIRA one way or the other. I will walk you thought how to connect to JIRA api cloud, also give out some of the pain points I had from my experiences.

Before I listed out the steps, I expected you know some basic about JIRA on how to setup projects as admin, create issues, and use jql. If not, please view this video

or (https://confluence.atlassian.com/jira/jira-documentation-1556.html) to get help.

Assuming you already have JIRA cloud setup, like for an example, http://yourproject.atlassian.net, and have an atlassian cloud user account, here is the step by step, on how to connect to your JIRA cloud API.

1. First you need to encoded your username (JIRA cloud email login) and password in base64. Let say your JIRA cloud login address is abc@hello.com and password is 1234, you will encode this as: #echo – n “abc@hello.com:1234” | base64 to get the encoded string for your basic JIRA api authentication. Make sure you use “-n” because echo will attach a trailing newline char at the end.

2.Now you can connect using this simple ruby script:

require 'httparty'

## Create your JQL query here
jql = 'text~ "' + "find X".to_s + '"' + " and project = YOURPROJECT"
yourencoded = "" ## Put your encoded string on step #1 above

## Header
@jurl= "https://yourcompany.atlassian.net/rest/api/2/search?jql=" + URI.encode(uri_text)

### Now Loop through each issue from the search
 response=HTTParty.get(
 @surl,
 headers: {
 "Authorization"=> 'Basic '+ yourencoded.to_s,
 "Content-Type"=> 'application/json'
 }
 )
 result=JSON.parse(response.body)
 result["issues"].each do | issue |
    ### Now refer to the JIRA issue doc, you can do your logic here
 end

3. you are done, I will put this in some kind of chart tool so you can create your own dashboard

February 8, 2018

Should I “dress up” for a software QA engineer interview?

One of my previous report asked me recently, “should I wear a tie for my interview?” (Sorry, my friend is a guy, generally, he meant should we dress up before we head to an QA interview?

My simple answer is, yes. “Dressing up” doesn’t necessary mean you have to wear a tie, or suit, or nice evening gown. “Dressing up” to me meaning business causal. Something that it’s proper for the company business environment that you are interviewing for.

A typical QA interview includes 4 major parts

Personality test (whether you fit the team, able to work with others)
Technical test (mostly coding skill)
Testing process/development process test (test you knowledge on testing and development, also, if you can prioritize tasks)
Problem solving skill

Dressing up before you for go an interview doesn’t guarantee you the job. But it provide an impression to your interviewer that you respect the company and you treat the interview seriously. At the point that if you and another candidate did very well on the first 4 parts, and you dressed up and look more professional than the other candidate, you will have a slice advantage to get the job.

Rule of thumb, dress to impress, but don’t over-dress.

January 20, 2018January 21, 2018

How to make a QA team full of manual testers into automation engineers

This is a real question from one of my interviewers…

Hi Dickson,

Currently, I have a team full of manual QA testers, how do you convert them and change them into Automation test Engineers?

What’s in my head…

Fire all and re hire real automation QA engineers
No, you can’t, but what you can do is have them work 16 more hours a day overtime unpaid to complete all the test scenarios.
I automated everything myself, while all manual QA testers are enhancing their foos ball skills

That’s great, my potential hiring manger shakes my hand good bye. The recruiter blacklisted me everywhere, linkedin, glassdoor, ziprecruiter, monster …

Okay, without even get an offer, you are fired!

After more serious revising my draft (joking aside) Here are the steps on as a QA manager, how you can help turn a team into Automation engineers

Provide them ALL the necessary trainings that to get them started. Meaning, you have to be their personal trainer. Every QA tester has technical background and level. You need to find a right training program for each of them. For example, some of them already have basic knowledge, like setup DB, git, some sort of coding/script skill. Some, might even don’t have those. You will have to make different trainings and mentoring programs for each of your team members.
Show them you can port a example case. Simple fact, for the folks that have kids. If your kids see you wash your hand every time before a meal, eventually, they will do the same thing after a few meals/days/weeks. You as a leader, show that you can, and WANT to do it yourself. Then good things will follow.
Provide a good starting automation framework. this has to be simple, easy to use and easy to pick up by anyone. Look at all the current BDD test frameworks and discuss with all your QAs as a group. Learning it together.
Encourage and praise on their progress and results. Always measures each QA testers their automation progress and provide feedbacks on where they might need more refinement.

Trust me, all manual QAs doesn’t want to just be that manual QA. By doing this, you are providing them a grow path and something more exciting then just replaying same test steps over and over again.