Photon: The Swiss Army Knife for Deployments

This blog post will walk you through how we’ve automated our Micro-services deployment using photon.


If working on such problems excite you, our SRE team is hiring. If you care about developer productivity and tooling—check out the job description and apply here.


Feb 2021

At slice, our team is growing at a 3X scale, and the number of engineers and micro-services have started increasing. So automatically we were looking for a way to automate our CI/CD pipelines. We had already done POCs with Packer and Jenkins-based scripts. But we found that they were not universal; we couldn’t use the same learnings from one pipeline for all services and in all environments. Also, every time the service had different deployment requirements, we had to customise the pipeline again.

March 2021

We went to the whiteboard. We already had experience with hercules, and we loved the idea of using Docker as immutable carriers of an application. As a result, we designed photon.

The idea was simple; We wanted to build a tool that captures a developer’s wishes, deploys, updates, and deletes the application. We gave it over to the devs to decide how they’d want to consume the photon CRUD interface.

Developers package their code in the form of docker images to private registries. Provide a configuration file. Photon reads the configuration and immediately deploys the application over AWS Fargate. Photon takes care of logging, distributed tracing, and the deployment pattern itself.

To use photon, a developer-only needs to run the following commands.

photon –file deployment.json #for creating and updating the infra
photon –file deployment.json –terminate #deletes the infra

We then used photon for one of our use cases.

  • A developer commits in his specific branch, a photon pipeline runs and gets an endpoint for his application server over slack.
  • The developer then merges his branch to the staging environment. photon terminates all his feature branch-related infra, and the photon pipeline updates staging deployment.
  • We use a similar flow for staging, beta and production environments.

Oct 2021

photon now supports a lot of features by itself. Here are a few:

  • Logging – using fluent bit (which can be directed to any standard logging mechanism) and cloud watch.
  • Distributed tracing.
  • Different Deployment Patterns – B/G, Canary and Linear Deployment.
  • Secrets and Configuration Management.

The developer can again disable/enable and customise these features from the same configuration file.

The interface remains the same.

Jan 2022

photon is currently used to deploy over 50+ micro-services, different consumers across different environments at slice. It creates, updates, and cleans up the infra at the developer’s will. We currently benchmark the ease for deploying services using photon for developers to be less than an hour, for all environments.

photon has made service deployment a declarative and productive process — enjoyable and hassle-free and we will open-source it in the coming quarter.

Watch this space for the next blogs where Adersh and Kumaran will dig deeper into the philosophy and architecture of photon.

CloudCustodian At Slice

Regulation is an exciting topic and is directly proportional to fair and clean working of a process. Interestingly, it is also inversely related to the freedom with which the process could be performed. Take Indian markets, for instance, when the 1991 moment removed regulation, the productivity of the market increased, but lawmakers still have to come every year to amend the processes of the market to be fair and clean.

At slice, our infrastructure is growing, and so is our team. We wanted to be productive to an extent. A developer should ideally get a bucket up and ready in seconds but we also wanted to make sure that it is launched as per the configuration and is compliant with our organisation’s policy. But it’s extremely tricky to convey to every developer in an ever-growing team about all the types of configuration required. Would you want him to just create the bucket or first go through a document of how to launch the bucket in org ? See, Security and Productivity are inversely proportional, just like the Regulation and the Markets are.

Enters cloudcustodian,

Cloud Custodian runs a set of policies against your cloud resources. It checks if your resource is compliant with the set of filters defined in the policy as well as performs an action if you want to – make the bucket private, which you accidentally made public.

A typical cloudcustodian policy for the above use case looks like this .

It has two major components:

  1. Filters: The set of filters to be run against your resources. You can apply operators, combine them together as well . Cloudcustodian also comes with predefined standard filters for resources, or you can create them custom by filtering the values based on the describe-resource api call for aws resource. Ex: aws ec2 describe-instances.
  2. Actions: The set of actions you want to perform on the resources selected via filters. Again, cloudcustodian comes with predefined standard actions for resources, or you can attach a lambda handler to perform any custom action. 

Our infrastructure team defines what the right configuration to launch an s3 bucket is, commits it into a git repo. Next, our custodian runs on the hercules platform and picks these policies up from an s3 bucket. It further runs against all resources in our multiple AWS accounts and sends alerts, and aggregates into AWS Security Hub. Lastly, it performs an action if it’s a critical configuration mismatch. 

Drawing comparisons from the market analogy, the custodian runs like the constant regulator in a market, making sure it’s fair and clean. The developers are like the entrepreneurs who remain productive without ever having a second thought about the security.

So, this was all about the CloudCustodian at slice. To know more about the amazing things that slice’s engineering team does, keep an eye on this space! 

Session Manager: Driving operational excellence at slice!

Goodbye SSH and bastion hosts. Hello SSM!

As much as we’d like to run our servers like cattle (pets vs cattle mantra), there are times that call for interactive shell access to instances. 

This translates to audited secure access to cloud resources either through bastion hosts or through SSH keys, which in turn opens up a Pandora’s box of bastion management and tight SSH security. 

Surely this age-old problem of remote server access was looking for a cloud-native solution for it. Enter session manager, the future of remote server management. 

So how does session manager improve upon traditional remote access technologies? Here are a list of it’s features: 

  • No inbound security rules required to access instances. This means, 0 ports have to be opened to allow remote access. 
  • All user sessions and commands are logged with optional encryption via KMS.
  • Integration with existing IAM policies to allow robust access control.
  • SSH tunneling over session manager. 

The architecture diagram below provides a high level overview of how session manager works.

session manager architecture

Let’s look at how to setup and enable session manager for AWS instances.

Configuring session manager

1. IAM permissions for instance: The easiest way to get started is to attach the AmazonSSMRoleForInstancesQuickSetup role to your instance.

IAM role for session manager

If your instance already has a role attached to it, the AmazonSSMManagedInstanceCore policy can be attached to the existing role.

IAM policy for session manager

2. IAM permissions for users: You need to create policies to allow access to an EC2 instance for specific IAM users and roles. The below policy grants access to EC2 instances with the name tag of API:

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Action": [
         "ssm:StartSession"
       ],
       "Resource": "arn:aws:ec2:::instance/*",
       "Condition": {
         "StringEquals": {
           "ssm:resourceTag/name": "API"
         }
       }
     },
     {
       "Effect": "Allow",
       "Action": [
         "ssm:TerminateSession"
       ],
       "Resource": [
         "arn:aws:ssm:::session/${aws:username}-*"
       ]
     },
     {
       "Effect": "Allow",
       "Action": [
         "ssm:GetConnectionStatus",
         "ssm:DescribeSessions",
         "ssm:DescribeInstanceProperties",
         "ec2:DescribeInstances"
       ],
       "Resource": "*"
     }
   ]
 }

More info on configuring policy can be found here

3. SSM agent installation: You need to make sure your Amazon Machine Images (AMIs) have SSM Agent installed. SSM Agent is preinstalled, by default on popular AMI’s like Amazon Linux, Ubuntu Server etc. If not, the agent can be manually installed from the command:

sudo yum install -y https://s3.region.amazonaws.com/amazon-ssm-region/latest/linux_amd64/amazon-ssm-agent.rpm

More info on installing and enabling agent can be found here

4. Audit logs: Session Manager can store audit logs in a CloudWatch log group or an S3 bucket. However, the option has to be enabled in Session Manager -> Preferences.

S3 logging for session manager

Using session manager

A session can be started by an authenticated user either from the AWS management console or through CLI. 

  1. Starting a session (console): Either the EC2 console or the Systems Manager console can be used to start a session.
Connect through the EC2 console

2. Starting a session (AWS CLI): Using session manager through the CLI calls for an additional requirement of installing the SSM plugin:

  • Prerequisites: 
    1. AWS CLI version 1.16.12 or higher
    2. Session manager plugin – Install instructions for different systems here
  • Starting a session: 
aws ssm start-session --target "<instance_id>"

3. Using SSH and SCP with session manager: One of the major limitations of session manager when it was launched was its inability to copy files without going through S3. 

Now the AWS-StartSSHSession document supports tunnelling SSH traffic through session manager.

Note: Using this functionality requires the use of a key that is associated with the instance. Logging is unavailable for sessions that connect through SSH as SSH encrypts all transit data.

Steps to use SSH/SCP with session manager: 

  1. Verify that prerequisites mentioned above are met.
  2. Add the below lines to SSH config to allow session manager tunneling. The SSH configuration file is typically located at ~/.ssh/config.
# SSH over Session Manager

host i-* mi-*
ProxyCommand sh -c "aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'"

SSH into instance with Session Manager: SSH can be performed as normal using the instance-id as the hostname. Example:

% ssh ec2-user@<instance_id>
Last login: Wed Oct 28 10:53:22 2020 from ip-<instance_ip>.ap-south-1.compute.internal
[ec2-user@ip-<instance_ip> ~]$

SCP to copy files with Session Manager: SCP can be performed as normal using the instance-id as the hostname. Example:

% scp test ec2-user@<instance_id>:test
test           100%    0     0.0KB/s   00:00

Wrapping up

Session manager defies the saying,

“Convenience is the enemy of security by being both convenient and secure.” 

The ease of using session manager along with its ability to tunnel SSH traffic allows us to phase out SSH and switch completely to session manager. No more open SSH ports!

Combining session manager with the extended capabilities systems manager provides like patching, automation documents and run command makes for a powerful ops workflow.

If you are invested in AWS cloud, leveraging session manager is a no brainer!

Here at slice, we are constantly working towards creating new tools, every day, to streamline our workflow. So, stay tuned for more!