Tracing a production incident back to git commits #
In this 5 minute tutorial you'll learn how Kosli can track a production incident in Cyber-dojo back to git commits.
Something has gone wrong and https://cyber-dojo.org is displaying a 500 error!
It was working an hour ago. What has happened in the last hour?
Start with the environment #
https://cyber-dojo.org is running in an AWS environment
that reports to Kosli as
Get a log of this environment's changes:
kosli log env aws-prod
You will see more than 177 snapshots because
aws-prod has moved on since this incident (it has been resolved with new
commits which have created new deployments). To get the same output as we have
you can set the interval for the command:
kosli log env aws-prod --interval 175..177
SNAPSHOT EVENT FLOW DEPLOYMENTS #177 Artifact: 274425519734.dkr.ecr.eu-central-1.amazonaws.com/creator:31dee35 creator #87 Fingerprint: 5d1c926530213dadd5c9fcbf59c8822da56e32a04b0f9c774d7cdde3cf6ba66d Description: 1 instance stopped running (from 1 to 0). Reported at: Tue, 06 Sep 2022 16:53:28 CEST #176 Artifact: 274425519734.dkr.ecr.eu-central-1.amazonaws.com/creator:b7a5908 creator #89 Fingerprint: 860ad172ace5aee03e6a1e3492a88b3315ecac2a899d4f159f43ca7314290d5a Description: 1 instance started running (from 0 to 1). Reported at: Tue, 06 Sep 2022 16:52:28 CEST ...
These two snapshots belong to the same blue-green deployment.
You see artifact
creator:b7a5908 starting in snapshot #176, and artifact
creator:31dee35 exiting in snapshot #177.
Dig into the artifact #
You are interested in #176, showing the newly running artifact,
with the fingerprint starting
Let's learn more about this artifact:
kosli get artifact creator@860ad17
Name: cyberdojo/creator:b7a5908 Flow: creator Fingerprint: 860ad172ace5aee03e6a1e3492a88b3315ecac2a899d4f159f43ca7314290d5a Created on: Tue, 06 Sep 2022 16:48:07 CEST • 21 hours ago Git commit: b7a590836cf140e17da3f01eadd5eca17d9efc65 Commit URL: https://github.com/cyber-dojo/creator/commit/b7a590836cf140e17da3f01eadd5eca17d9efc65 Build URL: https://github.com/cyber-dojo/creator/actions/runs/3001102984 State: COMPLIANT History: Artifact created Tue, 06 Sep 2022 16:48:07 CEST Deployment #88 to aws-beta environment Tue, 06 Sep 2022 16:49:59 CEST Deployment #89 to aws-prod environment Tue, 06 Sep 2022 16:51:12 CEST Started running in aws-beta#196 environment Tue, 06 Sep 2022 16:51:42 CEST Started running in aws-prod#176 environment Tue, 06 Sep 2022 16:52:28 CEST
Follow to the commit #
You can follow the commit URL.
The incident was caused by a simple typo in the
Perhaps someone accidentally inserted the "s" while trying to save the file?
Either way, this is clearly the problem because the function is called
respond_to without the
You were able to trace the problem back to a specific commit without any access to cyber-dojo's