Learn how to use Kosli to trace a production 500 error in cyber-dojo back to the specific git commit that caused it — without any access to the production environment.
By the end of this tutorial, you will have traced a production incident from a 500 error all the way back to the git commit that caused it, using only Kosli CLI queries against the public cyber-dojo organization.
https://cyber-dojo.org is showing a 500 error. It was working an hour ago. What changed?
When this incident happened the flow was simply named creator. The flow has since been archived, and archiving a flow currently renames it by appending -archived-at-<timestamp>. The historical evidence is unchanged; only the displayed name is longer.
These two snapshots are part of the same : creator:b7a5908 started in snapshot #176, and creator:31dee35 stopped in snapshot #177. The new artifact arrived just before the 500 error — that is the one to investigate.
Get the full history of creator:b7a5908 with kosli search, using the fingerprint prefix from snapshot #176:
kosli search 860ad17
You should see:
Search result resolved to artifact with fingerprint 860ad172ace5aee03e6a1e3492a88b3315ecac2a899d4f159f43ca7314290d5aName: cyberdojo/creator:b7a5908Fingerprint: 860ad172ace5aee03e6a1e3492a88b3315ecac2a899d4f159f43ca7314290d5aHas provenance: trueFlow: creator-archived-at-1707630496Git commit: b7a590836cf140e17da3f01eadd5eca17d9efc65Commit URL: https://github.com/cyber-dojo/creator/commit/b7a590836cf140e17da3f01eadd5eca17d9efc65Build URL: https://github.com/cyber-dojo/creator/actions/runs/3001102984Artifact URL: https://app.kosli.com/cyber-dojo/flows/creator-archived-at-1707630496/artifacts/860ad172ace5aee03e6a1e3492a88b3315ecac2a899d4f159f43ca7314290d5aCompliance state: COMPLIANTRunning in: [ ]Exited from: [ aws-beta, aws-prod ]History: Artifact created Tue, 06 Sep 2022 16:48:07 CEST Started running in aws-beta#196 environment Tue, 06 Sep 2022 16:51:42 CEST Started running in aws-prod#176 environment Tue, 06 Sep 2022 16:52:28 CEST No longer running in aws-beta#199 environment Tue, 06 Sep 2022 21:28:42 CEST No longer running in aws-prod#179 environment Tue, 06 Sep 2022 21:30:28 CEST
The artifact started running in aws-prod at 16:52 — right when the incident began. The output includes a direct link to the git commit. (You can also see the artifact exiting both environments later that evening, once the incident was fixed by a newer commit.)
A simple typo in app.rb — an extra s inserted into the method name. The function is called respond_to, not responds_to. That one character caused the 500 error.
You traced a production 500 error back to a specific git commit — without any direct access to aws-prod. By querying the environment log and artifact history in Kosli, you identified exactly which deployment introduced the incident and which code change caused it.From here you can:
Learn more about environment and artifact queries in the Querying Kosli tutorial