Building a Bot to Handle 18,000+ SIMKATMAWA Entries

Some technical projects start from curiosity.

This one started from exhaustion.

I was dealing with a reporting workflow in SIMKATMAWA that looked simple on paper, but became unreasonable once the data volume grew. There was no API, no bulk import, and no practical shortcut. Every record had to be entered manually through the browser, one by one, along with supporting details and uploaded files.

For a small number of entries, maybe that is still manageable. But once the records reach the thousands, it stops being routine administrative work. It becomes a bottleneck.

In the case I handled, 2,759 certification records expanded into 18,272 executable jobs. The manual estimate was around 5 to 10 minutes per entry, which means the total workload could easily reach 1,500 to 3,000 hours. That is not just inefficient. It is the kind of work that drains focus, increases error risk, and quietly consumes too much time from people who should be doing more meaningful things.

That was the point where I felt this process should not stay manual.

Why I Built It

What pushed me to build this was not only the number of records, but the nature of the work itself. Repetitive input across thousands of browser forms is exactly the kind of task that sounds harmless until you actually have to live through it.

And once you do, the pain becomes obvious.

The challenge was also quite specific. I could not rely on backend integration because the platform did not provide it. So the only realistic option was to automate the browser itself, carefully and responsibly, as if a human operator were doing the work, but with better consistency and endurance.

I did not want to build a gimmick. I wanted something I could trust to run for days.

What I Built

The result was a browser automation system with a web dashboard on top of it.

At the center of it was Playwright running on Chromium, responsible for navigating pages, filling forms, uploading files, retrying when needed, and keeping the process moving. Around that engine, I built a dashboard to control the bot in real time, with start, pause, stop, and resume actions, plus log streaming and job monitoring. The backend handled imported CSV or Excel files, converted them into executable jobs, and managed user credentials securely.

The stack was quite straightforward:

  • Node.js with TypeScript
  • Playwright with Chromium
  • Express
  • MySQL / MariaDB
  • PM2
  • Ubuntu with Nginx

Nothing about this stack was chosen to look impressive. I picked it because I needed something practical, stable, and easy to keep running.

Core Features

To handle the scale and complexity of 18,000+ automated entries, I designed the system around several features that focused not only on execution, but also on durability, visibility, and operational stability.

  1. State Persistence and Crash Recovery.
    The engine persists its state to disk regularly, every 5 jobs, including the status of each job such as pending, done, failed, or skipped, along with progress counters, current index, and error messages. This allows the bot to recover from crashes or server restarts and continue from the last known position without replaying thousands of completed jobs. For an automation process that runs for 60 to 70 hours non-stop, this feature is essential.
  2. Multi-Phase Retry and Exponential Backoff.
    Each job is divided into several execution phases such as navigation, form filling, file upload, and submit. If a failure happens in a particular phase, the retry starts from that phase instead of restarting the entire job from the beginning. The retry delay uses exponential backoff, moving from 3 seconds to 6 seconds and then 12 seconds, giving the target platform enough time to recover before the bot attempts the next action.
  3. Smart Throttle.
    The bot adjusts its delay dynamically based on the platform response time. When the server becomes slower or busier, the bot backs off. When conditions improve, it proceeds faster again. This helps the automation remain stable while also behaving responsibly toward a shared system that may be used by many institutions at the same time.
  4. Server-Side Pagination and Smart Refresh.
    The dashboard uses server-side pagination, sending only 50 rows per page with filtering handled on the server. This keeps the interface responsive even while the bot is processing thousands of jobs in the background. Data refresh is also optimized. Instead of reloading on every SSE tick, the dashboard only refreshes when meaningful counters change, such as success or failed totals increasing.
  5. Live Dashboard and Monitoring.
    The web dashboard provides full operational control with start, pause, stop, and resume actions. It also includes live log streaming through Server-Sent Events, progress tracking, screenshot-on-failure for debugging, and a job table with status filters. I did not want the automation to feel like a black box. If something went wrong, I wanted enough context to understand what happened and where it happened.

The Part That Actually Matters

For me, the interesting part of this project was never simply that the bot could click forms automatically.

That part is easy to say. The harder part is making the automation reliable enough that you are willing to let it run unattended for more than 60 hours without constantly worrying that everything will fall apart halfway through.

That is why most of my attention went into reliability rather than just speed.

I wanted the system to preserve progress, recover gracefully, and remain observable during long execution windows. Over time, that made the project feel less like a script and more like a real internal tool that could actually be trusted.

What Came Out of It

On the real run, the bot was able to process the workload at a scale that would have been very difficult to handle manually. What mattered most to me was not only that the automation worked, but that it stayed stable throughout a long-running execution and kept the overall progress intact.

Metric Result
Total jobs 18,272
Success rate 99.8%
Recoverable failures 6 to 10 jobs
Average throughput 4 to 5 jobs per minute
Total runtime Around 65 hours
Estimated manual effort 1,500 to 3,000 hours
Estimated time saved 1,435 to 2,935 hours

The bot ran continuously on a VPS with PM2 auto-restart and persistence enabled, so even when interruptions happened at the server level, the process could continue without losing the broader execution state.

What Stayed With Me

This project reminded me that not every useful system has to be flashy.

Sometimes the most valuable thing you can build is something that quietly removes friction from other people’s work. Something that takes a process everyone has accepted as tedious and makes it lighter, cleaner, and more survivable.

A lot of digital work is like that. The problem is not always a lack of ideas. Sometimes the problem is that too much energy is being spent on tasks that should have been automated long ago.

This bot was my answer to one of those tasks.

And honestly, I think that is why I like projects like this. They sit in that space between engineering and empathy. You are still solving a technical problem, but what you are really improving is someone’s working day.