Lewis Jackson Self-Improving Trading Agent Framework

Last updated: 24 May 2026

Deploy a 24/7 autonomous trading agent that learns from its own trade outcomes and iteratively self-improves its strategy using the Hermes agent's built-in self-learning loop — without manual retraining.

// TL;DR

The Lewis Jackson Self-Improving Trading Agent Framework is a method for deploying a 24/7 autonomous trading agent that learns from its own trade outcomes using the Hermes agent's built-in self-learning loop. It applies the scientific method — changing one variable per cycle — to iteratively improve a trading strategy without manual retraining. Use it when you want to move beyond static trading bots and build an adaptive system that compounds improvements over time, hosted continuously in the cloud via Railway so it never goes offline.

Framework

// When should I use the Lewis Jackson Self-Improving Trading Agent Framework?

Use this skill when you want to build or upgrade an automated trading agent that goes beyond static rules and can adapt its strategy over time. Trigger it when you have a trading strategy (or want one built) and want to architect it into a self-improving system running continuously in the cloud.

// What inputs do I need to build a self-improving trading agent?

Existing trading strategy (optional)
A pre-built strategy file or documentation on your current approach (e.g. entry/exit rules, asset class, position sizing). If absent, the onboarding agent will scaffold a basic baseline strategy for you.
Target assetrequired
The asset or market you want the agent to trade (e.g. Bitcoin, Ethereum, Solana, a specific subnet token, forex pair, etc.).
Success definitionrequired
A specific, measurable goal — e.g. target monthly return percentage, minimum Sharpe score, maximum drawdown threshold. Must be within realistic bounds given your starting capital.
Failure definitionrequired
The threshold or condition that constitutes failure — e.g. drawdown below X%, return below Y% over Z days. Used by the agent to orient improvement direction.
Starting capital amountrequired
The capital available for trading. Used to sanity-check whether the success definition is achievable (e.g. targeting $1M/month on $10 capital is flagged as impossible).
Railway accountrequired
A Railway.app account for 24/7 cloud hosting of the agent so it runs regardless of whether your local machine is on.

// What core principles make the self-improving trading agent work?

Accuracy

The data feeding the agent must be clean, consistent, and objectively interpreted. Unreliable API connections or ambiguous news-feed parsing introduce errors that corrupt every downstream decision. Establish strict rules so conclusions drawn from incoming data are accurate and objective, not dependent on the AI's mood or interpretation drift.

Reliability

The agent must operate 24/7 regardless of local machine state. Host on a cloud server (Railway) so execution is never interrupted by a shutdown or network drop. Reliability means the system is always executing — not just when you're watching.

Well-Defined Goal

The agent needs a destination, not vibes. Define exactly what success looks like (specific return target, Sharpe score threshold) and exactly what failure looks like. The agent uses this polarity — toward-goal vs. away-from-goal — to orient every improvement cycle. Without this, 90% of trading agents are flying blind.

Self-Improving (Scientific Method Loop)

The agent must assemble outcomes, analyze whether they moved toward or away from the goal, form a hypothesis about why, then form a second hypothesis about what to change next — and apply only ONE variable change per cycle. This is the scientific method applied to strategy iteration: change one variable, observe the outcome, make the winning version the new baseline, repeat.

One Variable At A Time

When iterating the strategy, change only a single variable per cycle. If you changed multiple variables and profitability improved, you cannot know which variable was responsible. Single-variable testing produces a clean learning signal that compounds over cycles.

Oneshot Prompt Architecture

The entire agent setup — environment detection, strategy onboarding, scaffolding, cloud deployment, and Hermes installation — is triggered by a single copy-paste prompt fed into Claude Code. This reduces setup friction to near-zero and ensures reproducibility. The oneshot prompt improves over time as community feedback is incorporated.

// How do you set up the self-improving trading agent step by step?

1
Define your four agent criteria before writing a single line of code
Before touching any tooling, answer in writing: (1) How will I ensure data accuracy? (2) How will I guarantee 24/7 reliability? (3) What is my exact success definition (target return, Sharpe score, drawdown limit)? (4) What is my exact failure definition? These four answers become the scoring and improvement backbone of the entire agent. Do not skip this — without a well-defined goal, the self-improvement loop has no direction.
2
Obtain the oneshot prompt from the creator's community resource
Retrieve the most current version of the oneshot prompt (Lewis Jackson stores these in 01 Systems community > Classroom > YouTube Video Prompts). Always use the latest version — these prompts are themselves iterated based on community feedback and supersede any version shown in the video.
3
Open Claude Code in your terminal and paste the oneshot prompt
Launch a terminal session with Claude Code. Paste the oneshot prompt in full. The prompt initiates a guided multi-phase onboarding flow — do not try to shortcut or pre-answer phases; let the agent walk through each phase in sequence.
4
Complete Phase 1 — Environment Check
The agent detects your OS (Mac or Windows) and available runtimes (e.g. Node.js). It will fork into OS-specific instructions. Confirm your environment when prompted. No action needed beyond confirmation if your environment is detected correctly.
5
Complete Phase 2 — Strategy Definition
Choose one of three paths: (A) Point the agent at an existing strategy file on your machine by name — it will locate, parse, and extract goals and parameters from it. (B) Tell the agent you have no strategy and want a basic one scaffolded — it will create a solid baseline. (C) Build a new strategy interactively with the onboarding agent. Whichever path, the agent will output a strategy document including: asset, position limits, slippage tolerance, gas reserve, scorer weights, target return, Sharpe floor, max drawdown, and failure thresholds. Confirm or correct these before proceeding.
6
Complete Phase 3 — Scaffold the Hermes Side-State
The agent creates all folders, files, and a Hermes-readable trade ledger. Existing trade history (wins and losses) is converted into structured data Hermes can analyze. Review the generated strategy document and goals document to confirm they accurately reflect your intent — this is the source of truth the self-improvement loop will reference.
7
Complete Phase 4 — Strategy Deployment (if not already live)
If you already have a live strategy, this phase may be skipped automatically. If not, the agent will walk you through connecting the necessary APIs to make trades execute. Ensure API connections are validated before proceeding — inaccurate API connections undermine the Accuracy criterion.
8
Authenticate with Railway for 24/7 cloud hosting
When prompted, the agent will attempt a Railway CLI login. If the interactive login fails inside the session, open a split terminal, paste the provided login command, complete browser authentication, then return and type 'done, continuing'. If you don't have a Railway account, create one during this step — it is free up to a significant usage threshold. This step ensures Reliability: the agent runs 24/7 regardless of local machine state.
9
Allow the agent to install Hermes automatically
The oneshot prompt installs Hermes agent as part of the handoff phase. Verify installation by opening a new terminal and typing 'hermes' — if it launches, installation succeeded. Hermes is now the self-learning brain that will review trades on a weekly cadence, own portfolio mechanics and score weights, and write updated strategy iterations.
10
Review the final configuration summary and confirm
The agent will output a full confirmation: deployed strategy name, live asset, Railway hosting status, Hermes brain assignment, review cadence, Sharpe floor, max drawdown, and operating mode. The FIRST Hermes cycle is READ-ONLY — Hermes observes and produces a markdown review but does not write to the live strategy yet. You must manually approve the transition to live mode by editing the Hermes trading strategy YAML. Do not flip to live mode until you have reviewed the first cycle's output.
11
Monitor improvement cycles and approve strategy promotions
Hermes reviews trades on a weekly cadence. It will produce scored hypotheses, change ONE variable, and run the next cycle against the new baseline. Use the check-in commands provided in the final summary to inspect cycle outputs. When Hermes produces a cycle result that moves toward your success definition, it becomes the new baseline. Track directional progress: toward-goal outputs are good signals; away-from-goal outputs inform the next hypothesis.

// What does the self-improving trading agent look like in practice?

A trader has a manually-built momentum strategy for a Layer-1 token with 6 weeks of trade history (roughly 50 trades, mix of wins and losses) but has never been able to systematize improvement.

Feed the existing strategy file into Phase 2 path A. The agent extracts current parameters (position limits, entry signals, slippage tolerance) and trade history. Phase 3 converts the 50 trades into a Hermes-readable ledger. Success is defined as 15% monthly return with Sharpe ≥ 1.2 and max drawdown ≤ 12%; failure is defined as two consecutive months below 5% return. Hermes runs read-only for Week 1, identifies that position sizing on losing trades was 40% larger than on winning trades (one variable), proposes reducing max position size by 20% as the sole change for Cycle 2, and the trader approves. This becomes the new baseline if Cycle 2 moves toward goal.

A complete beginner has no existing strategy, only a small starting capital, and wants to trade a major crypto asset.

In Phase 2, select path B (no existing strategy). The agent scaffolds a basic baseline strategy appropriate for the asset. Success and failure definitions are set conservatively given the capital size — the agent will flag any success definition that is mathematically impossible given starting capital (e.g. targeting 10x returns in 30 days on minimal capital). Hermes begins the self-improvement loop from this baseline, changing one variable per weekly cycle (e.g. first testing a tighter stop-loss, then a different entry signal), compounding improvements toward the defined goal over time.

// What mistakes should I avoid when building a self-improving trading agent?

Defining a success goal that is impossible relative to your starting capital — the agent cannot self-improve toward an unreachable target and will waste cycles. Sanity-check: $10 starting capital cannot target $1M/month.
Skipping or vaguely defining the failure threshold — without a clear failure definition, the agent has no polarity to orient its improvement direction. 'Losing money' is not a failure definition; 'drawdown exceeding 15% in any 30-day window' is.
Changing multiple strategy variables between cycles — if you or a secondary agent modifies several parameters simultaneously, the learning signal is corrupted. You cannot attribute the result to any single change. Enforce the scientific method: one variable per cycle.
Flipping to live trading mode before reviewing the first Hermes read-only cycle — the first cycle exists to validate that Hermes has correctly understood your strategy and goals. Skipping this review risks live capital being managed by a misconfigured loop.
Using inaccurate or inconsistently sourced data — different AI interpretations of the same news article can produce different conclusions. Establish objective, rules-based interpretation criteria for all non-numerical inputs before deploying.
Hosting the agent locally instead of on Railway — a local-only agent goes offline when your machine does, violating the Reliability criterion. The 24/7 cloud hosting is non-negotiable for the system to function as designed.
Treating the oneshot prompt as static — the prompt improves over time based on community feedback. Always pull the latest version from the community resource rather than reusing a version from a video recording.

// What key terms should I know for the self-improving trading agent framework?

Self-Improving Trading Agent: An autonomous trading agent that completes a continuous loop: execute strategy → observe outcome → analyze toward-goal or away-from-goal → form hypothesis → change one variable → update strategy → repeat. Distinct from a static bot that executes fixed rules without learning.
Hermes Agent: The self-learning AI brain installed into the trading agent stack. Hermes natively learns from every interaction and engagement without requiring manual retraining instructions. It owns portfolio mechanics, score weights, and weekly strategy review cycles.
Oneshot Prompt: A single copy-paste prompt that, when fed into Claude Code, orchestrates the entire agent setup end-to-end: environment detection, strategy onboarding, scaffolding, cloud deployment, and Hermes installation. No multi-step manual configuration required.
Well-Defined Goal: A specific, measurable destination that includes both a success definition (e.g. target return, minimum Sharpe score) and a failure definition (e.g. maximum drawdown, minimum return floor). The agent uses this polarity to orient every improvement cycle.
Sharpe Score: A numerical metric representing the risk-adjusted profitability of a trading strategy. Used as a quantitative success/failure threshold in the agent's goal definition rather than relying on subjective performance assessment.
Hermes-Readable Ledger: A structured file of all historical trades (wins and losses) converted into a format Hermes can parse, score, and learn from. Generated automatically during Phase 3 scaffolding from existing trade history.
Scientific Method Loop: The self-improvement cycle protocol: change only ONE variable per iteration, observe the outcome, promote the better-performing version to the new baseline, then iterate again. Ensures clean attribution of performance changes to specific strategy modifications.
Railway: Cloud hosting platform used to run the trading agent 24/7, independent of the user's local machine. Integrates with the terminal via CLI so strategy updates are pushed automatically without manual redeployment.
Read-Only Cycle: The first Hermes review cycle, in which Hermes observes, scores, and produces a markdown analysis of the strategy but does not write any changes to the live strategy. The user must manually approve the transition to live write mode.
Score Weights: Configurable parameters in the strategy document that determine how different trade outcomes are scored relative to the defined goal. Hermes owns and adjusts score weights as part of its self-improvement process.
Weekly Cadence: The default review and improvement cycle frequency for Hermes — once per week, with a 3-day offset from any secondary agent (e.g. Cornelius) to prevent simultaneous conflicting parameter updates.
01 Systems: Lewis Jackson's free community resource where all oneshot prompts from his videos are stored, versioned, and updated. Always retrieve prompts from here rather than manually transcribing from video to ensure you have the most improved version.

// FREQUENTLY ASKED QUESTIONS

What is the Lewis Jackson Self-Improving Trading Agent Framework?

It is a framework for building an autonomous trading agent that continuously learns from its own trade outcomes and iteratively improves its strategy using the Hermes agent's self-learning loop. Unlike static trading bots, it applies the scientific method — changing one variable per cycle — to compound improvements over time. The entire setup is triggered by a single oneshot prompt pasted into Claude Code and hosted 24/7 on Railway.

What is the Hermes agent in the self-improving trading framework?

Hermes is the self-learning AI brain installed into the trading agent stack. It natively learns from every trade interaction without requiring manual retraining. Hermes owns portfolio mechanics, score weights, and weekly strategy review cycles. Its first cycle is always read-only — it observes and produces analysis but doesn't modify the live strategy until you manually approve the transition to live write mode.

How do I set up the Lewis Jackson self-improving trading agent?

Paste the latest oneshot prompt from the 01 Systems community into Claude Code. The prompt initiates a multi-phase guided flow: environment detection, strategy definition (use an existing strategy or have one scaffolded), Hermes side-state scaffolding, Railway cloud deployment, and Hermes installation. The entire process is orchestrated by the single prompt — no multi-step manual configuration is needed. Always pull the latest prompt version from the community resource.

How does the self-improvement loop work in the Lewis Jackson trading agent?

Hermes reviews trades on a weekly cadence. It assembles outcomes, analyzes whether they moved toward or away from the defined goal, forms a hypothesis about why, then proposes changing exactly one variable. If the cycle result improves performance, that version becomes the new baseline. This scientific method loop — single-variable testing with clean attribution — compounds improvements over successive cycles without corrupting the learning signal.

How does the Lewis Jackson trading agent compare to a regular trading bot?

A regular trading bot executes fixed rules without learning — if market conditions change, performance degrades until a human intervenes. The Lewis Jackson framework adds a continuous self-improvement loop where the agent analyzes its own outcomes, identifies what worked or didn't, and iterates its strategy automatically. It also enforces the scientific method (one variable change per cycle) and requires well-defined success and failure thresholds, which most static bots lack entirely.

When should I use the self-improving trading agent framework?

Use it when you have a trading strategy (or want one built) and want to architect it into a system that adapts over time without manual retraining. It's ideal when you're tired of static bots that degrade in changing markets, when you want 24/7 uninterrupted execution via cloud hosting, or when you want a structured scientific approach to strategy iteration rather than ad-hoc parameter tweaking.

What results can I expect from the Lewis Jackson self-improving trading agent?

Results depend on your starting strategy, asset, capital, and how realistic your success definition is. The framework ensures directional progress: each weekly cycle either moves toward your goal (becoming the new baseline) or produces a learning signal that informs the next hypothesis. Expect incremental compounding improvements rather than overnight transformation. The agent will flag mathematically impossible targets — like $1M/month from $10 starting capital — before wasting cycles.

Do I need coding experience to use the Lewis Jackson trading agent framework?

No deep coding experience is required. The oneshot prompt architecture handles environment detection, scaffolding, deployment, and Hermes installation automatically through Claude Code. You interact via guided prompts in a terminal. However, you should be comfortable using a terminal, creating a Railway account, and reviewing strategy documents. Understanding basic trading concepts like Sharpe ratio, drawdown, and position sizing will help you define meaningful goals.

What inputs do I need before building the self-improving trading agent?

You need five required inputs: the target asset you want to trade, a specific measurable success definition (e.g., 15% monthly return, Sharpe ≥ 1.2), a failure definition (e.g., drawdown exceeding 12%), your starting capital amount, and a Railway.app account for 24/7 cloud hosting. An existing trading strategy is optional — if you don't have one, the onboarding agent will scaffold a basic baseline for you.

Why does the framework only change one variable per improvement cycle?

Single-variable testing produces a clean learning signal. If you change multiple variables and profitability improves, you cannot determine which change was responsible. This corrupted attribution compounds over time, leading to strategies built on false assumptions. By enforcing one variable per cycle — the scientific method — every improvement is clearly attributable, and the compounding effect of verified improvements produces reliable, directional progress toward your goal.

What is the oneshot prompt in the Lewis Jackson framework?

The oneshot prompt is a single copy-paste prompt that orchestrates the entire agent setup when fed into Claude Code. It handles environment detection, strategy onboarding, file scaffolding, cloud deployment to Railway, and Hermes installation — no multi-step manual configuration needed. The prompt itself improves over time based on community feedback, so always pull the latest version from the 01 Systems community resource rather than reusing one from a video.

// GET THIS SKILL — FREE