
Senior ML Engineer
Added
2/5/2026
How Syndicated Job Posts Work
This Role is Closed
This is a Featured Job
Note: We've kept the name of the company private. If you'd like to know the company before requesting an intro, just email us at hello [at] fractionaljobs.io
About Evercred
Evercred is transforming physician credentialing from a 90–120 day manual process into a ~14-30 day automated workflow using AI agents. We are building a production system that orchestrates multi-step verification across state medical boards, education institutions, and employers—with real-time monitoring, intelligent escalation, and compliant audit trails.
Stage: Pre-revenue, 7 signed LOIs (~$250K), targeting $1M ARR by Summer 2026
Stack: Next.js, TypeScript, PostgreSQL, Prisma, Anthropic Claude API, HashiCorp Vault, LangChain-style orchestration
The Role
This is a fractional CTO-level individual contributor role focused on architecture, orchestration, observability, and reliability of a multi-agent verification system.
We are working with an external agency on agent implementation and need a senior in-house engineer to:
- Establish production monitoring and observability
- Review architecture for security and scalability
- Optimize agent performance and cost
- Build orchestration, decision logic, and escalation systems
- Ensure readiness for scale (25K+ physician wallets, 150+ enterprise customers)
Why this is interesting
- First production AI agent system in healthcare credentialing
- High-impact problem: eliminating 90–120 day bottlenecks that cost hospitals millions
- True multi-agent orchestration (parallel workflows, failure modes, latency variance)
- Non-deterministic systems engineering on LLM infrastructure
- Regulated environment: HIPAA-adjacent, Joint Commission auditability, long-term retention
What You’ll Do
Phase 1: Monitoring & Architecture Review
(Weeks 1–6 ½-time | 1–3 weeks full-time)
Observability & reliability
- Implement monitoring for agent workflows (performance, cost, success/failure)
- Build dashboards, alerting, and escalation detection
- Create debugging tools for non-deterministic agent behavior
Security & compliance
- Review architecture for sensitive PII handling (SSN, credentials, portals)
- Validate encryption, access controls, audit logging
Deliverables:
Monitoring dashboards, alerting system, security review doc, cost projection model
Phase 2: Orchestration & Decision Logic
(Weeks 7–12 ½-time | 4–8 weeks full-time)
Agent orchestration
- Design multi-agent workflow orchestration
- Implement job queues, polling, webhooks, retries, timeouts
- Build agent health monitoring and failover to manual workflows
Decision & escalation intelligence
- Auto-verify vs manual review logic
- Confidence scoring (data quality, source reliability, discrepancies)
- Explainable escalation routing
Performance & integrations
- Prompt optimization, caching, and cost controls
- API integrations (FSMB PDC, ABMS CertiFACTS, NPI, OIG/SAM, others)
- Structured/unstructured response parsing
Deliverables:
Orchestration framework, decision logic system, 3+ API integrations live, 20–30% cost-per-verification reduction
Phase 3: Scale & Production Readiness (Ongoing)
- Scale to 25K+ wallets and 150+ organizations
- Rate limiting, circuit breakers, graceful degradation
- Immutable audit trails with 7-year retention
- Agent performance analytics, anomaly detection
- Benchmarking and regression testing
What You’ll Bring
Required (5–8 years)
Production AI systems
- Shipping LLM-powered systems to production
- Building reliability on non-deterministic models
- Debugging hallucinations, regressions, agent failures
- AI observability and monitoring
Technical
- LLM systems: prompting, error handling, cost management
- Backend: TypeScript/Node.js, PostgreSQL, APIs, job queues
- Orchestration: LangChain/LlamaIndex-style frameworks
- Security: encryption, access controls, audit logging
- Observability: metrics, logging, alerting, distributed debugging
Strongly Preferred
- Healthcare compliance (PII, Joint Commission, NCQA)
- Web automation/scraping (Playwright, Puppeteer)
- Multi-agent coordination patterns
- LLM cost optimization
Work Style
- Comfortable in pre-revenue ambiguity
- Balances speed with compliance
- Strong async communication
- Effective with external contractors
- Pragmatic about technical debt
What You’ll Get
Impact
- Architect the intelligence layer of a first-to-market AI verification system
- Direct contribution to revenue milestones and scale
Learning
- Deep production LLM and agent orchestration experience
- Regulated healthcare systems engineering
- Fractional CTO-level strategic ownership
Flexibility
- Fully remote, async-first
- Deliverables over hours
- Flexible schedule with core overlap
Compensation
- $8K+/month (scales with hours + experience)
- Potential conversion to full-time CTO role
- 0.25%–6% equity (PT vs FT 1-year cliff)
How We Work
- Kanban, weekly delivery cycles
- Weekly demos (async-recorded)
- Daily async standups
- Weekly sync (45 min)
- Tools: GitHub, Slack
- Definition of Done: merged, deployed, documented/demoed
- Target: 5–7 medium stories/week (quality > quantity)
Interview Process
- Intro call (30 min)
- Technical deep-dive (60 min)
- Async code/architecture review
- Culture fit (30 min)
- Paid trial (1 week) — ship monitoring or orchestration component
Timeline: 1–2 weeks end-to-end
How to Apply
Note: This is a syndicated job post. Fractional Jobs found it on the web, but we are not working with the client directly, so we don't have control over or knowledge of the application process. To apply, click on the "View Application" button and follow the application's instructions. Let us know how it goes!
How to Get in Touch
Hit that "Request Intro" button below. Include any relevant links so we can get to know you better.
Your brief intro note should clearly address:
If we think there's a fit, we'll reach out to schedule an intro call. Looking forward!
MoreEngineeringJobs
Send fractional jobs,
playbooks, and more to
%20(1).webp)