Phase 1 · Feasibility · Time · Pilot recommendation

One workflow, built to be owned by your team in a month.

From the seven candidate areas in VOG's bottleneck worksheet, we recommend a shipping-document cross-check pilot covering PI, Commercial Invoice, Packing List and Import Certificate. It is the highest-value, lowest-risk first move — and it builds the foundation that later unlocks finance follow-up, supplier tracking and reporting.

Recommended pilot
Document cross-checkPI · Invoice · PL · IC
Weighted score
54/55Next: C — 43
Phase 1 effort
23.5–36.5developer-days · mid ≈ 30
Delivery window
≈ 1 monthTomek 25–30% · Lulu 70–75%
01

Why this is the right first move

The biggest source of manual checking in the worksheet is also the cleanest to scope: one team, one document set, one acceptance metric — and the same engine later powers the next phases.

A

Largest, most-reported pain. Document work is 21 items / ~124 hrs per week — the team's biggest category. Cross-check pain is named independently by Sophia, Jessica, Eva and Kelly, so it is structural rather than personal.

B

Bounded, measurable, easy to hand over. Read → compare → a person approves. No ERP write-back, no full automation. Acceptance is a number on an evaluation set, not a feeling.

C

Builds a reusable foundation. The DCSA-aligned extraction this pilot produces is exactly what later unlocks finance follow-up (C), supplier tracking (D), and a real dashboard (G).

D

De-risked by precedent. A 2024 production deployment at a Silicon Valley fintech runs the same extract-and-cross-check pattern on heterogeneous shipping documents, confirming feasibility at production scale.

02

Seven areas, ranked transparently

Each area scored 1–5 against six criteria (business value ×3, feasibility ×2, time-to-value ×1, measurability ×2, scope & handover ×1, strategic leverage ×2). Max 55. Bars below are the resulting weighted totals.

Lead pilot Part of lead Fast-follow (1b) Add-on Phase 2 Defer
A · Document cross-check
PI / Invoice / PL / IC · 13–20 d · Tomek 3–5 / Lulu 10–15
Lead pilotHigh feasibility · clean acceptance
54
B · Shipping-doc review
Extends A · +3.5–6 d · Tomek 0.5–1 / Lulu 3–5
Part of ASame engine, more doc types
C · Finance & payment follow-up
6–10 d · Tomek 1–2 / Lulu 5–8 · ERP-dependent
Fast-follow (Phase 1b)High value · gated by ERP access
43
F · English email support
1.5–3.5 d · Tomek ≈ 0.5 / Lulu 1–3
Add-onFast · saves 8 hrs/wk · not approval-gated
39
D · Supplier & delivery tracking
10–15 d · Tomek 2–3 / Lulu 8–12
Phase 2Data consolidation first
34
G · Reporting / visibility
4–7 d · Tomek ≈ 1 / Lulu 3–6
Add-on / Phase 2Best as a layer on A's output
34
E · Complaint summaries
3.5–6 d · Tomek 0.5–1 / Lulu 3–5
DeferLow volume → low lead-ROI
32
The Excel worksheet's own Top-20 is hours-weighted, which puts design/CAD at the top despite being marked low AI-fit. Ranking by AI-fit (above) corrects for that bias — and surfaces the genuine document/finance/email cluster.
03

What the workflow actually does

End-to-end, six steps. The model reads, code compares, a person decides — no step does more than one job.

01
Ingest
code
Pick up a shipment's document set (PI, Commercial Invoice, Packing List, Import Certificate) from an agreed location.
02
Classify
model
Identify each document's type and route it to a type-specific extraction prompt.
03
Extract → DCSA schema
model
Read key fields into a standardised JSON record aligned to the DCSA eBL 3.0 trade-document model.
04
Compare · 5 anchors
code
Deterministic field-by-field check against five cross-document anchors. Item-code aliases normalised first (XXL ≡ 2XL).
05
Review & approve
human
A reviewer sees the documents, extracted fields, and any flagged discrepancies side-by-side; approves, rejects or corrects.
06
Log & report
code
Outcome is logged and surfaced in a thin reporting view: flagged vs cleared, by discrepancy category, over time.
Model — language understanding only Code — deterministic, auditable, no hallucination Human — final decision authority

The five cross-document anchors

The comparison checks these consistency points across every shipment. They are the contract between documents — and the basis of acceptance testing.

01
Reference / PO
primary alignment key
02
Parties
name · tax id · role
03
Cargo
desc · HS · qty · weight · volume
04
Charges
unit × qty = amount · ccy · term
05
Container / seal
container no. · seal no.
In scope · what it does
  • Reads documents, extracts fields, flags discrepancies
  • Presents flagged items for human approval
  • Writes outcomes to a thin reporting view
  • First cut: 1 doc pair × 1 supplier lane (e.g. IXS), digital-first
Out of scope · what it does not do
  • No write-back to ERP or any system of record
  • No automatic emails or actions without human approval
  • No action on flagged shipments without a person's decision
  • Other doc types / supplier lanes added later via add-on B
04

One month, gated and parallelised

Discovery is a real step with an exit gate, not a kickoff. The build clock starts only when sample documents, schema, access, a single point of contact and tooling are confirmed.

Week 1
Week 2
Week 3
Week 4
Phase 0
Gated discovery
Samples · schema · access · SPOC · tooling
Tomek
Architect
Architecture · DCSA schema · data-source audit
Lulu
Implementer
Extraction · comparison engine
Review UI · reporting view
Both
Acceptance & handover
Eval · acceptance · training · handover
Why the gate matters

The biggest risk to the timeline is not work effort but elapsed time — many stakeholders, scattered documents, format surprises. The gated discovery and a single point of contact at VOG turn this from silent overrun into a visible client dependency: if those items are delayed, the timeline shifts via change control, not by squeezing the build window.

Phase 1 componentLowHighMidNotes
Phase 0 — Discovery & setup (gated)354Samples, schema, access, SPOC, tooling
Core A — extraction + comparison + review UI132016.5First cut: 1 doc pair × 1 supplier lane
Add-on — thin reporting view2.53.53Rides on the structured output
Cross-cutting — eval, testing, docs, handover354Including team training
Contingency / coordination buffer232.5Scan quality, ERP access unknowns
Phase 1 total (developer-days)23.536.5≈ 30~25–30% Tomek · ~70–75% Lulu
05

Acceptance is a number, not a feeling

Targets are proposed below and confirmed jointly with VOG during Phase 0 against an agreed evaluation set built from real historical documents (including the known problem cases).

≥ 95%
Discrepancy recall on the eval set — share of real issues we catch
≥ 90%
Discrepancy precision — share of flags that are real (not false alarms)
0
False positives on a clean control set — reviewers are not flooded
≥ 70%
Reduction in per-shipment check time vs Phase 0 baseline
≥ 0.95
Field-extraction F1 on the key cross-document anchor fields

Acceptance is signed off when all proposed targets are met on the agreed eval set and the workflow has been demonstrated end-to-end with a VOG reviewer. A regression set of top templates (IXS / HY / Bangladesh) is run before any prompt change ships.

06

SOW essentials at a glance

Mapped one-to-one to the items the client asked the Statement of Work to define. The full text lives in the Phase 1 Assessment document.

10.1 · Deliverables

What VOG receives

  • Live cross-check workflow
  • Review & approval interface
  • Thin reporting view
  • Labelled eval set + test report
  • SOP, prompt library, runbook
  • Two training sessions (recorded)
10.2 · Selected workflow

The specific pilot

Shipping-document cross-check for PI / Invoice / Packing List / Import Certificate. First cut: 1 doc pair × 1 supplier lane, digital-first, locked at the end of Phase 0.

10.3 · Acceptance

Sign-off basis

Targets in section 05 above, met on the agreed eval set, plus end-to-end demonstration with a VOG reviewer. Final targets fixed in Phase 0.

10.4 · IP ownership

VOG owns everything built

Workflow, prompts, schema, review interface, reporting view, dashboards, code, documentation, configurations and training materials. IP assigns to VOG on payment.

10.5 · Data security

No training on VOG data

  • AI tool agreed in Phase 0 on a contractual no-training tier
  • Confidentiality, least-privilege access, residency confirmed
  • Access revoked at handover
10.6 · Fixed fee includes

What is covered

  • Phase 0 discovery and setup
  • Core build of the first-cut workflow
  • Thin reporting view
  • Eval set + acceptance testing
  • Handover pack & two training sessions
  • Tomek & Lulu time during Phase 1

Not included: third-party LLM / cloud subscriptions (VOG-owned account) and any travel.

10.7 · Out of scope

Explicitly not in Phase 1

  • ERP write-back / full automation
  • Doc types or supplier lanes beyond the first cut
  • Phase 2: C finance, D tracking, full G dashboard
  • Historical-data cleanup not needed for the pilot
  • Production SLAs / 24-7 / managed service
10.8 · Change control

How extras get approved

  1. Either party proposes a change
  2. ATS notes impact: days, schedule, fee
  3. VOG approves in writing (email is fine)
  4. Only then does ATS start the extra work
10.9 · Training & handover

So VOG can run it

  • Session 1 (90 min): operating the workflow
  • Session 2 (90 min): configuration & maintenance
  • SOP, recordings, runbook, FAQ
  • Two-week post-handover hand-holding included
07

How decisions were made

Scoring criteria & weights

Designed so high scores are good for VOG and ATS alike — the best first pilot maximises client value while minimising delivery risk.

01
Business value
Hours saved + cost of errors avoided (freight/customs rework, payment disputes)
×3highest
02
Feasibility / confidence
Capability maturity × data accessibility — discounts high-value but risky items
×2
03
Time-to-value
Less effort to a working result scores higher
×1
04
Measurability
Whether a clean acceptance metric exists — protects both sides
×2
05
Scope & handover
Bounded to one team so VOG can own it after handover
×1
06
Strategic leverage
Builds a reusable foundation that unlocks later phases
×2

Open questions — what to confirm next

These cross-cutting items most affect the estimate. They are sent to the client during Phase 0 and the answers feed the SOW.

Document format — digital PDF vs scanned ratio?
scopes A
Document volume — per week / month?
ROI
ERP system — API or export available?
gates C / D / G
Where documents live — NAS / Google / email?
access
Approved AI tooling — security & no-training tier?
security
Item-code alias table — does one already exist?
comparison
08

Excel source references

For cross-checking against the VOG bottleneck worksheet. Each entry points to a sheet, item number and priority score so the assessment can be verified at the source.

Open the source map · sheet · item · score
Aggregate'Team Overview' / '團隊總覽' sheets — 139 items, 11 staff · ≈ 711 hrs/wk · 28 High AI-fit · 39 scored ≥ 15 · Documents 21 items / 124h, Finance 49h, Email 38.5h.
Method caveat'Team Overview EN' ranks 1, 2, 6, 7, 8, 16 = Ken (design / CAD), score 18–22, AI-fit Low / "No" — the hours-weighted bias the assessment corrects for.
ASophia #18 IXS IC+PI+HY PL cross-check, ≈ 10 h/wk, score 17; document revised 6× · Jessica #5 Invoice / PL compare, 3 h/wk, 'Team Overview EN' rank 5 / 20.2 · Eva invoice / packing qty, amount, model errors.
BSophia #2 IN+PL (score 13), #4 customs-declaration check, #5 cert-of-origin check, #6 BL draft, #9 advance-payment PI.
CJessica #4 payment tracking (rank 9 / 19), #9 accounts receivable (13 / 18.5), #7 bookkeeping review (12 / 18.5) · Jane AR + cash-level alert · Kelly #9 rank 4 / 20.4.
DJessica #3 shipment tracking (11 / 18.5) · Kelly #5 lead-time · Eva #5 / #21 · Jason data scattered across ERP / Excel / Email / Google / Evernote.
EJessica #11 complaint triage (14 / 18.5).
FJessica #1 English email, 8 h/wk, 'Team Overview EN' rank 3 / 21.5 — top high-fit item · Eva email / translation · Noella translation.
GJessica #3 dashboard · Sophia #10 (score 15) · Jane P&L / cashflow.