SECA
On-device Hardware Design Copilot · Demo
RUNNING ON SNAPDRAGON / DRAGONWING (≤ 7B local)

On-device Inference Optimization · KV Cache & Memory

SECA targets sub-7B parameter models running directly on Snapdragon · Dragonwing edge AI hardware. Achieving useful latency on a workstation-class NPU without compromising hardware-domain reasoning depends on KV cache compression, retrieval-based context packing and attention-aware token eviction. The visualization below shows the same prompt running with and without these optimizations.

value 0 · platform feasibility no cloud · ≤ 7B params KV-cache · context pruning · LoRA adapters
PROMPT TOKEN STREAM (synthetic)

Generic 70B cloud LLM vs SECA · same input, side-by-side extraction

For the exact same artifacts (Ver D spec, BOM, datasheets), a generic 70B cloud LLM lacks hardware-domain priors and either drops or hallucinates the revision-critical fields. SECA's domain-tuned 7B + retrieval recovers all five Ver D deltas. Wrong fields on the left are highlighted in red.

Customer Design Artifacts — Masked Mock

The mock folder contains a real customer cluster system specification (Ver C / Ver D), datasheets, a power-domain block diagram and a BOM. Customer-identifying information is masked for the demo. In production, every artifact stays on the engineer's workstation or on a Snapdragon / Dragonwing edge box — none of it is sent to a cloud LLM.

PDF · DOCX · PPTX · XLSX on-device only customer data masked

Live Upload · Drop & Extract

Drop files from demo_dump/ (or mock/) here — datasheets, BOM, cluster spec, BD — and click Run. SECA's on-device pipeline normalizes each artifact into hardware-domain JSON below, matched by filename. The accuracy comparison vs a generic cloud LLM is shown in 00 TECHNOLOGY.

drag & drop · multi-file parsing happens locally JSON dump output
Drop your design artifacts here
or click to choose files — PDF · DOCX · PPTX · XLSX accepted
Run on-device extraction
No files yet — drop or choose at least one file to enable Run.

Hardware-specific KG Schema · JSON → Triples

Generic LLMs emit free-form JSON. To build a knowledge graph we constrain extraction to a hardware ontology — entity types like Spec, Component, BomLine, Datasheet, BdBlock and typed predicates like requires, has_attr, instance_of, contradicts. Each JSON field is matched against this schema and rewritten as a triple before it is added to the graph.

domain ontology field → predicate mapping triple store ready
All extracted JSON sources7 sources
HW ontology graph5 types · 6 preds
node-type — predicate → node-type · edge widens with each emitted triple
Unified KG · triples0
idle · 0 / 0 triples

Unified Design Graph · single & revision compare

Each artifact contributes nodes (Spec, Component, BomLine, BdBlock, DsAttr) and edges (requires, instance_of, has_attr, driven_by, contradicts). The single view animates the staged build of one revision; revision-compare runs two cytoscape instances side-by-side so you can see exactly which nodes/edges were added, changed or now contradict the spec when going Ver C → Ver D.

cytoscape.js · fcose layout cross-artifact merge graph stored locally
LIVE TRACE · spec.verC → spec.verD stage 0 / 5 · nodes 0 · edges 0
STAGE 00 · idle

Cross-artifact Consistency Errors — Value 1

The moment requirements move from Ver C to Ver D, SECA re-evaluates the affected slice of the graph and surfaces five cross-artifact conflicts. Each finding shows which two artifacts disagree, on which field, with what evidence — so the engineer can fix it before tape-out, not after a USD 50K–500K re-spin.

value 1 · cross-artifact consistency 5 conflicts detected latency < 2s on-device

Value 1 summary

SECA's Value 1 stops here. 5 cross-artifact conflicts are surfaced with traceable evidence; 3 are introduced by the Ver D revision and 2 were latent in Ver C and likely to be missed in manual review. Every finding is evidence-grounded — engineers do the deciding, not the searching.

Top-K Active Learning — Value 2

Beyond hard 1:1 mismatches, the scorer model proposes a ranked list of top-K candidate conflicts. The engineer's confirm / reject signals feed two parallel updates: ① the scorer is updated with RLHF so next round's ranking improves, and ② the Knowledge Graph acts as an external memory network — confirmed patterns become permanent rules edited directly into the graph.

value 2 · self-improving RLHF · KG-as-memory all loops on-device

① Top-K candidates · scorer output (current round)

Latent conflicts the model is unsure about. Confirm or reject each — your decisions drive both updates below.

0 confirmed · 0 rejected · 6 pending

② Twin update — RLHF scorer + KG memory edit

① Top-K scorer · RLHF weight update

policy : score(candidate) → engineer signal
awaiting Apply
Hidden layer activations and last-layer weights will animate when you press Apply. Confirmed signals push the policy toward the green class; rejected signals push it away.

② Knowledge Graph · memory network edit

graph := graph ⊕ confirmed_pattern
awaiting Apply
Confirmed candidates become red highlighted nodes that get rewritten as permanent rules; rejected candidates fade out. The graph's rule layer grows without enlarging the model itself.

Hardware Design Agent — Final Goal

The accumulated KG is no longer just a consistency-checking artifact. It becomes the external context / feature memory for a hardware-design agent. Type a question or pick one from the side — the agent decomposes it into sub-queries, traverses the KG, packs the relevant slice as context, and a local SLM streams an evidence-grounded answer. Every step is shown explicitly.

final goal · agent platform decompose → traverse → pack → generate all on-device