How this reading is produced
Karuthu Vellam re-implements the Pol.is opinion-mapping pipeline in TypeScript inside this archive. We do not run Pol.is. We re-implement its documented math so the methodology is owned, versioned, and auditable per reading.
1 · The vote
For each Keeper-approved statement you cast agree, disagree, or pass. Pass is first-class — there is no "you must answer". A vote is keyed to sha256(cookie + per-set salt). The cookie is httpOnly. No phone, no email, no name. A per-set salt prevents linking the same cookie across sets.
2 · The live counter (ε-differential privacy)
Once per minute the server reads the total vote count and adds a sample from a Laplace distribution with scale b = 1/ε. The published value is ≈ X ± Y where Y is the 90th-percentile noise band (≈ 2.3/ε).
Default ε = 1.0 — a tight, "strong privacy" budget by the standards of the differential-privacy literature. The exact ε is published on every reading and on the stream surface itself. The DP guarantee means no individual vote can be inferred from the counter's behaviour, even by an adversary watching the live stream forever.
Reference: Dwork et al., "Differential privacy under continual observation" (STOC 2010).
3 · The era-day aggregate
Once per era-day, for each statement, we count agree/disagree/pass and write an aggregate row. Each count is itself noised. Cohort sub-cells (generation, region, language) are suppressed when N < 25. No cell below the k-anonymity floor is ever written, so it cannot ever be published.
4 · The era-week reading (PCA + k-means)
- Build the sparse vote matrix V[participant × statement] with values {+1, -1, 0, null}.
- Mean-centre each column (ignoring nulls).
- Compute the statement-by-statement covariance and power-iterate for the top 2 eigenvectors (deterministic seed for reproducibility).
- Project each participant onto those 2 axes — the opinion space.
- k-means in 2D with k ∈ 2..5; pick by silhouette score.
- For each cluster, surface representative statements (highest in-cluster vs out-cluster agree-share).
- Across all clusters, surface consensus statements (high agreement, N ≥ 25) and divisive statements (closest to 50/50).
- Write the reading with
math_versionstamped so historical readings stay reproducible.
Reference: Megill et al., "Polis: Scaling Deliberation by Mapping High-Dimensional Opinion Spaces", Recerca 26.2 (2021). Reproduction reference: polis-community/red-dwarf (MPL-2.0).
5 · What this method cannot do
- It cannot produce "99% of Tamils support X". DP noise + k=25 suppression + opinion-mapping (not yes/no aggregation) make that output structurally impossible.
- It cannot tell you who voted for what. Per-participant identifiers are pseudonymous hashes; there is no public read path on the vote table.
- It cannot represent the diaspora. N is always self-selected.
- It cannot be safely streamed at the per-statement level — only the global DP counter and the rotating consensus statement are stream-safe.
6 · Statement source policy
During the founding era, statements come only from MP Packs, Unmai desks, Magalir Avai, and the Case Organ. Open member submission opens only after the graduation gates close.
7 · Why we do not run Pol.is itself
Pol.is (pol-is/polisMath) is AGPL-3.0 and runs as a Clojure service paired with Postgres. We chose to re-implement the well-documented math inside our own stack because (a) we already run TanStack Start on Postgres, (b) owning the implementation lets us audit it line by line, and (c) the Pol.is community's MPL-2.0 reference port (red-dwarf) and the Recerca 2021 paper make the algorithm reproducible without inheriting the AGPL surface.
