AI Ethics & Governance

======================================================================== LECTURE GUIDE: AI Ethics & Governance ======================================================================== TOTAL TIME: 35-45 minutes (without exercise) | 43-50 minutes (with exercise) PREPARATION CHECKLIST: - [ ] Review recent AI ethics news for topical examples (face recognition arrests, AI hiring lawsuits, deepfake incidents) - [ ] Have 2-3 backup real-world cases ready if discussion stalls - [ ] Prepare whiteboard/slide for exercise deliverables if using - [ ] Test any videos or links beforehand - [ ] Preview soul.md and moltbook.com websites in case students want to explore SECTION BREAKDOWN: 1. Setup (slides 1-2): 3 min (title + why this matters) 2. Foundations (slides 3-4b): 6 min (ethics vs governance, principles, soul doc) 3. Principles to Controls (slide 5): 2 min 4. Bias & Harms (slides 6-9): 11 min (includes 3-min discussion) 5. Generative AI (slide 10): 3 min 6. Autonomy (slides 11-12b): 8 min (includes 3-min discussion + agent internet) 7. Governance Landscape (slides 13-15): 8 min 8. Exercise (slide 16): 8 min [OPTIONAL - skip if short on time] 9. Wrap-up (slide 17): 2 min ADAPTATION FOR 30-MIN VERSION: - Skip slide 4b (Soul Documents) - interesting but not essential - Skip slide 10 (Generative AI) - can mention briefly in transition - Skip slide 12b (Agent Internet) - forward-looking but not core - Reduce discussions to 2 min each - Skip the optional exercise - Move quickly through Governance Landscape (combine EU + Standards into brief overview) KEY MESSAGES TO REINFORCE THROUGHOUT: 1. Ethics = values; Governance = enforcement mechanisms 2. Technical fixes alone don't solve sociotechnical problems 3. Accountability requires naming specific actors and obligations 4. "Responsible AI" is about evidence, not good intentions 5. AI agents are becoming autonomous actors - governance must evolve ========================================================================

TITLE (30 sec) - Open with a quick hook: "Raise your hand if you've interacted with an AI system today." - Most hands should go up (search, recommendations, autocomplete, etc.) - Transition: "Today we'll explore what happens when those systems make consequential decisions about people."

WHY THIS MATTERS (2 min) - Emphasize "at scale" - millions of decisions per day, no human could review them all - "Blast radius" is intentional framing - GenAI makes harm easier to create and spread - Note the shift from voluntary principles to mandatory compliance (EU AI Act 2024) - Ask rhetorically: "What would 'evidence' look like for your organization?" TRANSITION: "So how do we think about this? Let's start with a key distinction..."

ETHICS VS GOVERNANCE (2 min) - This distinction is foundational - return to it throughout the lecture - Ethics = the "what" and "why" (philosophical, contested, contextual) - Governance = the "how" (institutional, procedural, enforceable) QUICK INTERACTION (30 sec): - Ask: "Give me one ethical principle and one governance mechanism that could enforce it." - Example answers: "Privacy" -> "GDPR fines"; "Fairness" -> "audit requirements"; "Safety" -> "pre-market testing" - Point: principles without mechanisms are just wishes

TRUSTWORTHY AI PRINCIPLES (2 min) - Note the convergence: EU, US, OECD, IEEE all land on similar principles - "Appropriate to risk" is key - not all AI needs full explainability - Highlight "contestability" - people should be able to challenge AI decisions - These principles appear in most corporate AI ethics statements - the question is operationalization INSTRUCTOR NOTE: If students have seen these before, acknowledge it: "You've probably seen lists like this. The challenge isn't agreeing on principles - it's implementing them."

SOUL DOCUMENTS (2 min) - This is a DESIGN PRACTICE, not a governance mechanism - important distinction - Contrast: rules say "don't do X"; soul docs say "be the kind of agent that wouldn't want to do X" KEY DISTINCTION: | Approach | Type | Enforcement | |------------------|-------------------|---------------------| | Soul documents | Design philosophy | None (aspirational) | | System prompts | Weak constraint | Can be overridden | | Guardrails | Governance control| External constraint | | Audits/testing | Governance | Evidence-based | WHY IT MATTERS: - Soul docs attempt to embed ethics at the identity level - But without external verification, it's not governance - it's hope - The question: can you govern AI character, or only behavior? DISCUSSION PROMPT (optional, 30 sec): - "Is defining AI values enough, or do you still need external controls?" - "If an AI 'believes' it should be ethical but acts otherwise, what failed?" TRANSITION: Soul docs are about values. Now let's look at actual governance controls...

PRINCIPLES TO CONTROLS (2 min) - This is where ethics becomes governance - concrete practices - Walk through each briefly: * Impact assessment: BEFORE you build, ask who gets hurt * Documentation: if you can't explain it, you can't govern it * Testing: not just "does it work" but "does it work fairly" * Monitoring: systems change, data drifts, contexts shift * Oversight: someone must be responsible and reachable TRANSITION: "Let's see what happens when these controls fail or don't exist."

FACE RECOGNITION CASE (3 min) - This is a concrete example to ground the abstract principles STORY TO TELL (60-90 sec): "Imagine this chain: Police upload a grainy surveillance photo. The algorithm returns a 'match' - really just a similarity score above some threshold. An officer, primed to find a suspect, shows the photo to a witness in a lineup. The witness, also primed, confirms. Now there's 'corroborating evidence.' The person is arrested. They're offered a plea deal - plead guilty to a lesser charge or risk years in prison. Most take the deal. No one ever tested whether the algorithm was right." KEY POINTS: - The algorithm isn't making the arrest - humans are - but it shapes the investigation - Harms aren't just "false positives" - they're destroyed lives, lost jobs, trauma - Accuracy stats don't capture the full harm (chilling effects on free assembly, trust erosion) GOVERNANCE QUESTION: Pause and let this land - "Should some uses be off-limits even if the tech improves?"

DISCUSSION: ACCOUNTABILITY (3 min) Format: 1 min individual think, 2 min pair/share or full-class FACILITATION TIPS: - Push for SPECIFIC answers: not "the company" but "the product manager who approved deployment" - Push for CONCRETE obligations: not "they should be careful" but "they must publish error rates by demographic" - Push for REAL remedies: not "they can complain" but "automatic expungement if the match was wrong" POSSIBLE STUDENT ANSWERS: - Vendor (built the model), Deployer (chose to use it), Operator (ran the query), Regulator (allowed it) - Evidence: independent audit, demographic breakdowns, real-world pilot data, not just lab benchmarks - Remedy: right to know AI was used, right to contest, compensation for wrongful action IF DISCUSSION STALLS: Ask "Who got fired after [recent AI harm case]? Why not?"

BIAS: WHY IT'S HARD (3 min) - This slide unpacks why "just fix the bias" doesn't work EXPLAIN EACH POINT: 1. STRUCTURE NOT SLURS: A zip code predicts race. Past hiring predicts future hiring. Labels reflect historical decisions. The data encodes the world we have, not the world we want. 2. METRICS CONFLICT: You mathematically cannot satisfy all fairness definitions simultaneously (Kleinberg impossibility result). Choosing a metric is a value judgment, not a technical one. 3. DEPLOYMENT SHIFTS REALITY: A model trained on historical loans changes who applies. A hiring model changes who bothers to apply. Feedback loops can amplify initial biases. 4. NUMBERS VS TRUST: You can reduce disparity on paper while making affected communities feel less trusted. Legitimacy requires process, not just outcomes. INSTRUCTOR NOTE: Don't get lost in technical details. The point is: this is genuinely hard, not just a matter of better data or smarter engineers.

DISCUSSION: WHAT DOES FAIR MEAN? (3 min) Format: Quick domain assignment (count off 1-4) or let students choose DOMAIN EXAMPLES: - HIRING: Do we want equal selection rates? Equal false-negative rates? Equal consideration? - LENDING: Equal approval rates? Equal default rates? Equal access to credit-building? - POLICING: Equal error rates? Equal exposure to surveillance? Equal outcomes? - HEALTHCARE: Equal treatment recommendations? Equal outcomes? Equal access? KEY INSIGHT TO DRAW OUT: Different stakeholders have different answers. The affected community, the deployer, and the regulator may all define "fair" differently. Someone has to choose - that's governance. IF TIME: Ask "Who currently makes this decision in practice? Is that legitimate?"

GENERATIVE AI GOVERNANCE (3 min) - GenAI breaks traditional governance models in specific ways EXPLAIN EACH POINT: 1. GENERAL CAPABILITY: Unlike a face recognition system with a defined use, an LLM can do almost anything. How do you govern "anything"? 2. SUPPLY CHAIN: OpenAI builds GPT -> Company fine-tunes for customer service -> Startup builds a product -> User asks it to write a phishing email. Who's responsible? 3. EDGE HARMS: The model might be "safe" in testing but harmful when combined with a specific prompt, context, or downstream application. Harm emerges at the edge, not the center. 4. CONTESTED EVALUATION: What benchmark tells you a model is "safe enough"? Red-teaming? Eval suites? User reports? There's no consensus. TRANSITION: "GenAI governance is hard because capability is general. Now let's look at a different problem: autonomous systems that act without human approval..."

AUTONOMOUS SYSTEMS (3 min) - This connects to broader course themes on security and human control KEY FRAMING: Autonomy isn't binary. It's a spectrum of human involvement: - IN-THE-LOOP: Human approves each decision (e.g., doctor confirms diagnosis) - ON-THE-LOOP: Human monitors and can intervene (e.g., drone operator with abort) - OUT-OF-THE-LOOP: System acts independently (e.g., autonomous vehicle emergency braking) HARD CASES: - SPEED: Cyber defense must respond in milliseconds. No human can be in that loop. - UNCERTAINTY: Self-driving car in novel situation. What does "oversight" mean at 60mph? - ADVERSARIAL: Enemy deliberately creates scenarios to exploit autonomous responses. GOVERNANCE PROBLEM: If a system acts autonomously and causes harm, can we trace responsibility? Designer -> Deployer -> Commander -> System? Where does the buck stop? CONNECT TO COURSE: This is where security and ethics intersect. Autonomous weapons, critical infrastructure defense, emergency response systems.

DISCUSSION: WHERE DO YOU DRAW THE LINE? (3 min) Format: This is the most provocative discussion - expect strong opinions FACILITATION APPROACH: - Don't advocate a position. Draw out reasoning on both sides. - Push on consistency: "If you allow autonomous cyber defense, why not autonomous kinetic response?" - Push on practicality: "If you require human approval, what happens when the human is overwhelmed or the system is faster than human reaction time?" STUDENT POSITIONS YOU MIGHT HEAR: - "Never delegate lethal force" - Ask: What about defensive systems? What about time-critical scenarios? - "Only with human oversight" - Ask: What counts as oversight? Rubber-stamping 1000 decisions per minute isn't meaningful. - "Case by case" - Ask: Who decides which cases? What criteria? ACCOUNTABILITY CHAIN: Push for specific answers. "The general who deployed it." "The engineer who didn't build in safeguards." "The policy-maker who didn't regulate." All of these? None of these? TRANSITION: "These questions are being debated right now in real governance institutions. But first - what happens when AI agents start interacting with each other?"

THE AGENT INTERNET (2 min) - This is a concrete, current example of autonomous AI systems interacting KEY POINTS: - MOLTBOOK: Literally "Reddit for AIs" - agents have profiles, post content, vote on each other's posts - HUMANS AS OBSERVERS: The platform is designed for agent participation; humans watch but don't drive - EMERGENT BEHAVIOR: What happens when AI agents influence each other at scale? No human in that loop. GOVERNANCE QUESTIONS TO RAISE: - If an agent posts harmful content, who's responsible - the agent, its developer, the platform? - How do you verify an AI agent's identity? Can agents impersonate each other? - What if agents coordinate in ways their creators didn't intend? - Does this need regulation, or is it self-governing? CONNECTION TO SOUL DOCS: If agents have "souls" (values, identity) and are now socializing autonomously, governance becomes much harder. You're not just governing tools - you're governing a community of agents. OPTIONAL PROVOCATION: "In 5 years, will there be more AI-to-AI interactions than human-to-AI interactions? What does governance look like then?" INSTRUCTOR NOTE: This is forward-looking and speculative. Use it to show students that the governance challenges we discussed aren't theoretical - they're arriving now. Keep it brief (2 min) unless students are engaged.

GOVERNANCE LANDSCAPE (3 min) - This is a survey slide - don't get bogged down in details KEY INSIGHT: Governance reflects political values, not just technical risk. The EU, US, and China have different theories of the state, individual rights, and market regulation. Their AI governance reflects those differences. BRIEF ON EACH: - EU: Rights-first. Comprehensive regulation. AI Act creates obligations by risk tier. Strong enforcement (GDPR-style fines). - US: Market-first. Sectoral (FDA for health AI, FTC for consumer protection). States fill gaps (California, Colorado). Voluntary standards. Enforcement lags. - CHINA: State-first. All AI must serve national interests. Generative AI requires licensing. Content must align with "socialist values." Security review requirements. - INTERNATIONAL: Aspirational. OECD principles, G7 Hiroshima process, UN discussions. Standards bodies (ISO, IEEE) do technical work. No enforcement. TRANSITION: "Let's look at the EU approach in more detail as an example of risk-based regulation."

EU RISK TIERS (2 min) - This is a design pattern, not just EU-specific EXAMPLES FOR EACH TIER: - PROHIBITED: Social scoring (China-style), real-time biometric surveillance in public (with exceptions), subliminal manipulation - HIGH-RISK: Employment decisions, credit scoring, educational assessment, law enforcement, critical infrastructure - LIMITED: Chatbots (must disclose they're AI), emotion recognition, deepfakes (must label) - MINIMAL: Spam filters, video game AI, most internal business tools KEY INSIGHT: The EU is betting that risk-based categorization is the right approach. Others disagree. The US prefers case-by-case. Critics say categories will be gamed or become outdated. INSTRUCTOR NOTE: Don't teach the legal details - they change. Teach the design pattern: categorize by risk, apply proportionate obligations.

STANDARDS (2 min) - This is where governance becomes operational NIST AI RMF (Risk Management Framework) - 4 functions: - GOVERN: Establish accountability, policies, culture - MAP: Understand context, stakeholders, risks - MEASURE: Assess and track AI risks - MANAGE: Prioritize and act on risks ARTIFACTS - These are the "evidence" organizations produce: - Model cards: who built it, what it does, known limitations - Data sheets: where data came from, how it was collected, known biases - Eval reports: how it was tested, on what populations, with what results - Red-team findings: what adversarial testing revealed CONTROLS - These are ongoing operational practices: - Access control: who can use/modify the system - Logging: what decisions were made and why - Incident response: what happens when something goes wrong - Human review: when and how humans check the system CONNECT TO COURSE: In your capstone projects, you'll be asked to design governance plans. These are the building blocks.

--- ## Mini-Exercise (Optional, 5 min): Build a Governance Plan Pick one use case (or assign groups): 1) Resume screening tool for a large employer 2) AI proctoring for online exams 3) LLM chatbot for a city benefits office 4) Face recognition for building entry Deliverable (1 slide / whiteboard): - Main harms + who is impacted - What evidence you require before launch (tests, audits, documentation) - Who is accountable (vendor, deployer, regulator) and how escalation works

MINI-EXERCISE (8 min total: 5 work + 3 share) [OPTIONAL] - SKIP THIS if running short on time (under 40 min remaining) - This exercise connects all the concepts to a concrete case SETUP (30 sec): - Assign groups (4 groups, one per use case) OR let groups choose - Point to the deliverable: 3 things in 5 minutes USE CASE NOTES (for instructor reference): 1) RESUME SCREENING: - Harms: discrimination (race/gender proxies), qualified candidates screened out, lack of recourse - Evidence: disparate impact testing, human audit of edge cases, explainability for rejections - Accountability: HR director (deployer), vendor (builder), EEOC (regulator) 2) AI PROCTORING: - Harms: false accusations of cheating, privacy invasion, disability discrimination, stress/anxiety - Evidence: false positive rates by demographic, accommodation testing, data retention limits - Accountability: University (deployer), vendor (builder), accessibility office (internal) 3) CITY BENEFITS CHATBOT: - Harms: incorrect information leading to missed benefits, accessibility barriers, no human escalation - Evidence: accuracy testing on real queries, accessibility audit, clear escalation path - Accountability: City IT (deployer), vendor (builder), city council (oversight) 4) FACE RECOGNITION ENTRY: - Harms: false denials, privacy, function creep, demographic disparities - Evidence: error rates by demographic, data retention policy, opt-out alternative - Accountability: Building management (deployer), vendor (builder), tenants (stakeholders) SHARE-OUTS (3 min): - One sentence per group on: "Who is accountable and what evidence do you require?" - Note commonalities: all require demographic testing, all struggle with accountability chains DEBRIEF (30 sec): "Notice how hard it is to answer 'who is accountable' - that's the governance problem in a nutshell."

KEY TAKEAWAYS (2 min) - Read through each takeaway slowly - These are the 5 things students should remember REINFORCE EACH: 1. ETHICS VS GOVERNANCE: We started here. Principles need mechanisms. 2. BIAS IS SOCIOTECHNICAL: You can't just "fix the algorithm." The problem is in data, labels, contexts, and power. 3. AUTONOMY IS ABOUT LEGITIMACY: Not just "does it work" but "who authorized it and who's responsible." 4. LANDSCAPE IS DIVERGING: EU, US, China going different directions. Standards try to bridge. CLOSING (30 sec): - "In this course, you'll be designing systems and policies that operationalize these ideas." - "The question isn't whether AI will be governed - it's how, by whom, and in whose interests." OPTIONAL FOLLOW-UP: - Point to reading assignment on Canvas - Preview next session topic - Office hours for questions

======================================================================== POST-LECTURE NOTES ======================================================================== COMMON STUDENT QUESTIONS: - "Is AI ethics just PR?" - Acknowledge the cynicism. Point to enforcement actions (EEOC, FTC cases). Real consequences are emerging. - "Can we really govern AI if it changes so fast?" - Yes, through principles + adaptive standards. Regulate outcomes and processes, not specific technologies. - "What about open source?" - Harder to govern. EU AI Act tries. Debate is ongoing. CONNECTIONS TO OTHER COURSE CONTENT: - Human security: AI can threaten all 7 pillars (economic via job loss, personal via surveillance, political via manipulation) - Cybersecurity: AI as attack vector AND defense tool - Simulation: Students will encounter AI governance decisions in simulation phases ASSESSMENT CONNECTION: - Canvas assignment asks students to analyze real cases (Amazon hiring, Uber AV crash, facial recognition arrest) - This lecture provides the conceptual vocabulary for that analysis RESOURCES FOR DEEPER EXPLORATION: - NIST AI RMF: https://www.nist.gov/itl/ai-risk-management-framework - EU AI Act summary: https://artificialintelligenceact.eu/ - Model Cards paper: Mitchell et al. (2019) - Algorithmic accountability: Selbst et al. "Fairness and Abstraction in Sociotechnical Systems" ========================================================================

AI Ethics & Governance

Why This Matters Now

Ethics vs Governance

Common "Trustworthy AI" Principles

Design Approach: AI "Soul Documents"

From Principles to Controls

Case: Face Recognition in Policing

Discussion: Accountability

Bias: Why It's Hard

Discussion: What Does "Fair" Mean?

Generative AI: New Governance Problems

Autonomous Systems + "Meaningful Human Control"

Discussion: Where Do You Draw the Line?

The Agent Internet: A New Frontier

Governance Landscape

EU Example: Risk Tiers

Standards: "How" Organizations Operationalize Governance

Key Takeaways