Four short links: 16 July 2019

Four short links: 16 July 2019

Four short links
  1. Introducing a new game: Quantum TiqTaqToeThis experience was essential to the birth of Quantum TiqTaqToe. In my quest to understand Unity and Quantum Games, I set out to implement a “simple” game to get a handle on how all the different game components worked together. Having a game based on quantum mechanics is one thing; making sure it is fun to play requires an entirely different skill set.
  2. Association of Screen Time and Depression in Adolescence (JAMA) — Time-varying associations between social media, television, and depression were found, which appeared to be more explained by upward social comparison and reinforcing spirals hypotheses than by the displacement hypothesis. (via Slashdot)
  3. CAST HandbookHow to learn more from incidents and accidents.
  4. ML-AgentsUnity Machine Learning Agents Toolkit, open source.
Article image: Four short links

Managing machine learning in the enterprise: Lessons from banking and health care

Managing machine learning in the enterprise: Lessons from banking and health care

Monument in Bydgoszcz

(source: Krzysztof Mizera on Wikimedia Commons)

As companies use machine learning (ML) and AI technologies across a broader suite of products and services, it’s clear that new tools, best practices, and new organizational structures will be needed. In recent posts, we described requisite foundational technologies needed to sustain machine learning practices within organizations, and specialized tools for model development, model governance, and model operations/testing/monitoring.

What cultural and organizational changes will be needed to accommodate the rise of machine and learning and AI? In this post, we’ll address this question through the lens of one highly regulated industry: financial services. Financial services firms have a rich tradition of being early adopters of many new technologies, and AI is no exception:

Figure 1. Stage of adoption of AI technologies (by industry). Image by Ben Lorica.

Alongside health care, another heavily regulated sector, financial services companies have historically had to build in explainability and transparency to some of their algorithms (e.g., credit scores). In our experience, many of the most popular conference talks on model explainability and interpretability are those given by speakers from finance.

Figure 2. AI projects in financial services and health care. Image by Ben Lorica.

After the 2008 financial crisis, the Federal Reserve issued a new set of guidelines governing models—SR 11-7: Guidance on Model Risk Management. The goal of SR 11-7 was to broaden a set of earlier guidelines which focused mainly on model validation. While there aren’t any surprising things in SR 11-7, it pulls together important considerations that arise once an organization starts using models to power important products and services. In the remainder of this post, we’ll list the key areas and recommendations covered in SR 11-7, and explain how they are relevant to recent developments in machine learning. (Note that the emphasis of SR 11-7 is on risk management.)

Sources of model risk

We should clarify that SR 11-7 also covers models that aren’t necessarily based on machine learning: “quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates.” With this in mind, there are many potential sources of model risk, SR 11-7 highlighted incorrect or inappropriate use of models, and fundamental errors. Machine learning developers are beginning to look at an even broader set of risk factors. In earlier posts, we listed things ML engineers and data scientists may have to manage, such as bias, privacy, security (including attacks aimed against models), explainability, and safety and reliability.

Figure 3. Model risk management. Image by Ben Lorica and Harish Doddi.

Model development and implementation

The authors of SR 11-7 emphasize the importance of having a clear statement of purpose so models are aligned with their intended use. This is consistent with something ML developers have long known: models built and trained for a specific application are seldom (off-the-shelf) usable in other settings. Regulators behind SR 11-7 also emphasize the importance of data—specifically data quality, relevance, and documentation. While models garner the most press coverage, the reality is that data remains the main bottleneck in most ML projects. With these important considerations in mind, research organizations and startups are building tools focused on data quality, governance, and lineage. Developers are also building tools that enable model reproducibility, collaboration, and partial automation.

Model validation

SR 11-7 has some specific organizational suggestions for how to approach model validation. The fundamental principle it advances is that organizations need to enable critical analysis by competent teams that are able to identify the limitations of proposed models. First, model validation teams should be comprised of people who weren’t responsible for the development of a model. This is similar to recommendations made in a recent report released by The Future of Privacy Forum and Immuta (their report is specifically focused on ML). Second, given the tendency to showcase and reward the work of model builders over those of model validators, appropriate authority, incentives, and compensation policies should be in place to reward teams that perform model validation. In particular, SR 11-7 introduces the notion of “effective challenge”:

Staff conducting validation work should have explicit authority to challenge developers and users, and to elevate their findings, including issues and deficiencies. … Effective challenge depends on a combination of incentives, competence, and influence.

Finally, SR 11-7 recommends that there be processes in place to select and validate models developed by third-parties. Given the rise of SaaS and the proliferation of open source research prototypes, this is an issue that is very relevant to organizations that use machine learning.

Model monitoring

Once a model is deployed to production, SR 11-7 authors emphasize the importance of having monitoring tools and targeted reports aimed at decision-makers. This is in line with our recent recommendation that ML operations teams provide dashboards with custom views for all principals (operations, ML engineers, data scientists, and business owners). They also cite another important reason to setup independent risk monitoring teams: the authors point out that in some instances, the incentive to challenge specific models might be asymmetric. Depending on the reward structure within an organization, some parties might be less likely to challenge models that help elevate their own specific key performance indicators (KPIs).

Governance, policies, controls

SR 11-7 highlights the importance of maintaining a model catalog that contains complete information for all models, including those currently deployed, recently retired, and under development. The authors also emphasize that documentation should be detailed enough so that “parties unfamiliar with a model can understand how the model operates, its limitations, and its key assumptions.” These are relevant to ML, and the early tools and open source projects for ML lifecycle development and model governance will need to be supplemented with tools that facilitate the creation of adequate documentation.

This section of SR 11-7 also has specific recommendations on roles that might be useful for organizations that are beginning to use more ML in products and services:

  • Model owners make sure that models are properly developed, implemented, and used. In the ML world, these are data scientists, machine learning engineers, or other specialists.
  • Risk-control staff take care of risk measurement, limits, monitoring, and independent validation. In the ML context, this would be a separate team of domain experts, data scientists, and ML engineers.
  • Compliance staff ensure there are specific processes in place for model owners and risk-control staff.
  • External regulators are responsible for making sure these measures are being properly followed across all the business units.

Aggregate exposure

There have been many examples of seemingly well-prepared financial institutions caught off-guard by rogue units or rogue traders who weren’t properly accounted for in risk models. To that end, SR 11-7 recommends that financial institutions consider risk from individual models as well as aggregate risks that stem from model interactions and dependencies. Many ML teams have not started to think of tools and processes for managing risks stemming from the simultaneous deployment of multiple models, but it’s clear that many applications will require this sort of planning and thinking. Creators of emerging applications that depend on many different data sources, pipelines, and models (e.g., autonomous vehicles, smart buildings, and smart cities) will need to manage risks in the aggregate. New digital-native companies (in media, e-commerce, finance, etc.) that rely very heavily on data and machine learning also need systems to monitor many machine learning models individually and in aggregate.

Health care and other industries

While we focused this post in guidelines written specifically for financial institutions, companies in every industry will need to develop tools and processes for model risk management. Many companies are already affected by existing (GDPR) and forthcoming (CCPA) privacy regulations. And, as mentioned, ML teams are beginning to build tools to help detect bias, protect privacy, protect against attacks aimed at models, and ensure model safety and reliability.

Health care is another highly regulated industry that AI is rapidly changing. Earlier this year, the U.S. FDA took a big step forward by publishing a Proposed Regulatory Framework for Modifications to AI/ML Based Software as a Medical Device. The document starts by stating that “the traditional paradigm of medical device regulation was not designed for adaptive AI/ML technologies, which have the potential to adapt and optimize device performance in real time to continuously improve health care for patients.”

The document goes on to propose a framework for risk management and best practices for evolving such ML/AI based systems. As a first step, the authors list modifications that impact users and thus need to be managed:

  • modifications to analytical performance (i.e., model re-training)
  • changes to the software’s inputs
  • changes to its intended use.

The FDA proposes a total product lifecycle approach that requires different regulatory approvals. For the initial system, a premarket assurance of safety and effectiveness is required. For real-time performance, monitoring is required—along with logging, tracking, and other processes supporting a culture of quality—but not regulatory approval of every change.

This regulatory framework is new and was published in order to receive comments from the public before a full implementation. It still lacks requirements for localized measurement of safety and effectiveness, as well as for the evaluation and elimination of bias. However, it’s an important first step for developing a fast-growing AI industry for health care and biotech with a clear regulatory framework, and we recommend that practitioners stay educated on it as it evolves.

Summary

Every important new wave of technologies brings benefits and challenges. Managing risks in machine learning is something organizations will increasingly need to grapple with. SR 11-7 from the Federal Reserve contains many recommendations and guidelines that map well over to the needs of companies that are integrating machine learning into products and services.

Related content:

Four short links: 15 July 2019

Four short links: 15 July 2019

Four Short Links
  1. NASA Climbing Robota four-limbed robot named LEMUR (Limbed Excursion Mechanical Utility Robot) can scale rock walls, gripping with hundreds of tiny fishhooks in each of its 16 fingers and using artificial intelligence to find its way around obstacles.
  2. Programming and Programming Languages — a new edition of a book that introduces programming and programming languages at the same time.
  3. IINAThe modern media player for macOS. Open source, and very good.
  4. Job Burnout in Professional and Economic Contexts (PDF) — In recent times, we are seeing the development of new ‘burnout shops’ that are not short-term projects, but are long-term models for doing business. A new word in my lexicon, on a subject of interest to me.
Article image: Four Short Links

Four short links: 12 July 2019

Four short links: 12 July 2019

Four short links
  1. The Dirty Business of Hosting Hate Online (Gizmodo) — an interesting rundown of who is hosting some of the noxious sites on the web.
  2. Releasing Fast and SlowOur research shows that: rapid releases are more commonly delayed than their non-rapid counterparts; however, rapid releases have shorter delays; rapid releases can be beneficial in terms of reviewing and user-perceived quality; rapidly released software tends to have a higher code churn, a higher test coverage, and a lower average complexity; challenges in rapid releases are related to managing dependencies and certain code aspects—e.g., design debt.
  3. Embracing Innovation in Government (OECD) — a global review that explores how governments are innovating and taking steps to make innovation a routine and integrated practice across the globe.
  4. Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice CloningWe present a multispeaker, multilingual text-to-speech (TTS) synthesis model based on Tacotron that is able to produce high-quality speech in multiple languages. Moreover, the model is able to transfer voices across languages—e.g., synthesize fluent Spanish speech using an English speaker’s voice, without training on any bilingual or parallel examples. Such transfer works across distantly related languages—e.g. English and Mandarin.
Article image: Four short links

Four short links: 11 July 2019

Four short links: 11 July 2019

Four short links
  1. The Great Wave: What Hokusai’s Masterpiece Tells Us About Museums, Copyright, and Online Collections TodayIf we consider the customer journey of acquiring a digital image of “The Great Wave” from our 14 museums, a definite trend emerges—the more open the policy of a museum is, the easier it is to obtain its pictures. Like the other open access institutions in our sample group, The Art Institute of Chicago’s collections website makes the process incredibly simple: clicking once on the download icon triggers the download of a high-resolution image. In contrast, undertaking the same process on the British Museum’s website entails mandatory user registration and the submission of personal data.
  2. Introducing the Twitter Engineering Apprenticeship ProgramThrough our new apprenticeship program, participants will go through a one-year rotation program with full-time employment benefits. Upon completion of the program, they will graduate and join one of our engineering teams. For under-represented and non-traditional backgrounds, but I’d love to see more apprenticeship models in software orgs.
  3. Regulation of AI (LOC) — This report examines the emerging regulatory and policy landscape surrounding artificial intelligence (AI) in jurisdictions around the world and in the European Union.
  4. A Primer for Computational Biology — an open textbook from Oregon State.
Article image: Four short links

AI and retail

AI and retail

This is a keynote highlight from the O’Reilly Artificial Intelligence Conference in Beijing 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Article image: Shopping carts

(source: Pixabay).

Four short links: 8 July 2019

Four short links: 8 July 2019

Four short links
  1. Algorithmic Governance and Political Legitimacy (American Affairs Journal) — Mechanized judgment resembles liberal proceduralism. It relies on our habit of deference to rules, and our suspicion of visible, personified authority. But its effect is to erode precisely those pro­cedural liberties that are the great accomplishment of the liberal tradition, and to place authority beyond scrutiny. I mean “authori­ty” in the broadest sense, including our interactions with outsized commercial entities that play a quasi-governmental role in our lives. That is the first problem. A second problem is that decisions made by an algorithm are often not explainable, even by those who wrote the algorithm, and for that reason cannot win rational assent. This is the more fundamental problem posed by mechanized decision-making, as it touches on the basis of political legitimacy in any liberal regime.
  2. The 27-Factor Assessment Model for DevOpsThe factors are the cross-product of current best practices for three dimensions (people, process, and technology) with nine pillars (leadership, culture, app development/design, continuous integration, continuous testing, infrastructure on demand, continuous monitoring, continuous security, continuous delivery/deployment).
  3. Millforka middle-level programming language targeting 6502- and Z80-based microcomputers and home consoles.
  4. FossaSat-1 (Hackaday) — FossaSat-1 will provide free and open source IoT communications for the globe using inexpensive LoRa modules, where anyone will be able to communicate with a satellite using modules found online for under 5€ and basic wire mono-pole antennas.
Article image: Four short links

Four short links: 10 July 2019

Four short links: 10 July 2019

Four short links
  1. Security Implications of Compiler Optimizations on Cryptography—A ReviewThis paper is a literature review of (1) the security complications caused by compiler optimizations, (2) approaches used by developers to mitigate optimization problems, and (3) recent academic efforts toward enabling security engineers to communicate implicit security requirements to the compiler. In addition, we present a short study of six cryptographic libraries and how they approach the issue of ensuring security requirements. With this paper, we highlight the need for software developers and compiler designers to work together in order to design efficient systems for writing secure software.
  2. Pillman — Pac-Man in 512 bytes, small enough to fit on a boot sector. Impressive feat, and nicely documented.
  3. Gotta Catch ‘Em All: Understanding How IMSI-Catchers Exploit Cell Networks (EFF) — with this post, we hope to make accessible the technical inner workings of CSSs [Cell Site Simulators, the IMSI catchers used by law enforcement and others], or rather, the details of the kind of attacks they might rely on. For example, what are the different kinds of location tracking attacks and how do they actually work? Another example: it’s also widely believed that CSSs are capable of communication interception, but what are the known limits around cell network communication interception and how does that actually work? (via BoingBoing)
  4. MemelearningIn this post, we’ll share how we used TensorFlow’s object detection API to build a custom image annotation service for Eyeson. Below you can seen an example where Philipp is making the “thinking” 🤔 during a meeting, which automatically triggers a GIF reaction. I don’t think automatically triggering is awesome, but certainly having them queued up for you to use would be good.
Article image: Four short links

Data orchestration for AI, big data, and cloud

Data orchestration for AI, big data, and cloud

This is a keynote highlight from the O’Reilly Artificial Intelligence Conference in Beijing 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Article image: Weave

(source: Pixabay).

Toward learned algorithms, data structures, and systems

Toward learned algorithms, data structures, and systems

This is a keynote highlight from the O’Reilly Artificial Intelligence Conference in Beijing 2019. Watch the full version of this keynote on the O’Reilly online learning platform.

You can also see other highlights from the event.

Article image: Fractal

(source: Pixabay).