- Software Patents Slipping Back (BoingBoing) — USPTO issuing new guidance that re-enables crappy software patenting.
- Unsupervised Learning of Artistic Styles with Archetypal Style Analysis — Our objective is to automatically discover, summarize, and manipulate artistic styles present in the collection. (via Adrian Colyer)
- Dropgangs, or the Future of Darknet Markets — The other major change is the use of “dead drops” instead of the postal system, which has proven vulnerable to tracking and interception. Now, goods are hidden in publicly accessible places like parks, and the location is given to the customer on purchase. The customer then goes to the location and picks up the goods.
- Boards — open source tool for collaboratively organizing notes.
- Rook — storage orchestration for Kubernetes.
- Why We Can’t Have Nice Things (MIT Press) — Trolls’ actions are born of and fueled by culturally sanctioned impulses—which are just as damaging as the trolls’ most disruptive behaviors. […] For trolls, exploitation is a leisure activity; for media, it’s a business strategy. (via Greg J. Smith)
- Language Bias in Accident Investigation — The SAIG [Forest Service’s Serious Accident Investigation Guide] influences investigators to apply linear, hindsight-biased, “cause and effect” reasoning toward human actors in the event. The guide’s use of agentive descriptions, binary opposition, and the active verb voice creates a seemingly exclusive causal attribution toward humans. Objective analysis was found to be impossible, using the SAIG’s language and report structure. This stands in contrast to the agency’s goal of accident prevention. nota bene, post-mortem facilitators. (via John Allspaw)
- Artificial Intelligence: American Attitudes and Trends — This report is based on findings from a nationally representative survey conducted by the Center for the Governance of AI, housed at the Future of Humanity Institute, University of Oxford, using the survey firm YouGov. The survey was conducted between June 6 and 14, 2018, with a total of 2,000 American adults (18+) completing the survey. Findings include Demographic characteristics account for substantial variation in support for developing high-level machine intelligence. There is substantially more support for developing high-level machine intelligence by those with larger reported household incomes, such as those earning over $100,000 annually (47%) than those earning less than $30,000 (24%); by those with computer science or programming experience (45%) than those without (23%); by men (39%) than women (25%). These differences are not easily explained away by other characteristics (they are robust to our multiple regression). (via Miles Brundage)
- Post Mortems (Dan Luu) — a collection of outage postmortems from big and small companies. (via Laurent Vanbever)
- Guide to GDPR — UK’s guide. It explains each of the data protection principles, rights, and obligations. It summarizes the key points you need to know, answers frequently asked questions, and contains practical checklists to help you comply.
- Featuretools — open source Python framework for automated feature engineering.
- The State of Security in 2019 — The high-order bit in much of the below is complexity. Hardware, software, platforms, and ecosystems are often way too complex, and a whole lot of our security, privacy, and abuse problems stem from that. Lots of really good links and ideas here.
There’s a passage in Ernest Hemingway’s novel The Sun Also Rises in which a character named Mike is asked how he went bankrupt. “Two ways,” he answers. “Gradually, then suddenly.”
Technological change happens much the same way. Small changes accumulate, and suddenly the world is a different place. Throughout my career at O’Reilly Media, we’ve tracked and fostered a lot of “gradually, then suddenly” movements: the World Wide Web, open source software, big data, cloud computing, sensors and ubiquitous computing, and now the pervasive effects of AI and algorithmic systems on society and the economy.
What are some of the things that are in the middle of their “gradually, then suddenly” transition right now? The list is long; here are a few of the areas that are on my mind.
AI and algorithms are everywhere
The most important trend for readers to focus on is the development of new kinds of partnership between human and machine. We take for granted that algorithmic systems do much of the work at online sites like Google, Facebook, Amazon, and Twitter, but we haven’t fully grasped the implications. These systems are hybrids of human and machine. Uber, Lyft, and Amazon Robotics brought this pattern to the physical world, reframing the corporation as a vast, buzzing network of humans both guiding and guided by machines. In these systems, the algorithms decide who gets what and why; they’re changing the fundamentals of market coordination in ways that gradually, then suddenly, will become apparent.
The rest of the world is leapfrogging the US
The volume of mobile payments in China is $13 trillion versus the US’ $50 billion, while credit cards never took hold. Already Zipline’s on-demand drones are delivering 20% of all blood supplies in Rwanda and will be coming soon to other countries (including the US). In each case, the lack of existing infrastructure turned out to be an advantage in adopting a radically new model. Expect to see this pattern recur, as incumbents and old thinking hold back the adoption of new models.
China and the transformation of Africa
Speaking of Africa, if it isn’t on your radar, it should be. Gradually, then suddenly, it’s becoming “the next factory of the world.” That’s the title of a 2017 book by McKinsey’s Irene Sun. There’s also a detailed McKinsey report, Dance of the Lions and Dragons, based on a study of more than 1,000 Chinese-owned factories in Africa. As the US has withdrawn into a kind of neo-isolationism, China is stepping up. There’s a lot of misinformation, rooted in denial, about its “One Belt, One Road” initiative. Expect to wake up one day and realize that China has done to the US what the US did to the UK in the 20th century, becoming the new leader of the world economy, for good or ill. Up until now, China has spent a lot more time copying us than we spend copying them; that’s suddenly going to go into reverse. For a detailed look at the competition between the two “AI superpowers,” read Kai-Fu Lee’s book of that name. See trend 1.
The next agricultural revolution
Last year, when I spoke at the Food+Tech Connect Conference in Amsterdam, I got an eyeful of the agricultural revolution that is happening in the Netherlands. Did you know that this tiny country, 1/270th the size of the US, is the world’s second-largest food exporter? That’s a testament to the way that precision farming and other new technologies are transforming agriculture. Silicon Valley is waking up to the opportunity, and so are consumers. I stopped in at an Oakland sports bar recently, and what did I see on the menu but an Impossible Burger. This new meatless meat is no longer just a treat for tech elites. Expect meaningful change in the makeup of our food supply, what we consume, and how it gets to us. If you’re skeptical, remember that 25 years ago, the internet was just becoming mainstream, and even the smartphone revolution is only 10 years old. Gradually, then suddenly, both have transformed the world.
You have to have huge ideological blinders on not to see that the effects of climate change are less and less “gradual” and that we are rushing headlong toward a “suddenly” moment. One of the most interesting discoveries for me in the past year has been the work of groups like the Initiative for the Science of the Human Past at Harvard, which have been looking at the connection between climate change events and the fall of ancient civilizations. My friend Malcolm Wiener pointed out to me that climate events trigger mass migrations, which often bring with them new plagues, and whether a civilization survives (as the Roman Empire did, albeit on a reduced scale) or falls depends on the quality of its ruling elites. I leave you to consider the implications of the current political moment.
Genetic engineering is an important driver of food innovation, but it’s also a huge part of the possible response to climate change. Bring the wooly mammoth back to life? Save coral reefs? But climate adaptation is just the tip of the iceberg. Could we replace chemical dyes with bacterial by-products? And don’t get me started on the application of genomics to healthcare. Back in 2010, George Church pointed out the equivalent of Moore’s law for gene sequencing. As a result of that acceleration, we’re now approaching the “suddenly” moment for precision medicine. And of course, AI is in the middle of all that, helping with drug discovery, synthesis of new materials, and biological pathways. But I suspect that there’s also a hidden intersection with …
One of my biggest “Wow!” moments of 2018 took place in the offices of neural interface company CTRL-labs. Their demo involves someone playing the old Asteroids computer game without touching a keyboard, using machine learning to interpret the nerve signals that are sent to the hands. But it isn’t quite what you think. Moving things in the digital realm without moving your hands seems startling enough (though it’s worth remembering that it was once considered remarkable to be able to read silently without moving your lips). But that’s just the first stage. Essentially, users of this technology “grow” another virtual hand, which they can move independently of their physical hands. One of the researchers bowled me over when he said he was “working on controlling nine cursors at once.” Gradually, then suddenly, our children will interface with machines in deeper and deeper ways. Humanity is already going cyborg (see trend 1); expect it to accelerate. Don’t fall into the trap of thinking that AI will replace humans when it can be used even more powerfully to augment them.
Online learning isn’t just about online schools like Udacity and Coursera or O’Reilly’s own learning platform. What’s too often overlooked is how education and cognitive augmentation go hand in hand. The reason Uber and Lyft have a seemingly unlimited supply of drivers is because no training is required; the app itself does the heavy lifting of telling the driver where to pick up the passenger and how to get to the destination. At O’Reilly, we call this “performance-adjacent learning.” Josh Bersin calls it “learning in the flow of work.” So many of the attempts to create online education seem to be reproducing 20th century models online; instead, we’ve hitched our platform squarely to the “gradually, then suddenly” trend of knowledge on demand, understanding that the supporting role of coursework is to get you to the point where you can take in and use on-demand knowledge. (We call this “structural literacy.”) See trend 1.
The crisis of faith in government
Ever since Jennifer Pahlka and I began working on the Gov 2.0 Summit back in 2008, we’ve been concerned that if we can’t get government up to speed on 21st century technology, a critical pillar of the good society will crumble. When we started that effort, we were focused primarily on government innovation; over time, through Jen’s work at Code for America and the United States Digital Service, that shifted to a focus on making sure that government services actually work for those who need them most. Michael Lewis’ latest book, The Fifth Risk, highlights just how bad things might get if we continue to neglect and undermine the machinery of government. It’s not just the political fracturing of our country that should concern us; it’s the fact that government plays a critical role in infrastructure, in innovation, and in the safety net. That role has gradually been eroded, and the cracks that are appearing in the foundation of our society are coming at the worst possible time.
Economics is my learning frontier right now as I explore the connections between the business ecosystems of the great tech platforms and trends in what I’ve been calling the Next Economy. Some of the books that I’ve taken the most from this year include Doughnut Economics, by Kate Raworth; The Value of Everything, by Mariana Mazzucato; How Asia Works, by Joe Studwell; The Assumptions Economists Make, by Jonathan Schlefer; Prediction Machines, by Ajay Agrawal, Joshua Gans, and Avi Goldfarb; and How Adam Smith Can Change Your Life, by Russ Roberts. That final book is not at all what most people will expect from the title. It’s not about the “invisible hand” or The Wealth of Nations but about Adam Smith’s other great book, The Theory of Moral Sentiments, which explores the role of social norms as a check on self-interest. We must rediscover and reinvent those norms, or gradually, then suddenly, we’ll continue the descent into economic and political barbarism.
Rather than ending this newsletter on a down note, let me remind you that the future is not inevitable. As I wrote in my book last year, it’s up to us:
This is my faith in humanity: that we can rise to great challenges. Moral choice, not intelligence or creativity, is our greatest asset. Things may get much worse before they get better. But we can choose instead to lift each other up, to build an economy where people matter, not just profit. We can dream big dreams and solve big problems. Instead of using technology to replace people, we can use it to augment them so they can do things that were previously impossible.
- Quantum Computing Zines — from EPiQC, the University of Chicago-led quantum research collaboration. Topics: history, hype, measurement, operations, notation, reversibility, superposition, and entanglement.
- Surprising People Have Access to Your Phone’s Location (VICE) — T-Mobile, Sprint, and AT&T are selling access to their customers’ location data, and that data is ending up in the hands of bounty hunters and others not authorized to possess it, letting them track most phones in the country.
- Underclocking the ESP8266 Leads to Wi-Fi Weirdness (Hackaday) — underclock an 8266 and the channel width decreases proportionally. Underclock two by the same amount and you can create a channel so narrow that non-underclocked devices can’t understand it. Clever!
- Gödel Was Incompleteness Ex Machina — In this essay we’ll prove Gödel’s incompleteness theorems twice. First, we’ll prove them the good old-fashioned way. Then we’ll repeat the feat in the setting of computation. In the process, we’ll discover that Gödel’s work, rightly viewed, needs to be split into two parts: the transport of computation into the arena of arithmetic on the one hand and the actual incompleteness theorems on the other. After we’re done, there will be cake. (via Daniel Bilar)
- Implicit Model of Other People’s Visual Attention as an Invisible, Force-Carrying Beam Projecting from the Eyes — I wonder how that affects VR/AR interaction design. Here we report that people automatically and unconsciously treat other people’s eyes as if beams of force-carrying energy emanate from them, gently pushing on objects in the world.
- OneDev — The opinionated but practical self-hosted git server. Interesting set of pro features for power users. The product manager in me always says, “cool, but how do you compete with GitHub and GitLab? Any useful features can be copied by their armies of developers. Features are not defensible.” Good luck to ’em, though. (And if this is open source, they don’t need to “compete” in a classic way; winning can be whatever the developers want it to be.)
- Successful 51% Attack on Ethereum Classic — though, as Sam Minnée said on Twitter, “Ethereum Classic is the Windows XP of Ethereum.” This as Bitcoin is less secure than most people think: As an example, Budish shows that if the attacker has just 5% more computational power than the honest nodes, then on average it takes 26.5 blocks (a little over four hours) for the attacker to have the longest chain. (Most of the time it takes far fewer blocks, but occasionally it takes hundreds of blocks for the attacker to produce the longest chain.) The attack will always be successful eventually; the key question is what is the cost of the attack?
- Pirate’s Take on Strategy vs. Tactics — useful to give to That Person on your team who misuses the words.
Whether you’re a business leader or a practitioner, here are key data trends to watch and explore in the months ahead.
Increasing focus on building data culture, organization, and training
In a recent O’Reilly survey, we found that the skills gap remains one of the key challenges holding back the adoption of machine learning. The demand for data skills (“the sexiest job of the 21st century”) hasn’t dissipated. LinkedIn recently found that demand for data scientists in the US is “off the charts,” and our survey indicated that the demand for data scientists and data engineers is strong not just in the US but globally.
With the average shelf life of a skill today at less than five years and the cost to replace an employee estimated at between six and nine months of the position’s salary, there is increasing pressure on tech leaders to retain and upskill rather than replace their employees in order to keep data projects (such as machine learning implementations) on track. We are also seeing more training programs aimed at executives and decision makers, who need to understand how these new ML technologies can impact their current operations and products.
Beyond investments in narrowing the skills gap, companies are beginning to put processes in place for their data science projects, for example creating analytics centers of excellence that centralize capabilities and share best practices. Some companies are also actively maintaining a portfolio of use cases and opportunities for ML.
Cloud for data infrastructure
Cloud platforms will continue to draw companies that need to invest in data infrastructure: not only do the cloud platforms have improving foundational technologies and managed services, but increasingly software vendors and popular open source data projects are making sure their offerings are easy to run in the cloud. According to a recent O’Reilly survey, 85% of respondents said they already had some of their data infrastructure in the cloud, and other surveys of IT executives reveal that many are planning to increase their investments in SaaS and cloud tools. Data engineers and data scientists are beginning to use new cloud technologies, like serverless, for some of their tasks.
Continuing investments in (emerging) data technologies
For most companies, the road toward machine learning (ML) involves simpler analytic applications. This is good news because ML demands data, and many of the simpler analytic tools that precede ML already require data infrastructure to be in place. The growing interest in ML will spur companies to continue to invest in the foundational data technologies that are required to scale ML initiatives. This includes items like data ingestion and integration, storage and data processing, and data preparation and cleaning.
Tools for secure and privacy-preserving analytics
Companies will continue to invest in tools for data security and privacy, but we expect to see an increased focus on tools for privacy-preserving analytics—areas where researchers and startups have been actively engaged. Organizations will begin to identify and manage risks that accompany the use of machine learning in products and services, such as security and privacy, bias, safety, and lack of transparency.
Sustaining machine learning in an enterprise
Early indications are that many organizations are correctly focusing their initial machine learning projects (and investments) in use cases that improve their most mission-critical analysis projects. For example, financial service companies are investing ML in risk analysis, telecom companies are applying AI to service operations, and automotive companies are focusing their initial ML implementations in manufacturing. This is also reflected by the emergence of tools that are specific to machine learning, including data science platforms, data lineage, metadata management and analysis, data governance, and model lifecycle management.
Burgeoning IoT technologies
A few years ago, most internet of things (IoT) examples involved smart cities and smart governments. But the rise of cloud platforms, cheap sensors, and machine learning has IoT poised to make a comeback in industry. We’ll still hear about municipal and public sector applications, but there are other interesting use cases involving closed systems (factories, buildings, homes) and enterprise and consumer applications (edge computing).
Automation in data science and data
As the use of machine learning and analytics becomes more widespread, we need tools that will allow data scientists and data engineers to scale so they can tackle many more problems and maintain more systems. This will lead to more automation tools for the many stages involved in data science, including data preparation, feature engineering, model selection, and hyperparameter tuning, as well as data engineering and data operations. There are already some early applications of machine learning aimed at the partial automation of tasks in data science, software development, and IT operations.
- Tensor Considered Harmful — Trap 1: Privacy by Convention; Trap 2: Broadcasting by Alignment; Trap 3: Access by Comments. Author proposes a named tensor to tackle these problems. (via Daniel Bilar)
- 100 Lessons Learned for Project Managers (NASA) — This material first appeared in the October 2003 issue of NASA’s ASK Magazine, which now lists 122 of these aphorisms. Examples: People who monitor work and don’t help get it done, never seem to know exactly what is going on. Integrity means your subordinates trust you. An agency’s age can be estimated by the number of reports and meetings it has. The older it gets, the more the paperwork increases and the less product is delivered per dollar. Many people have suggested that an agency self-destruct every 25 years and be reborn starting from scratch.
- The Man Turning China into a Quantum Superpower (MIT TR) — One of the reasons China has done so well in quantum science is the close coordination between its government research groups, the Chinese Academy of Sciences, and the country’s universities. Europe now has its own quantum master plan to prompt such collaborations, but the U.S. has been slow to produce a comprehensive strategy for developing the technologies and building a future quantum workforce. Where’s quantum’s Licklider?
- Dive Into Deep Learning — Berkeley University course. Uses Jupyter Notebooks and MXNet (not TensorFlow or PyTorch).
- Bruce Sterling’s State of the World — this year’s guest, James Bridle. It’s quite clear that many things being currently constructed, from large-scale capitalist enterprises to social media timelines to microinteractions on smartphone apps, are specifically designed as attacks on our ability to think clearly and act autonomously: “the race to the bottom of the brain stem,” as Tristan Harris puts it. What you’re feeling is not some weird emergent effect of too much screen time: it’s deliberate. (via BoingBoing)
- Flair — very simple framework for state-of-the-art NLP. Multilingual, built on PyTorch.
- Towards a Human Artificial Intelligence for Human Development — Sandy Pentland was a co-author, so it caught my eye. This paper discusses the possibility of applying the key principles and tools of current artificial intelligence (AI) to design future human systems in ways that could make them more efficient, fair, responsive, and inclusive.
- TS100 — new open source firmware for your soldering iron. You had me at “soldering iron with flashable firmware”…
In this episode of the Data Show, I spoke with Haoyuan Li, CEO and founder of Alluxio, a startup commercializing the open source project with the same name (full disclosure: I’m an advisor to Alluxio). Our discussion focuses on the state of Alluxio (the open source project that has roots in UC Berkeley’s AMPLab), specifically emerging use cases here and in China. Given the large-scale use in China, I also wanted to get Li’s take on the state of data and AI technologies in Beijing and other parts of China.
Here are some highlights from our conversation:
A much needed layer between compute and storage in a world with disparate storage systems
This new layer, which we call a virtual distributed file system, sits in the middle between the compute and storage layers. This new layer virtualizes data from different storage systems and presents a unified API with a global namespace for the data-driven applications to interact with all of the data in the enterprise environment.
AI and machine learning applications
One key reason people use an object store is that it is cheap. Per gigabyte or per terabyte, it’s cheaper than other solutions in a market,…but performance is not as good. And from that perspective, by putting open source Alluxio on top of that, that improves performance from Alluxio’s caching functionality. On top of that, in many cases, machine learning libraries cannot directly talk with object stores, and Alluxio can also serve as a translation layer.
Adoption in China
Things are moving very fast in that region. People are eager to adopt new technology, particularly for AI and big data. Some are users we know very quickly boosted their Alluxio deployments to hundreds of nodes or even thousands of nodes. It’s amazing to see how fast they can adapt.
… Of the top 10 internet companies in China, nine are using open source Alluxio in production today. All nine of them have big data and AI use cases for Alluxio. … I also travel back and forth between these two regions quite often, and every time I go there, I see more use cases, more applications, and more innovation.